1113 Commits

Author SHA1 Message Date
Antonio Frighetto
e977b28c37
[InstCombine] Match intrinsic recurrences when known to be hoisted
For value-accumulating recurrences of kind:
```
  %umax.acc = phi i8 [ %umax, %backedge ], [ %a, %entry ]
  %umax = call i8 @llvm.umax.i8(i8 %umax.acc, i8 %b)
```
The binary intrinsic may be simplified into an intrinsic with init
value and the other operand, if the latter is loop-invariant:
```
  %umax = call i8 @llvm.umax.i8(i8 %a, i8 %b)
```

Proofs: https://alive2.llvm.org/ce/z/ea2cVC.

Fixes: https://github.com/llvm/llvm-project/issues/145875.
2025-08-08 09:31:50 +02:00
Nikita Popov
16d73839b1
[InstCombine] Support folding intrinsics into phis (#151115)
Call foldOpIntoPhi() for speculatable intrinsics. We already do this for
FoldOpIntoSelect().

Among other things, this partially subsumes
https://github.com/llvm/llvm-project/pull/149858.
2025-07-31 12:32:37 +02:00
Pedro Lobo
d9952a7a5f
[InstCombine] Propagate neg nsw when folding abs(-x) to abs(x) (#150460)
We can propagate the nsw in the neg to abs, as `-x` is only poison if x
== INT_MIN.
2025-07-24 21:45:37 +01:00
Prabhu Rajasekaran
921c6dbeca
[llvm] Introduce callee_type metadata
Introduce `callee_type` metadata which will be attached to the indirect
call instructions.

The `callee_type` metadata will be used to generate `.callgraph` section
described in this RFC:
https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html

Reviewers: morehouse, petrhosek, nikic, ilovepi

Reviewed By: nikic, ilovepi

Pull Request: https://github.com/llvm/llvm-project/pull/87573
2025-07-18 14:40:54 -07:00
Jeremy Morse
5b8c15c6e7
[DebugInfo] Remove getPrevNonDebugInstruction (#148859)
With the advent of intrinsic-less debug-info, we no longer need to
scatter calls to getPrevNonDebugInstruction around the codebase. Remove
most of them -- there are one or two that have the "SkipPseudoOp" flag
turned on, however they don't seem to be in positions where skipping
anything would be reasonable.
2025-07-16 11:41:32 +01:00
Ahmed Bougacha
77bcab835a
[InstCombine] Combine ptrauth intrin. callee into same-key bundle. (#94707)
Try to optimize a call to the result of a ptrauth intrinsic, potentially
into the ptrauth call bundle:
    call(ptrauth.resign(p)), ["ptrauth"()] ->  call p, ["ptrauth"()]
    call(ptrauth.sign(p)),   ["ptrauth"()] ->  call p

as long as the key/discriminator are the same in sign and auth-bundle,
and we don't change the key in the bundle (to a potentially-invalid
key.)

Generating a plain call to a raw unauthenticated pointer is generally
undesirable, but if we ended up seeing a naked ptrauth.sign in the first
place, we already have suspicious code. Unauthenticated calls are also
easier to spot than naked signs, so let the indirect call shine.


Note that there is an arguably unsafe extension to this, where we don't
bother checking that the key in bundle and intrinsic are the same (and
also allow folding away an auth into a bundle.)

This can end up generating calls with a bundle that has an invalid key
(which an informed frontend wouldn't have otherwise done), which can be
problematic. The C that generates that is straightforward but arguably
unreasonable. That wouldn't be an issue if we were to bite the bullet
and make these fully AArch64-specific, allowing key knowledge to be
embedded here.
2025-07-15 14:39:53 -07:00
Ahmed Bougacha
42d2ae1034
[InstCombine] Combine ptrauth constant callee into bundle. (#94706)
Try to optimize a call to a ptrauth constant, into its ptrauth bundle:
    call(ptrauth(f)), ["ptrauth"()] ->  call f
as long as the key/discriminator are the same in constant and bundle.
2025-07-15 13:37:07 -07:00
Jeremy Morse
57a5f9c47e
[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383)
There are no longer debug-info instructions, thus we don't need this
skipping. Horray!
2025-07-15 15:34:10 +01:00
Luke Lau
04c614327c
[InstCombine] Pull vector reverse through intrinsics (#146384)
This is the intrinsic version of #146349, and handles fabs as well as
other intrinsics.

It's largely a copy of InstCombinerImpl::foldShuffledIntrinsicOperands
but a bit simpler since we don't need to find a common mask.

Creating a separate function seems to be cleaner than trying to shoehorn
it into the existing one.
2025-07-01 16:49:10 +01:00
Luke Lau
6f7370ced6
[InstCombine] Pull vector reverse through fneg (#146349)
This follows on from
https://github.com/llvm/llvm-project/pull/144933#issuecomment-2992372627,
and allows us to remove the reverse (fneg (reverse x)) combine.

A separate patch will handle the case for fabs. I haven't checked if we
perform this canonicalization for either unops or binops for vp.reverse
2025-06-30 16:15:04 +01:00
AZero13
8c77191835
[InstCombine] smin(smax(X, -1), 1) -> scmp(X, 0) and smax(smin(X, 1), -1) -> scmp(X, 0) (#145736)
Motivating case: https://godbolt.org/z/Wxcc51jcj

Alive2: https://alive2.llvm.org/ce/z/-bPPAg
2025-06-30 15:44:37 +02:00
Luke Lau
d0c1ea928c
[InstCombine] Pull unary shuffles through fneg/fabs (#144933)
This canonicalizes fneg/fabs (shuffle X, poison, mask) -> shuffle
(fneg/fabs X), posion, mask

This undoes part of b331a7ebc1e02f9939d1a4a1509e7eb6cdda3d38 and
a8f13dbdeb31be37ee15b5febb7cc2137bbece67, but keeps the binary shuffle
case i.e. shuffle fneg, fneg, mask.

By pulling out the shuffle we bring it inline with the same
canonicalisation we perform on binary ops and intrinsics, which the
original commit acknowledges it goes in the opposite direction.

However nowadays VectorCombine is more powerful and can do more
optimisations when the shuffle is pulled out, so I think we should
revisit this. In particular we get more shuffles folded and can perform
scalarization.
2025-06-30 10:40:12 +01:00
Philip Reames
b5aaf9d988
[InstCombine] Implement vp.reverse reordering/elimination through binop/unop (#143963)
This simply copies the structure of the vector.reverse patterns from
just above, and reimplements them for the vp.reverse intrinsics when the
mask is all ones and the EVLs exactly match.

Its unfortunate that we have three different ways to represent a reverse
(shuffle, vector.reverse, and vp.reverse) but I don't see an obvious way
to remove any them because the semantics are slightly different.

This significantly improves vectorization in TSVC_2's s112 and s1112
loops when using EVL tail folding.
2025-06-18 08:53:45 -07:00
Philip Reames
25781221d6
[instcombine] Delete dead transform for reverse of binop (#143967)
We canonicalize reverse to after a binop in foldVectorBinop, and
simplify reverse pairs in InstSimplify, so these elimination transforms
are redundant.
2025-06-16 12:43:13 -07:00
Luke Lau
b81d5e06c7
[InstCombine] Fold shuffles through all trivially vectorizable intrinsics (#141979)
This addresses a TODO in foldShuffledIntrinsicOperands to use
isTriviallyVectorizable instead of a hardcoded list of intrinsics, which
in turn allows more intriniscs to be scalarized by VectorCombine.

From what I can tell every intrinsic here should be speculatable so an
assertion was added.

Because this enables intrinsics like abs which have a scalar operand, we
need to also check isVectorIntrinsicWithScalarOpAtArg.
2025-06-13 18:25:07 +01:00
Ricardo Jesus
c70c0a86a5
[AArch64][InstCombine] Combine AES instructions with zero operands. (#142781)
We currently combine (AES (EOR (A, B)), 0) into (AES A, B) for Neon
intrinsics when the zero operand appears in the RHS of the AES
instruction.

This patch extends the combine to support AES SVE intrinsics and
the case where the zero operand appears in the LHS of the AES
instructions.
2025-06-09 08:27:58 +01:00
Ramkumar Ramachandra
b40e4ceaa6
[ValueTracking] Make Depth last default arg (NFC) (#142384)
Having a finite Depth (or recursion limit) for computeKnownBits is very
limiting, but is currently a load-bearing necessity, as all KnownBits
are recomputed on each call and there is no caching. As a prerequisite
for an effort to remove the recursion limit altogether, either using a
clever caching technique, or writing a easily-invalidable KnownBits
analysis, make the Depth argument in APIs in ValueTracking uniformly the
last argument with a default value. This would aid in removing the
argument when the time comes, as many callers that currently pass 0
explicitly are now updated to omit the argument altogether.
2025-06-03 17:12:24 +01:00
Luke Lau
618a399f86
[InstCombine] Explicitly match poison operand. NFCI (#141744)
This is a follow up from
https://github.com/llvm/llvm-project/pull/141300#discussion_r2109109224
2025-05-28 13:16:55 +01:00
Luke Lau
9262e37d8c
[InstCombine] Fold shuffled intrinsic operands with constant operands (#141300)
We currently pull shuffles through binops and intrinsics, which is an
important canonical form for VectorCombine to be able to scalarize
vector sequences. But while binops can be folded with a constant
operand, intrinsics currently require all operands to be shufflevectors.

This extends intrinsic folding to be in line with regular binops by
reusing the constant "unshuffling" logic.

As far as I can tell the list of currently folded intrinsics don't
require any special UB handling.

This change in combination with #138095 and #137823 fixes the following
C:

```c
void max(int *x, int *y, int n) {
  for (int i = 0; i < n; i++)
    x[i] += *y > 42 ? *y : 42;
}
```

Not using the splatted vector form on RISC-V with `-O3 -march=rva23u64`:

```asm
	vmv.s.x	v8, a4
	li	a4, 42
	vmax.vx	v10, v8, a4
	vrgather.vi	v8, v10, 0
.LBB0_9:                                # %vector.body
                                        # =>This Inner Loop Header: Depth=1
	vl2re32.v	v10, (a5)
	vadd.vv	v10, v10, v8
	vs2r.v	v10, (a5)
```

i.e., it now generates

```asm
        li	a6, 42
        max	a6, a4, a6
.LBB0_9:                                # %vector.body
                                        # =>This Inner Loop Header: Depth=1
	vl2re32.v	v8, (a5)
	vadd.vx	v8, v8, a6
	vs2r.v	v8, (a5)
```
2025-05-28 10:57:08 +01:00
Yingwei Zheng
8d0ebd823c
[InstCombine] Special handle va_copy(dst, src) + va_end(src) (#140250)
Closes https://github.com/llvm/llvm-project/issues/140215.
2025-05-17 16:04:03 +08:00
Iris Shi
f9783c559f
[InstCombine] Fix frexp(frexp(x)) -> frexp(x) fold (#138837)
Fixes #138819

When frexp is applied twice, the second result should be zero.
2025-05-08 00:37:46 +08:00
sallto
1ee9576ee7
[InstCombine] Funnel shift with negative amount folds to funnel shift in opposite direction (#138334) (#138763)
Partially `fixes` #138334.

Combine fshl(X,X,Neg(Y)) into fshr(X,X,Y) and
fshr(X,X,Neg(Y)) into fshl(X,X,Y)
2025-05-07 15:50:49 +02:00
Philip Reames
650dca5d89
[IR] Remove the AtomicMem*Inst helper classes (#138710)
Migrate their usage to the `AnyMem*Inst` family, and add a isAtomic()
query on the base class for that hierarchy. This matches the idioms we
use for e.g. isAtomic on load, store, etc.. instructions, the existing
isVolatile idioms on mem* routines, and allows us to more easily share
code between atomic and non-atomic variants.

As with #138568, the goal here is to simplify the class hierarchy and
make it easier to reason about. I'm moving from easiest to hardest, and
will stop at some point when I hit "good enough". Longer term, I'd sorta
like to merge or reverse the naming on the plain Mem*Inst and the
AnyMem*Inst, but that's a much larger and more risky change. Not sure
I'm going to actually do that.
2025-05-06 14:24:40 -07:00
Philip Reames
0b8528e127
[instcombine] Adjust style of MemIntrinsic code to be more idiomatic [nfc] (#138715)
Use an existing helper function. Remove the use of a local Changed
variable which doesn't seem to interact with surrounding transforms in
any meaningful way. (Both memcpy and memmove are MemTransfer
instructions, so switching from one to the other doesn't change
results.)

Posted for review mostly for a sanity check that I'm not missing
something with the logic around the Change flag.
2025-05-06 13:19:53 -07:00
Iris Shi
6db447f824
[InstCombine] Canonicalize max(min(X, MinC), MaxC) -> min(max(X, MaxC), MinC) (#136665)
Closes #121870.

https://alive2.llvm.org/ce/z/WjmAjz
https://alive2.llvm.org/ce/z/4KCjgL
2025-04-23 16:31:50 +08:00
Tim Gymnich
049f179606
[Analysis][NFC] Extract KnownFPClass (#133457)
- extract KnownFPClass for future use inside of GISelKnownBits

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-03-28 18:10:02 +01:00
Iris
1762f16f6c
[InstCombine] Fold umax/umin(nuw_shl(z, x), nuw_shl(z, y)) -> nuw_shl(z, umax/umin(x, y)) and umax/umin(nuw_shl(x, z), nuw_shl(y, z)) -> nuw_shl(umax/umin(x, y), z) (#131076)
- Closes #129947

This PR introduces the following transformations:
1. `umax(nuw_shl(z, x), nuw_shl(z, y)) -> nuw_shl(z, umax(x, y))`
2. `umin(nuw_shl(z, x), nuw_shl(z, y)) -> nuw_shl(z, umin(x, y))`
3. `umax(nuw_shl(x, z), nuw_shl(y, z)) -> nuw_shl(umax(x, y),z)`
4. `umin(nuw_shl(x, z), nuw_shl(y, z)) -> nuw_shl(umin(x, y),z)`


Alive2 live proof: 
- https://alive2.llvm.org/ce/z/6bM-p7 for 1 and 2
- https://alive2.llvm.org/ce/z/aqLRYA and
https://alive2.llvm.org/ce/z/twoVhb for 3 and 4 repectively
2025-03-15 13:40:35 +08:00
Yingwei Zheng
2ebc69a521
[InstCombine] Add support for GEPs in simplifyNonNullOperand (#128365)
Alive2: https://alive2.llvm.org/ce/z/2KE8zG
2025-02-23 17:19:31 +08:00
Yingwei Zheng
126016b662
[InstCombine] Simplify nonnull pointers (#128111)
This patch is the follow-up of
https://github.com/llvm/llvm-project/pull/127979. It introduces a helper
`simplifyNonNullOperand` to avoid duplicate logic. It also addresses the
one-use issue in `visitLoadInst`, as discussed in
https://github.com/llvm/llvm-project/pull/127979#issuecomment-2671013972.
The `nonnull` attribute is also supported. Proof:
https://alive2.llvm.org/ce/z/MCKgT9
2025-02-22 15:30:04 +08:00
Yingwei Zheng
cf37ae5cae
[InstCombine] Add one-use check when folding fabs over selects (#122270)
Fixes multi-use issue introduced by
https://github.com/llvm/llvm-project/pull/86390.
It allows the folding of `fabs (select Cond, TrueC, FalseC)` to avoid performance regression in ocio
2025-01-29 21:44:59 +08:00
Jeremy Morse
8e70273509
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.

This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.

We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
2025-01-24 10:53:11 +00:00
Kazu Hirata
7fa1936947
[InstCombine] Avoid repeated hash lookups (NFC) (#123559) 2025-01-20 10:14:18 -08:00
Ruhung
9c7e02d579
[InstCombine] Fold umax(nuw_mul(x, C0), x + 1) into (x == 0 ? 1 : nuw_mul(x, C0)) (#123468)
This PR introduces the following transformations:

- If C0 is not 0:  
umax(nuw_shl(x, C0), x + 1) -> x == 0 ? 1 : nuw_shl(x, C0)  
- If C0 is not 0 or 1:  
umax(nuw_mul(x, C0), x + 1) -> x == 0 ? 1 : nuw_mul(x, C0)  

Fixes #122388.
Alive2 proof: https://alive2.llvm.org/ce/z/rkp_8U
2025-01-20 16:32:35 +01:00
Yingwei Zheng
94fee13d42
[InstCombine] Simplify FMF propagation. NFC. (#121899)
This patch uses new FMF interfaces introduced by
https://github.com/llvm/llvm-project/pull/121657 to simplify existing
code with `andIRFlags` and `copyFastMathFlags`.
2025-01-17 01:31:06 +08:00
Andreas Jonson
2570e354f1
[InstCombine] Handle trunc to i1 in align assume. (#122949)
proof: https://alive2.llvm.org/ce/z/EyAUA4
2025-01-15 17:39:01 +01:00
goldsteinn
3318a7248a
[InstCombine] Fold (ct{t,l}z Pow2) -> Log2(Pow2) (#122620)
- **[InstCombine] Add tests for folding `(ct{t,l}z Pow2)`; NFC**
- **[InstCombine] Fold `(ct{t,l}z Pow2)` -> `Log2(Pow2)`**

Do so we can find `Log2(Pow2)` for "free" with `takeLog2`

https://alive2.llvm.org/ce/z/CL77fo
2025-01-13 09:38:09 -06:00
Amr Hesham
642e493d4d
[InstCombine] Convert fshl(x, 0, y) to shl(x, and(y, BitWidth - 1)) when BitWidth is pow2 (#122362)
Convert `fshl(x, 0, y)` to `shl(x, and(y, BitWidth - 1))` when BitWidth
is pow2

Alive2 proof: https://alive2.llvm.org/ce/z/3oTEop
Fixes: #122235
2025-01-11 11:48:05 +01:00
Yingwei Zheng
d80bdf7261
[IRBuilder] Add a helper function to intersect FMFs from two instructions (#122059)
Address review comment in
https://github.com/llvm/llvm-project/pull/121899#discussion_r1905765776
2025-01-09 14:36:42 +08:00
Yingwei Zheng
c05599966c
[InstCombine] Fix FMF propagation in copysign Mag, (copysign ?, X) -> copysign Mag, X (#121574)
Closes https://github.com/llvm/llvm-project/issues/121432.
2025-01-06 16:23:46 +08:00
Yingwei Zheng
2adcec7780
[InstCombine] Simplify with.overflow intrinsics with assumption information (#84016)
This patch recognizes never-overflow assumptions generated by rustc to
improve the codegen.
Please refer to https://github.com/rust-lang/hashbrown/issues/509 for
more details.

Closes https://github.com/rust-lang/hashbrown/issues/509
Closes https://github.com/llvm/llvm-project/issues/80637
2025-01-05 21:15:17 +08:00
David Green
18abc7e0c5
[PatternMatch] Introduce m_c_Select (#114328)
This matches m_Select(m_Value(), L, R) or m_Select(m_Value(), R, L).
2024-11-25 13:47:23 +00:00
Yingwei Zheng
a59976bea8
[InstCombine] Drop noundef attributes in foldCttzCtlz (#116718)
Closes https://github.com/llvm/llvm-project/issues/112068.
2024-11-19 20:06:34 +08:00
Jorge Botto
fcd51dee42
[InstCombine] Factorise Add and Min/Max using distributivity (#101717)
This PR fixes part of https://github.com/llvm/llvm-project/issues/92433.

It specifically adds the 4 cases mentioned in
https://github.com/llvm/llvm-project/issues/92433#issuecomment-2117064459.

I've added 8 positive tests, 4 of which are mentioned in the comment
above and 4 which are their commutative equivalents. Alive proof:
https://alive2.llvm.org/ce/z/z6eFTb
I've also added 8 negative tests, because we want to make sure we do not
optimise if the relevant flags are not relevant because the optimisation
wouldn't be sound. Alive proof that the optimisation is invalid:
https://alive2.llvm.org/ce/z/NvNjTD
I did have to make the integer types `i4` to make Alive not timeout and
to fit them all on one page.
2024-11-02 17:08:12 +01:00
goldsteinn
c85611e858
[SimplifyLibCall][Attribute] Fix bug where we may keep range attr with incompatible type (#112649)
In a variety of places we change the bitwidth of a parameter but don't
update the attributes.

The issue in this case is from the `range` attribute when inlining
`__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an
`i8`, and if the `i32` had a `range` attr assosiated it will cause an
error.

Fixes #112633
2024-10-17 10:32:55 -05:00
Jay Foad
85c17e4092
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706)
Convert many instances of:
  Fn = Intrinsic::getOrInsertDeclaration(...);
  CreateCall(Fn, ...)
to the equivalent CreateIntrinsic call.
2024-10-17 16:20:43 +01:00
Ramkumar Ramachandra
c5f82f7893
ValueTracking: introduce llvm::isNotCrossLaneOperation (#112011)
Factor out and unify common code from InstSimplify and InstCombine that
partially guard against cross-lane vector operations into
llvm::isNotCrossLaneOperation in ValueTracking.

Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/68H4ka
2024-10-14 11:37:30 +01:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Jay Foad
e03f427196
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
2024-09-19 16:16:38 +01:00
David Green
c0e308ba3d
[InstCombine] Pass DomTree and DomTreeCacheto LibCallSimplifier (#108446)
This allows any combines to pick up Known states from dominating
conditions.
2024-09-13 08:36:48 +01:00
Volodymyr Vasylkun
d163935585
[InstCombine] Fold scmp(x -nsw y, 0) to scmp(x, y) (#105583)
Proof: https://alive2.llvm.org/ce/z/v6VtXz
2024-08-22 14:18:48 +01:00