1163 Commits

Author SHA1 Message Date
Pedro Lobo
76d614b7c1
[InstSimplify] Extend icmp-of-add simplification to sle/sgt/sge (#168900)
When comparing additions with the same base where one has `nsw`, the
following simplification can be performed:

```llvm
icmp slt/sgt/sle/sge (x + C1), (x +nsw C2)
=>
icmp slt/sgt/sle/sge C1, C2
```

Previously this was only done for `slt`. This patch extends it to the
`sgt`, `sle`, and `sge` predicates when either of the conditions hold:
- `C1 <= C2 && C1 >= 0`, or
- `C2 <= C1 && C1 <= 0`

This patch also handles the `C1 == C2` case, which was previously
excluded.

Proof: https://alive2.llvm.org/ce/z/LtmY4f
2025-11-20 21:35:14 +00:00
Paul Walker
f2b5d04f29
[LLVM][InstSimplify] Add folds for SVE integer reduction intrinsics. (#167519)
[andv, eorv, orv, s/uaddv, s/umaxv, s/uminv]
sve_reduce_##(none, ?) -> op's neutral value
sve_reduce_##(any, neutral) -> op's neutral value
    
[andv, orv, s/umaxv, s/uminv]
sve_reduce_##(all, splat(X)) -> X
    
[eorv]
sve_reduce_##(all, splat(X)) -> 0
2025-11-18 14:33:43 +00:00
Igor Gorban
dd7a000a31
[InstSimplify] Fix crash when optimizing minmax with bitcast constant vectors (#168055)
When simplifying min/max intrinsics with fixed-size vector constants,
InstructionSimplify attempts to optimize element-wise. However,
getAggregateElement() can return null for certain constant expressions
like bitcasts, leading to a null pointer dereference.

This patch adds a check to bail out of the optimization when
getAggregateElement() returns null, preventing the crash while
maintaining correct behavior for normal constant vectors.

Fixes crash with patterns like:
  call <2 x half> @llvm.minnum.v2f16(<2 x half> %x,
<2 x half> bitcast (<1 x i32> <i32 N> to <2 x half>))
2025-11-15 03:05:30 +08:00
Congzhe
e246fffb25
Reland "[InstructionSimplify] Enhance simplifySelectInst() (#163453)" (#164694)
This reverts commit f1c1063.

PR #163453 was merged and reverted since it exposed a crash. 
After investigation the crash was unrelated and is then fixed in #164628.

This is an attempt to reland #163453.
2025-10-26 00:39:49 -04:00
Arthur Eubanks
f1c1063acb
Revert "[InstCombinePHI] Enhance PHI CSE to remove redundant phis" (#164520)
Reverts llvm/llvm-project#163453

Causes crashes, see
https://github.com/llvm/llvm-project/pull/163453#issuecomment-3429922732
2025-10-22 08:51:10 +00:00
Congzhe
9a9fbbba5c
[InstructionSimplify] Enhance simplifySelectInst() (#163453)
Fold select instructions with true and false values that act as the same 
phi, which cleans up the IR and open up opportunities for other passes 
such as loop vectorization.
2025-10-21 16:12:26 -04:00
Nikita Popov
ec26f219ac
[InstSimplify] Support ptrtoaddr in simplifyGEPInst() (#164262)
This adds support for ptrtoaddr in the `ptradd p, ptrtoaddr(p2) -
ptrtoaddr(p) -> p2` fold.

This fold requires that p and p2 have the same underlying object
(otherwise the provenance may not be the same).

The argument I would like to make here is that because the underlying
objects are the same (and the pointers in the same address space), the
non-address bits of the pointer must be the same. Looking at some
specific cases of underlying object relationship:

 * phi/select: Trivially true.
* getelementptr: Only modifies address bits, non-address bits must
remain the same.
* addrspacecast round-trip cast: Must preserve all bits because we
optimize such round-trip casts away.
* non-interposable global alias: I'm a bit unsure about this one, but I
guess the alias and the aliasee must have the same non-address bits?
* various intrinsics like launder.invariant.group, ptrmask. I think
these all either preserve all pointer bits (like the invariant.group
ones) or at least the non-address bits (like ptrmask). There are some
interesting cases like amdgcn.make.buffer.rsrc, but those are cross
address-space.

-----

There is a second `gep (gep p, C), (sub 0, ptrtoint(p)) -> C` transform
in this function, which I am not extending to handle ptrtoaddr, adding
negative tests instead. This transform is overall dubious for provenance
reasons, but especially dubious with ptrtoaddr, as then we don't have
the guarantee that provenance of `p` has been exposed.
2025-10-21 09:27:07 +02:00
Nikita Popov
ee50839700 [InstSimplify] Support ptrtoaddr in simplifyCastInst()
Handle ptrtoaddr the same way as ptrtoint. The fold already only
operates on the index/address bits.
2025-10-20 14:18:34 +02:00
Nikita Popov
573ca36753
[IR] Replace alignment argument with attribute on masked intrinsics (#163802)
The `masked.load`, `masked.store`, `masked.gather` and `masked.scatter`
intrinsics currently accept a separate alignment immarg. Replace this
with an `align` attribute on the pointer / vector of pointers argument.

This is the standard representation for alignment information on
intrinsics, and is already used by all other memory intrinsics. This
means the signatures now match llvm.expandload, llvm.vp.load, etc.
(Things like llvm.memcpy used to have a separate alignment argument as
well, but were already migrated a long time ago.)

It's worth noting that the masked.gather and masked.scatter intrinsics
previously accepted a zero alignment to indicate the ABI type alignment
of the element type. This special case is gone now: If the align
attribute is omitted, the implied alignment is 1, as usual. If ABI
alignment is desired, it needs to be explicitly emitted (which the
IRBuilder API already requires anyway).
2025-10-20 08:50:09 +00:00
Nikita Popov
cf3765752b [InstSimplify] Support ptrtoaddr in ptrmask fold
Treat it the same way as ptrtoint. ptrmask only operates on the
address bits of the pointer.
2025-10-14 13:55:04 +02:00
Nikita Popov
261580cacd
[InstSimplify] Support non-inbounds GEP in ptrdiff fold (#162676)
We can fold ptrdiff(ptradd(p, x), p) to x regardless of whether the
ptradd is inbounds.

Proof: https://alive2.llvm.org/ce/z/Xuvc7N
2025-10-10 08:03:25 +00:00
Nikita Popov
187a8e3e08
[InstSimplify] Support ptrtoaddr in pointer subtraction fold (#162672)
Add a new m_PtrToIntOrAddr() matcher which matches both ptrtoint and
ptrtoaddr. Pointer arithmetic only works on the address bits, so
supporting ptrtoaddr is always fine here.
2025-10-10 09:30:30 +02:00
Nikita Popov
7e5bb1e58a
[IR] Require DataLayout for pointer cast elimination (#162279)
isEliminableCastPair() currently tries to support elimination of
ptrtoint/inttoptr cast pairs by assuming that the maximum possible
pointer size is 64 bits. Of course, this is no longer the case nowadays.

This PR changes isEliminableCastPair() to accept an optional DataLayout
argument, which is required to eliminate pointer casts.

This means that we no longer eliminate these cast pairs during ConstExpr
construction, and instead only do it during DL-aware constant folding.
This had a lot of annoying fallout on tests, most of which I've
addressed in advance of this change.
2025-10-07 17:19:48 +02:00
Lewis Crawford
17efa572c3
[InstSimplify] Optimize maximumnum and minimumnum (#139581)
Add support for the new maximumnum and minimumnum intrinsics in various
optimizations in InstSimplify.

Also, change the behavior of optimizing maxnum(sNaN, x) to simplify to
qNaN instead of x to better match the LLVM IR spec, and add more tests
for sNaN behavior for all 3 max/min intrinsic types.
2025-10-07 14:23:32 +01:00
Yingwei Zheng
ca5ece8939
[InstSimplify] Simplify fcmp implied by dominating fcmp (#161090)
This patch simplifies an fcmp into true/false if it is implied by a
dominating fcmp.
As an initial support, it only handles two cases:
+ `fcmp pred1, X, Y -> fcmp pred2, X, Y`: use set operations.
+ `fcmp pred1, X, C1 -> fcmp pred2, X, C2`: use `ConstantFPRange` and
set operations.

Note: It doesn't fix https://github.com/llvm/llvm-project/issues/70985,
as the second fcmp in the motivating case is not dominated by the edge.
We may need to adjust JumpThreading to handle this case.

Comptime impact (~+0.1%):
https://llvm-compile-time-tracker.com/compare.php?from=a728f213c863e4dd19f8969a417148d2951323c0&to=8ca70404fb0d66a824f39d83050ac38e2f1b25b9&stat=instructions:u
IR diff: https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2848
2025-10-05 16:15:51 +08:00
Matthew Devereau
819e6b2043
[InstSimplify] Consider vscale_range for get active lane mask (#160073)
Scalable get_active_lane_mask intrinsic calls can be simplified to i1
splat (ptrue) when its constant range is larger than or equal to the
maximum possible number of elements, which can be inferred from
vscale_range(x, y)
2025-09-24 11:35:15 +01:00
Rajveer Singh Bharadwaj
5d39cae6ba
[InstCombine] Generalise optimisation of redundant floating point comparisons with ConstantFPRange (#159315)
Follow up of #158097

Similar to `simplifyAndOrOfICmpsWithConstants`, we can do so for
floating point comparisons.
2025-09-20 12:42:17 +00:00
Ramkumar Ramachandra
7fb3a91418
[PatternMatch] Introduce match functor (NFC) (#159386)
A common idiom is the usage of the PatternMatch match function within a
functional algorithm like all_of. Introduce a match functor to shorten
this idiom.

Co-authored-by: Luke Lau <luke@igalia.com>
2025-09-17 21:04:33 +01:00
Rajveer Singh Bharadwaj
08a58b2cea
[InstCombine] Optimize redundant floating point comparisons in or/and inst's (#158097)
Resolves #157371

We can eliminate one of the `fcmp` when we have two same `olt` or `ogt`
instructions matched in `or`/`and` simplification.
2025-09-16 20:52:11 +05:30
David Sherwood
1f49c9494e
[InstSimplify] Simplify get.active.lane.mask when 2nd arg is zero (#158018)
When the second argument passed to the get.active.lane.mask intrinsic is
zero we can simplify the instruction to return an all-false mask
regardless of the first operand.
2025-09-12 10:39:29 +01:00
Florian Hahn
b50ad945dd
[InstSimplify] Simplify extractvalue (umul_with_overflow(x, 1)). (#157307)
Look through extractvalue to simplify umul_with_overflow where one of
the operands is 1.

This removes some redundant instructions when expanding SCEVs, which in
turn makes the runtime check cost estimate more accurate, reducing the
minimum iterations for which vectorization is profitable.

PR: https://github.com/llvm/llvm-project/pull/157307
2025-09-07 18:32:40 +01:00
Kazu Hirata
c7cd1d0ae3
[Analysis] Remove an unnecessary cast (NFC) (#150838)
getOpcode() already returns Instruction::CastOps.
2025-07-27 10:43:30 -07:00
jjasmine
68309adef3
[NFC] Clean up poison folding in simplifyBinaryIntrinsic (#147259)
Fixes #147116.
2025-07-07 23:21:09 +08:00
jjasmine
07286b1fcd
[InstCombine] Propagate poison pow[i], [us]add, [us]sub and [us]mul (#146750)
Fixes #146560 as well as propagate poison for [us]add, [us]sub and
[us]mul
2025-07-04 22:55:07 +01:00
Iris Shi
f51d8730b3
[InstSimplify] Simplify 'x u>= 1' to true when x is known non-zero (#145204) 2025-06-22 13:32:19 +08:00
Philip Reames
6f9cd79fa2
[InstSimplify] Add basic simplifications for vp.reverse (#144112)
Directly modeled after what we do for vector.reverse, but with
restrictions on EVL and mask added.
2025-06-16 10:07:56 -07:00
Ramkumar Ramachandra
b40e4ceaa6
[ValueTracking] Make Depth last default arg (NFC) (#142384)
Having a finite Depth (or recursion limit) for computeKnownBits is very
limiting, but is currently a load-bearing necessity, as all KnownBits
are recomputed on each call and there is no caching. As a prerequisite
for an effort to remove the recursion limit altogether, either using a
clever caching technique, or writing a easily-invalidable KnownBits
analysis, make the Depth argument in APIs in ValueTracking uniformly the
last argument with a default value. This would aid in removing the
argument when the time comes, as many callers that currently pass 0
explicitly are now updated to omit the argument altogether.
2025-06-03 17:12:24 +01:00
Nadharm
f71e4e9bc2
[InstSimplify] Handle nsz when simplifying X * 0.0 (#142181)
If ValueTracking can guarantee non-NaN and non-INF and the `nsz`
fast-math flag is set, we can simplify X * 0.0 ==> 0.0.

https://alive2.llvm.org/ce/z/XacRQZ
2025-05-31 13:50:22 +08:00
Tim Gymnich
571a24c314
Reland [llvm] add GenericFloatingPointPredicateUtils #140254 (#141065)
#140254 was previously missing 2 files in the bazel build config.
2025-05-22 17:17:02 +02:00
Kewen12
c47a5fbb22
Revert "[llvm] add GenericFloatingPointPredicateUtils (#140254)" (#140968)
This reverts commit d00d74bb2564103ae3cb5ac6b6ffecf7e1cc2238. 

The PR breaks our buildbots and blocks downstream merge.
2025-05-21 19:31:14 -04:00
Tim Gymnich
d00d74bb25
[llvm] add GenericFloatingPointPredicateUtils (#140254)
add `GenericFloatingPointPredicateUtils` in order to generalize
effects of floating point comparisons on `KnownFPClass` for both IR and
MIR.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-05-21 23:45:31 +02:00
Iris Shi
f9783c559f
[InstCombine] Fix frexp(frexp(x)) -> frexp(x) fold (#138837)
Fixes #138819

When frexp is applied twice, the second result should be zero.
2025-05-08 00:37:46 +08:00
Luke Lau
aa6d541065
[InstSimplify] Fold {u,s}{min,max} x, poison -> poison (#138166)
Following from the discussion in
https://github.com/llvm/llvm-project/pull/138095#discussion_r2070484664,
these intrinsics are poison if any of their operands are poison, and are
marked as such in propagatesPoison in ValueTracking.cpp.

This will help fold away leftover vectors produced by VectorCombine when
scalarizing intrinsics.
2025-05-02 07:49:27 +08:00
Yingwei Zheng
c37b2549ff
Revert "[InstSimplify] Fold getelementptr inbounds null, idx -> null (#130742)" (#138168)
Revert #130742 for now to avoid breaking glibc failures until the
workaround patches are landed.
2025-05-01 14:21:59 -07:00
Yingwei Zheng
5a993558c5
[InstSimplify] Fold getelementptr inbounds null, idx -> null (#130742)
Proof: https://alive2.llvm.org/ce/z/5ZkPx-
See also https://github.com/llvm/llvm-project/pull/130734 for the motivation.
2025-04-17 20:44:46 +08:00
Tim Gymnich
049f179606
[Analysis][NFC] Extract KnownFPClass (#133457)
- extract KnownFPClass for future use inside of GISelKnownBits

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-03-28 18:10:02 +01:00
Veera
bc35510725
[InstSimplify] Fold X * C >= X to true (#129352)
Proof: https://alive2.llvm.org/ce/z/T_ocLy

Discovered in: https://github.com/rust-lang/rust/issues/114386

This PR folds `X * C >= X` to `true` when `C` is known to be non-zero
and `mul` is `nuw`.

Folds for other math operators exist already:
https://llvm-ir.godbolt.org/z/GKcYEf5Kb
2025-03-01 12:02:57 -05:00
Nikita Popov
e56a6a2683
Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880) (#128020)
Relative to the previous attempt this includes two fixes:
 * Adjust callCapturesBefore() to not skip captures(ret: address,
    provenance) arguments, as these will not count as a capture
    at the call-site.
 * When visiting uses during stack slot optimization, don't skip
    the ModRef check for passthru captures. Calls can both modref
    and be passthru for captures.

------

This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.

The key API changes here are:

* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.

For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.

This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
2025-02-27 09:38:29 +01:00
Nico Weber
e2ba1b6ffd Revert "Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)"
This reverts commit 0fab404ee874bc5b0c442d1841c7d2005c3f8729.
Seems to break LTO builds of clang on Windows, see comments on
https://github.com/llvm/llvm-project/pull/125880
2025-02-19 11:32:57 -05:00
Nikita Popov
7e3735d1a1 Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)
Relative to the previous attempt, this adjusts isEscapeSource()
to not treat calls with captures(ret: address, provenance) or similar
arguments as escape sources. This addresses the miscompile reported at:
https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577

The implementation uses a helper function on CallBase to make this
check a bit more efficient (e.g. by skipping the byval checks) as
checking attributes on all arguments if fairly expensive.

------

This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.

The key API changes here are:

* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.

For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.

This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
2025-02-14 12:38:04 +01:00
Nikita Popov
1e64ea9914 Revert "[CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)"
This reverts commit ee655ca27aad466bcc54f6eba03f7e564940ad5a.

A miscompilation has been reported at:
https://github.com/llvm/llvm-project/pull/125880#issuecomment-2656632577
2025-02-13 14:56:12 +01:00
Nikita Popov
ee655ca27a
[CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880)
This extends CaptureTracking to support inferring non-trivial
CaptureInfos. The focus of this patch is to only support FunctionAttrs,
other users of CaptureTracking will be updated in followups.

The key API changes here are:

* DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC
component specifies what is captured at that Use and the ResultCC
component specifies what may be captured via the return value of the
User. Usually only one or the other will be used (corresponding to
previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for
call captures.
* The CaptureTracking::captures() extension point is passed this
UseCaptureInfo as well and then can decide what to do with it by
returning an Action, which is one of: Stop: stop traversal.
ContinueIgnoringReturn: continue traversal but don't follow the
instruction return value. Continue: continue traversal and follow the
instruction return value if it has additional CaptureComponents.

For now, this patch retains the (unsound) special logic for comparison
of null with a dereferenceable pointer. I'd like to switch key code to
take advantage of address/address_is_null before dropping it.

This PR mainly intends to introduce necessary API changes and basic
inference support, there are various possible improvements marked with
TODOs.
2025-02-13 09:36:35 +01:00
Ramkumar Ramachandra
738cf5acc6
InstSimplify: improve computePointerICmp (NFC) (#126255)
The comment about inbounds protecting only against unsigned wrapping is
incorrect: it also protects against signed wrapping, but the issue is
that it could cross the sign boundary.
2025-02-10 11:42:06 +00:00
Yingwei Zheng
1af627b592
[InstSimplify] Add additional checks when substituting pointers (#125385)
Compile-time impact:
https://llvm-compile-time-tracker.com/compare.php?from=d09b521624f263b5f1296f8d4771836b97e600cb&to=e437ba2cb83bb965e13ef00727671896f03ff84f&stat=instructions:u
IR diff looks acceptable.
Closes https://github.com/llvm/llvm-project/issues/115574
2025-02-02 19:04:23 +08:00
Andreas Jonson
9399a1ddb8
[InstSimplify] Handle trunc to i1 in Select with bit test folds. (#122944)
Proof: https://alive2.llvm.org/ce/z/Jncqb2
2025-02-01 08:48:32 +01:00
Yingwei Zheng
626c23112f
[ValueTracking] Use SimplifyQuery in isKnownNonEqual (#124942)
It is needed by https://github.com/llvm/llvm-project/pull/117442.
2025-02-01 15:13:11 +08:00
goldsteinn
cc995ad064
[InstSimpify] Simplifying (xor (sub C_Mask, X), C_Mask) -> X (#122552)
- **[InstSimpify] Add tests for simplifying `(xor (sub C_Mask, X),
C_Mask)`; NFC**
- **[InstSimpify] Simplifying `(xor (sub C_Mask, X), C_Mask)` -> `X`**

Helps address regressions with folding `clz(Pow2)`.

Proof: https://alive2.llvm.org/ce/z/zGwUBp
2025-01-11 15:10:42 -06:00
goldsteinn
6192fafe9c
[InstSimplify] Use multi-op replacement when simplify select (#121708)
- **[InstSimplify] Refactor `simplifyWithOpsReplaced` to allow multiple
replacements; NFC**
- **[InstSimplify] Use multi-op replacement when simplify `select`**

In the case of `select X | Y == 0 :...` or `select X & Y == -1 : ...`
we can do more simplifications by trying to replace both `X` and `Y`
with the respective constant at once.

Handles some cases for https://github.com/llvm/llvm-project/pull/121672
more generically.
2025-01-07 11:42:01 -06:00
Nikita Popov
c630e13676
[InstSimplify] Simplify both operands of select before comparing (#121753)
In the simplifySelectWithEquivalence fold, simplify both operands before
comparing them, instead of comparing one simplified operand with a
non-simplified operand. This is slightly more powerful.
2025-01-06 14:43:28 +01:00
Veera
6f8afafd30
[InstCombine] Fold A == MIN_INT ? B != MIN_INT : A < B to A < B (#120177)
This PR folds:
 `A == MIN_INT ? B != MIN_INT : A < B` to `A < B`
 `A == MAX_INT ? B != MAX_INT : A > B` to `A > B`

Proof: https://alive2.llvm.org/ce/z/bR6E2s

This helps in optimizing comparison of optional unsigned non-zero types
in https://github.com/rust-lang/rust/issues/49892.

Rust compiler's current output: https://rust.godbolt.org/z/9fxfq3Gn8
2024-12-19 22:52:55 +08:00