891 Commits

Author SHA1 Message Date
Florian Mayer
a1d12e944e
[NFC] [IndVars] test for multiple bounds for predicate-loop-traps
This can come from loops like

```
for (int i = 0; i < X; ++i) {
  if (i < N)
    __builtin_trap();
  if (i < M)
    __builtin_trap();
  x[i] = y[i];
}
```

Reviewers: nikic

Reviewed By: nikic

Pull Request: https://github.com/llvm/llvm-project/pull/181264
2026-02-13 10:13:31 -08:00
Nikita Popov
fe413f70fe
[SCEV] Discard samesign when analyzing loop invariant exits (#181171)
If the predicate has samesign set, we could either perform the checks
with the unsigned predicate and return and unsigned invariant predicate,
or we could perform them with the signed predicate and return a signed
invariant predicate. The current implementation can end up mixing both,
using a signed predicate for one check and an unsigned one for the
other.

Avoid this by dropping the samesign flag.

Fixes https://github.com/llvm/llvm-project/issues/180870.
2026-02-13 12:23:31 +01:00
Anshil Gandhi
a89e766ca6
[IndVarSimplify] Add safety check for getTruncateExpr in genLoopLimit (#181296)
getTruncateExpr may not always return a SCEVAddRecExpr when truncating
loop bounds. Add a check to verify the result type before casting, and
bail out of the transformation if the cast would be invalid.

This prevents potential crashes from invalid casts when dealing with
complex loop bounds.

Co-authored by Michael Rowan

Resolves
[https://github.com/llvm/llvm-project/issues/153090](https://github.com/llvm/llvm-project/issues/153090)
2026-02-13 05:52:34 +00:00
Florian Mayer
ac974fddb9
[NFC] [IndVars] fix typo in test (#181262) 2026-02-12 23:28:28 +00:00
Nikita Popov
74bd92d6e2 Revert "[IndVarSimplify] Add safety check for getTruncateExpr in genLoopLimit (#172234)"
This reverts commit 4f551b55aeb316cd2d8f8f911908ea5bd4ced16b.

This change reformatted the file.
2026-02-12 14:52:14 +01:00
Anshil Gandhi
4f551b55ae
[IndVarSimplify] Add safety check for getTruncateExpr in genLoopLimit (#172234)
getTruncateExpr may not always return a SCEVAddRecExpr when truncating
loop bounds. Add a check to verify the result type before casting, and
bail out of the transformation if the cast would be invalid.

This prevents potential crashes from invalid casts when dealing with
complex loop bounds.

Co-authored by Michael Rowan

Resolves [#153090](https://github.com/llvm/llvm-project/issues/153090)
2026-02-11 10:08:41 +00:00
Matt Arsenault
2502e3b7ba
IR: Promote "denormal-fp-math" to a first class attribute (#174293)
Convert "denormal-fp-math" and "denormal-fp-math-f32" into a first
class denormal_fpenv attribute. Previously the query for the effective
denormal mode involved two string attribute queries with parsing. I'm
introducing more uses of this, so it makes sense to convert this
to a more efficient encoding. The old representation was also awkward
since it was split across two separate attributes. The new encoding
just stores the default and float modes as bitfields, largely avoiding
the need to consider if the other mode is set.

The syntax in the common cases looks like this:
  `denormal_fpenv(preservesign,preservesign)`
  `denormal_fpenv(float: preservesign,preservesign)`
  `denormal_fpenv(dynamic,dynamic float: preservesign,preservesign)`

I wasn't sure about reusing the float type name instead of adding a
new keyword. It's parsed as a type but only accepts float. I'm also
debating switching the name to subnormal to match the current
preferred IEEE terminology (also used by nofpclass and other
contexts).

This has a behavior change when using the command flag debug
options to set the denormal mode. The behavior of the flag
ignored functions with an explicit attribute set, per
the default and f32 version. Now that these are one attribute,
the flag logic can't distinguish which of the two components
were explicitly set on the function. Only one test appeared to
rely on this behavior, so I just avoided using the flags in it.

This also does not perform all the code cleanups this enables.
In particular the attributor handling could be cleaned up.

I also guessed at how to support this in MLIR. I followed
MemoryEffects as a reference; it appears bitfields are expanded
into arguments to attributes, so the representation there is
a bit uglier with the 2 2-element fields flattened into 4 arguments.
2026-02-05 13:31:26 +00:00
Mingjie Xu
fac9472593
[IR] Reland Optimize PHINode::removeIncomingValue() and PHINode::removeIncomingValueIf() to use the swapping strategy. (#174274)
Reland #171963, #172639 and #173444, they are reverted in
86b9f90b9574b3a7d15d28a91f6316459dcfa046 because of introducing
non-determinism in compiles.
The non-determinism has been fixed in
9b8addffa70cee5b2acc5454712d9cf78ce45710.
2026-01-04 09:24:53 +08:00
Walter Lee
86b9f90b95
Revert 159f1c048e08a8780d92858cfc80e723c90235e3 (#173893)
This causes non-determinism in compiles.

From nikic: "FYI the non-determinism is also visible on
llvm-opt-benchmark. Maybe repeatedly running test cases from
299446d99f
could reproduce the issue..."

Also revert dependent 796fafeff92fe5d2d20594859e92607116e30a16 and
e135447bda617125688b71d33480d131d1076a72.
2025-12-29 20:23:13 -05:00
Mingjie Xu
159f1c048e
[IR] Optimize PHINode::removeIncomingValue() by swapping removed incoming value with the last incoming value. (#171963)
Current implementation uses `std::copy` to shift all incoming values
after the removed index. This patch optimizes
`PHINode::removeIncomingValue()` by replacing the linear shift of
incoming values with a swap-with-last strategy.

After this change, the relative order of incoming values after removal
is not preserved.

This improves compile-time for PHI nodes with many predecessors.

Depends:
https://github.com/llvm/llvm-project/pull/171955
https://github.com/llvm/llvm-project/pull/171956
https://github.com/llvm/llvm-project/pull/171960
https://github.com/llvm/llvm-project/pull/171962
2025-12-17 19:44:01 +08:00
Philip Reames
c752bb9203
[IndVars] Strengthen inference of samesign flags (#170363)
When reviewing another change, I noticed that we were failing to infer
samsign for two cases: 1) an unsigned comparison, and 2) when both
arguments were known negative.

Using CVP and InstCombine as a reference, we need to be careful to not
allow eq/ne comparisons. I'm a bit unclear on the why of that, and for
now am going with the low risk change. I may return to investigate that
in a follow up.

Compile time results look like noise to me, see:
https://llvm-compile-time-tracker.com/compare.php?from=49a978712893fcf9e5f40ac488315d029cf15d3d&to=2ddb263604fd7d538e09dc1f805ebc30eb3ffab0&stat=instructions:u
2025-12-03 16:16:22 +00:00
Philip Reames
49a9787128 [SCEV] Regenerate a subset of auto updated tests
Reducing spurious diff in an upcoming change.
2025-12-02 12:16:53 -08:00
Antonio Frighetto
2f56977aea
[IndVarSimplify] Add regression test for recently-added refactor (NFC)
Add a test case for commit f54c6b4306a3f92c08aeb8a9fa222b88985cb9ef, which was previously failing after
refactor in b27af83120b32a4b8312ddf1e6317271122769e4.
2025-11-28 11:59:46 +01:00
Lucie Choi
356479191c
[IndVarSimplify] Fix IndVarSimplify to skip unfolding predicates when the loop contains control convergence operations. (#165643)
Skip constant folding the loop predicates if the loop contains control
convergence tokens referenced outside the loop.

Fixes https://github.com/llvm/llvm-project/issues/164496.

Verified
[loop_peeling.test](https://github.com/llvm/offload-test-suite/pull/473)
passes with the fix.

Similar control convergence issues are found on other passes.
https://github.com/llvm/llvm-project/issues/165642

HLSL used for tests:
```hlsl
RWStructuredBuffer<uint> Out : register(u0);

[numthreads(8,1,1)]
void main(uint3 TID : SV_GroupThreadID) {
    for (uint i = 0; i < 8; i++) {
        if (i == TID.x) {
            Out[TID.x] = WaveActiveMax(TID.x);
            break;
        }
    }
}
```
With nested loop:
```hlsl
RWStructuredBuffer<uint> Out : register(u0);

[numthreads(8,8,1)]
void main(uint3 TID : SV_GroupThreadID) {
    for (uint i = 0; i < 8; i++) {
        for (uint j = 0; j < 8; j++) {
            if (i == TID.x && j == TID.y) {
                uint index = TID.x * 8 + TID.y;
                Out[index] = WaveActiveMax(index);
                break;
            }
        }
    }
}
```
2025-11-26 09:04:41 -08:00
Alexander Belyaev
7ee0e0f956 Revert "[LICM] Sink unused l-invariant loads in preheader. #157559"
This reverts commit 469702c5d5cc4fa18c3a962afb971950a084f373.

https://github.com/llvm/llvm-project/issues/168048
2025-11-14 14:51:33 +01:00
Antonio Frighetto
eaf3a91722
[IndVarSimplify] Ensure fp values can be represented as exact integers
When transforming floating-point induction variables into integer ones,
make sure we stay within the bounds of fp values that can be represented
as integers without gaps, i.e., 2^24 and 2^53 for IEEE-754 single and
double precision respectively (both on negative and positive side).

Fixes: https://github.com/llvm/llvm-project/issues/166496.
2025-11-11 10:30:58 +01:00
Antonio Frighetto
9100001cd0
[IndVarSimplify] Precommit tests for PR166649 (NFC) 2025-11-11 10:30:58 +01:00
Florian Hahn
d3fe1df194
[SCEV] Improve handling of divisibility information from loop guards. (#163021)
At the moment, the effectivness of guards that contain divisibility
information (A % B == 0 ) depends on the order of the conditions.

This patch makes using divisibility information independent of the
order, by collecting and applying the divisibility information
separately.

We first collect all conditions in a vector, then collect the
divisibility information from all guards.

When processing other guards, we apply divisibility info collected
earlier.

After all guards have been processed, we add the divisibility info,
rewriting the existing rewrite. This ensures we apply the divisibility
info to the largest rewrite expression.

This helps to improve results in a few cases, one in
https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2921 and another one
in a different large C/C++ based IR corpus.

PR: https://github.com/llvm/llvm-project/pull/163021
2025-11-02 14:16:24 +00:00
Vigneshwar Jayakumar
469702c5d5
[LICM] Sink unused l-invariant loads in preheader. (#157559)
Unused loop invariant loads were not sunk from the preheader to the exit
block, increasing live range.

This commit moves the sinkUnusedInvariant logic from indvarsimplify to
LICM also adds functionality to sink unused load that's not
clobbered by the loop body.
2025-10-30 09:23:04 -05:00
paperchalice
249883d0c5
[test][Transforms] Remove unsafe-fp-math uses part 2 (NFC) (#164786)
Post cleanup for #164534.
2025-10-23 20:31:31 +08:00
Florian Hahn
a5d3522c13
[SCEV] Rewrite A - B = UMin(1, A - B) lazily for A != B loop guards. (#163787)
Follow-up to 2d027260b0f8
(https://github.com/llvm/llvm-project/pull/160500)

Creating the SCEV subtraction eagerly is very expensive. To soften the
blow, just collect a map with inequalities and check if we can apply the
subtract rewrite when rewriting SCEVAddExpr.

Restores most of the regression:

http://llvm-compile-time-tracker.com/compare.php?from=0792478e4e133be96650444f3264e89d002fc058&to=7fca35db60fe6f423ea6051b45226046c067c252&stat=instructions:u
stage1-O3: -0.10%
stage1-ReleaseThinLTO: -0.09%
stage1-ReleaseLTO-g: -0.10%
stage1-O0-g: +0.02%
stage1-aarch64-O3: -0.09%
stage1-aarch64-O0-g: +0.00%
stage2-O3: -0.17%
stage2-O0-g: -0.05%
stage2-clang: -0.07%

There is still some negative impact compared to before 2d027260b0f8, but
there's probably not much we could do reduce this even more.

Compile-time improvement with 2d027260b0f8 reverted on top of the
current PR:
http://llvm-compile-time-tracker.com/compare.php?from=7fca35db60fe6f423ea6051b45226046c067c252&to=98dd152bdfc76b30d00190d3850d89406ca3c21f&stat=instructions:u

stage1-O3: 60628M (-0.03%)
stage1-ReleaseThinLTO: 76388M (-0.04%)
stage1-ReleaseLTO-g: 89228M (-0.02%)
stage1-O0-g: 18523M (-0.03%)
stage1-aarch64-O3: 67623M (-0.03%)
stage1-aarch64-O0-g: 22595M (+0.01%)
stage2-O3: 52336M (+0.01%)
stage2-O0-g: 16174M (+0.00%)
stage2-clang: 34890032M (-0.03%)

PR: https://github.com/llvm/llvm-project/pull/163787
2025-10-18 13:32:40 +01:00
Florian Hahn
0590c9e828
[IndVars] Add additional tests with ICMP_NE loop guards.
Extra test coverage for
https://github.com/llvm/llvm-project/pull/163787.
2025-10-17 12:56:25 +01:00
Florian Mayer
39b0cbe69c
[IndVarSimplify] Allow predicateLoopExit on some loops with thread-local writes (#155901)
This is important to optimize patterns that frequently appear with
bounds checks:

```
for (int i = 0; i < N; ++i) {
  bar[i] = foo[i] + 123;
}
```

which gets roughly turned into

```
for (int i = 0; i < N; ++i) {
  if (i >= size of foo)
     ubsan.trap();
  if (i >= size of bar)
     ubsan.trap();
  bar[i] = foo[i] + 123;
}
```

Motivating example:
https://github.com/google/boringssl/blob/main/crypto/fipsmodule/hmac/hmac.cc.inc#L138

I hand-verified the assembly and confirmed that this optimization
removes the check in the loop.
This also allowed the loop to be vectorized.

Alive2: https://alive2.llvm.org/ce/z/3qMdLF

I did a `stage2-check-all` for both normal and
`-DBOOTSTRAP_CMAKE_C[XX]_FLAGS="-fsanitize=array-bounds
-fsanitize-trap=all"`.

I also ran some Google-internal tests with `fsanitize=array-bounds`.
Everything passes.
2025-10-16 09:18:00 -07:00
Florian Hahn
2d027260b0
[SCEV] Collect guard info for ICMP NE w/o constants. (#160500)
When collecting information from loop guards, use UMax(1, %b - %a) for
ICMP NE %a, %b, if neither are constant.

This improves results in some cases, and will be even more useful
together with
 * https://github.com/llvm/llvm-project/pull/160012
 * https://github.com/llvm/llvm-project/pull/159942

https://alive2.llvm.org/ce/z/YyBvoT

PR: https://github.com/llvm/llvm-project/pull/160500
2025-10-14 14:20:34 +00:00
Florian Hahn
6a0e5b2fd7
[IndVars] Add test for missed optimizations depending on guard order.
The added tests show missed optimizations, depending on the order of
loop guard conditions.
2025-10-11 16:40:24 +01:00
Florian Mayer
1c11f72344
[NFC] [IndVarSimplify] add overflowing tests (#159877)
Also use UTC for test instead.
2025-09-30 15:19:53 -07:00
Florian Hahn
f8a7f36a61
[IndVars,LV] Add tests with pointer-based loop guards.
Add tests with pointer-based loop guards.
2025-09-22 14:14:52 +01:00
Florian Mayer
370e007740
[NFC] [IndVarSimplify] Add non-overflowing usub test (#159683)
We would reenter the loop with %i.04 being 0, so the usub would
overflow to -1. This was the only test in this file that had
an overflow in the loop, the other ones did not.
2025-09-21 12:10:45 -07:00
Florian Hahn
8693ef16f6
[SCEV] Add tests that benefit from rewriting SCEVAddExpr with guards.
Add additional tests benefiting from rewriting existing SCEVAddExprs with
guards.
2025-09-20 19:24:19 +01:00
Florian Hahn
914374624f
[SCEV] Try to push op into ZExt: C * zext (A + B) -> zext (A*C + B*C) (#155300)
Try to push constant multiply operand into a ZExt containing an add, if
possible. In general we are trying to push down ops through ZExt if
possible. This is similar to
https://github.com/llvm/llvm-project/pull/151227 which did the same for
additions.

For now this is restricted to adds with a constant operand, which is
similar to some of the logic above.

This enables some additional simplifications.

Alive2 Proof: https://alive2.llvm.org/ce/z/97pbSL

PR: https://github.com/llvm/llvm-project/pull/155300
2025-08-26 19:31:50 +01:00
Florian Hahn
f0df62f7b6
[IndVars,LV] Add tests for missed SCEV simplifications with muls. 2025-08-25 22:09:15 +01:00
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Florian Hahn
d74d841b65
[SECV] Try to push the op into ZExt: A + zext (-A + B) -> zext (B) (#151227)
Try to push the constant operand into a ZExt:
A + zext (-A + B) -> zext (B), if trunc (A) + -A + B does not
unsigned-wrap.

The actual code supports ZExts with arbitrary number of arguments, hence
the getAddExpr in the return.

This helps SCEV reasoning in some cases, commonly when adding an offset
to a zero-extended SCEV that subtracts the same offset.

Note that this is restricted to cases where we can fold away an operand
of the inner Add. This is needed to avoid bad interactions with patterns
when forming ZExts, which try to push to ZExt to add operands.

https://alive2.llvm.org/ce/z/q7d303

PR: https://github.com/llvm/llvm-project/pull/151227
2025-07-30 21:10:57 +01:00
Florian Hahn
446b3de5b6
[IndVars] Add tests showing missed folding opportunity. 2025-07-29 21:25:52 +01:00
Ramkumar Ramachandra
bdc8736b2d
[SCEV] Move a test into IndVars (#147360)
Move the guards.ll into IndVars, as it is really an IndVars test.
2025-07-09 13:32:30 +01:00
Florian Hahn
8f79754225
[SCEV] Better preserve wrapping info in SimplifyICmpOperands for UGE. (#144404)
Update SimplifyICmpOperands to only try subtracting 1 from RHS first, if
RHS is an op we can fold the subtract directly into. Otherwise try
adding to LHS first, as we can preserve NUW flags.

This improves results in a few cases, including the modified test case
from berkeley-abc and new code to be added in
https://github.com/llvm/llvm-project/pull/128061.

Note that there are more cases where the results can be improved by
better ordering here which I'll try to investigate as follow-up.

PR: https://github.com/llvm/llvm-project/pull/144404
2025-06-17 15:30:08 +01:00
Florian Hahn
c7d85813fd
[IndVars] Add tests showing missed simplifications. 2025-06-16 16:31:21 +01:00
Craig Topper
e0cc556ad4
[IndVars] Teach widenLoopCompare to use sext if narrow IV is positive and other operand is already sext. (#142703)
This prevents us from ending up with (zext (sext X)). The zext will
require an instruction on targets where zext isn't free like RISC-V.
2025-06-10 12:52:39 -07:00
Florian Hahn
1340ecf0ba
[SCEV] Add more tests with zext(add C, %var)<nsw>.
Add extra test coverage for
https://github.com/llvm/llvm-project/pull/142599.
2025-06-03 22:03:20 +01:00
Florian Hahn
0ba63b2f22
[SCEV] Add additional test coverage for loop-guards reasoning.
Add additional tests showing missed opportunities when using loop guards
for reasoning in SCEV, depending on the order the guards appear in the
IR.
2025-06-01 22:39:37 +01:00
Craig Topper
52d2b589b2
[IndVarSimplify] Set samesign when converting signed comparison to unsigned comparison in eliminateIVComparison. (#138215) 2025-05-02 08:17:45 -07:00
Alexander Richardson
ee13638362
[AMDGPU] Remove explicit datalayout from tests where not needed
Since e39f6c1844fab59c638d8059a6cf139adb42279a opt will infer the
correct datalayout when given a triple. Avoid explicitly specifying it
in tests that depend on the AMDGPU target being present to avoid the
string becoming out of sync with the TargetInfo value.
Only tests with REQUIRES: amdgpu-registered-target or a local lit.cfg
were updated to ensure that tests for non-target-specific passes that
happen to use the AMDGPU layout still pass when building with a limited
set of targets.

Reviewed By: shiltian, arsenm

Pull Request: https://github.com/llvm/llvm-project/pull/137921
2025-04-30 10:58:17 -07:00
Sirish Pande
7f107c3019
[IndVarsSimplify] sinkUnusedInvariants is skipping instructions while sinking. (#135205)
While sinking instructions (that are loop invariant) from preheader to
the exit block, we are skipping instructions due to decrementing
instruction iterator twice.
2025-04-17 19:21:18 -05:00
Stephen Tozer
1f224d889d
[DebugInfo][IndVarSimplify] Propagate source loc when simplifying rem (#135399)
When IndVarSimplify simplifies a rem of the induction variable to a cmp
and select, only the select currently receives the rem's source
location; this patch propagates it to the cmp as well.

Found using https://github.com/llvm/llvm-project/pull/107279.
2025-04-17 17:30:09 +01:00
Yingwei Zheng
d14acb7806
[IndVarSimplify] Handle the case where both operands are the same when widening IV (#135207)
`WidenIV::widenWithVariantUse` assumes that exactly one of the binop
operands is the IV to be widened. This miscompilation happens when it
tries to sign-extend the "NonIV" operand while the IV is zero-extended.
Closes https://github.com/llvm/llvm-project/issues/135182.
2025-04-11 09:03:06 +08:00
Yingwei Zheng
f066d7504e
[Reland][SCEV] teach isImpliedViaOperations about samesign (#133711)
This patch relands https://github.com/llvm/llvm-project/pull/124270.
Closes https://github.com/llvm/llvm-project/issues/126409.

The root cause is that we incorrectly preserve the samesign flag after
truncating operands of an icmp:
https://alive2.llvm.org/ce/z/4NE9gS

---------

Co-authored-by: Ramkumar Ramachandra <ramkumar.ramachandra@codasip.com>
2025-04-02 18:45:33 +08:00
Ramkumar Ramachandra
c6b13a2871
Revert "SCEV: teach isImpliedViaOperations about samesign" (#126506)
The commit f5d24e6c is buggy, and following miscompiles have been
reported: #126409 and
https://github.com/llvm/llvm-project/pull/124270#issuecomment-2647222903

Revert it while we investigate.
2025-02-10 13:31:18 +00:00
Nikita Popov
7aed53eb19
[ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() (#126236)
The code already guards against values coming from a previous iteration
using properlyDominates(). However, addrecs are considered to properly
dominate the loop they are defined in.

Handle this special case separately, by checking for expressions that
have computable loop evolution (this should cover cases like a zext of
an addrec as well).

I considered changing the definition of properlyDominates() instead, but
decided against it. The current definition is useful in other context,
e.g. when deciding whether an expression is safe to expand in a given
block.

Fixes https://github.com/llvm/llvm-project/issues/126012.
2025-02-10 10:07:21 +01:00
Ramkumar Ramachandra
52b59476cd
SCEV: re-org a test, regen via UTC (#126237) 2025-02-07 13:19:34 +00:00
Nikita Popov
ae08969a20 [IndVars] Add test for #126012 (NFC) 2025-02-07 12:41:23 +01:00