648 Commits

Author SHA1 Message Date
Yingwei Zheng
4eac8daa38
[LoopPeel] Handle non-local instructions/arguments when updating exiting values (#142993)
Similar to
7e14161f49,
the exiting value may be a non-local instruction or an argument.

Closes https://github.com/llvm/llvm-project/issues/142895.
2025-06-06 12:56:28 +08:00
Florian Hahn
3a8b48862a
[LoopPeel] Add tests for peeling last iteration with loop guards.
Add additional test coverage for peeling the last iteration where
information from loop guards is needed.
2025-06-03 14:29:44 +01:00
Florian Hahn
f98bdd94e6
Reapply "[LoopPeel] Remove known trip count restriction when peeling last. (#140792)"
This reverts commit 580454526b936f7a576ddbc9bb932cf9be376ec4.

The recommitted version contains an extra check to not peel if the
latch exit is controlled by a pointer induction.

Original message:
Remove the restriction that the loop must be known to execute at least 2
iterations when peeling the last iteration. If we cannot prove at least
2 iterations are executed, a check and branch to skip the peeled loop is
inserted.

PR: https://github.com/llvm/llvm-project/pull/140792
2025-05-28 13:02:03 +01:00
Florian Hahn
f0f666bc32
[LoopPeel] Add peeling tests with debug value and pointer inductions
Adds extra test coverage for https://github.com/llvm/llvm-project/pull/140792.
2025-05-28 10:07:02 +01:00
Florian Hahn
580454526b
Revert "[LoopPeel] Remove known trip count restriction when peeling last. (#140792)"
This reverts commit 24b97756decb7bf0e26dcf0e30a7a9aaf27f417c.
Also reverts ac9a466e39bf97ffeab127982aa7c405cb257551.

Building CMake triggers a crash with the patch, revert while I
investigate.
2025-05-27 21:25:32 +01:00
Florian Hahn
ac9a466e39
[LoopPeel] Insert new phis before first non-PHI when peeling last iter.
Make sure the new phis are inserted before any non-phi instructions.
This fixes a crash when dbg_value instructions are present in the
original exit block.
2025-05-27 10:46:28 +01:00
Florian Hahn
24b97756de
[LoopPeel] Remove known trip count restriction when peeling last. (#140792)
Remove the restriction that the loop must be known to execute at least 2
iterations when peeling the last iteration. If we cannot prove at least
2 iterations are executed, a check and branch to skip the peeled loop is
inserted.

PR: https://github.com/llvm/llvm-project/pull/140792
2025-05-26 20:08:02 +01:00
Florian Hahn
3c9812eeea
[LoopPeel] Add tests for peeling last iteration with multiple exits. 2025-05-23 15:46:34 +01:00
Florian Hahn
4f869e0f5c
[LoopPeel] Add test for peeling last iteration with non-trivial BTC.
Additional test to https://github.com/llvm/llvm-project/pull/140792 with
different SCEV expansion costs.
2025-05-21 22:28:26 +01:00
Florian Hahn
705e27c234
[LoopPeel] Add tests for peeling from end with variable trip counts.
Add more test coverage for peeling the last iteration with variable trip
counts. Separate test cases for constant and variable trip counts in
different files.
2025-05-20 21:07:21 +01:00
Florian Hahn
a0a2a1e095
[LoopPeel] Make sure exit condition has a single use when peeling last.
Update the check in canPeelLastIteration to make sure the exiting
condition has a single use. When peeling the last iteration, we adjust
the condition in the loop body to be true one iteration early, which
would be incorrect for other users.

Fixes https://github.com/llvm/llvm-project/issues/140444.
2025-05-18 11:47:12 +01:00
Florian Hahn
7e14161f49
[LoopPeel] Handle constants when updating exit values when peeling last.
Account for constant values when updating exit values after peeling an
iteration from the end. This can happen if the inner loop gets unrolled
and simplified.

Fixes https://github.com/llvm/llvm-project/issues/140442.
2025-05-18 10:17:21 +01:00
Florian Hahn
3fcfce4c5e
Reapply "[LoopPeel] Implement initial peeling off the last loop iteration. (#139551)"
This reverts the revert commit bf92b127d2637948f53d11a187e865aa10e2e74c.

This adds missing initialization of PeelLast in gatherPeelingPreferences.

Original message:
Generalize countToEliminateCompares to also consider peeling off the
last iteration if it eliminates a compare.

At the moment, codegen for peeling off the last iteration is quite
restrictive and callers have to make sure that the exit condition can be
adjusted when peeling and that the loop executes at least 2 iterations.

Both will be relaxed in follow-ups.

PR: https://github.com/llvm/llvm-project/pull/139551
2025-05-17 10:51:05 +01:00
Florian Hahn
bf92b127d2
Revert "[LoopPeel] Implement initial peeling off the last loop iteration. (#139551)"
This reverts commit bb10c3ba7f77d40a7fbfd4ac815015d3a4ae476a.

Also reverts 4f663cca15f2b53c2bc6a84d1b1f5bd81679356d:
  Revert "[LoopPeel] Make sure PeelLast is always initialized."

Revert for now to bring msan bots back to green

 https://lab.llvm.org/buildbot/#/builders/164/builds/9992
 https://lab.llvm.org/buildbot/#/builders/94/builds/7158
2025-05-16 08:33:12 +01:00
Florian Hahn
bb10c3ba7f
[LoopPeel] Implement initial peeling off the last loop iteration. (#139551)
Generalize countToEliminateCompares to also consider peeling off the
last iteration if it eliminates a compare.

At the moment, codegen for peeling off the last iteration is quite
restrictive and callers have to make sure that the exit condition can be
adjusted when peeling and that the loop executes at least 2 iterations.

Both will be relaxed in follow-ups.

PR: https://github.com/llvm/llvm-project/pull/139551
2025-05-15 19:15:48 +01:00
Florian Hahn
310ed2b070
[LoopUnroll] Add tests with multiple exiting/latches and small BTCs.
Extra test coverage for cases mentioned during review of
https://github.com/llvm/llvm-project/pull/139551.
2025-05-15 12:54:00 +01:00
Florian Hahn
d39ca81fdd
[LoopPeel] Add initial tests for peeling the last iteration.
Precommit tests for upcoming PR.
2025-05-12 14:56:21 +01:00
Matt Arsenault
9bdd9dc895
AMDGPU: Mark workitem ID intrinsics with range attribute (#136196)
This avoids the need to have special handling at every use site.
Unfortunately this means we unnecessarily emit AssertZext in the DAG
(where we already directly understand the range of the intrinsic), andt
we regress in undefined cases as we don't fold out asserts on undef.
2025-04-18 12:27:38 +02:00
Sirish Pande
7f107c3019
[IndVarsSimplify] sinkUnusedInvariants is skipping instructions while sinking. (#135205)
While sinking instructions (that are loop invariant) from preheader to
the exit block, we are skipping instructions due to decrementing
instruction iterator twice.
2025-04-17 19:21:18 -05:00
Yingwei Zheng
7e5317139d
[PowerPC] Pre-commit tests for PR130742. NFC. (#135606)
Needed by https://github.com/llvm/llvm-project/pull/130742.
2025-04-17 17:52:49 +08:00
Björn Pettersson
092b6e73e6
[InstCombine] Handle "add like" in ADD+GEP->GEP+GEP rewrites (#135156)
Considering that "or disjoint" is the canonical for certain add
operations, then I think we want to support such "add like" operations
when doing ADD+GEP->GEP+GEP rewrites to make things more consistent.

Problem was found when improving ValueTracking, which turned an ADD into
OR, and then suddenly optimizations got worse due to these rewrites no
longer triggering.
2025-04-14 17:11:13 +02:00
David Sherwood
712c21336f
[AArch64] Enable unrolling for small multi-exit loops (#131998)
It can be highly beneficial to unroll small, two-block search loops
that look for a value in an array. An example of this would be
something that uses std::find to find a value in libc++. Older
versions of std::find in the libstdc++ headers are manually unrolled
in the source code, but this might change in newer releases where
the compiler is expected to either vectorise or unroll itself.
2025-04-09 10:34:27 +01:00
Florian Hahn
a4573ee38d
[LoopUnroll] UnrollRuntimeMultiExit takes precedence over TTI. (#134259)
Update UnrollRuntimeLoopRemainder to always give priority to the
UnrollRuntimeMultiExit option, if provided.

After ad9da92cf6f7357 (https://github.com/llvm/llvm-project/pull/124462),
we would ignore the option if the backend indicates multi-exit is profitable.
This means it cannot be used to disable runtime unrolling.

To be consistent with canProfitablyRuntimeUnrollMultiExitLoop, always
respect the option.

This surfaced while discussing https://github.com/llvm/llvm-project/pull/131998.

PR: https://github.com/llvm/llvm-project/pull/134259
2025-04-04 10:16:50 +01:00
David Sherwood
aaf398c2e7
[AArch64] Regenerate apple-unrolling-multi-exit.ll test checks (#134257) 2025-04-04 09:03:49 +01:00
Yingwei Zheng
c5a491e9ea
[SCEV] Check whether the start is non-zero in ScalarEvolution::howFarToZero (#131522)
https://github.com/llvm/llvm-project/pull/94525 assumes that the loop
will be infinite when the stride is zero. However, it doesn't hold when
the start value of addrec is also zero.

Closes https://github.com/llvm/llvm-project/issues/131465.
2025-03-17 13:59:16 +08:00
Jeremy Morse
792a6f8119
[RemoveDIs] Remove "try-debuginfo-iterators..." test flags (#130298)
These date back to when the non-intrinsic format of variable locations
was still being tested and was behind a compile-time flag, so not all
builds / bots would correctly run them. The solution at the time, to get
at least some test coverage, was to have tests opt-in to non-intrinsic
debug-info if it was built into LLVM.

Nowadays, non-intrinsic format is the default and has been on for more
than a year, there's no need for this flag to exist.

(I've downgraded the flag from "try" to explicitly requesting
non-intrinsic format in some places, so that we can deal with tests that
are explicitly about non-intrinsic format in their own commit).
2025-03-14 15:50:49 +00:00
Florian Hahn
46a13a5b17
[AArch64] Runtime-unroll small multi-exit loops on Apple Silicon. (#124751)
Extend unrolling preferences to allow more aggressive unrolling of
search loops with 2 exits, building on the TTI hook added in

ad9da92cf6.

In combination with
eac23a5b97
this enables unrolling loops like
std::find, which can improve performance significantly (+15% end-to-end
on a workload that makes heavy use of std::find). It increase the total
number of unrolled loops by ~2.5% across a very large corpus of
workloads.

For SPEC2017, +1.6% more loops are unrolled and the following workloads
increase in size (`__text`):

      workload             base                patch  
    500.perlbench_r   1682884.00         1694104.00  0.7%
    523.xalancbmk_r   3001716.00          3003832.00  0.1%

PR: https://github.com/llvm/llvm-project/pull/124751
2025-02-27 14:42:45 +00:00
Nikita Popov
29441e4f5f
[IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
Florian Hahn
3007f31e74 [LoopUnroll] Add AArch64 tests for multi-exit loop unrolling.
Test coverage to https://github.com/llvm/llvm-project/pull/124751.
2025-01-28 14:25:27 +00:00
Florian Hahn
d486b76823
[AArch64] Unroll some loops with early-continues on Apple Silicon. (#118499)
Try to runtime-unroll loops with early-continues depending on
loop-varying loads; this helps with branch-prediction for the
early-continues and can significantly improve performance
for such loops

Builds on top of https://github.com/llvm/llvm-project/pull/118317.

PR: https://github.com/llvm/llvm-project/pull/118499.
2024-12-22 13:10:54 +00:00
Vladi Krapp
f8d270474c
[ARM] Reduce loop unroll when low overhead branching is available (#120065)
For processors with low overhead branching (LOB), runtime unrolling the
innermost loop is often detrimental to performance. In these cases the
loop remainder gets unrolled into a series of compare-and-jump blocks,
which in deeply nested loops get executed multiple times, negating the
benefits of LOB.

This is particularly noticable when the loop trip count of the innermost
loop varies within the outer loop, such as in the case of triangular
matrix decompositions.

In these cases we will prefer to not unroll the innermost loop, with the
intention for it to be executed as a low overhead loop.
2024-12-18 10:10:51 +00:00
Florian Hahn
0bb7bd4b4e
[AArch64] Runtime-unroll small load/store loops for Apple Silicon CPUs. (#118317)
Add initial heuristics to selectively enable runtime unrolling for loops
where doing so is expected to be highly beneficial on Apple Silicon
CPUs.

To start with, we try to runtime-unroll small, single block loops, if
they have load/store dependencies, to expose more parallel memory
access streams [1] and to improve instruction delivery [2].

We also explicitly avoid runtime-unrolling for loop structures that may
limit the expected gains from runtime unrolling. Such loops include
loops with complex control flow (aren't innermost loops, have multiple
exits, have a large number of blocks), trip count expansion is
expensive and are expected to execute a small number of iterations.

Note that the heuristics here may be overly conservative and we err on
the side of avoiding runtime unrolling rather than unroll excessively. 
They are all subject to further refinement.

Across a large set of workloads, this increase the total number of
unrolled loops by 2.9%.

[1] 4.6.10 in Apple Silicon CPU Optimization Guide
[2] 4.4.4 in Apple Silicon CPU Optimization Guide

Depends on https://github.com/llvm/llvm-project/pull/118316 for TTI
changes.

PR: https://github.com/llvm/llvm-project/pull/118317
2024-12-09 14:28:31 +00:00
VladiKrapp-Arm
bb3eb0ca0c
[ARM] Test unroll behaviour on machines with low overhead branching (#118692)
Add test for existing loop unroll behaviour.

Current behaviour is the single loop with fmul gets runtime unrolled by
count of 4, with the loop remainder unrolled as the 3 for.body9.us.prol
sections. This is quite a lot of compare and branch, negating the
benefits of the low overhead loop mechanism.
2024-12-06 15:04:56 +00:00
Nikita Popov
f7685af4a5 [InstCombine] Move gep of phi fold into separate function
This makes sure that an early return during this fold doesn't end
up skipping later gep folds.
2024-12-05 15:20:56 +01:00
Nikita Popov
462cb3cd6c
[InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144)
If the gep is nusw (usually via inbounds) and the offset is
non-negative, we can infer nuw.

Proof: https://alive2.llvm.org/ce/z/ihztLy
2024-12-05 14:36:40 +01:00
Florian Hahn
21d27b3aab
[LoopUnroll] Add tests for loop unrolling on Apple platforms.
Add first set of tests where runtime unrolling can be highly beneficial
on Apple Silicon CPUs.
2024-12-02 15:48:48 +00:00
Lee Wei
abb9f9fa06
[llvm] Remove br i1 undef from some regression tests [NFC] (#117112)
This PR removes tests with `br i1 undef` under
`llvm/tests/Transforms/Loop*, Lower*`.
2024-11-21 08:06:56 +00:00
Stephen Tozer
92e0fb0c94
[DebugInfo][LoopUnroll] Preserve DebugLocs on optimized cond branches (#114225)
This patch fixes a simple error where as part of loop unrolling we
optimize conditional loop-exiting branches into unconditional branches
when we know that they will or won't exit the loop, but does not
propagate the source location of the original branch to the new one.

Found using https://github.com/llvm/llvm-project/pull/107279.
2024-11-08 16:52:30 +00:00
Yingwei Zheng
0b9f1cc024
[SCEV] Disallow simplifying phi(undef, X) to X (#115109)
See the following case:
```
@GlobIntONE = global i32 0, align 4

define ptr @src() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  %retval.0 = phi ptr [ %retval.2.peel, %entry.peel.newph ], [ %retval.2, %cleanup ]
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  %retval.2 = phi ptr [ %retval.0, %for.body ], [ @GlobIntONE, %cleanup.loopexit ]
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}

define ptr @tgt() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2.peel, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}
```
1. `simplifyInstruction(%retval.2.peel)` returns `@GlobIntONE`. Thus,
`ScalarEvolution::createNodeForPHI` returns SCEV expr `@GlobIntONE` for
`%retval.2.peel`.
2. `SimplifyIndvar::replaceIVUserWithLoopInvariant` tries to replace the
use of `%retval.2.peel` in `%retval.2.lcssa.ph` with `@GlobIntONE`.
3. `simplifyLoopAfterUnroll -> simplifyLoopIVs -> SCEVExpander::expand`
reuses `%retval.2.peel = phi ptr [ undef, %for.body.peel ], [
@GlobIntONE, %cleanup.loopexit.peel ]` to generate code for
`@GlobIntONE`. It is incorrect.

This patch disallows simplifying `phi(undef, X)` to `X` by setting
`CanUseUndef` to false.
Closes https://github.com/llvm/llvm-project/issues/114879.
2024-11-07 15:53:51 +08:00
Paul Walker
38fffa630e
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548) 2024-11-06 11:53:33 +00:00
Florian Hahn
2f7ccaf4a8
[SCEV] Add predicate in SolveLinEq to ensure B is a multiple of A. (#108777)
This can help in cases where pointer alignment info is missing, e.g.
https://github.com/llvm/llvm-project/pull/108210

The predicate is formed for the complex expression that's passed to
SolveLinEquationWithOverflow and the checks could probably be pushed
closer to the root nodes, which in some cases may be cheaper to check.


PR: https://github.com/llvm/llvm-project/pull/108777
2024-09-28 14:19:57 +01:00
Nikita Popov
5bcc82d433 [LoopPeel] Fix LCSSA phi node invalidation
In the test case, the BECount of the second loop uses %load,
but we only have an LCSSA phi node for %add, so that is what
gets invalidated. Use the forgetLcssaPhiWithNewPredecessor()
API instead, which will invalidate the roots of the expression
instead.

Fixes https://github.com/llvm/llvm-project/issues/109333.
2024-09-20 17:01:41 +02:00
Nikita Popov
4ec4ac15ed
[SCEVExpander] Fix addrec cost model (#106704)
The current isHighCostExpansion cost model for addrecs computes the cost
for some kind of polynomial expansion that does not appear to have any
relation to addrec expansion whatsoever.

A literal expansion of an affine addrec is a phi and add (plus the
expansion of start and step). For a non-affine addrec, we get another
phi+add for each additional addrec nested in the step recurrence.

This partially `fixes` https://github.com/llvm/llvm-project/issues/53205
(the runtime unroll test case in this PR).
2024-09-19 09:39:35 +02:00
Ganesh
02e4186d0b
[X86] AMD Zen 5 Initial enablement (#107964)
This patch enables the basic skeleton enablement of AMD next gen zen5 CPUs.
2024-09-13 17:45:33 +01:00
Nikita Popov
52b879594f [LoopUnroll] Avoid undef values in test (NFC)
Avoid most of the code being optimized away as a result of
optimization improvements.
2024-09-03 12:10:29 +02:00
Nikita Popov
fe1a1eee2f [Tests] Regenerate test checks (NFC) 2024-09-03 11:42:47 +02:00
Nikita Popov
9edd998e10 [LoopUnroll] Add test for #53205 (NFC) 2024-08-29 16:43:56 +02:00
Nikita Popov
fe182ddf1f [LoopUnrollAnalyzer] Use constant folding API for loads
Use ConstantFoldLoadFromConst() instead of a partial re-implementation.
This makes the code slightly more generic by not depending on the
exact structure of the constant.
2024-08-28 11:53:25 +02:00
James Y Knight
dfeb3991fb
Remove the x86_mmx IR type. (#98505)
It is now translated to `<1 x i64>`, which allows the removal of a bunch
of special casing.

This _incompatibly_ changes the ABI of any LLVM IR function with
`x86_mmx` arguments or returns: instead of passing in mmx registers,
they will now be passed via integer registers. However, the real-world
incompatibility caused by this is expected to be minimal, because Clang
never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>`
or `double`, depending on ABI.

This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type.
That type simply no longer corresponds to an IR type, and is used only
by MMX intrinsics and inline-asm operands.

Because SelectionDAGBuilder only knows how to generate the
operands/results of intrinsics based on the IR type, it thus now
generates the intrinsics with the type MVT::v1i64, instead of
MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus
have the X86 backend fix them up in DAGCombine. (This may be a
short-lived hack, if all the MMX intrinsics can be removed in upcoming
changes.)

Works towards issue #98272.
2024-07-25 09:19:22 -04:00
v01dXYZ
cff8d716bd
[SCEV] forgetValue: support (with-overflow-inst op0, op1) (#98015)
The use-def walk in forgetValue() was skipping instructions with
non-SCEVable types. However, SCEV may look past with.overflow
intrinsics returning aggregates.

Fixes #97586.
2024-07-09 09:14:33 +02:00