90 Commits

Author SHA1 Message Date
Justin Fargnoli
7c8a13ab79
[LoopPeel] change peelLoop's return type from bool to void (#177488) 2026-01-23 10:49:03 -08:00
Craig Topper
ef21740781
[LoopPeel] Check for onlyAccessesInaccessibleMemory instead of llvm.assume in peelToTurnInvariantLoadsDereferenceable. (#171910)
onlyAccessesInaccessibleMemory can't alias with a load. This allows us
to ignore more intrinsics than llvm.assume.

Follow up from #171547
2025-12-12 10:45:41 -08:00
Ramkumar Ramachandra
85fafd5db0
[SCEVExp] Get DL from SE, strip constructor arg (NFC) (#171823) 2025-12-11 14:26:47 +00:00
Craig Topper
ccc3835ffa
[LoopPeel] Ignore assume intrinsics for the mayWriteToMemory check in peelToTurnInvariantLoadsDereferenceable. (#171547)
llvm.assume intrinsics have the mayWriteToMemory property, but
won't prevent the load from becoming dereferenceable.
2025-12-10 13:14:19 -08:00
Craig Topper
2d98a366b7
[LoopPeel] Fix typo Derefencebale -> Dereferenceable. NFC (#170791)
Co-authored-by: Nikita Popov <github@npopov.com>
2025-12-05 18:17:36 +00:00
Joel E. Denny
21fedcbf89
[LoopPeel] Fix BFI when peeling last iteration without guard (#168250)
LoopPeel sometimes proves that, when reached, the original loop always
executes at least two iterations. LoopPeel then unconditionally executes
both the remaining loop's initial iteration and the peeled final
iteration. But that increases the latter's frequency above its frequency
in the original loop. To maintain the total frequency, this patch
compensates by decreasing the remaininng loop's latch probability.

This is another step in issue #135812 and was discussed at
<https://github.com/llvm/llvm-project/pull/166858#discussion_r2528968542>.
2025-11-20 10:45:53 -05:00
Mircea Trofin
358e9a56af
[LP] Assign weights when peeling last iteration. (#166858) 2025-11-15 10:01:04 -08:00
Joel E. Denny
afb262855e
[LoopPeel] Fix branch weights' effect on block frequencies (#128785)
[LoopPeel] Fix branch weights' effect on block frequencies

This patch implements the LoopPeel changes discussed in [[RFC] Fix Loop
Transformations to Preserve Block
Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785).

In summary, a loop's latch block can have branch weight metadata that
encodes an estimated trip count that is derived from application profile
data. Initially, the loop body's block frequencies agree with the
estimated trip count, as expected. However, sometimes loop
transformations adjust those branch weights in a way that correctly
maintains the estimated trip count but that corrupts the block
frequencies. This patch addresses that problem in LoopPeel, which it
changes to:

- Maintain branch weights consistently with the original loop for the
sake of preserving the total frequency of the original loop body.
- Store the new estimated trip count in the
`llvm.loop.estimated_trip_count` metadata, introduced by PR #148758.
2025-10-02 16:07:55 +00:00
Ryotaro Kasuga
df69dfe688
[LoopPeel] Address followup comments on #121104 (#155221)
This is a follow-up PR for post-commit comments in #121104 .

Details:

- Rename `mergeTwoCounter` to `mergeTwoCounters` (add trailing `s`).
- Avoid duplicated hash lookup.
- Use `///` instead of `//`.
- Fix typo.
2025-08-25 18:18:44 +09:00
Ryotaro Kasuga
2330fd2f73
[LoopPeel] Add new option to peeling loops to convert PHI into IV (#121104)
LoopPeel currently considers PHI nodes that become loop invariants
through peeling. However, in some cases, peeling transforms PHI nodes
into induction variables (IVs), potentially enabling further
optimizations such as loop vectorization. For example:

```c
// TSVC s292
int im = N-1;
for (int i=0; i<N; i++) {
  a[i] = b[i] + b[im];
  im = i;
}
```

In this case, peeling one iteration converts `im` into an IV, allowing
it to be handled by the loop vectorizer.

This patch adds a new feature to peel loops when to convert PHIs into
IVs. At the moment this feature is disabled by default.

Enabling it allows to vectorize the above example. I have measured on
neoverse-v2 and observed a speedup of more than 60% (options: `-O3
-ffast-math -mcpu=neoverse-v2 -mllvm -enable-peeling-for-iv`).

This PR is taken over from #94900
Related #81851
2025-08-20 13:44:56 +00:00
Philip Reames
bb288de4e0
[LoopPeel] Support last iteration peeling of min/max intrinsics (#143598)
This isn't terribly useful at the moment because of the step=1
restriction but it should be functionally sound. This is mostly just
making sure the codepaths don't diverge as we make other changes.
2025-06-17 11:22:23 -07:00
Florian Hahn
e5ff7055be
[LoopPeel] Use loop guards when checking if last iter can be peeled. (#142605)
Apply loop guards to BTC before checking if the last iteration should be
peeled off. This also adds an assert to make sure applying the guards
does not pessimize the results. I checked on a large test set and it did
not trigger there, but it adds an additional guard to catch potential
cases where loop-guards pessimize results.

Peels ~15% more loops.

PR: https://github.com/llvm/llvm-project/pull/142605
2025-06-10 08:29:42 +01:00
Yingwei Zheng
4eac8daa38
[LoopPeel] Handle non-local instructions/arguments when updating exiting values (#142993)
Similar to
7e14161f49,
the exiting value may be a non-local instruction or an argument.

Closes https://github.com/llvm/llvm-project/issues/142895.
2025-06-06 12:56:28 +08:00
Florian Hahn
f98bdd94e6
Reapply "[LoopPeel] Remove known trip count restriction when peeling last. (#140792)"
This reverts commit 580454526b936f7a576ddbc9bb932cf9be376ec4.

The recommitted version contains an extra check to not peel if the
latch exit is controlled by a pointer induction.

Original message:
Remove the restriction that the loop must be known to execute at least 2
iterations when peeling the last iteration. If we cannot prove at least
2 iterations are executed, a check and branch to skip the peeled loop is
inserted.

PR: https://github.com/llvm/llvm-project/pull/140792
2025-05-28 13:02:03 +01:00
Florian Hahn
580454526b
Revert "[LoopPeel] Remove known trip count restriction when peeling last. (#140792)"
This reverts commit 24b97756decb7bf0e26dcf0e30a7a9aaf27f417c.
Also reverts ac9a466e39bf97ffeab127982aa7c405cb257551.

Building CMake triggers a crash with the patch, revert while I
investigate.
2025-05-27 21:25:32 +01:00
Florian Hahn
ac9a466e39
[LoopPeel] Insert new phis before first non-PHI when peeling last iter.
Make sure the new phis are inserted before any non-phi instructions.
This fixes a crash when dbg_value instructions are present in the
original exit block.
2025-05-27 10:46:28 +01:00
Florian Hahn
24b97756de
[LoopPeel] Remove known trip count restriction when peeling last. (#140792)
Remove the restriction that the loop must be known to execute at least 2
iterations when peeling the last iteration. If we cannot prove at least
2 iterations are executed, a check and branch to skip the peeled loop is
inserted.

PR: https://github.com/llvm/llvm-project/pull/140792
2025-05-26 20:08:02 +01:00
Florian Hahn
364d80e5c5
[LoopPeel] Make sure bound in exit condition is loop invariant.
Follow-up to post-commit comment for
(https://github.com/llvm/llvm-project/pull/139551.

This should effectively be NFC, given the other existing restrictions.
2025-05-25 19:21:49 +01:00
Florian Hahn
f755e6644a
[LoopPeel] Make sure AddRec is for correct loop when peeling last iter.
Follow-up to post-commit comment for
(https://github.com/llvm/llvm-project/pull/139551.

This should effectively be NFC, given the other existing restrictions.
2025-05-25 12:05:06 +01:00
Florian Hahn
a0a2a1e095
[LoopPeel] Make sure exit condition has a single use when peeling last.
Update the check in canPeelLastIteration to make sure the exiting
condition has a single use. When peeling the last iteration, we adjust
the condition in the loop body to be true one iteration early, which
would be incorrect for other users.

Fixes https://github.com/llvm/llvm-project/issues/140444.
2025-05-18 11:47:12 +01:00
Florian Hahn
7e14161f49
[LoopPeel] Handle constants when updating exit values when peeling last.
Account for constant values when updating exit values after peeling an
iteration from the end. This can happen if the inner loop gets unrolled
and simplified.

Fixes https://github.com/llvm/llvm-project/issues/140442.
2025-05-18 10:17:21 +01:00
Florian Hahn
3fcfce4c5e
Reapply "[LoopPeel] Implement initial peeling off the last loop iteration. (#139551)"
This reverts the revert commit bf92b127d2637948f53d11a187e865aa10e2e74c.

This adds missing initialization of PeelLast in gatherPeelingPreferences.

Original message:
Generalize countToEliminateCompares to also consider peeling off the
last iteration if it eliminates a compare.

At the moment, codegen for peeling off the last iteration is quite
restrictive and callers have to make sure that the exit condition can be
adjusted when peeling and that the loop executes at least 2 iterations.

Both will be relaxed in follow-ups.

PR: https://github.com/llvm/llvm-project/pull/139551
2025-05-17 10:51:05 +01:00
Florian Hahn
bf92b127d2
Revert "[LoopPeel] Implement initial peeling off the last loop iteration. (#139551)"
This reverts commit bb10c3ba7f77d40a7fbfd4ac815015d3a4ae476a.

Also reverts 4f663cca15f2b53c2bc6a84d1b1f5bd81679356d:
  Revert "[LoopPeel] Make sure PeelLast is always initialized."

Revert for now to bring msan bots back to green

 https://lab.llvm.org/buildbot/#/builders/164/builds/9992
 https://lab.llvm.org/buildbot/#/builders/94/builds/7158
2025-05-16 08:33:12 +01:00
Florian Hahn
4f663cca15
[LoopPeel] Make sure PeelLast is always initialized.
Make sure PeelLast is initialized on all paths.

Should fix MSan bootstrap failures
     https://lab.llvm.org/buildbot/#/builders/164/builds/9992
     https://lab.llvm.org/buildbot/#/builders/94/builds/7158

Fixup after https://github.com/llvm/llvm-project/pull/139551.
2025-05-15 22:33:10 +01:00
Florian Hahn
bb10c3ba7f
[LoopPeel] Implement initial peeling off the last loop iteration. (#139551)
Generalize countToEliminateCompares to also consider peeling off the
last iteration if it eliminates a compare.

At the moment, codegen for peeling off the last iteration is quite
restrictive and callers have to make sure that the exit condition can be
adjusted when peeling and that the loop executes at least 2 iterations.

Both will be relaxed in follow-ups.

PR: https://github.com/llvm/llvm-project/pull/139551
2025-05-15 19:15:48 +01:00
Kazu Hirata
73dc2afd2c
[Transforms] Use *Set::insert_range (NFC) (#132652)
We can use *Set::insert_range to collapse:

  for (auto Elem : Range)
    Set.insert(E);

down to:

  Set.insert_range(Range);

In some cases, we can further fold that into the set declaration.
2025-03-23 19:42:53 -07:00
Ramkumar Ramachandra
80bdfcd411
[LoopUtils] Don't wrap in getLoopEstimatedTripCount (#129080)
getLoopEstimatedTripCount returns the trip count based on profiling
data, and its documentation says that it could return 0 when the trip
count is zero, but this is not the case: a valid trip count can never be
zero, and it returns 0 when the unsigned ExitCount is incremented by 1
and wraps. Some callers are careful about checking for this zero value
in an std::optional, but it makes for an API with footguns, as a
std::optional return value indicates that a non-nullopt value would be a
valid trip count. Fix this by explicitly returning std::nullopt when the
return value would wrap, and strip additional checks in callers. This
also fixes a minor bug in LoopVectorize.
2025-03-04 08:43:08 +00:00
Ramkumar Ramachandra
4a0d53a0b0
PatternMatch: migrate to CmpPredicate (#118534)
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.

This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.
2024-12-13 14:18:33 +00:00
Kazu Hirata
91b2ac640e
[Transforms] Avoid repeated hash lookups (NFC) (#112654) 2024-10-17 07:45:02 -07:00
Nikita Popov
5bcc82d433 [LoopPeel] Fix LCSSA phi node invalidation
In the test case, the BECount of the second loop uses %load,
but we only have an LCSSA phi node for %add, so that is what
gets invalidated. Use the forgetLcssaPhiWithNewPredecessor()
API instead, which will invalidate the roots of the expression
instead.

Fixes https://github.com/llvm/llvm-project/issues/109333.
2024-09-20 17:01:41 +02:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Paul Kirth
294f3ce5dd
Reapply "[llvm][IR] Extend BranchWeightMetadata to track provenance o… (#95281)
…f weights" #95136

Reverts #95060, and relands #86609, with the unintended code generation
changes addressed.

This patch implements the changes to LLVM IR discussed in
https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof meatdata nodes for
branch weights, which can be used to distinguish weights added from
llvm.expect* intrinsics from those added via other methods, e.g. from
profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
2024-06-12 12:52:28 -07:00
Paul Kirth
607afa0b63
Revert "[llvm][IR] Extend BranchWeightMetadata to track provenance of weights" (#95060)
Reverts llvm/llvm-project#86609

This change causes compile-time regressions for stage2 builds
(https://llvm-compile-time-tracker.com/compare.php?from=3254f31a66263ea9647c9547f1531c3123444fcd&to=c5978f1eb5eeca8610b9dfce1fcbf1f473911cd8&stat=instructions:u).
It also introduced unintended changes to `.text` which should be
addressed before relanding.
2024-06-11 08:06:06 +02:00
Paul Kirth
c5978f1eb5
[llvm][IR] Extend BranchWeightMetadata to track provenance of weights (#86609)
This patch implements the changes to LLVM IR discussed in

https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032

In this patch, we add an optional field to MD_prof metadata nodes for
branch weights, which can be used to distinguish weights added from
`llvm.expect*` intrinsics from those added via other methods, e.g.
from profiles or inserted by the compiler.

One of the major motivations, is for use with MisExpect diagnostics,
which need to know if branch_weight metadata originates from an
llvm.expect intrinsic. Without that information, we end up checking
branch weights multiple times in the case if ThinLTO + SampleProfiling,
leading to some inaccuracy in how we report MisExpect related
diagnostics to users.

Since we change the format of MD_prof metadata in a fundamental way, we
need to update code handling branch weights in a number of places.

We also update the lang ref for branch weights to reflect the change.
2024-06-10 11:27:21 -07:00
Sergey Kachkov
f34dedbf44
[LoopPeel] Support min/max intrinsics in loop peeling (#93162)
This patch adds processing of min/max intrinsics in LoopPeel in the
similar way as it was done for conditional statements: for
min/max(IterVal, BoundVal) we peel iterations where IterVal < BoundVal
for monotonically increasing IterVal; for monotonically decreasing
IterVal we peel iterations where IterVal > BoundVal (strict comparision
predicates are used to minimize number of peeled iterations).
2024-05-31 13:58:10 +03:00
Joshua Cao
5602636835
[LoopPeel] Peel iterations based on and, or conditions (#73413)
For example, this allows us to peel this loop with a `and`:
```
for (int i = 0; i < N; ++i) {
  if (i % 2 == 0 && i < 3) // can peel based on || as well
    f1();
  f2();
```
into:
```
for (int i = 0; i < 3; ++i) { // peel three iterations
  if (i % 2 == 0)
    f1();
  f2();
}
for (int i = 3; i < N; ++i)
  f2();
```
2023-12-02 11:24:02 -08:00
Matthias Braun
cb4627d150
Add setBranchWeigths convenience function. NFC (#72446)
Add `setBranchWeights` convenience function to ProfDataUtils.h and use
it where appropriate.
2023-11-16 10:55:19 -08:00
Aleksandr Popov
e8d5db206c
[LoopPeeling] Fix weights updating of peeled off branches (#70094)
In https://reviews.llvm.org/D64235 a new algorithm has been introduced
for updating the branch weights of latch blocks and their copies.

It increases the probability of going to the exit block for each next
peel iteration, calculating weights by (F - I * E, E), where:
- F is a weight of the edge from latch to header.
- E is a weight of the edge from latch to exit.
- I is a number of peeling iteration.

E.g: Let's say the latch branch weights are (100,300) and the estimated
trip count is 4. If we peel off all 4 iterations the weights of the
copied branches will be:
0: (100,300)
1: (100,200)
2: (100,100)
3: (100,1)

https://godbolt.org/z/93KnoEsT6

So we make the original loop almost unreachable from the 3rd peeled copy
according to the profile data. But that's only true if the profiling
data is accurate.
Underestimated trip count can lead to a performance issues with the
register allocator, which may decide to spill intervals inside the loop
assuming it's unreachable.

Since we don't know how accurate the profiling data is, it seems better
to set neutral 1/1 weights on the last peeled latch branch. After this
change, the weights in the example above will look like this:
0: (100,300)
1: (100,200)
2: (100,100)
3: (100,100)

Co-authored-by: Aleksandr Popov <apopov@azul.com>
2023-10-31 14:02:42 +01:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Nikita Popov
a6705053c3 [LoopPeel] Clear dispositions after peeling
Block dispositions of values defined inside the loop may change
during peeling, so clear them. We already do this for other kinds
of unrolling.

Differential Revision: https://reviews.llvm.org/D153762
2023-07-19 10:39:59 +02:00
Joshua Cao
849d01bf3d [LoopUnroll] Peel iterations based on select conditions
This also allows us to peel loops with a `select`:
```
for (int i = 0; i <= N; ++i);
  f3(i == 0 ? a : b); // select instruction
```
into:
```
f3(a); // peel one iteration
for (int i = 1; i <= N; ++i)
  f3(b);
```

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D151052
2023-05-24 00:57:57 -07:00
Anna Thomas
05b060b0b0 [LoopPeel] Expose ValueMap of last peeled iteration. NFC
The value map of last peeled iteration is computed within peelLoop API.
This patch exposes it for callers of peelLoop.
While this is not currently used by upstream passes, we have a usecase
downstream which benefits from this API update. Future users of peelLoop
can also use the ValueMap if needed.

Similar value maps are exposed by other loop utilities such as loop
cloning.

Differential Revision: https://reviews.llvm.org/D138228
2022-12-19 09:55:29 -05:00
Kazu Hirata
6eb0b0a045 Don't include Optional.h
These files no longer use llvm::Optional.
2022-12-14 21:16:22 -08:00
Vasileios Porpodas
dc891846b8 [NFC] Cleanup: Replace Function::getBasicBlockList().splice() with Function::splice()
This is part of a series of patches that aim at making Function::getBasicBlockList() private.

Differential Revision: https://reviews.llvm.org/D139984
2022-12-14 15:34:19 -08:00
Fangrui Song
c178ed33bd Transforms/Utils: llvm::Optional => std::optional 2022-12-12 08:29:05 +00:00
Jamie Schmeiser
2b6683fd5f Expand loop peeling phi computation to handle binary ops and casts
Summary:
Expand the capabilities of the code for computing how many peels are
needed to make phis determined.  A cast gets the peel count for the
value being casted while a binary op gets the maximum of the operands.

Respond to review comments: remove redundant asserts.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By:mkazantsev (Max Kazantsev),syzaara (Zaara Syeda)
Differential Revision: https://reviews.llvm.org/D138719
2022-12-05 12:10:53 -05:00
Kazu Hirata
343de6856e [Transforms] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 21:11:37 -08:00
Kazu Hirata
88988c50f8 [Utils] Use std::optional in LoopPeel.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-26 17:53:17 -08:00
Jamie Schmeiser
be1ff1fe58 [NFC] Refactor loop peeling code for calculating phi invariance.
Summary:
Refactor loop peeling code by moving code for calculating phi invariance
into a separate class that does the calculation.  Redescribe and rework
the algorithm in preparation for adding increased functionality.  Add
test case that does not exhibit peeling that will be subsequently supported.

Author: Jamie Schmeiser <schmeise@ca.ibm.com>
Reviewed By: mkazantsev (Max Kazantsev)
Differential Revision: https://reviews.llvm.org/D138232
2022-11-25 09:07:14 -05:00
Vasileios Porpodas
af4e856fa7 [NFC] Replaced BB->getInstList().{erase(),pop_front(),pop_back()} with eraseFromParent().
Differential Revision: https://reviews.llvm.org/D138617
2022-11-23 22:47:46 -08:00