200 Commits

Author SHA1 Message Date
Ryotaro Kasuga
b75530ff03
[LoopInterchange] Consider forward/backward dependency in vectorize heuristic (#133672)
The vectorization heuristic of LoopInterchange attempts to move a
vectorizable loop to the innermost position. Before this patch, a loop
was deemed vectorizable if there are no loop-carried dependencies
induced by the loop.
This patch extends the vectorization heuristic by introducing the
concept of forward and backward dependencies, inspired by
LoopAccessAnalysis. Specifically, an additional element is appended to
each direction vector to indicate whether it represents a forward
dependency (`<`) or not (`*`). Among these, only the forward
dependencies (i.e., those whose last element is `<`) affect the
vectorization heuristic. Accordingly, the check is conservative, and
dependencies are considered forward only when this can be proven.
Currently, we only support perfectly nested loops whose body consists of
a single basic block. For other cases, dependencies are pessimistically
treated as non-forward.
2025-07-25 22:37:20 +09:00
Sjoerd Meijer
a9f8143072
[LoopInterchange] Ignore the cost-model, force interchange if legal (#148858)
This is and has been proven useful for testing purposes, to get more
test coverage.
2025-07-18 13:42:25 +01:00
Kazu Hirata
1c6422c4bd [Scalar] Fix a warning
This patch fixes:

  llvm/lib/Transforms/Scalar/LoopInterchange.cpp:863:20: error: unused
  variable 'OpCode' [-Werror,-Wunused-variable]
2025-07-15 08:00:46 -07:00
Ryotaro Kasuga
b3c293c5b9
[LoopInterchange] Drop nuw/nsw flags from reduction ops when interchanging (#148612)
Before this patch, when a reduction exists in the loop, the legality
check of LoopInterchange only verified if there exists a
non-reassociative floating-point instruction in the reduction
calculation. However, it is insufficient, because reordering integer
reductions can also lead to incorrect transformations. Consider the
following example:

```c
int A[2][2] = {
  { INT_MAX, INT_MAX },
  { INT_MIN, INT_MIN },
};

int sum = 0;
for (int i = 0; i < 2; i++)
  for (int j = 0; j < 2; j++)
    sum += A[j][i];
```

To make this exchange legal, we must drop nuw/nsw flags from the
instructions involved in the reduction operations.

This patch extends the legality check to correctly handle such cases. In
particular, for integer addition and multiplication, it verifies that
the nsw and nuw flags are set on involved instructions, and drop them
when the transformation actually performed. This patch also introduces
explicit checks for the kind of reduction and permits only those that
are known to be safe for interchange. Consequently, some "unknown"
reductions (at the moment, `FindFirst*` and `FindLast*`) are rejected.

Fix #148228
2025-07-15 22:04:16 +09:00
Ryotaro Kasuga
29487759e3
[LoopInterchange] Defer CacheCost calculation until needed (#146874)
LoopInterchange currently stop and exit its process when the number of
loads/stores in the loop is greater than `MaxMemInstrCount`. This
prevents excessive querying to DependenceAnalysis. However, computing
`CacheCost` also involves DependenceAnalysis queries, and their number
can grow to `O(N^2)` in the worst case, where `N` is the number of
loads/stores in the loop. Therefore, we should also avoid calculating it
if the loads/stores count exceeds `MaxMemInstrCount`.
This patch defers the calculation of `CacheCost` until it is actually
needed to reduce compile time. This avoids computing `CacheCost` when
the number of loads/stores is large. Additionally, since this patch
delays its calculation as much as possible, it is also effective in
other scenarios, e.g., when there are no legal loop pairs to exchange.
2025-07-08 17:44:34 +09:00
Ramkumar Ramachandra
be9502b095
[LoopInterchange] Modernize loops (NFC) (#146105) 2025-06-28 11:41:05 +01:00
Ramkumar Ramachandra
7613c2412a
[LoopInterchange] Use ArrayRef in more places (NFC) (#146077) 2025-06-27 17:22:24 +01:00
Kazu Hirata
2a35414e98
[Transforms] Use range-based for loops (NFC) (#145252)
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-06-25 10:08:26 -07:00
Ryotaro Kasuga
1e5f7f64b0
[LoopInterchange] Handle confused dependence correctly (#140709)
This patch fixes the handling of a confused `Dependence` object. Such an
object doesn’t contain any information about dependencies, so we must
process it conservatively. However, it was converted into a direction
vector like `[I I ... I]`. As a result, it was treated as if there are
no loop-carried dependencies, which can lead to illegal loop exchanges.

Fixes #140238
2025-06-05 16:44:02 +09:00
Kazu Hirata
8876790763
[Scalar] Use llvm::count (NFC) (#141445) 2025-05-26 09:13:25 -07:00
Ryotaro Kasuga
e99ca74dc1
[LoopInterchange] Relax the legality check to accept more patterns (#139690)
When proving the legality of exchanging two loops, it doesn't need to
check the elements of the direction vectors associated with the loops
outside of the two target loops. Before this patch, the legality check
looked at all elements of a direction vector to calculate the
lexicographically order of the vector, which may reject some legal
exchanges. For example, if a direction vector is `[* < =]`, it is safe
to swap the last two loops because the corresponding subsequence of the
vector (`[< =]`) is lexicographically positive for both before and after
the exchange. However, the its order is unknown if we don't drop the
prefix since the first element is `*`. This patch improves the logic of
legality check to ignore such unrelated prefixes of direction vectors.
2025-05-13 21:14:14 +09:00
Ryotaro Kasuga
6dc6ca3302
[LoopInterchange] Skip legality check if surrounding loops already guarantee dependency (NFC) (#139254)
The legality check in LoopInterchange allows two loops to be exchanged
if all direction vectors are lexicographically positive (or zero) for
both before and after the swap. The current implementation performs this
routine naively. However, if a direction vector is lexicographically
positive due to an element corresponding to a loop that is outside the
given two loops (i.e., if there is an element `<` before the loops we
are trying to interchange), then obviously it is also positive after
exchanging them. For example, for a direction vector `[< < >]`, swapping
the last two elements doesn't make it lexicographically negative because
the first element is `<`.
This patch adds a code to skip legality check if surrounding loops
already guarantee that the direction vector is lexicographically
positive. Note that this is only a small improvement on its own, but
it's necessary to relax the legality check I'm working on.

Split off from #118267

---------

Co-authored-by: Michael Kruse <llvm-project@meinersbur.de>
2025-05-13 10:19:17 +09:00
Kazu Hirata
6ab7cb7899
[Transforms] Remove unused local variables (NFC) (#138442) 2025-05-04 00:35:22 -07:00
Kazu Hirata
b2ba53172e
[Transforms] Construct SmallVector with iterator ranges (NFC) (#136259) 2025-04-18 10:27:05 -07:00
Ryotaro Kasuga
91f3965be4
[LoopInterchange] Fix the vectorizable check for a loop (#133667)
In the profitability check for vectorization, the dependency matrix was
not handled correctly. This can result to make a wrong decision: It may
say "this loop can be vectorized" when in fact it cannot. The root cause
of this is that the check process early returns when it finds '=' or 'I'
in the dependency matrix. To make sure that we can actually vectorize
the loop, we need to check all the rows of the matrix. This patch fixes
the process of checking whether we can vectorize the loop or not. Now it
won't make a wrong decision for a loop that cannot be vectorized.

Related: #131130
2025-04-03 16:21:19 +09:00
Ryotaro Kasuga
528e408b94
[LoopInterchange] Add an option to control the cost heuristics applied (#133664)
LoopInterchange has several heuristic functions to determine if
exchanging two loops is profitable or not. Whether or not to use each
heuristic and the order in which to use them were fixed, but #125830
allows them to be changed internally at will. This patch adds a new
option to control them via the compiler option.

The previous patch also added an option to prioritize the vectorization
heuristic. This patch also removes it to avoid conflicts between it and
the newly introduced one, e.g., both
`-loop-interchange-prioritize-vectorization=1` and
`-loop-interchange-profitabilities='cache,vectorization'` are specified.
2025-04-02 15:41:40 +09:00
Ryotaro Kasuga
281028e37c
[LoopInterchange] Prevent from undoing its own transformation (#127473)
LoopInterchange uses the bubble-sort fashion algorithm to sort the loops
in a loop nest. For two loops `A` and `B`, it calls `isProfitable(A, B)`
to determine whether these loops should be exchanged. The result of
`isProfitable(A, B)` should be conservative, that is, it should return
true only when we are sure exchanging `A` and `B` will improve
performance. If it's not conservative enough, it's possible that a
subsequent `isProfitable(B, A)` will also return true, in which case
LoopInterchange will undo its previous transformation. To avoid such
cases, `isProfitable(B, A)` must not return true if `isProfitable(A, B)`
returned true in the past. However, the current implementation can be in
such a situation. This patch resolves it by correcting the handling of
two loops that have the same cache cost.

This resolve the problem mentioned in
https://github.com/llvm/llvm-project/pull/118267#issuecomment-2510759354.
2025-03-21 21:42:49 +09:00
Ryotaro Kasuga
17b202fc17
[LoopInterchange] Add an option to prioritize vectorization (#131988)
The LoopInterchange cost-model consists of several decision rules. They
are called one by one, and if some rule can determine the profitability,
then the subsequent rules aren't called. In the current implementation,
the rule for `CacheCostAnalysis` is called first, and if it fails to
determine the profitability, then the rule for vectorization is called.
However, there are cases where interchanging loops for vectorization
makes the code faster even if such exchanges are detrimental to the
cache. For example, exchanging the inner two loops in the following
example looks about x3 faster in my local (compiled with `-O3
-mcpu=neoverse-v2 -mllvm -cache-line-size=64`), even though it's
rejected by the rule based on cache cost. (NOTE: LoopInterchange cannot
exchange these loops due to legality checks. This should also be
improved.)

```c
__attribute__((aligned(64))) float aa[256][256],bb[256][256],cc[256][256],
                                   dd[256][256],ee[256][256],ff[256][256];

// Alternative of TSVC s231 with more array accesses than the original.
void s231_alternative() {
  for (int nl = 0; nl < 100*(100000/256); nl++) {
    for (int i = 0; i < 256; ++i) {
      for (int j = 1; j < 256; j++) {
        aa[j][i] = aa[j-1][i] + bb[j][i] + cc[i][j]
                 + dd[i][j] + ee[i][j] + ff[i][j];
      }
    }
  }
}
```

This patch introduces a new option to prioritize the vectorization rule
over the cache cost rule.

Related issue: #131130

---------

Co-authored-by: Florian Hahn <flo@fhahn.com>
2025-03-21 17:03:06 +09:00
Kazu Hirata
a3b0189f6d
[Transforms] Avoid repeated hash lookups (NFC) (#130890) 2025-03-12 08:48:27 -07:00
Ryotaro Kasuga
aa37a698d4
[LoopInterchange] Move some processes to another function (NFC) (#129514)
Some post-processing involved in exchanging a pair of loops has been
done separately from `processLoop`, which is a main function that does
the transformation. It's better to consolidate these processes into the
same function. This patch is a preparation of #127474.
2025-03-04 17:24:04 +09:00
Alireza Torabian
3c74430320
[DependenceAnalysis][NFC] Removing PossiblyLoopIndependent parameter (#124615)
Parameter PossiblyLoopIndependent has lost its intended purpose. This
flag is always set to true in all cases when depends() is called, hence
we want to reconsider the utility of this variable and remove it from
the function signature entirely. This is an NFC patch.
2025-02-11 16:23:28 -05:00
Madhur Amilkanthwar
0074a462f1
[LoopInterchange] Hoist isComputableLoopNest() in the control flow (#124247)
The profiling of the LLVM Test-suite reveals that a significant portion,
specifically 14,090 out of 139,323, loop nests were identified as
non-viable candidates for transformation, leading to the transform
exiting from isComputableLoopNest() without any action.

More importantly, dependence information was computed for these loop
nests before reaching the function isComputableLoopNest(), which does
not require DI and relies solely on scalar evolution (SE).

To enhance compile-time efficiency, this patch moves the call to
isComputableLoopNest() earlier in the control-flow, thereby avoiding
unnecessary dependence calculations.

The impact of this change is evident on the compile-time-tracker, with
the overall geometric mean improvement recorded at 0.11%, while the
lencode benchmark gets a more substantial benefit of 0.44%.
This improvement can be tracked in the isc-ln-exp-2 branch under my
repo.
2025-02-05 13:50:17 +05:30
Ryotaro Kasuga
690f251063
[LoopInterchange] Handle LE and GE correctly (#124901)
LoopInterchange have converted `DVEntry::LE` and `DVEntry::GE` in
direction vectors to '<' and '>' respectively. This handling is
incorrect because the information about the '=' it lost. This leads to
miscompilation in some cases. To resolve this issue, convert them to '*'
instead.

Resolve #123920
2025-01-29 19:30:54 +09:00
Jeremy Morse
6292a808b3
[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to getFirstNonPHI use the iterator-returning version.

This patch changes a bunch of call-sites calling getFirstNonPHI to use
getFirstNonPHIIt, which returns an iterator. All these call sites are
where it's obviously safe to fetch the iterator then dereference it. A
follow-up patch will contain less-obviously-safe changes.

We'll eventually deprecate and remove the instruction-pointer
getFirstNonPHI, but not before adding concise documentation of what
considerations are needed (very few).

---------

Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
2025-01-24 13:27:56 +00:00
Jeremy Morse
8e70273509
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.

This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.

We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
2025-01-24 10:53:11 +00:00
Madhur Amilkanthwar
d15f3e828d
[LoopInterchange] Constrain LI within supported loop nest depth (#118656)
This patch is an extension to #115128.

After profiling LLVM test-suite, I see a lot of loop nest of depth more
than `MaxLoopNestDepth` which is 10. Early exit for them would save
compile-time as it would avoid computing DependenceInfo and CacheCost.

Please see 'bound-max-depth' branch on compile-time-tracker.
2025-01-23 10:41:54 +05:30
Madhur Amilkanthwar
5d281a480e
[LoopInterchange] Constrain number of load/stores in a loop (#118973)
In the current state of the code, the transform computes entries for the
dependency matrix until `MaxMemInstrCount` which is 100. After 99th
entry, it terminates and thus overall wastes compile-time.

It would be nice if we can compute total number of entries upfront and
early exit if the number of entries > 100. However, computing the number
of entries is not always possible as it depends on two factors:
1. Number of load-store pairs in a loop.
2. Number of common loop levels for each of the pair.

This patch constrains the whole computation on the number of loads and
stores instructions in the loop.

In another approach, I experimented with computing 1 and constraining
the number of pairs, but that did not lead to any additional benefit in
terms of compile time. However, when other issues are fixed, I can
revisit this approach.
2025-01-21 10:49:19 +05:30
Sjoerd Meijer
456ec1c2f4
[LoopInterchange] Remove 'S' Scalar Dependencies (#119345)
We are not handling 'S' scalar dependencies correctly and have at least
the following miscompiles related to that:

[LoopInterchange] incorrect handling of scalar dependencies and dependence vectors starting with ">" #54176
[LoopInterchange] Interchange breaks program correctness #46867
[LoopInterchange] Loops should not interchanged due to dependencies #47259
[LoopInterchange] Loops should not interchanged due to control flow #47401

This patch does no longer insert the "S" dependency/direction into the
dependency matrix, so a dependency is never "S". We seem to have
forgotten what the exact meaning is of this dependency type, and don't
see why it should be treated differently.

We prefer correctness over incorrect and more aggressive results. I.e.,
this prevents the miscompiles at the expense of handling less cases,
i.e. making interchange more pessimistic. However, some of the cases
that are now rejected for dependence analysis reasons, were rejected
before too but for other reasons (e.g. profitability). So at least for
the llvm regression tests, the number of regression are very reasonable.
This should be a stopgap. We would like to get interchange enabled by
default and thus prefer correctness over unsafe transforms, and later
see if we can get solve the regressions.
2025-01-20 13:04:58 +00:00
Sjoerd Meijer
cac6f21149
[LoopInterchange] Make the entries of the Dependency Matrix unique (#116195)
The entries in the dependency matrix can contain a lot of duplicates,
which is unnecessary and results in more checks that we can avoid, and
this patch adds that.
2024-11-19 11:40:53 +00:00
Madhur Amilkanthwar
1eaa17975d
[LoopInterchange] Bail out early if minimum loop nest is not met (#115128)
This patch bails out early if minimum depth
is not met. As it stands today, the pass computes
CacheCost before it attempts to do the transform.
This is not needed if minimum depth is not met.
This handles basic cases where depth is typically 1.

As the patch avoids unnecessary computation, it is aimed to improve
compile-time.
2024-11-19 09:58:20 +05:30
Kazu Hirata
94f9cbbe49
[Scalar] Remove unused includes (NFC) (#114645)
Identified with misc-include-cleaner.
2024-11-02 08:32:26 -07:00
Kazu Hirata
b7146aed5b
[Transforms] Construct SmallVector with ArrayRef (NFC) (#101851) 2024-08-03 15:33:08 -07:00
Victor Toni
900be9013f
Fix typos (#88565) 2024-04-12 14:33:35 -07:00
Jeremy Morse
4427407a29 [NFC][RemoveDIs] Create a new spelling of the moveBefore method
As outlined in my proposal of how to get rid of debug intrinsics, this
patch adds a moveBefore method that signals the caller /intends/ the order
of moved instructions is to stay the same. This semantic difference has an
effect on debug-info, as it signals whether debug-info needs to move with
instructions or not.

The patch just replaces a few calls to moveBefore with calls to
moveBeforePreserving -- and the latter just calls the former, so it's all
NFC right now. A future patch will add an implementation of
moveBeforePreserving that takes action to correctly preserve debug-info,
but that's tightly coupled with our non-instruction debug-info
representation that's still being reviewed.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D156369
2023-09-07 18:37:57 +01:00
Nikita Popov
143ed21b26 Revert "[LCSSA] Remove unused ScalarEvolution argument (NFC)"
This reverts commit 5362a0d859d8e96b3f7c0437b7866e17a818a4f7.

In preparation for reverting a dependent revision.
2023-06-05 16:45:38 +02:00
Nikita Popov
d5c56c5162 [SCEVExpander] Remember phi nodes inserted by LCSSA construction
SCEVExpander keeps track of all instructions it inserted. However,
it currently misses some phi nodes created during LCSSA construction.
Fix this by collecting these into another argument.

This also removes the IRBuilder argument, which was added for
essentially the same purpose, but only handles the root LCSSA nodes,
not those inserted by SSAUpdater.

This was reported as a regression on D149344, but the reduced test
case also reproduces without it.

Differential Revision: https://reviews.llvm.org/D150681
2023-05-25 09:34:19 +02:00
Nikita Popov
5362a0d859 [LCSSA] Remove unused ScalarEvolution argument (NFC)
After D149435, LCSSA formation no longer needs access to
ScalarEvolution, so remove the argument from the utilities.
2023-05-02 12:17:05 +02:00
Bjorn Pettersson
a20f7efbc5 Remove several no longer needed includes. NFCI
Mostly removing includes of InitializePasses.h and Pass.h in
passes that no longer has support for the legacy PM.
2023-04-17 13:54:19 +02:00
Kazu Hirata
7b014a0732 [Scalar] Use range-based for loops (NFC) 2023-04-16 09:05:20 -07:00
Kazu Hirata
c8f9555c4d [Transforms] Use *{Set,Map}::contains (NFC) 2023-03-14 00:24:30 -07:00
Craig Topper
fac4c476b9 [LoopInterchange] Remove unused RecurrenceDescriptor object. NFC 2023-03-03 22:15:32 -08:00
Fangrui Song
21ccddd2ce [LoopInterchange] Remove legacy pass (unused in the pipeline)
Following recent changes to remove non-core legacy passes.
2023-02-15 15:09:29 -08:00
Ram-NK
ee7188c8b2 [LoopInterchange] Correcting the profitability check
Before D135808, There would be endless loop interchange posibility (no
proper priority was there in profitability check. Any profitable check
may leads to loop-interchange). With this patch, there  is no endless
interchange (priority in profitable check is defined. Order of decision
is 'Cache cost' check, 'InstrOrderCost', 'Vectorization'). Corrected the
dependency checking inside isProfitableForVectorization(), corrected the
checking of bad order loops in isProfitablePerInstrOrderCost().

Reviewed By: Meinersbur, bmahjour, #loopoptwg

Differential Revision: https://reviews.llvm.org/D135808
2023-01-16 14:36:06 -05:00
Joshua Cao
b2b4d95827 [NFC][LoopFlatten][LoopInterchange] Do not explicitly forget subloops
We don't need to explicitly forget subloops because forgetting parent
loops will automatically forget their subloops

Differential Revision: https://reviews.llvm.org/D141029
2023-01-12 08:08:11 -08:00
Vasileios Porpodas
bebca2b6d5 [NFC] Cleanup: Replaces BB->getInstList().splice() with BB->splice().
This is part of a series of cleanup patches towards making BasicBlock::getInstList() private.

Differential Revision: https://reviews.llvm.org/D138979
2022-12-01 15:37:51 -08:00
Mengxuan Cai
cd58333a62 [LoopInterchange] Refactor and rewrite validDepInterchange()
The current code of validDepInterchange() enumerates cases that are
legal for interchange. This could be simplified by checking
lexicographically order of the swapped direction matrix.

Reviewed By: congzhe, Meinersbur, bmahjour

Differential Revision: https://reviews.llvm.org/D137461
2022-11-17 13:41:02 -05:00
Congzhe Cao
75b33d6bd5 [LoopInterchange] Check phis in all subloops
This is the bugfix to the miscompile mentioned in
https://reviews.llvm.org/D132055#3814831. The IR
that reproduced the bug is added as the test case in
this patch.

What this patch does is that, during legality phase
instead of checking the phi nodes only in `InnerLoop`
and `OuterLoop`, we check phi nodes in all subloops
of the `OuterLoop`. Suppose if the loop nest is triply
nested, and `InnerLoop` and `OuterLoop` is the middle
loop and the outermost loop respectively, we'll check
phi nodes in the innermost loop as well, in addition to
the ones in the middle and outermost loops.

Reviewed By: Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D134930
2022-11-04 00:20:52 -04:00
Ram-NK
a58b6acf1f [NFC][LoopInterchange] Clean up of irrelevent dependency checking with
isOuterMostDepPositive()

The function isOuterMostDepPositive() is checked after negative dependence
vectors are normalized to be non-negative, so there will not be any negative
dependency ('>' as the outermost non-equal sign) after normalization. And
therefore the check in isOuterMostDepPositive() is irrelevent and redundant.

Reviewed By: congzhe

Differential Revision: https://reviews.llvm.org/D132982
2022-10-04 14:54:08 -04:00
Congzhe Cao
22c91df52c [LoopInterchange][PR57148] Ensure the correct form of IR after transformation
This is a bugfix patch that resolves the following two bugs in loop interchange:

1. PR57148 which is an assertion error due to of loss of LCSSA form after interchange,
   as referred to test1() in pr57148.ll.
2. Use before def for the outermost loop induction variables after interchange,
   as referred to test2() in pr57148.ll.

The fix in this patch is that:

1. In cases where the LCSSA form is not maintained after interchange, we update the IR
   to the LCSSA form again.
2. We split the phi nodes in the inner loop header into a separate basic block to avoid
   the situation where use of the outer indvar appears before its def after interchange.
   Previously we already did this for innermost loops, now we do it for non-innermost
   loops (e.g., middle loops) as well.

Reviewed By: bmahjour, Meinersbur, #loopoptwg

Differential Revision: https://reviews.llvm.org/D132055
2022-09-22 00:20:53 -04:00
Congzhe Cao
6782d71680 [LoopPassManager] Ensure to construct loop nests with the outermost loop
This patch is to resolve the bug reported and discussed in
https://reviews.llvm.org/D124926#3718761 and https://reviews.llvm.org/D124926#3719876.

The problem is that loop interchange is a loopnest pass under the new pass manager,
but the loop nest may not be constructed correctly by the loop pass manager after
running loop interchange and before running the next pass, which might cause problems
when it continues running the next pass.

The reason that the loop nest is constructed incorrectly is that the outermost
loop might have changed after interchange, and what was the original outermost
loop is not the current outermost loop anymore. Constructing the loop nest based
on the original outermost loop would generate an invalid loop nest.

The fix in this patch is that, in the loop pass manager before running each loopnest
pass, we re-cosntruct the loop nest based on the current outermost loop, if LPMUpdater
notifies the loop pass manager that the previous loop nest has been invalidated by passes
like loop interchange.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D132199
2022-09-21 23:59:26 -04:00