15 Commits

Author SHA1 Message Date
Arthur Eubanks
f3a928e233 [opt] Don't translate legacy -analysis flag to require<analysis>
Tests relying on this should explicitly use -passes='require<analysis>,foo'.
2022-10-07 14:54:34 -07:00
Congzhe Cao
b941857b40 [LoopInterchange] New cost model for loop interchange
This is another attempt to land this patch.

The patch proposed to use a new cost model for loop interchange,
which is obtained from loop cache analysis.

Given a loopnest, what loop cache analysis returns is a vector of
loops [loop0, loop1, loop2, ...] where loop0 should be replaced as
the outermost loop, loop1 should be placed one more level inside, and
loop2 one more level inside, etc. What loop cache analysis does is not
only more comprehensive than the current cost model, it is also a "one-shot"
query which means that we only need to query it once during the entire
loop interchange pass, which is better than the current cost model where
we query it every time we check whether it is profitable to interchange
two loops. Thus complexity is reduced, especially after D120386 where we
do more interchanges to get the globally optimal loop access pattern.

Updates made to test cases are mostly minor changes and some
corrections. One change that applies to all tests is that we added an option
`-cache-line-size=64` to the RUN lines. This is ensure that loop
cache analysis receives a valid number of cache line size for correct
analysis. Test coverage for loop interchange is not reduced.

Currently we did not completely remove the legacy cost model, but
keep it as fall-back in case the new cost model did not run successfully.
This is because currently we have some limitations in delinearization, which
sometimes makes loop cache analysis bail out. The longer term goal is to
enhance delinearization and eventually remove the legacy cost model
compeletely.

Reviewed By: bmahjour, #loopoptwg

Differential Revision: https://reviews.llvm.org/D124926
2022-06-28 00:08:37 -04:00
Evgenii Stepanov
878309cc54 Revert "[LoopInterchange] New cost model for loop interchange"
llvm/lib/Analysis/LoopCacheAnalysis.cpp:702:30: runtime error: signed
integer overflow: 6148914691236517209 * 100 cannot be represented in
type 'long'

https://lab.llvm.org/buildbot/#/builders/5/builds/25185

This reverts commit 1b24fe34b06cd9f2337313f513a8b19f9a37c5de.
2022-06-23 16:10:53 -07:00
Congzhe Cao
1b24fe34b0 [LoopInterchange] New cost model for loop interchange
This is the second attempt to land this patch.

The patch proposed to use a new cost model for loop interchange,
which is obtained from loop cache analysis.

Given a loopnest, what loop cache analysis returns is a vector of
loops [loop0, loop1, loop2, ...] where loop0 should be replaced as the
outermost loop, loop1 should be placed one more level inside, and loop2
one more level inside, etc. What loop cache analysis does is not only more
comprehensive than the current cost model, it is also a "one-shot" query
which means that we only need to query it once during the entire loop
interchange pass, which is better than the current cost model where we
query it every time we check whether it is profitable to interchange two
loops. Thus complexity is reduced, especially after D120386 where we do
more interchanges to get the globally optimal loop access pattern.

Updates made to test cases are mostly minor changes and some corrections.
One change that applies to all tests is that we added an option
`-cache-line-size=64` to the RUN lines. This is ensure that loop cache
analysis receives a valid number of cache line size for correct analysis.
Test coverage for loop interchange is not reduced.

Currently we did not completely remove the legacy cost model, but keep it
as fall-back in case the new cost model did not run successfully. This is
because currently we have some limitations in delinearization, which sometimes
makes loop cache analysis bail out. The longer term goal is to enhance
delinearization and eventually remove the legacy cost model compeletely.

Reviewed By: bmahjour, #loopoptwg

Differential Revision: https://reviews.llvm.org/D124926
2022-06-23 16:34:57 -04:00
Daniil Suchkov
f1940a5895 Revert "[LoopInterchange] New cost model for loop interchange"
Reverting the commit due to numerous buildbot failures.

This reverts commit 006334470d8d1b5d8f630890336fcb45795749d1.
2022-06-03 00:52:08 +00:00
Congzhe Cao
006334470d [LoopInterchange] New cost model for loop interchange
This patch proposed to use a new cost model for loop interchange, which
is obtained from loop cache analysis.

Given a loopnest, what loop cache analysis returns is a vector of loops
[loop0, loop1, loop2, ...] where loop0 should be replaced as the outermost
loop, loop1 should be placed one more level inside, and loop2 one more level
inside, etc. What loop cache analysis does is not only more comprehensive than
the current cost model, it is also a "one-shot" query which means that we only
need to query it once during the entire loop interchange pass, which is better
than the current cost model where we query it every time we check whether it is
profitable to interchange two loops. Thus complexity is reduced, especially after
D120386 where we do more interchanges to get the globally optimal loop access pattern.

Updates made to test cases are mostly minor changes and some corrections.
Test coverage for loop interchange is not reduced.

Currently we did not completely remove the legacy cost model, but keep it as
fall-back in case the new cost model did not run successfully. This is because
currently we have some limitations in delinearization, which sometimes makes
loop cache analysis bail out. The longer term goal is to enhance delinearization
and eventually remove the legacy cost model compeletely.

Reviewed By: bmahjour, #loopoptwg

Differential Revision: https://reviews.llvm.org/D124926
2022-06-02 19:07:14 -04:00
Congzhe Cao
1ef04326ec [LoopInterchange] Support loop interchange with floating point reductions
Enabled loop interchange support for floating point reductions
if it is allowed to reorder floating point operations.

Previously when we encouter a floating point PHI node in the
outer loop exit block, we bailed out since we could not detect
floating point reductions in the early days. Now we remove this
limiation since we are able to detect floating point reductions.

Reviewed By: #loopoptwg, Meinersbur

Differential Revision: https://reviews.llvm.org/D117450
2022-02-06 17:04:47 -05:00
Congzhe Cao
3f8be15f29 [LoopInterchange] Handle lcssa PHIs with multiple predecessors
This is a bugfix in the transformation phase.

If the original outer loop header branches to both the inner loop
(header) and the outer loop latch, and if there is an lcssa PHI
node outside the loop nest, then after interchange the new outer latch
will have an lcssa PHI node inserted which has two predecessors, i.e.,
the original outer header and the original outer latch. Currently
the transformation assumes it has only one predecessor (the original
outer latch) and crashes, since the inserted lcssa PHI node does
not take both predecessors as incoming BBs.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D100792
2021-05-11 21:30:54 -04:00
Fangrui Song
f31811f2dc [BasicAA] Rename deprecated -basicaa to -basic-aa
Follow-up to D82607
Revert an accidental change (empty.ll) of D82683
2020-06-26 20:41:37 -07:00
Eric Christopher
cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher
a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Florian Hahn
a684a99441 [LoopInterchange] Support reductions across inner and outer loop.
This patch adds logic to detect reductions across the inner and outer
loop by following the incoming values of PHI nodes in the outer loop. If
the incoming values take part in a reduction in the inner loop or come
from outside the outer loop, we found a reduction spanning across inner
and outer loop.

With this change, ~10% more loops are interchanged in the LLVM
test-suite + SPEC2006.

Fixes https://bugs.llvm.org/show_bug.cgi?id=30472

Reviewers: mcrosier, efriedma, karthikthecool, davide, hfinkel, dmgreen

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D43245

llvm-svn: 346438
2018-11-08 20:44:19 +00:00
Florian Hahn
c8bd6ea35e [LoopInterchange] Remove support for inner-only reductions.
Inner-loop only reductions require additional checks to make sure they
form a load-phi-store cycle across inner and outer loop. Otherwise the
reduction value is not properly preserved. This patch disables
interchanging such loops for now, as it causes miscompiles in some
cases and it seems to apply only for a tiny amount of loops. Across the
test-suite, SPEC2000 and SPEC2006, 61 instead of 62 loops are
interchange with inner loop reduction support disabled. With
-loop-interchange-threshold=-1000, 3256 instead of 3267.

See the discussion and history of D53027 for an outline of how such legality
checks could look like.

Reviewers: efriedma, mcrosier, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D53027

llvm-svn: 345877
2018-11-01 19:25:00 +00:00
Florian Hahn
6feb637124 [LoopInterchange] Preserve LCSSA.
This patch extends LoopInterchange to move LCSSA to the right place
after interchanging. This is required for LoopInterchange to become a
function pass.

An alternative to the manual moving of the PHIs, we could also re-form
the LCSSA phis for a set of interchanged loops, but that's more
expensive.

Reviewers: efriedma, mcrosier, davide

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D52154

llvm-svn: 343132
2018-09-26 19:34:25 +00:00
Florian Hahn
f3fea0f11f [LoopInterchange] Allow some loops with PHI nodes in the exit block.
We currently support LCSSA PHI nodes in the outer loop exit, if their
incoming values do not come from the outer loop latch or if the
outer loop latch has a single predecessor. In that case, the outer loop latch
will be executed only if the inner loop gets executed. If we have multiple
predecessors for the outer loop latch, it may be executed even if the inner
loop does not get executed.

This is a first step to support the case described in
https://bugs.llvm.org/show_bug.cgi?id=30472

Reviewers: efriedma, karthikthecool, mcrosier

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D43237

llvm-svn: 331037
2018-04-27 13:52:51 +00:00