60 Commits

Author SHA1 Message Date
Lee Wei
abb9f9fa06
[llvm] Remove br i1 undef from some regression tests [NFC] (#117112)
This PR removes tests with `br i1 undef` under
`llvm/tests/Transforms/Loop*, Lower*`.
2024-11-21 08:06:56 +00:00
Paul Walker
38fffa630e
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548) 2024-11-06 11:53:33 +00:00
Ramkumar Ramachandra
7eea55fd4b
LoopLoadElim: re-org tests after invalid #96656 (#97598)
After pr96656.ll were added to LAA and LoopVersioning, it was decided
that the bug is in a caller of LoopVersioning, not in LAA or
LoopVersioning itself. The new candidate was LoopLoadElim, but #96656
has since been marked invalid. Hence, re-organize the added tests to
avoid confusion, and the testcase from the investigation to
LoopLoadElim.
2024-09-30 15:46:34 +01:00
Florian Hahn
e949b54a5b
[LAA] Use PSE::getSymbolicMaxBackedgeTakenCount. (#93499)
Update LAA to use PSE::getSymbolicMaxBackedgeTakenCount which returns
the minimum of the countable exits.

When analyzing dependences and computing runtime checks, we need the
smallest upper bound on the number of iterations. In terms of memory
safety, it shouldn't matter if any uncomputable exits leave the loop,
as long as we prove that there are no dependences given the minimum of
the countable exits. The same should apply also for generating runtime
checks.

Note that this shifts the responsiblity of checking whether all exit
counts are computable or handling early-exits to the users of LAA.

Depends on https://github.com/llvm/llvm-project/pull/93498

PR: https://github.com/llvm/llvm-project/pull/93499
2024-06-04 22:23:30 +01:00
Shan Huang
d0e2808f80
[DebugInfo][LoopLoadElim] Fix missing debug location updates (#91839) 2024-05-17 10:56:05 +01:00
Florian Hahn
977c0a6d29
[LAA] Add tests with non-constant strides & distances.
Add a number of LAA test cases with both forward and backward
dependences with non-constant strides and dependence distances.

This includes test coverage for
https://github.com/llvm/llvm-project/issues/87336

Also includes a LoopLoadElimination test to make sure the pass does not
crash on non-constant dependence distances.
2024-04-08 19:18:38 +01:00
Florian Hahn
06bb8c9f20
[VPlan] Explicitly handle scalar pointer inductions. (#83068)
Add a new PtrAdd opcode to VPInstruction that corresponds to
IRBuilder::CreatePtrAdd, which creates a GEP with source element type
i8.

This is then used to model scalarizing VPWidenPointerInductionRecipe by
introducing scalar-steps to model the index increment followed by a
PtrAdd.

Note that PtrAdd needs to be able to generate code for only the first
lane or for all lanes. This may warrant introducing a separate recipe
for scalarizing that can be created without relying on the underlying
IR.

Depends on https://github.com/llvm/llvm-project/pull/80271

PR: https://github.com/llvm/llvm-project/pull/83068
2024-03-26 16:01:57 +01:00
Nikita Popov
2d69827c5c [Transforms] Convert tests to opaque pointers (NFC) 2024-02-05 11:57:34 +01:00
Nikita Popov
edb2fc6dab [llvm] Remove explicit -opaque-pointers flag from tests (NFC)
Opaque pointers mode is enabled by default, no need to explicitly
enable it.
2023-07-12 14:35:55 +02:00
Igor Kirillov
50dfc9e35d [LoopLoadElimination] Add support for stride equal to -1
This patch allows us to gain all the benefits provided by
LoopLoadElimination pass to descending loops.

Differential Revision: https://reviews.llvm.org/D151448
2023-06-01 12:10:53 +00:00
Krzysztof Drewniak
f0415f2a45 Re-land "[AMDGPU] Define data layout entries for buffers""
Re-land D145441 with data layout upgrade code fixed to not break OpenMP.

This reverts commit 3f2fbe92d0f40bcb46db7636db9ec3f7e7899b27.

Differential Revision: https://reviews.llvm.org/D149776
2023-05-03 19:43:56 +00:00
Krzysztof Drewniak
3f2fbe92d0 Revert "[AMDGPU] Define data layout entries for buffers"
This reverts commit f9c1ede2543b37fabe9f2d8f8fed5073c475d850.

Differential Revision: https://reviews.llvm.org/D149758
2023-05-03 16:11:00 +00:00
Krzysztof Drewniak
f9c1ede254 [AMDGPU] Define data layout entries for buffers
Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new address spaces for AMDGCN targets.

The first is address space 7, a non-integral address space (which was
already in the data layout) that has 160-bit pointers (which are
256-bit aligned) and uses a 32-bit offset. These pointers combine a
128-bit buffer descriptor and a 32-bit offset, and will be usable with
normal LLVM operations (load, store, GEP). However, they will be
rewritten out of existence before code generation.

The second of these is address space 8, the address space for "buffer
resources". These will be used to represent the resource arguments to
buffer instructions, and new buffer intrinsics will be defined that
take them instead of <4 x i32> as resource arguments. ptr
addrspace(8). These pointers are 128-bits long (with the same
alignment). They must not be used as the arguments to getelementptr or
otherwise used in address computations, since they can have
arbitrarily complex inherent addressing semantics that can't be
represented in LLVM. Even though, like their address space 7 cousins,
these pointers have deterministic ptrtoint/inttoptr semantics, they
are defined to be non-integral in order to prevent optimizations that
rely on pointers being a [0, [addr_max]] value from applying to them.

Future work includes:
- Defining new buffer intrinsics that take ptr addrspace(8) resources.
- A late rewrite to turn address space 7 operations into buffer
intrinsics and offset computations.

This commit also updates the "fallback address space" for buffer
intrinsics to the buffer resource, and updates the alias analysis
table.

Depends on D143437

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D145441
2023-05-03 15:25:58 +00:00
Philip Reames
78ae870f11 {tests] Rerun autogen to reduce a diff [nfc] 2023-03-31 12:47:08 -07:00
Nikita Popov
055fb7795a [Transforms] Convert some tests to opaque pointers (NFC)
These are all tests where conversion worked automatically, and
required no manual fixup.
2023-01-05 12:43:45 +01:00
Bjorn Pettersson
3528e63d89 [test] Remove duplicate RUN lines in Transform tests 2022-12-08 11:47:16 +01:00
Roman Lebedev
971bc46f8a
[NFC] Port all LoopLoadElim tests to -passes= syntax 2022-12-08 02:38:46 +03:00
Arthur Eubanks
f3a928e233 [opt] Don't translate legacy -analysis flag to require<analysis>
Tests relying on this should explicitly use -passes='require<analysis>,foo'.
2022-10-07 14:54:34 -07:00
Florian Hahn
4f827318e3
[LoopVersioning,LLE] Clear LoopAccessInfoManager after making changes.
Loop versioning changes the control-flow, which may impact SCEVs cached
by for other loops in LoopAccessInfoManager. Clear the manager after
making changes.

Fixes #57825.

Depends on D134609.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D134611
2022-10-04 21:35:42 +01:00
Florian Hahn
21e5bd4d3b
[LoopVersioning,LLE] Add -S option to runlines. 2022-10-04 14:28:04 +01:00
Florian Hahn
473bb59a2f
[LoopVersioning] Add tests where versioning requires LAA invalidation.
Additional test cases for #57825.
2022-09-24 20:33:49 +01:00
Florian Hahn
623c4a7a55
[LoopVersioning] Invalidate SCEV for phi if new values are added.
After 20d798bd47ec5191d, SCEV looks through PHIs with a single incoming
value. This means adding a new incoming value may change the SCEV for a
phi. Add missing invalidation when an existing PHI is reused during
LoopVersioning. New incoming values will be added later from the
versioned loop.

Similar issues have been fixed by also adding missing invalidation.

Fixes #57825.

Note that the test case unfortunately requires running loop-vectorize
followed by loop-load-elimination, which does the actual versioning. I
don't think it is possible to reproduce the failure without that
combination.
2022-09-23 11:53:29 +01:00
Sebastian Peryt
99c9b37d11 [NFC][1/n] Remove -enable-new-pm=0 flags from lit tests
This is the first patch in a series intended for removing flag
-enable-new-pm=0 from lit tests. This is part of a bigger
effort of completely removing legacy code related to legacy
pass manager in favor of currently default new pass manager.

In this patch flag has been removed only from tests where no significant
change has been required because checks has been duplicated for
both PMs.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D134150
2022-09-19 09:57:37 -07:00
Jolanta Jensen
958abe864a [LoopLoadElim] Add stores with matching sizes as load-store candidates
We are not building up a proper list of load-store candidates because
we are throwing away stores where the type don't match the load.
This patch adds stores with matching store sizes as candidates.
Author of the original patch: David Sherwood.

Differential Revision: https://reviews.llvm.org/D130233
2022-09-02 13:11:25 +01:00
Jolanta Jensen
92c4172756 [NFC][LoopLoadElim] Extending type-mismatch testing
Added IR for int-pointer type mismatch and int-vector
type mismatch. Regenerated CHECK lines using
the update_test_checks.py script.

Differential Revision: https://reviews.llvm.org/D132239
2022-08-30 16:44:19 +01:00
Chang-Sun Lin Jr
7ee30a0e24 [NFC][LAA] Match-up type sizes for possible extensions, based on actual bit-size rather than rounded-up byte size.
Differential Revision: https://reviews.llvm.org/D119200
2022-04-22 23:16:20 -07:00
Philip Reames
7a04526117 Autogen a couple of predicated SCEV tests 2022-02-11 13:56:35 -08:00
Arthur Eubanks
1bdc6eacba [LoopLoadElim] Support opaque pointers
With typed pointers the pointer operand type checks the address space
and the load/store type. With opaque pointers we have to check the
load/store type separately.
2022-02-09 09:22:21 -08:00
Florian Hahn
d8276208be
[LAA] Remove overeager assertion for aggregate types.
0a00d64 turned an early exit here into an assertion, but the assertion
can be triggered, as PR52920 shows.

The later code is agnostic to the accessed type, so just drop the
assert. The patch also adds tests for LAA directly and
loop-load-elimination to show the behavior is sane.
2022-01-04 15:20:35 +00:00
Max Kazantsev
16370e02a7 [IndVars] Provide eliminateIVComparison with context
We can prove more predicates when we have a context when eliminating ICmp.
As first (and very obvious) approximation we can use the ICmp instruction itself,
though in the future we are going to use a common dominator of all its users.
Need some refactoring before that.

Observed ~0.5% negative compile time impact.

Differential Revision: https://reviews.llvm.org/D98697
Reviewed By: lebedev.ri
2021-03-19 12:28:22 +07:00
Max Kazantsev
e3c759bd58 [LoopLoadElim] Pass ScalarEvolution in old pass manager. PR49141
Loop canonicalization may end up deleting blocks from CFG. And
Scalar Evolution may still keep cached referenced to those blocks
unless updated properly.
2021-02-15 18:08:23 +07:00
Florian Hahn
890079ef18
[LoopLoadElim] Add tests with uncomputable BTCs. 2021-01-01 13:57:02 +00:00
Max Kazantsev
664e1da485 [LoopLoadElim] Make sure all loops are in simplify form. PR48150
LoopLoadElim may end up expanding an AddRec from a loop
which is not the current loop. This loop may not be in simplify
form. We figure it out after the no-return point, so cannot bail
in this case.

AddRec requires simplify form to expand. The only way to ensure
this does not crash is to simplify all loops beforehand.

The issue only exists in new PM. Old PM requests LoopSimplify
required pass and it simplifies all loops before the opt begins.

Differential Revision: https://reviews.llvm.org/D91525
Reviewed By: asbirlea, aeubanks
2020-11-26 10:51:11 +07:00
Max Kazantsev
813781a923 [Test] Add Check statement 2020-11-12 10:47:34 +07:00
Max Kazantsev
c2a7d9f317 [Test] Add failing test for PR48150 2020-11-11 19:11:32 +07:00
Arthur Eubanks
da450240fd [test][NewPM] Fix LoopLoadElim tests under NPM 2020-11-09 12:24:03 -08:00
Max Kazantsev
c413a8a8ec [LoopLoadElim] Filter away candidates that stop being AddRecs after loop versioning. PR47457
The test in PR47457 demonstrates a situation when candidate load's pointer's SCEV
is no loger a SCEVAddRec after loop versioning. The code there assumes that it is
always a SCEVAddRec and crashes otherwise.

This patch makes sure that we do not consider candidates for which this requirement
is broken after the versioning.

Differential Revision: https://reviews.llvm.org/D87355
Reviewed By: asbirlea
2020-09-10 13:30:31 +07:00
Max Kazantsev
37a7c0a007 [Test] Add failing test for pr47457 2020-09-09 15:45:35 +07:00
Florian Hahn
2062b3707c [LAA] Avoid adding pointers to the checks if they are not needed.
Currently we skip alias sets with only reads or a single write and no
reads, but still add the pointers to the list of pointers in RtCheck.

This can lead to cases where we try to access a pointer that does not
exist when grouping checks.  In most cases, the way we access
PositionMap masked that, as the value would default to index 0.

But in the example in PR46854 it causes a crash.

This patch updates the logic to avoid adding pointers for alias sets
that do not need any checks. It makes things slightly more verbose, by
first checking the numbers of reads/writes and bailing out early if we don't
need checks for the alias set.

I think this makes the logic a bit simpler to follow.

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D84608
2020-07-30 19:21:14 +01:00
Fangrui Song
f31811f2dc [BasicAA] Rename deprecated -basicaa to -basic-aa
Follow-up to D82607
Revert an accidental change (empty.ll) of D82683
2020-06-26 20:41:37 -07:00
Max Kazantsev
4e87823026 [LoopLoadElim] Fix crash by always checking simplify form
Loop simplify form should always be checked because logic of
propagateStoredValueToLoadUsers relies on it (in particular, it
requires preheader).

Reviewed By: Fedor Sergeev, Florian Hahn
Differential Revision: https://reviews.llvm.org/D77775
2020-04-10 09:23:28 +07:00
Max Kazantsev
7adb9e06fd [LoopLoadElim] Add test showing that LoopLoadElim doesn't work correctly with new PM 2020-04-08 17:32:03 +07:00
Matt Arsenault
86325be3d7 LoopLoadElim: Respect convergent
llvm-svn: 363162
2019-06-12 13:50:47 +00:00
Eric Christopher
cee313d288 Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552
2019-04-17 04:52:47 +00:00
Eric Christopher
a863435128 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

llvm-svn: 358546
2019-04-17 02:12:23 +00:00
Hiroshi Yamauchi
09e539fcae [PGO] Profile guided code size optimization.
Summary:
Enable some of the existing size optimizations for cold code under PGO.

A ~5% code size saving in big internal app under PGO.

The way it gets BFI/PSI is discussed in the RFC thread

http://lists.llvm.org/pipermail/llvm-dev/2019-March/130894.html 

Note it doesn't currently touch loop passes.

Reviewers: davidxl, eraman

Reviewed By: eraman

Subscribers: mgorny, javed.absar, smeenai, mehdi_amini, eraman, zzheng, steven_wu, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59514

llvm-svn: 358422
2019-04-15 16:49:00 +00:00
Chandler Carruth
baabda9317 [PM] Port LoopLoadElimination to the new pass manager and wire it into
the main pipeline.

This is a very straight forward port. Nothing weird or surprising.

This brings the number of missing passes from the new PM's pipeline down
to three.

llvm-svn: 293249
2017-01-27 01:32:26 +00:00
Mehdi Amini
27d224fbbb Fix LoopLoadElimination to keep original alignment on the inital hoisted store
This is fixing a bug where Loop Vectorization is widening a load but
with a lower alignment. Hoisting the load without propagating the alignment
will allow inst-combine to later deduce a higher alignment that what the pointer
actually is.

Differential Revision: https://reviews.llvm.org/D28408

llvm-svn: 291281
2017-01-06 21:06:51 +00:00
Adam Nemet
bd861acf29 [LLE] Don't hoist conditionally executed loads
If the load is conditional we can't hoist its 0-iteration instance to
the preheader because that would make it unconditional.  Thus we would
access a memory location that the original loop did not access.

llvm-svn: 273991
2016-06-28 04:02:47 +00:00
Adam Nemet
a9f09c6245 [LAA] Enable symbolic stride speculation for all LAA clients
This is a functional change for LLE and LDist.  The other clients (LV,
LVerLICM) already had this explicitly enabled.

The temporary boolean parameter to LAA is removed that allowed turning
off speculation of symbolic strides.  This makes LAA's caching interface
LAA::getInfo only take the loop as the parameter.  This makes the
interface more friendly to the new Pass Manager.

The flag -enable-mem-access-versioning is moved from LV to a LAA which
now allows turning off speculation globally.

llvm-svn: 273064
2016-06-17 22:35:41 +00:00