2883 Commits

Author SHA1 Message Date
Nikita Popov
29441e4f5f
[IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
David Sherwood
776ef9d1be
[LoopVectorize][NFC] Regenerate some early exit test CHECK lines (#124900) 2025-01-29 09:48:55 +00:00
David Sherwood
c836b8956d
[LoopVectorize][NFC] Disable output for tests that don't need it (#124747)
There are a lot of tests that do not depend upon the IR output
for validation, relying instead on the debug output. For these
tests we can add the -disable-output command line argument.
2025-01-29 08:09:50 +00:00
Nicholas Guy
cdea38f91a
Reland "[LoopVectorizer] Add support for chaining partial reductions #120272" (#124282)
Change `getScaledReduction` to take an existing vector, rather than
creating and returning a new one each call.
Rename `getScaledReduction` to `getScaledReductions` to more accurately
reflect what it's now doing.

---------

Co-authored-by: Karlo Basioli <68535415+basioli-k@users.noreply.github.com>
2025-01-28 10:40:35 +00:00
Florian Hahn
713482fccf [VPlan] Use State.get to extract lane mask for BranchOnMask.
Simplifies the code slightly and avoids redundant extracts/broadcasts if
the operand is live-in or already scalar.
2025-01-27 21:35:36 +00:00
Florian Hahn
09a29fcc8d
[VPlan] Don't collect live-ins in collectUsersInExitBlocks. (NFC) (#123819)
Live-ins don't need to be handled, other than adding to the exit phi
recipe. Do that early and assert that otherwise the exit value is
defined in the vector loop region.

This should enable simply skipping other exit values that do not need
further fixing, e.g. if handling the exit value from the early exit
directly in handleUncountableEarlyExit.

PR: https://github.com/llvm/llvm-project/pull/123819
2025-01-27 16:12:07 +00:00
David Sherwood
b7286dbef9
Reland "[LoopVectorize] Add support for reverse loops in isDereferenceableAndAlignedInLoop #96752" (#123616)
The last attempt failed a sanitiser build because we were
creating a reference to a null Predicates pointer in
isDereferenceableAndAlignedInLoop. This was exposed by
the unit test IsDerefReadOnlyLoop in
unittests/Analysis/LoadsTest.cpp. I fixed this by falling
back on getConstantMaxBackedgeTakenCount if Predicates is
null - see line 316 in llvm/lib/Analysis/Loads.cpp. There
are no other changes.
2025-01-27 11:59:38 +00:00
Florian Hahn
81d38da65e [LV] Add more tests for narrowing interleave groups for AArch64.
Add additional tests for
https://github.com/llvm/llvm-project/pull/106441.
2025-01-26 13:52:18 +00:00
Florian Hahn
6383a12e3b
[VPlan] Refactor HCFG builder to preserve original vector latch (NFC).
Update HCFG builder to preserve the original latch block of the initial
VPlan, ensuring there is always a latch.

It also skips creating the BranchOnCond for the latch of the top-level
loop, instead of removing it later. Exiting via the latch is controlled
by later recipes.

This further unifies HCFG construction and prepares for use to also
build an initial VPlan (VPlan0) for inner loops.
2025-01-25 13:32:01 +00:00
Elvis Wang
aff1242b8e
[LV] Align debug location of the widen-phi to the original phi. (#120338)
This patch align the debug location of the widen-phi to the debug
location of original phi.

Split from: #120054
2025-01-24 17:49:54 +08:00
David Blaikie
42043c423f Reapply "Verifier: Add check for DICompositeType elements being null"
This remove some erroneous debug info from tests that should address the
test failures that showed up when the this was previously committed.

This reverts commit 6716ce8b641f0e42e2343e1694ee578b027be0c4.
2025-01-23 22:29:30 +00:00
Vitaly Buka
0e213834df
Revert "[LoopVectorizer] Add support for chaining partial reductions (#120272)" (#124198)
Introduced stack buffer overflow, see #120272.

`getScaledReduction` can return empty vector, and there is not check for
that.

This reverts commit c9b7303b9b18129c4ee6b56aaa2a0a9f59be2d09.
This reverts commit caf0540b91b0fee31353dc7049ae836e0f814cff.
2025-01-23 14:00:33 -08:00
Nicholas Guy
caf0540b91
[LoopVectorizer] Add support for chaining partial reductions (#120272)
Chaining partial reductions, where multiple partial reductions share an
accumulator, allow for more values to be combined together as part of
the reduction without discarding the semantics of the partial reduction
itself.
2025-01-23 17:24:57 +00:00
Nicholas Guy
26b61e143b
[LoopVectorizer] Propagate underlying instruction to the cloned instances of VPPartialReductionRecipes (#123638) 2025-01-23 14:57:31 +00:00
Florian Hahn
3418cd082a
[LV] Add test showing cost-model difference after 9491f75e1d9.
Reduced test case from
https://lab.llvm.org/buildbot/#/builders/143/builds/4847.
2025-01-21 21:37:51 +00:00
Florian Hahn
6c787ff6cf
Revert "[LV]: Teach LV to recursively (de)interleave. (#122989)"
This reverts commit 9491f75e1d912b277247450d1c7b6d56f7faf885.

This triggers an assert when building with SVE enabled.
https://lab.llvm.org/buildbot/#/builders/143/builds/4795
2025-01-21 21:36:16 +00:00
David Green
8552c49046
[AArch64] Enable UseFixedOverScalableIfEqualCost for more Cortex-x cpus. (#122807)
For similar reasons for fixed-width being prefered to scalable for
Neoverse V2, this patch enables the UseFixedOverScalableIfEqualCost
feature when using -mcpu=cortex-x2, x3, x4 and x925 that are similar to
Neoverse V2.
2025-01-20 15:05:15 +00:00
Mel Chen
84c89d0aa4
[LV][EVL] Address post-commit comments for 9720be9. (NFC) (#123311) 2025-01-20 14:20:40 +08:00
Florian Hahn
2c87133c62
Reapply "[VPlan] Update final IV exit value via VPlan. (#112147)"
This reverts the revert commit 58326f1d5b5b379590af92dd129b2f3b3e96af46.

The build failure in sanitizer stage2 builds has been fixed with
0d39fe6f5bb3edf0bddec09a8c6417377390aeac.

Original commit message:
Model updating IV users directly in VPlan, replace fixupIVUsers.

Now simple extracts are created for all phis in the exit block during
initial VPlan construction. A later VPlan transform
(optimizeInductionExitUsers) replaces extracts of inductions with
their pre-computed values if possible.

This completes the transition towards modeling all live-outs directly in
VPlan.

There are a few follow-ups:
* emit extracts initially also for resume phis, and optimize them
   tougher with IV exit users
* support for VPlans with multiple exits in optimizeInductionExitUsers.

Depends on https://github.com/llvm/llvm-project/pull/110004,
https://github.com/llvm/llvm-project/pull/109975 and
https://github.com/llvm/llvm-project/pull/112145.
2025-01-19 19:32:03 +00:00
Florian Hahn
8d90473c3e
[LV] Add tests with outisde IV users where vector region can e removed.
Tests for crash caused by initial version of
https://github.com/llvm/llvm-project/pull/112147.
2025-01-19 19:15:45 +00:00
Florian Hahn
58326f1d5b
Revert "[VPlan] Update final IV exit value via VPlan. (#112147)"
This reverts commit c2d15ac4d4432788557e77c15ce572ac655a8fec.

Causes build failures on PPC stage2 & fuchsia bots
    https://lab.llvm.org/buildbot/#/builders/168/builds/7650
    https://lab.llvm.org/buildbot/#/builders/11/builds/11248
2025-01-18 13:40:33 +00:00
Florian Hahn
c2d15ac4d4
[VPlan] Update final IV exit value via VPlan. (#112147)
Model updating IV users directly in VPlan, replace fixupIVUsers.

Now simple extracts are created for all phis in the exit block during
initial VPlan construction. A later VPlan transform 
(optimizeInductionExitUsers) replaces extracts of inductions with 
their pre-computed values if possible.

This completes the transition towards modeling all live-outs directly in
VPlan.

There are a few follow-ups:
* emit extracts initially also for resume phis, and optimize them 
   tougher with IV exit users
* support for VPlans with multiple exits in optimizeInductionExitUsers.


Depends on https://github.com/llvm/llvm-project/pull/110004,
https://github.com/llvm/llvm-project/pull/109975 and
https://github.com/llvm/llvm-project/pull/112145.
2025-01-18 13:22:34 +00:00
Florian Hahn
22637a877a
[Loads] Respect UseDerefAtPointSemantics in isDerefAndAlignedPointer. (#123196)
If a pointer gets freed, it may not be dereferenceable any longer, even
though there is a dominating dereferenceable assumption. As first step,
only consider assumptions if the pointer value cannot be freed if
UseDerefAtPointSemantics is used.

PR: https://github.com/llvm/llvm-project/pull/123196
2025-01-17 12:52:24 +00:00
Hassnaa Hamdi
9491f75e1d
Reland: [LV]: Teach LV to recursively (de)interleave. (#122989)
This commit relands the changes from "[LV]: Teach LV to recursively
(de)interleave. #89018"

Reason for revert:
- The patch exposed a bug in the IA pass, the bug is now fixed and landed by commit: #122643
2025-01-17 10:34:57 +00:00
Mel Chen
9720be95d6
[LV][EVL] Disable fixed-order recurrence idiom with EVL tail folding. (#122458)
The currently llvm.splice may occurs unexpected behavior if the evl of
the second-to-last iteration is not VF*UF.

Issue #122461
2025-01-17 16:55:35 +08:00
Luke Lau
e83e0c300d [LV] Add test case for #119173. NFC
This showcases a miscompile involving a widened reduction-phi.
2025-01-17 09:45:42 +08:00
Florian Hahn
4481030a03 [Loads] Use use-dereferenceable-at-point-semantics=1 in test.
Update the test to use use-dereferenceable-at-point-semantics=1.
Existing tests are updated with the nofree attribute and a new one has
been added showing that the dereferenceable assumption is used after the
pointer may be freed.
2025-01-16 12:49:50 +00:00
Nikita Popov
b0c4aed4f1 [LoopVectorize] Regenerate test checks (NFC)
Add a prefix to avoid conflicts, otherwise the test becomes
invalid on regeneration.
2025-01-16 10:21:24 +01:00
David Sherwood
a00938eedd
Revert "[LoopVectorize] Add support for reverse loops in isDereferenceableAndAlignedInLoop (#96752)" (#123057)
This reverts commit bfedf6460c2cad6e6f966b457d8d27084579dcd8.
2025-01-15 13:56:42 +00:00
David Sherwood
bfedf6460c
[LoopVectorize] Add support for reverse loops in isDereferenceableAndAlignedInLoop (#96752)
Currently when we encounter a negative step in the induction
variable isDereferenceableAndAlignedInLoop bails out because
the element size is signed greater than the step. This patch
adds support for negative steps in cases where we detect the
start address for the load is of the form base + offset. In
this case the address decrements in each iteration so we need
to calculate the access size differently. I have done this by
caling getStartAndEndForAccess from LoopAccessAnalysis.cpp.

The motivation for this patch comes from PR #88385 where a
reviewer requested reusing isDereferenceableAndAlignedInLoop,
but that PR itself does support reverse loops.

The changed test in LoopVectorize/X86/load-deref-pred.ll now
passes because previously we were calculating the total access
size incorrectly, whereas now it is 412 bytes and fits
perfectly into the alloca.
2025-01-15 12:47:43 +00:00
LiqinWeng
0294dab79e
[LV][VPlan] Add fast flags for selectRecipe (#121023)
Change the inheritance of class VPWidenSelectRecipe to class
VPRecipeWithIRFlags, which allows recipe of the select to pass the
fastmath flags.The patch of #119847 will add the fastmath flag to for
recipe
2025-01-15 10:10:11 +08:00
Florian Hahn
1de3dc7d23 [LV] Bail out early if BTC+1 wraps.
Currently we fail to detect the case where BTC + 1 wraps, i.e. the
vector trip count is 0, In those cases, the minimum iteration count
check will fail, and the vector code will never be executed.

Explicitly check for this condition in computeMaxVF and avoid trying to
vectorize alltogether.

Note that a number of tests needed to be updated, because the vector
loop would never be executed given the input IR.

Fixes https://github.com/llvm/llvm-project/issues/122558.
2025-01-14 22:07:38 +00:00
Luke Lau
f925e54554
[VPlan] Fix mutating whilst iterating over users in EVL transform (#122885)
This fixes a miscompilation extracted from 525.x264_r, where we were
failing to update the runtime VF of a VPReverseVectorPointerRecipe.

We were removing a use of VF whilst iterating over the users() iterator,
which messed up the iterator in-flight and caused us to miss some
recipes. This fixes it by copying the users into a SmallVector first.

Fixes #122681
Fixes #122682
2025-01-14 22:17:51 +08:00
Jay Foad
e87f94a6a8
[llvm-project] Fix typos mutli and mutliple. NFC. (#122880) 2025-01-14 11:59:41 +00:00
Mel Chen
418f5cd6c2
[LV][EVL] Pre-commit test case for fixed-order recurrence with EVL tail folding. (NFC) (#122456)
This test case is from
SingleSource/UnitTests/Vectorizer/recurrences.test.
Pre-commit for #122458
2025-01-13 21:15:03 +08:00
Mel Chen
3397950f2d
[LV] Fix FindLastIV reduction for epilogue vectorization. (#120395)
Following 0e528ac404e13ed2d952a2d83aaf8383293c851e, this patch adjusts
the resume value of VPReductionPHIRecipe for FindLastIV reductions.
Replacing the resume value with:

  ResumeValue = ResumeValue == StartValue ? SentinelValue : ResumeValue;

This addressed the correctness issue when the start value might not be
less than the minimum value of a monotonically increasing induction
variable.

Thanks Florian Hahn for the help.

---------

Co-authored-by: Florian Hahn <flo@fhahn.com>
2025-01-13 20:58:38 +08:00
Sam Tebbs
795e35a653
Reland "[LoopVectorizer] Add support for partial reductions" with non-phi operand fix. (#121744)
This relands the reverted #120721 with a fix for cases where neither
reduction operand are the reduction phi. Only
63114239cc8d26225a0ef9920baacfc7cc00fc58 and
63114239cc8d26225a0ef9920baacfc7cc00fc58 are new on top of the reverted
PR.

---------

Co-authored-by: Nicholas Guy <nicholas.guy@arm.com>
2025-01-13 11:20:35 +00:00
Florian Hahn
8df64ed777 [LV] Don't consider IV increments uniform if exit value is used outside.
In some cases, there might be a chain of uniform instructions producing
the exit value. To generate correct code in all cases, consider the IV
increment not uniform, if there are users outside the loop.

Instead, let VPlan narrow the IV, if possible using the logic from
3ff1d01985752.

Test case from #122602 verified with Alive2:
    https://alive2.llvm.org/ce/z/bA4EGj

Fixes https://github.com/llvm/llvm-project/issues/122496.
Fixes https://github.com/llvm/llvm-project/issues/122602.
2025-01-12 22:03:21 +00:00
Florian Hahn
f5a35a31bf [LV] Add test cases with incorrect IV live-outs.
Add test cases for https://github.com/llvm/llvm-project/issues/122496
and https://github.com/llvm/llvm-project/issues/122602.
2025-01-12 20:55:20 +00:00
Florian Hahn
3ff1d01985 Recommit "[VPlan] Try to narrow wide and replicating recipes to uniform recipes."
This reverts commit 0ebb3ac7c92c4c1c44e7f3d17832d75ec5a42a67.

Re-applies commit with typos fixed.
2025-01-12 20:10:28 +00:00
Florian Hahn
0ebb3ac7c9 Revert "[VPlan] Try to narrow wide and replicating recipes to uniform recipes."
This reverts commit 1afba19913253dda865a8e57b37b9f4dabead1ac.

Typo breaking the build
2025-01-12 19:37:45 +00:00
Florian Hahn
1afba19913 [VPlan] Try to narrow wide and replicating recipes to uniform recipes.
Use the existing VPlan-based analysis to identify recipes that only have
their first lane demanded and transform them to uniform recpliate
recipes. This simplifies the generated code in some places and prepares
for fixing https://github.com/llvm/llvm-project/issues/122496.
2025-01-12 19:32:01 +00:00
Florian Hahn
44058e5b5f [LV] Precommit tests for #106441.
Tests for https://github.com/llvm/llvm-project/pull/106441
from https://github.com/llvm/llvm-project/issues/82936.
2025-01-10 18:49:44 +00:00
Florian Hahn
b6cda338ab
[Loads] Also consider getPointerAlignment when checking assumptions. (#120916)
Also use getPointerAlignment when trying to use alignment and
dereferenceable assumptions. This catches cases where dereferencable is
known via the assumption but alignment is known via getPointerAlignment
(e.g. via argument attribute or align of 1)

PR: https://github.com/llvm/llvm-project/pull/120916
2025-01-09 18:19:39 +00:00
Florian Hahn
b0697dc1de
[LV] Only check isVectorizableEarlyExitLoop with multiple exits. (#121994)
Currently we emit early-exit related debug messages/remarks even when
there is a single exit. Update to only check isVectorizableEarlyExitLoop
if there isn't a single exit block.

PR: https://github.com/llvm/llvm-project/pull/121994
2025-01-09 12:05:19 +00:00
Benjamin Maxwell
f88ef1bd1b
[LV] Teach LoopVectorizationLegality about struct vector calls (#119221)
This is a split-off from #109833 and only adds code relating to checking
if a struct-returning call can be vectorized.

This initial patch only allows the case where all users of the struct
return are `extractvalue` operations that can be widened.

```
%call = tail call { float, float } @foo(float %in_val)
%extract_a = extractvalue { float, float } %call, 0
%extract_b = extractvalue { float, float } %call, 1
```

Note: The tests require the VFABI changes from #119000 to pass.
2025-01-09 09:27:29 +00:00
Luke Lau
f0d5104c94
[VPlan] Handle some VPInstructions in may{Read,Write}FromMemory (#120058)
This just copies the same conservative definition from mayWriteToMemory,
and enables more VPInstructions to be hoisted out in LICM.

I think this should give more accurate costs, and I was able to build
llvm-test-suite without the legacy-vplan cost model assertion going off.
2025-01-08 15:17:26 +08:00
Florian Hahn
0eaa69eb23
[VPlan] Handle VPExpandSCEVRecipe in isUniformAfterVectorization.
VPExpandSCEVRecipes must be placed in the entry and are alway uniform.
This fixes a crash by always identifying them as uniform, even if the
main vector loop region has been removed.

Fixes https://github.com/llvm/llvm-project/issues/121897.
2025-01-07 21:35:09 +00:00
Florian Hahn
ea14bdb035
[LV] Add test showing debug output for loops with uncountable BTCs.
Currently we print an early-exit related related debug message, even
though there's no early exit.
2025-01-07 20:27:30 +00:00
Florian Mayer
ef391dbc29
[LV] Drop incorrect inbounds for reverse vector pointer when folding tail (#120730)
When folding the tail, we may compute an address that we don't in the
original scalar loop and it may not be inbounds. Drop Inbounds in that
case.
2025-01-07 06:14:01 -08:00