2847 Commits

Author SHA1 Message Date
Sam Tebbs
795e35a653
Reland "[LoopVectorizer] Add support for partial reductions" with non-phi operand fix. (#121744)
This relands the reverted #120721 with a fix for cases where neither
reduction operand are the reduction phi. Only
63114239cc8d26225a0ef9920baacfc7cc00fc58 and
63114239cc8d26225a0ef9920baacfc7cc00fc58 are new on top of the reverted
PR.

---------

Co-authored-by: Nicholas Guy <nicholas.guy@arm.com>
2025-01-13 11:20:35 +00:00
Florian Hahn
8df64ed777 [LV] Don't consider IV increments uniform if exit value is used outside.
In some cases, there might be a chain of uniform instructions producing
the exit value. To generate correct code in all cases, consider the IV
increment not uniform, if there are users outside the loop.

Instead, let VPlan narrow the IV, if possible using the logic from
3ff1d01985752.

Test case from #122602 verified with Alive2:
    https://alive2.llvm.org/ce/z/bA4EGj

Fixes https://github.com/llvm/llvm-project/issues/122496.
Fixes https://github.com/llvm/llvm-project/issues/122602.
2025-01-12 22:03:21 +00:00
Florian Hahn
f5a35a31bf [LV] Add test cases with incorrect IV live-outs.
Add test cases for https://github.com/llvm/llvm-project/issues/122496
and https://github.com/llvm/llvm-project/issues/122602.
2025-01-12 20:55:20 +00:00
Florian Hahn
3ff1d01985 Recommit "[VPlan] Try to narrow wide and replicating recipes to uniform recipes."
This reverts commit 0ebb3ac7c92c4c1c44e7f3d17832d75ec5a42a67.

Re-applies commit with typos fixed.
2025-01-12 20:10:28 +00:00
Florian Hahn
0ebb3ac7c9 Revert "[VPlan] Try to narrow wide and replicating recipes to uniform recipes."
This reverts commit 1afba19913253dda865a8e57b37b9f4dabead1ac.

Typo breaking the build
2025-01-12 19:37:45 +00:00
Florian Hahn
1afba19913 [VPlan] Try to narrow wide and replicating recipes to uniform recipes.
Use the existing VPlan-based analysis to identify recipes that only have
their first lane demanded and transform them to uniform recpliate
recipes. This simplifies the generated code in some places and prepares
for fixing https://github.com/llvm/llvm-project/issues/122496.
2025-01-12 19:32:01 +00:00
Florian Hahn
44058e5b5f [LV] Precommit tests for #106441.
Tests for https://github.com/llvm/llvm-project/pull/106441
from https://github.com/llvm/llvm-project/issues/82936.
2025-01-10 18:49:44 +00:00
Florian Hahn
b6cda338ab
[Loads] Also consider getPointerAlignment when checking assumptions. (#120916)
Also use getPointerAlignment when trying to use alignment and
dereferenceable assumptions. This catches cases where dereferencable is
known via the assumption but alignment is known via getPointerAlignment
(e.g. via argument attribute or align of 1)

PR: https://github.com/llvm/llvm-project/pull/120916
2025-01-09 18:19:39 +00:00
Florian Hahn
b0697dc1de
[LV] Only check isVectorizableEarlyExitLoop with multiple exits. (#121994)
Currently we emit early-exit related debug messages/remarks even when
there is a single exit. Update to only check isVectorizableEarlyExitLoop
if there isn't a single exit block.

PR: https://github.com/llvm/llvm-project/pull/121994
2025-01-09 12:05:19 +00:00
Benjamin Maxwell
f88ef1bd1b
[LV] Teach LoopVectorizationLegality about struct vector calls (#119221)
This is a split-off from #109833 and only adds code relating to checking
if a struct-returning call can be vectorized.

This initial patch only allows the case where all users of the struct
return are `extractvalue` operations that can be widened.

```
%call = tail call { float, float } @foo(float %in_val)
%extract_a = extractvalue { float, float } %call, 0
%extract_b = extractvalue { float, float } %call, 1
```

Note: The tests require the VFABI changes from #119000 to pass.
2025-01-09 09:27:29 +00:00
Luke Lau
f0d5104c94
[VPlan] Handle some VPInstructions in may{Read,Write}FromMemory (#120058)
This just copies the same conservative definition from mayWriteToMemory,
and enables more VPInstructions to be hoisted out in LICM.

I think this should give more accurate costs, and I was able to build
llvm-test-suite without the legacy-vplan cost model assertion going off.
2025-01-08 15:17:26 +08:00
Florian Hahn
0eaa69eb23
[VPlan] Handle VPExpandSCEVRecipe in isUniformAfterVectorization.
VPExpandSCEVRecipes must be placed in the entry and are alway uniform.
This fixes a crash by always identifying them as uniform, even if the
main vector loop region has been removed.

Fixes https://github.com/llvm/llvm-project/issues/121897.
2025-01-07 21:35:09 +00:00
Florian Hahn
ea14bdb035
[LV] Add test showing debug output for loops with uncountable BTCs.
Currently we print an early-exit related related debug message, even
though there's no early exit.
2025-01-07 20:27:30 +00:00
Florian Mayer
ef391dbc29
[LV] Drop incorrect inbounds for reverse vector pointer when folding tail (#120730)
When folding the tail, we may compute an address that we don't in the
original scalar loop and it may not be inbounds. Drop Inbounds in that
case.
2025-01-07 06:14:01 -08:00
Florian Hahn
f9369cc602
[VPlan] Make sure last IV increment value is available if needed.
Legalize extract-from-ends using uniform VPReplicateRecipe of wide
inductions to use regular VPReplicateRecipe, so the correct end value
is available.

Fixes https://github.com/llvm/llvm-project/issues/121745.
2025-01-06 22:40:41 +00:00
Florian Hahn
d0c00cf078
[LV] Add test case for #121745.
Test for https://github.com/llvm/llvm-project/issues/121745.
2025-01-06 22:28:44 +00:00
David Sherwood
a3fff3a14d
[LoopVectorize][NFC] Fix arith-fp-frem-costs.ll test to use new vplan cost model (#120742) 2025-01-06 10:26:51 +00:00
Florian Hahn
f4230b4332
[VPlan] Add and use debug location for VPScalarCastRecipe.
Update the recipe it always take a debug location and set it.
2025-01-05 20:08:51 +00:00
Florian Hahn
747f7f38bd
[LV] Add test with conditional load from invariant addr and assumption.
This adds missing test coverage for isDereferenceableAndAlignedInLoop,
related to https://github.com/llvm/llvm-project/pull/96752.
2025-01-05 19:31:54 +00:00
Florian Hahn
f48884ded8
[VPlan] Remove loop region in optimizeForVFAndUF. (#108378)
Update optimizeForVFAndUF to completely remove the vector loop region
when possible. At the moment, we cannot remove the region if it contains

* widened IVs: the recipe is needed to generate the step vector
* reductions: ComputeReductionResults requires the reduction phi recipe
for codegen.

Both cases can be addressed by more explicit modeling.

The patch also includes a number of updates to allow executing VPlans
without a vector loop region.

Depends on https://github.com/llvm/llvm-project/pull/110004
2025-01-05 15:50:42 +00:00
Florian Hahn
df4a615c98
[VPlan] Convert induction increment check to be VPlan-based.
Check the VPlan directly to determine if a VPValue is an optimiziable IV
or IV use instead of checking the underlying IR instructions.

Split off from https://github.com/llvm/llvm-project/pull/112147. This
refactoring enables moving IV end value creation from the legacy
fixupIVUsers to a VPlan-based transform.

There is one case we now won't optimize, that is IVs with subtracts and
non-constant steps. But as this is a minor optimization and doesn't
impact correctness, the benefits of performing the check in VPlan should
outweigh the missed case.
2025-01-05 11:16:01 +00:00
Luke Lau
7700695739
[VPlan] Fix crash with EVL tail folding intrinsic with no corresponding VP (#121542)
This fixes a crash when building SPEC CPU 2017 with EVL tail folding
when widening @llvm.log10 intrinsics.

@llvm.log10 and some other intrinsics don't have a corresponding VP
intrinsic, so this fixes the crash by removing the assert and bailing
instead.
2025-01-05 11:41:56 +08:00
Florian Hahn
b95cce9904
[VPlan] Update wide induction inc recipes to use same step as Wide IV.
Update wide induction increments to use the same step as the corresponding
wide induction. This enables detecting induction increments directly in
VPlan and removes redundant splats.
2025-01-04 20:04:59 +00:00
Florian Hahn
4a7c0b8afe
[LV] Add X86-specific induction step tests.
Adds additional test coverage for induction codegen.
2025-01-04 15:09:04 +00:00
Florian Hahn
47ac7fa861
[LV] Add tests with wide inductions and live-in step.
Also regenerate check lines and simplify existing tests and names.
2025-01-04 14:50:04 +00:00
Florian Mayer
62b5cf0410
[Vectorizer] precommit test for miscompilation (#120731)
we generate GEPs that are out of bounds but mark them as "inbound"
2025-01-03 06:37:45 -08:00
John Brawn
073e65a8e5
[LoopVectorize] Make needsExtract notice scalarized instructions (#119720)
LoopVectorizationCostModel::needsExtract should recognise instructions
that have been widened by scalarizing as scalar instructions, and thus
not needing an extract when used by later scalarized instructions.

This fixes an incorrect cost calculation in computePredInstDiscount,
where we are adding a scalarization overhead cost when we shouldn't,
though I haven't come up with a test case where it makes a difference.
It will make a difference when the cost model switches to using the cost
kind TCK_CodeSize for optsize, as not doing this causes the test
LoopVectorize/X86/small-size.ll to get worse.
2025-01-02 14:31:36 +00:00
Florian Hahn
3026ecaff5
[LV] Also verify loops in vector loop removal tests.
Also verify loop info in tests added in 7d6ec3b9680.
2024-12-31 20:11:23 +00:00
Florian Hahn
7d6ec3b968
[LV] Add more tests for vector loop removal.
Add missing test coverage of loops where the vector loop region can be
removed that include replicate recipes as well as nested loops.

Extra test coverage for https://github.com/llvm/llvm-project/pull/108378.
2024-12-31 20:08:54 +00:00
Muhammad Omair Javaid
332d2647ff Revert "[LV]: Teach LV to recursively (de)interleave. (#89018)"
This reverts commit ccfe0de0e1e37ed369c9bf89dd0188ba0afb2e9a.

This breaks LLVM build on AArch64 SVE Linux buildbots
https://lab.llvm.org/buildbot/#/builders/143/builds/4462
https://lab.llvm.org/buildbot/#/builders/17/builds/4902
https://lab.llvm.org/buildbot/#/builders/4/builds/4399
https://lab.llvm.org/buildbot/#/builders/41/builds/4299
2024-12-31 03:12:24 +05:00
Florian Hahn
b20b6e9ea9
[LV] Check IR generated for both interleaving and vectorizing in test.
Currently the tests would in some cases would only check the vectorized
IR, but not the interleaved IR, if they are different.
2024-12-30 19:57:26 +00:00
Florian Hahn
c2be48a6ce
[LV] Add additional tests with induction users.
Adds test coverage of post-inc IV users with different opcodes.
2024-12-30 17:36:48 +00:00
Florian Hahn
7f3428d3ed
[VPlan] Compute induction end values in VPlan. (#112145)
Use createDerivedIV to compute IV end values directly in VPlan, instead
of creating them up-front.

This allows updating IV users outside the loop as follow-up.

Depends on https://github.com/llvm/llvm-project/pull/110004 and
https://github.com/llvm/llvm-project/pull/109975.

PR: https://github.com/llvm/llvm-project/pull/112145
2024-12-29 19:05:08 +00:00
Zequan Wu
4d8f9594b2 Revert "Reland "[LoopVectorizer] Add support for partial reductions" (#120721)"
This reverts commit c858bf620c3ab2a4db53e84b9365b553c3ad1aa6 as it casuse optimization crash on -O2, see https://github.com/llvm/llvm-project/pull/120721#issuecomment-2563192057
2024-12-27 11:51:54 -08:00
Hassnaa Hamdi
ccfe0de0e1
[LV]: Teach LV to recursively (de)interleave. (#89018)
Currently available intrinsics are only ld2/st2, which don't support interleaving factor > 2.
This patch teaches the LV to use ld2/st2 recursively to support high
interleaving factors.
2024-12-27 12:42:07 +00:00
Elvis Wang
47e1c87a61
[VPlan] Set debug location for VPReduction/VPWidenIntrinsicRecipe. (#120054)
This patch add missing debug location for
VPReduction/VPWidenIntrinsicRecipe.
2024-12-27 10:37:21 +08:00
Florian Hahn
2d038caeeb
[VPlan] Remove stray space when printing VPWidenCastRecipe.
printFlags() already takes care of printing a single space if there are
no flags. Remove the extra space when printing a recipe without flags.
2024-12-24 20:23:48 +00:00
Sam Tebbs
c858bf620c
Reland "[LoopVectorizer] Add support for partial reductions" (#120721)
This re-lands the reverted #92418 

When the VF is small enough so that dividing the VF by the scaling
factor results in 1, the reduction phi execution thinks the VF is scalar
and sets the reduction's output as a scalar value, tripping assertions
expecting a vector value. The latest commit in this PR fixes that by
using `State.VF` in the scalar check, rather than the divided VF.

---------

Co-authored-by: Nicholas Guy <nicholas.guy@arm.com>
2024-12-24 12:08:17 +00:00
Florian Hahn
db2307d2d7
[LV] Add tests with dereferenceable assumptions.
Add a number of tests with dereferenceable assumptions and different
alignment info.
2024-12-22 16:32:40 +00:00
LiqinWeng
86fa35ce7e
[LV][VPlan] Use opcode to retrieve the VPID of the CallRecipe, rather than underlying instruction (#120816)
This patch may cause the flags in the CallRecipe to be lost after EVL
transformation, and it has been addressed in the patch: #119847
2024-12-22 10:28:20 +08:00
Florian Hahn
bc23ef3feb
[LV] Add test showing incorrect debug location for scalar casts. 2024-12-21 22:30:36 +00:00
Florian Hahn
9b496deb90
[VPlan] Set and use debug location for VPPredInstPHIRecipe.
Update the recipe it always set its debug location and use it during IR
generation.
2024-12-21 21:57:47 +00:00
Florian Hahn
df8efbdbbf
[SCEV] Remove existing predicates implied by newly added ones. (#118185)
When adding a new predicate to a union predicate, some of the existing
predicates may be implied by the new predicate. Remove any existing
predicates that are already implied by the new predicate.

Depends on https://github.com/llvm/llvm-project/pull/118184 to show the
main benefit.

PR: https://github.com/llvm/llvm-project/pull/118185
2024-12-20 20:49:37 +00:00
David Sherwood
5845298f94
[LoopVectorize] Teach some X86 cost model tests to use new vplan costs (#120738)
I've only fixed up the tests where I was able to use a simple sed script
to replace the text. Even after this patch lands, there are still over
50 tests that need updating in X86/CostModel!
2024-12-20 15:08:08 +00:00
Florian Hahn
5f096fd221
Revert "[LoopVectorizer] Add support for partial reductions (#92418)"
This reverts commit 060d62b48aeb5080ffcae1dc56e41a06c6f56701.

It looks like this is triggering an assertion when build llvm-test-suite
on ARM64 macOS.

Reproducer from MultiSource/Benchmarks/Ptrdist/bc/number.c

    target datalayout = "e-m:o-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-n32:64-S128-Fn32"
    target triple = "arm64-apple-macosx15.0.0"

    define void @test(i64 %idx.neg, i8 %0) #0 {
    entry:
      br label %while.body

    while.body:                                       ; preds = %while.body, %entry
      %n1ptr.0.idx131 = phi i64 [ %n1ptr.0.add, %while.body ], [ %idx.neg, %entry ]
      %n2ptr.0.idx130 = phi i64 [ %n2ptr.0.add, %while.body ], [ 0, %entry ]
      %sum.1129 = phi i64 [ %add99, %while.body ], [ 0, %entry ]
      %n1ptr.0.add = add i64 %n1ptr.0.idx131, 1
      %conv = sext i8 %0 to i64
      %n2ptr.0.add = add i64 %n2ptr.0.idx130, 1
      %1 = load i8, ptr null, align 1
      %conv97 = sext i8 %1 to i64
      %mul = mul i64 %conv97, %conv
      %add99 = add i64 %mul, %sum.1129
      %cmp94 = icmp ugt i64 %n1ptr.0.idx131, 0
      %cmp95 = icmp ne i64 %n2ptr.0.idx130, -1
      %2 = and i1 %cmp94, %cmp95
      br i1 %2, label %while.body, label %while.end.loopexit

    while.end.loopexit:                               ; preds = %while.body
      %add99.lcssa = phi i64 [ %add99, %while.body ]
      ret void
    }

    attributes #0 = { "target-cpu"="apple-m1" }

> opt -p loop-vectorize
Assertion failed: ((VF.isScalar() || V->getType()->isVectorTy()) && "scalar values must be stored as (0, 0)"), function set, file VPlan.h, line 284.
2024-12-19 21:46:51 +00:00
Nicholas Guy
060d62b48a
[LoopVectorizer] Add support for partial reductions (#92418)
Following on from https://github.com/llvm/llvm-project/pull/94499, this
patch adds support to the Loop Vectorizer to emit the partial reduction
intrinsics where they may be beneficial for the target.

---------

Co-authored-by: Samuel Tebbs <samuel.tebbs@arm.com>
2024-12-19 11:42:40 +00:00
David Sherwood
c18fda02e1
[LoopVectorize] Use new single string variant of reportVectorizationFailure (#120414) 2024-12-19 10:07:13 +00:00
Alexander Kornienko
23a239267e
Revert "[InstCombine] Infer nuw for gep inbounds from base of object" (#120460)
Reverts llvm/llvm-project#119225 due to the lack of sanitizer support,
large potential of breaking code containing latent UB, non-trivial
localization and investigation, and what seems to be a bad interaction
with msan (a test is in the works).

Related discussions:
https://github.com/llvm/llvm-project/pull/119225#issuecomment-2551904822
https://github.com/llvm/llvm-project/pull/118472#issuecomment-2549986255
2024-12-18 19:06:34 +01:00
Florian Hahn
0e8d022ffe
[VPlan] Handle exit phis with multiple operands in addUsersInExitBlocks. (#120260)
Currently the addUsersInExitBlocks incorrectly assumes exit phis only
have a single operand, which may not be the case for loops with early
exits when they share a common exit block.

Also further relax the assertion in fixupIVUsers to allow exit values if
they come from theloop latch/middle.block.

PR: https://github.com/llvm/llvm-project/pull/120260
2024-12-18 14:47:16 +00:00
Florian Hahn
3e02038948
[LV] Fixup check lines after 13107cb09441. 2024-12-18 09:37:30 +00:00