2121 Commits

Author SHA1 Message Date
zhongyunde
4d2723bd00 [ValueTracking] Support vscale assumes for isKnownToBeAPowerOfTwo
This patch is separated from D154953 to see what tests are affected by this
change alone according comment.
Depend on the related updating of LangRef on D155193.

Reviewed By: paulwalker-arm, nikic, david-arm
Differential Revision: https://reviews.llvm.org/D155350
2023-07-15 19:42:58 +08:00
Anna Thomas
dfaf4587e4 Precommit follow-up testcase for interleaved miscompile
Follow-up testcase for PR63602.

Suggested by Ayal in D154309, more complete fix coming up which should
handle this testcase as well.
2023-07-14 16:04:56 -04:00
Maciej Gabka
5b0e19a7ab [TLI][AArch64] Add mappings to vectorized functions from ArmPL
Arm Performance Libraries contain math library which provides
vectorized versions of common math functions.
This patch allows to use it with clang and llvm via -fveclib=ArmPL or
-vector-library=ArmPL, so loops with such calls can be vectorized.
The executable needs to be linked with the amath library.

Arm Performance Libraries are available at:
https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Libraries

Reviewed by: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D154508
2023-07-12 12:53:18 +00:00
Nikita Popov
edb2fc6dab [llvm] Remove explicit -opaque-pointers flag from tests (NFC)
Opaque pointers mode is enabled by default, no need to explicitly
enable it.
2023-07-12 14:35:55 +02:00
Mel Chen
0158d86ab3 [LV] Change the test cases to ensure that the trip count is not zero. (NFC)
Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D154415
2023-07-11 19:12:59 -07:00
Florian Hahn
d7e79bd7d4
[LV] Check if ops can safely be truncated in computeMinimumValueSizes.
Update computeMinimumValueSizes to check if an instruction's operands
can safely be truncated.

If more than MinBW bits are demanded by for the operand or if the
operand is a constant and cannot be safely truncated, it is not safe to
evaluate the instruction in the narrower MinBW. Skip those cases.

Fixes https://github.com/llvm/llvm-project/issues/47927

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D154717
2023-07-11 20:18:55 +01:00
Florian Hahn
1739200654
[LV] Add trunc test variants with shl and ashr.
Add extra tests for D154717 where narrowing results in poison.
2023-07-10 21:04:19 +01:00
Florian Hahn
14ec3f4b06
[LV] Skip VFs > # iterations remaining for epilogue vectorization.
If a candidate VF for epilogue vectorization is greater than the number of
remaining iterations, the epilogue loop would be dead. Skip such factors.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D154264
2023-07-07 21:43:51 +01:00
Florian Hahn
aee851fd0e
Revert "[LV] Skip VFs < iterations remaining for epilogue vectorization."
This reverts commit 7cc0be01a0068946ea3613dc2cb45c81b0f45860.

The title of the commit is incorrect, revert to fix the commit message.
2023-07-07 21:41:24 +01:00
Florian Hahn
7cc0be01a0
[LV] Skip VFs < iterations remaining for epilogue vectorization.
If a candidate VF for epilogue vectorization is less than the number of
remaining iterations, the epilogue loop would be dead. Skip such factors.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D154264
2023-07-07 20:33:42 +01:00
Luke Lau
b9af086292 [RISCV] Update loop vectorizer interleaved access test output
02bb33c3ce7a83d47244ae16c8b4c625aba187a2 changed it so it no longer unrolls the
loop.
2023-07-07 15:38:04 +01:00
Nikita Popov
a5e253d659 [LoopVectorize] Regenerate test checks (NFC) 2023-07-07 14:42:31 +02:00
Florian Hahn
4d847bf4d0
[LV] Do not add load to group if it moves across conflicting store.
This patch prevents invalid load groups from being formed, where a load
needs to be moved across a conflicting store.

Once we hit a store that conflicts with a load with an existing
interleave group, we need to stop adding earlier loads to the group, as
this would force hoisting the previous stores in the group across the
conflicting load.

To detect such cases, add a new CompletedLoadGroups set, which is used
to keep track of load groups to which no earlier loads can be added.

Fixes https://github.com/llvm/llvm-project/issues/63602

Reviewed By: anna

Differential Revision: https://reviews.llvm.org/D154309
2023-07-07 11:06:30 +01:00
Florian Hahn
6b289304f6
[LV] Add test case for incorrect shift truncation.
Test for https://github.com/llvm/llvm-project/issues/47927
2023-07-06 15:23:17 +01:00
Florian Hahn
a0fcf84a8c
[LV] Consider if scalar epilogue is required in getMaximizedVFForTarget.
When a scalar epilogue is required, at least one iteration of the scalar loop
has to execute. Adjust ConstTripCount accordingly to avoid picking a max VF
that results in a dead vector loop.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D154261
2023-07-06 13:35:35 +01:00
Florian Hahn
1746ac42ca
[LV] Forget SCEVs for exit phis after vectorization.
After vectorization, the exit blocks of the original loop will have additional
predecessors. Invalidate SCEVs for the exit phis in case SE looked through
single-entry phis.

Fixes https://github.com/llvm/llvm-project/issues/63368
Fixes https://github.com/llvm/llvm-project/issues/63669
2023-07-04 21:28:03 +01:00
Florian Hahn
8a25dc3787
[LV] Regenerate check lines to reduced diff.
Regenerate checks to avoid unnecessary changes in D154264.
2023-07-04 14:01:05 +01:00
Evgeniy Brevnov
d7329653d0 [VPlan] Allow sinking of instructions with no defs
We started seeing new failure after D142886. Looks like it enabled new cases and we hit an assert:
assert(Current->getNumDefinedValues() == 1 &&
           "only recipes with a single defined value expected");

 When we do instruction sinking for the first order recurrence we hit an assert if instruction doesn't have single def. In case instruction doesn't produce any new def there is no new users and nothing to sink.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D151204
2023-07-04 16:53:06 +07:00
Florian Hahn
e561edaaa5
[LV] Prepare tests for D154261.
Update trip count of test in
pr56319-vector-exit-cond-optimization-epilogue-vectorization.ll to
make sure epilogue vectorization will still trigger after D154261,
checking for the original issue.

Move the original test to limit-vf-by-tripcount.ll for testing new
functionality of D154261.
2023-07-03 17:49:36 +01:00
Florian Hahn
c14b0a7c55
[LV] Check for vector instruction in main vector loop.
Update the test to check for the vectorization call in the main vector
loop, not the dead epilogue vector loop as it does currently.
2023-07-03 14:16:47 +01:00
Florian Hahn
6954cb5425
[LV] Add test case for #63602. 2023-07-02 22:17:16 +01:00
Nikita Popov
bb3763e497 Revert "[SimplifyCFG] Allow dropping block that only contains ephemeral values"
This reverts commit 20f0c68fd83a0147a8ec1722bd2e848180610288.

https://reviews.llvm.org/D153966#4464594 reports an optimization
regression in Rust.

Additionally this change has caused an unexpected 0.3% compile-time
regression.
2023-06-30 21:24:05 +02:00
Nikita Popov
20f0c68fd8 [SimplifyCFG] Allow dropping block that only contains ephemeral values
Perform the TryToSimplifyUncondBranchFromEmptyBlock() transform if
the block is empty except for ephemeral values. The ephemeral values
will be dropped in that case.

This makes sure that assumes don't block this transforms, as reported
in https://discourse.llvm.org/t/llvm-assume-blocks-optimization/71609.

Differential Revision: https://reviews.llvm.org/D153966
2023-06-30 15:24:01 +02:00
Florian Hahn
9078a9942d
[LV] Add additional tests with dead vector epilogues. 2023-06-30 12:17:57 +01:00
Igor Kirillov
17bde328d6 [LV] Add mask support for vectorizing interleaved groups
This patch extends LoopVectorize to handle the vectorization of interleaved
memory accesses with scalable vectors when mask is required or/and predicated
tail folding is enabled.

Differential Revision: https://reviews.llvm.org/D152258
2023-06-29 17:50:56 +00:00
Michael Platings
54c79fa53c [test] Replace aarch64-*-eabi with aarch64
Also replace aarch64_be-*-eabi with aarch64_be

Using "eabi" for aarch64 targets is a common mistake and warned by Clang Driver.
We want to avoid it elsewhere as well. Just use the common "aarch64" without
other triple components.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D153943
2023-06-29 09:06:00 +01:00
Igor Kirillov
7049393a58 [LV] Precommit masked interleaved access tests
Precommit for D152258.

Differential Revision: https://reviews.llvm.org/D153443
2023-06-28 09:23:23 +00:00
Fangrui Song
ebbfdca586 [test] Replace aarch64-arm-none-eabi with aarch64
Similar to 02e9441d6ca73314afa1973a234dce1e390da1da, but for llvm/test and one
lld/test/ELF test.
2023-06-27 19:36:27 -07:00
Florian Hahn
dc9f69e483
[LV] Add test with reduction start values that are/may be poison/undef.
Test cases for #62565.
2023-06-22 20:15:23 +01:00
Anna Thomas
ec146cb7c0 [LV] Add support for minimum/maximum intrinsics
{mini|maxi}mum intrinsics are different from {min|max}num intrinsics in
the propagation of NaN and signed zero. Also, the minnum/maxnum
intrinsics require the presence of nsz flags to be valid reductions in
vectorizer. In this regard, we introduce a new recurrence kind and also
add support for identifying reduction patterns using these intrinsics.

The reduction intrinsics and lowering was introduced here: 26bfbec5d2.

There are tests added which show how this interacts across chains of
min/max patterns.

Differential Revision: https://reviews.llvm.org/D151482
2023-06-20 13:17:28 -04:00
Florian Hahn
0a246a0c72
[LV] Use VPValues when creating GEP with all invariant indices.
Update VPWidenGEPRecipe::execute to use the VPValue operands of the
recipe when creating the GEP instruction.

Fixes #63340.
2023-06-16 16:14:01 +01:00
Florian Hahn
ea6ca9cb2b
[LV] Fix crash when stride isn't a constant.
In same cases, the stride may not be a constant. Just skip those cases
for now. This should only happen for cases where LV interleaves only, if
it is vectorized the stride needs to be versioned to a constant.
2023-06-14 16:53:34 +01:00
Simon Pilgrim
4cbedaeff5 [LoopVectorize][X86] Regenerate slm-no-vectorize.ll 2023-06-13 14:15:37 +01:00
Florian Hahn
d209084720
[VPlan] Replace versioned stride with constant during VPlan opts.
After constructing the initial VPlan, replace VPValues for versioned
strides with their constant counterparts.

Differential Revision: https://reviews.llvm.org/D147783
2023-06-13 08:26:55 +01:00
Nikita Popov
2b7c347c7f [LoopVectorize] Convert test to opaque pointers (NFC)
I'm keeping the bitcast in the input here, because without it
we end up introducing a stride 1 assumption and end up testing
a different case.
2023-06-12 14:49:45 +02:00
Nikita Popov
9929f9533d [LoopVectorize] Convert test to opaque pointers (NFC) 2023-06-12 14:31:54 +02:00
Nikita Popov
aa92ae5924 [LoopVectorize] Regenerate test checks (NFC) 2023-06-12 14:31:54 +02:00
Nikita Popov
9cf67f6ea0 [LoopVectorize] Convert most tests to opaque pointers (NFC)
The unsized-pointee-crash.ll and zero-sized-pointee-crash.ll tests
have been removed, because these issues are not relevant for opaque
pointers.
2023-06-12 13:10:22 +02:00
Graham Hunter
95bfb1902d [LV][AArch64] Allow (limited) interleaving for scalable vectors
This patch uses the (de)interleaving intrinsics introduced in
D141924 to handle vectorization of interleaving groups with a
factor of 2 for scalable vectors.

Reviewed By: fhahn, reames

Differential Revision: https://reviews.llvm.org/D145163
2023-06-09 11:42:10 +01:00
Florian Hahn
c317a88767
[LV] Add tests for reasoning about SCEV predicates.
Add extra tests with cases where SCEV predicates can be proven to always
be false. The test in pointer-induction.ll has been adjusted to avoid
the induction always to wrap.
2023-06-08 21:13:06 +01:00
Florian Hahn
f5f6daf00f
[LV] Extend test coverage for loops with accesses with clamped indexes.
Extend test coverage ahead of upcoming patches.
2023-06-08 12:10:04 +01:00
Florian Hahn
123f807e5b
[LV] Remove UB caused by undef from pr37248.ll (NFC).
Also generate full check lines.
2023-06-08 11:58:58 +01:00
zhongyunde
df19d87227 [LV] Add option to tune the cost model, NFC
For Neon, the default nonconst stride cost is conservative,
and it is a local variable, which is not convenience to
to tune the loop vectorize.
So I try to use a option, which is similar to SVEGatherOverhead brought in D115143.
Fix https://github.com/llvm/llvm-project/issues/63082.

Reviewed By: dmgreen, fhahn
Differential Revision: https://reviews.llvm.org/D152253
2023-06-07 22:08:29 +08:00
Florian Hahn
8f781b96e2
Revert "[VPlan] Mark recurrence recipes as not having side-effects."
This reverts commit 02369b75fdd7b5fc5d9b47f1b60587c225918511.

At the moment, live-outs used *only* for the resume values in the scalar
loop are not modeled in VPlan yet. This means first-order recurrence
recipes could be removed, when a scalar epilogue is required and the
only use of a FOR is outside the loop.

Keep treating recurrence recipes as having side-effects for now, to
avoid them being removed.

Fixes #62954.
2023-06-06 11:35:26 +02:00
Florian Hahn
f47084ecfb
[LV] Use force-vector-width for X86 recurrence test.
This makes sure that all tests that can be vectorized in the file are
vectorized.
2023-06-06 11:27:35 +02:00
Florian Hahn
4c51a45e80
[LV] Add test for #62954. 2023-06-06 11:20:22 +02:00
Florian Hahn
3b912e269a
[LV] Bail out on loop-variant steps when rewriting SCEV exprs.
If the step is not loop-invariant, we cannot create a modified AddRec,
as the start needs to be loop-invariant. Mark those cases as
CannotAnalyze and bail out, to fix a crash.
2023-06-01 16:14:02 +01:00
Florian Hahn
572cfa3fde
[LV] Use SCEV for uniformity analysis across VF
This patch uses SCEV to check if a value is uniform across a given VF.

The basic idea is to construct SCEVs where the AddRecs of the loop are
adjusted to reflect the version in the vectorized loop (Step multiplied
by VF). We construct a SCEV for the value of the vector lane 0
(offset 0) compare it to the expressions for lanes 1 to the last vector
lane (VF - 1). If they are equal, consider the expression uniform.

While re-writing expressions, we also need to catch expressions we
cannot determine uniformity (e.g. SCEVUnknown).

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D148841
2023-05-31 16:01:00 +01:00
Florian Hahn
8098f2577e
[LV] Use Legal::isUniform to detect uniform pointers.
Update collectLoopUniforms to identify uniform pointers using
Legal::isUniform. This is more powerful and  brings pointer
classification here in sync with setCostBasedWideningDecision
which uses isUniformMemOp. The existing mis-match in reasoning
can causes crashes due to D134460, which is fixed by this patch.

Fixes https://github.com/llvm/llvm-project/issues/60831.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D150991
2023-05-30 16:42:55 +01:00
Florian Hahn
fcc135a8d6
[LV] Remove dead CHECK lines after 280656eae95a9cbf.
Those check lines were left over after adding new run lines in
280656eae95a9cbf.
2023-05-29 19:23:52 +01:00