2518 Commits

Author SHA1 Message Date
Florian Hahn
88e9c56990
[LV] Don't adjust name of recurrence phi in scalar loop (NFC).
Adjusting the name of the recurrence phi in the scalar loop is a bit
inconsistent, as we do not adjust any other names in the scalar loops
(including other phis).

Remove this adjustment in preparation for
https://github.com/llvm/llvm-project/pull/94760/ and as discussed there.
2024-07-10 18:37:35 +01:00
Florian Hahn
b841e2eca3
Recommit "[VPlan] First step towards VPlan cost modeling. (#92555)"
This reverts commit 6f538f6a2d3224efda985e9eb09012fa4275ea92.

A number of crashes have been fixed by separate fixes, including
ttps://github.com/llvm/llvm-project/pull/96622. This version of the
PR also pre-computes the costs for branches (except the latch) instead
of computing their costs as part of costing of replicate regions, as
there may not be a direct correspondence between original branches and
number of replicate regions.

Original message:
This adds a new interface to compute the cost of recipes, VPBasicBlocks,
VPRegionBlocks and VPlan, initially falling back to the legacy cost model
for all recipes. Follow-up patches will gradually migrate recipes to
compute their own costs step-by-step.

It also adds getBestPlan function to LVP which computes the cost of all
VPlans and picks the most profitable one together with the most
profitable VF.

The VPlan selected by the VPlan cost model is executed and there is an
assert to catch cases where the VPlan cost model and the legacy cost
model disagree. Even though I checked a number of different build
configurations on AArch64 and X86, there may be some differences
that have been missed.

Additional discussions and context can be found in @arcbbb's
https://github.com/llvm/llvm-project/pull/67647 and
https://github.com/llvm/llvm-project/pull/67934 which is an earlier
version of the current PR.

PR: https://github.com/llvm/llvm-project/pull/92555
2024-07-10 14:22:21 +01:00
Florian Hahn
ef89e3efa9
[VPlan] Collect ephemeral values for VPlan.
Port collectEphemeralValues to VPlan as collectEphemeralRecipesForVPlan,
use it in willGenerateVectors. This fixes a regression caused by
29b8b72117 for loops where the only vector values are ephemeral.
2024-07-09 21:34:49 +01:00
Florian Hahn
7346e7cc47
[VPlan] Update HCFG builder after 72937203dd3b to fix leak.
Update buildPlainCFG to re-use the vector and latch VPBBs created as
part of the initial skeleton in 72937203dd3b.

This should fix the leak sanitizer failure discovered by
https://lab.llvm.org/buildbot/#/builders/52/builds/619.
2024-07-09 15:28:43 +01:00
Florian Hahn
0577cdaa32
[LV] Split checking if tail-folding is possible, collecting masked ops. (#77612)
Introduce new canFoldTail helper which only checks if tail-folding is
possible, but without modifying MaskedOps.

Just because tail-folding is possible doesn't mean the tail will be
folded; that's up to the cost-model to decide. Separating the check if
tail-folding is possible and preparing for tail-folding makes sure that
MaskedOps is only populated when tail-folding is actually selected.

PR: https://github.com/llvm/llvm-project/pull/77612
2024-07-08 16:34:42 +01:00
Florian Hahn
27ccc8835e
[LV] Add tests with ephemeral values that are widened.
Add tests with loops with ephemeral values that are widened.
After 29b8b72117, @ephemeral_load_and_compare_another_load_used_outside
is vectorized even though the only vector values that are generated are
ephemeral.
2024-07-08 13:15:39 +01:00
Florian Hahn
29b8b72117
[LV] Move check if any vector insts will be generated to VPlan. (#96622)
This patch moves the check if any vector instructions will be generated
from getInstructionCost to be based on VPlan. This simplifies
getInstructionCost, is more accurate as we check the final result and
also allows us to exit early once we visit a recipe that generates
vector instructions.

The helper can then be re-used by the VPlan-based cost model to match
the legacy selectVectorizationFactor behavior, this fixing a crash and
paving the way to recommit
https://github.com/llvm/llvm-project/pull/92555.

PR: https://github.com/llvm/llvm-project/pull/96622
2024-07-07 20:08:01 +01:00
Florian Hahn
ac03ae30cf
[LV] Preserve LAA in LoopVectorize (NFCI).
LoopVectorize already always preserves DT, LI and SCEV. If any changes
get made to the CFG, cached LAA info for loops are cleared.

LoopAccessAnalysis also implements ::invalidate to clear the analysis if
SE, DT or LI gets invalidated. Hence it should be safe to preserve LAA
and save a small amount of compile-time.
2024-07-05 21:41:31 +01:00
Florian Hahn
959ff45bda
[LV] Regenerate test checks for zero_unroll.ll (NFC).
Regenerate test checks to better show impact of
https://github.com/llvm/llvm-project/pull/96622.
2024-07-05 11:37:13 +01:00
Florian Hahn
99d6c6d936
[VPlan] Model branch cond to enter scalar epilogue in VPlan. (#92651)
This patch moves branch condition creation to enter the scalar epilogue
loop to VPlan. Modeling the branch in the middle block also requires
modeling the successor blocks. This is done using the recently
introduced VPIRBasicBlock.

Note that the middle.block is still created as part of the skeleton and
then patched in during VPlan execution. Unfortunately the skeleton needs
to create the middle.block early on, as it is also used for induction
resume value creation and is also needed to properly update the
dominator tree during skeleton creation.

After this patch lands, I plan to move induction resume value and phi
node creation in the scalar preheader to VPlan. Once that is done, we
should be able to create the middle.block in VPlan directly.

This is a re-worked version based on the earlier
https://reviews.llvm.org/D150398 and the main change is the use of
VPIRBasicBlock.

Depends on https://github.com/llvm/llvm-project/pull/92525

PR: https://github.com/llvm/llvm-project/pull/92651
2024-07-05 10:08:42 +01:00
Florian Hahn
2b3b405b09
[LV] Don't vectorize first-order recurrence with VF <vscale x 1 x ..>
The assertion added as part of https://github.com/llvm/llvm-project/pull/93395
surfaced cases where first-order recurrences are vectorized with
<vscale x 1 x ..>. If vscale is 1, then we are unable to extract the
penultimate value (second to last lane). Previously this case got
mis-compiled, trying to extract from an invalid lane (-1)
https://llvm.godbolt.org/z/3adzYYcf9.

Fixes https://github.com/llvm/llvm-project/issues/97452.
2024-07-04 11:44:51 +01:00
Noah Goldstein
7c96469ea8 [ValueTracking] Extend LHS/RHS with matching operand to work without constants.
Previously we only handled the `L0 == R0` case if both `L1` and `R1`
where constant.

We can get more out of the analysis using general constant ranges
instead.

For example, `X u> Y` implies `X != 0`.

In general, any strict comparison on `X` implies that `X` is not equal
to the boundary value for the sign and constant ranges with/without
sign bits can be useful in deducing implications.

Closes #85557
2024-07-03 20:18:51 +08:00
Philip Reames
46f42d4db9 Revert "[test] Autogenerate a test in advance of an upcoming change"
This reverts commit a8e1c3e1239604ac787b6a2d39b5278ddec8aa8a.  Appears
to be causing at least one bot failure.
2024-07-01 09:52:56 -07:00
Philip Reames
a8e1c3e123 [test] Autogenerate a test in advance of an upcoming change 2024-07-01 09:02:54 -07:00
Noah Goldstein
2632680006 [InstCombine] Canonicalize (gep <not i8> p, (div exact X, C))
If C % sizeof(gep_element_type) is zero, we can canonicalize to `i8` via:
    `(gep i8 p, (div exact X, C / (sizeof(gep_element_type))))`

Closes #96898
2024-07-01 22:22:35 +08:00
Nikita Popov
77eb056830
[InstCombine] Simplify select using KnownBits of condition (#95923)
Simplify the arms of a select based on the KnownBits implied by its condition.
For now this only handles the case where the select arm folds to a constant,
but this can be generalized to handle other patterns by using
SimplifyDemandedBits instead (in that case we would also have to limit to
non-undef conditions).

This is implemented by adding a new member to SimplifyQuery that can be used
to inject an additional condition. The affected values are pre-computed and
we don't call computeKnownBits() if the select arms don't contain affected
values. This reduces the cost in some pathological cases.
2024-07-01 09:26:01 +02:00
Florian Hahn
06079233f8
[VPlan] Return std::nullopt early if plans are empty.
Fixes a crash caused by abf5969.
2024-06-27 12:25:59 +01:00
Kolya Panchenko
49e5cd2acc
[LV][NFC] Marked functions as const. Added LLVM_DEBUG. (#96681) 2024-06-26 17:38:18 -04:00
David Green
352a836176
[InstCombine] Canonicalize non-i8 gep of mul to i8 (#96606)
This is a small canonicalization for `gep i32, p, (mul x, C)` -> `gep
i8, p, (mul x, C*4)`, so that the mul can combine both of the constant
multiplications, and we take a small step towards canonicalizing more
geps to i8.

It currently doesn't attempt to check for multiple uses on the mul, but
that should be possible if it sounds better. Let me know what you think
of the idea in general.
2024-06-26 14:25:54 +01:00
Florian Hahn
8681bb8bed
[LV] Add additional test coverage for cost modeling.
Add missing tests uncovered by
https://github.com/llvm/llvm-project/pull/92555.

Includes test for https://github.com/llvm/llvm-project/issues/96294 and
https://github.com/llvm/llvm-project/issues/96328
2024-06-26 10:18:01 +01:00
David Sherwood
ec9ce89a08
[LoopVectorize] Fix build issue caused by #95920 (#96647) 2024-06-25 15:51:32 +01:00
David Sherwood
2dd4167a09
[LoopVectorize][AArch64] Add limited support for scalable vectorisation of i1 types (#95920)
Previously isElementTypeLegalForScalableVector returned false for i1
types, which also prevented vectorisation of loops with i1 reductions.
This is overkill - we only need to disable vectorisation for loads
and/or stores of i1 types. I've added i1 as a legal type, but changed
the cost model to return an invalid cost for loads and stores.
2024-06-25 15:04:24 +01:00
Nikita Popov
abc8c4be3b [LoopVectorize] Generate test checks (NFC) 2024-06-25 14:46:12 +02:00
Ramkumar Ramachandra
0f111ba790
LoopInfo: introduce Loop::getLocStr; unify debug output (#93051)
Introduce a Loop::getLocStr stolen from LoopVectorize's static function
getDebugLocString in order to have uniform debug output headers across
LoopVectorize, LoopAccessAnalysis, and LoopDistribute. The motivation
for this change is to have UpdateTestChecks recognize the headers and
automatically generate CHECK lines for debug output, with minimal
special-casing.
2024-06-25 13:12:15 +01:00
Florian Hahn
a2e915704f
[LV] Make create-induction-resume.ll more robust by adding store.
Without the store, the vector loop body is empty. Add a store to avoid
that, while not impacting the induction resume values that are created.
2024-06-25 11:14:13 +01:00
Nikita Popov
eeb0884e66 [LoopUnroll] Use poison instead of undef for preheader value 2024-06-25 12:09:58 +02:00
Sander de Smalen
738533c84a
[AArch64] Consider streaming mode in TTI interfaces for vectorization. (#96305)
At the moment, vectorization is only enabled in streaming(-compatible)
mode when enabled through an option. But the interfaces should check
more than just 'hasSVE()', because a function with +sme in streaming
mode should also vectorize with the option enabled.

Additionally, a streaming-compatible function should only be able to use
fixed-length autovec if SVE is available, otherwise the vector code will
be scalarised by the backend.
2024-06-24 11:06:16 +01:00
Florian Hahn
abf5969f76
[VPlan] Don't compute costs if there are no vector VPlans.
In some cases, no vector VPlans can be constructed due to failing VPlan
legality checks (e.g. unable to perform sinking for first order
recurrences or plans being incompatible with EVL).

There's no need to compute costs in those cases, so check directly if
there are no vector plans.
2024-06-24 08:38:31 +01:00
Florian Hahn
f0c674f680
[LV] Add test showing cost is computed when there are no vector plans.
Add test showing unnecessary cost computations, as no vector VPlans are
generated.
2024-06-24 08:08:56 +01:00
Florian Hahn
f1f3c34b47
Revert "Recommit "[VPlan] First step towards VPlan cost modeling. (#92555)""
This reverts commit 242cc200ccb24e22eaf54aed7b0b0c84cfc54c0b and
eea150c84053035163f307b46549a2997a343ce9, as it is causing a build bot
failure and there have been a number of crashes reported at
https://github.com/llvm/llvm-project/pull/92555
2024-06-21 19:54:21 +01:00
Sander de Smalen
747f9dacfe [AArch64] NFC: Precommit new RUN lines to test sme-vectorize.ll 2024-06-21 13:29:21 +00:00
Florian Hahn
eea150c840
[VPlan] Include IV phi and backedge cost in VPlan cost computation.
In WebAssembly, costs != 0 are assigned to be backedge and induction
phis, so make sure we include those costs in the VPlan-based cost model.

This fixes a downstream crash with WebAssembly after 242cc200ccb
(https://github.com/llvm/llvm-project/pull/92555)
2024-06-20 20:44:17 +01:00
Florian Hahn
242cc200cc
Recommit "[VPlan] First step towards VPlan cost modeling. (#92555)"
This reverts commit 6f538f6a2d3224efda985e9eb09012fa4275ea92.

Extra tests for crashes discovered when building Chromium have been
added in fb86cb7ec157689e, 3be7312f81ad2.

Original message:
This adds a new interface to compute the cost of recipes, VPBasicBlocks,
VPRegionBlocks and VPlan, initially falling back to the legacy cost model
for all recipes. Follow-up patches will gradually migrate recipes to
compute their own costs step-by-step.

It also adds getBestPlan function to LVP which computes the cost of all
VPlans and picks the most profitable one together with the most
profitable VF.

The VPlan selected by the VPlan cost model is executed and there is an
assert to catch cases where the VPlan cost model and the legacy cost
model disagree. Even though I checked a number of different build
configurations on AArch64 and X86, there may be some differences
that have been missed.

Additional discussions and context can be found in @arcbbb's
https://github.com/llvm/llvm-project/pull/67647 and
https://github.com/llvm/llvm-project/pull/67934 which is an earlier
version of the current PR.

PR: https://github.com/llvm/llvm-project/pull/92555
2024-06-20 17:32:52 +01:00
Florian Hahn
c07be08df5
[LV] Add tail folding test with scalarized store and wide header mask.
Add additional test with salarized store which caused crashes with
earlier versions of https://github.com/llvm/llvm-project/pull/92555.
2024-06-20 17:24:59 +01:00
Florian Hahn
3808ba78de
[VPlan] Model middle block via VPIRBasicBlock. (#95816)
Use VPIRBasicBlock to wrap the middle block and implement patching up
branches in predecessors in VPIRBasicBlock::execute. The IR middle block
is only created after skeleton creation. Initially a regular
VPBasicBlock is created, which will later be replaced by a
VPIRBasicBlock once the middle IR basic block has been created.

Note that this slightly changes the order of instructions created in the
middle block; code generated by recipe execution in the middle block
will now be inserted before the terminator (and in between the compare
to used by the terminator). The original order will be restored in
https://github.com/llvm/llvm-project/pull/92651.


PR: https://github.com/llvm/llvm-project/pull/95816
2024-06-20 13:42:20 +01:00
Florian Hahn
ffc51b966e
[LV] Remove loads from null from pr73894.ll test.
Load from null is UB, load from pointer arg instead.
2024-06-20 10:57:25 +01:00
Florian Hahn
b9702bb12f
[LV] Consider insts feeding interleave group pointers free.
For interleave groups, we only generate a pointer for the start of the
interleave group (the instruction at the insert position). The other
addresses for other members are alreayd considered free, but so are
their operands, if they are only used in address computations for
other interleave group members.
2024-06-19 17:06:52 +01:00
Philip Reames
cb76896d6e
[SCEVExpander] Recognize urem idiom during expansion (#96005)
If we have a urem expression, emitting it as a urem is significantly
better that letting the fully expansion kick in. We have the risk of a
udiv or mul which could have previously been shared, but loosing that
seems like a reasonable tradeoff for being able to round trip a urem w/o
modification.
2024-06-19 08:40:04 -07:00
Florian Hahn
3be7312f81
[LV] Add more masked store cost tests with different masks.
Add additional masked store tests which caused crashes with earlier
versions of https://github.com/llvm/llvm-project/pull/92555.
2024-06-19 15:34:03 +01:00
Florian Hahn
fb86cb7ec1
[LV] Add extra tests for interleave-group, reduction store costing.
Add extra cost model tests exposed by VPlan cost-model transition,
causing revert in 6f538f6a2d3224efda985e9eb09012fa4275ea92
2024-06-18 14:35:51 +01:00
Florian Hahn
7c0c9d640d
[LV] Add tests with multiple conditions feedin exit branches.
Test cases for the recent buildbot failures:
        https://lab.llvm.org/buildbot/#/builders/17/builds/47
        https://lab.llvm.org/buildbot/#/builders/168/builds/37
2024-06-15 21:31:37 +01:00
Farzon Lotfi
6355fb45a5
[CodeGen] Support vectors across all backends (#95518)
Add a default f16 type promotion
2024-06-14 17:18:20 -04:00
Florian Hahn
40a72f8cc4
[VPlan] Support extracting any lane of uniform value.
If the value we are extracting a lane from is uniform, only the first
lane will be set. Return lane 0 for any requested lane.

This fixes a crash when trying to extract the last lane for a
first-order recurrence resume value.

Fixes https://github.com/llvm/llvm-project/issues/95520.
2024-06-14 22:16:52 +01:00
Arthur Eubanks
6f538f6a2d Revert "Recommit "[VPlan] First step towards VPlan cost modeling. (#92555)""
This reverts commit 90fd99c0795711e1cf762a02b29b0a702f86a264.
This reverts commit 43e6f46936e177e47de6627a74b047ba27561b44.

Causes crashes, see comments on https://github.com/llvm/llvm-project/pull/92555.
2024-06-14 17:47:08 +00:00
Stephen Tozer
094572701d
[RemoveDIs] Print IR with debug records by default (#91724)
This patch makes the final major change of the RemoveDIs project, changing the
default IR output from debug intrinsics to debug records. This is expected to
break a large number of tests: every single one that tests for uses or
declarations of debug intrinsics and does not explicitly disable writing
records. 

If this patch has broken your downstream tests (or upstream tests on a
configuration I wasn't able to run):
1. If you need to immediately unblock a build, pass
`--write-experimental-debuginfo=false` to LLVM's option processing for all
failing tests (remember to use `-mllvm` for clang/flang to forward arguments to
LLVM).
2. For most test failures, the changes are trivial and mechanical, enough that
they can be done by script; see the migration guide for a guide on how to do
this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates
3. If any tests fail for reasons other than FileCheck check lines that need
updating, such as assertion failures, that is most likely a real bug with this
patch and should be reported as such.

For more information, see the recent PSA:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
2024-06-14 15:07:27 +01:00
Florian Hahn
90fd99c079
Recommit "[VPlan] First step towards VPlan cost modeling. (#92555)"
This reverts commit 46080abe9b136821eda2a1a27d8a13ceac349f8c.

Extra tests have been added in 52d29eb287.

Original message:
This adds a new interface to compute the cost of recipes, VPBasicBlocks,
VPRegionBlocks and VPlan, initially falling back to the legacy cost model
for all recipes. Follow-up patches will gradually migrate recipes to
compute their own costs step-by-step.

It also adds getBestPlan function to LVP which computes the cost of all
VPlans and picks the most profitable one together with the most
profitable VF.

The VPlan selected by the VPlan cost model is executed and there is an
assert to catch cases where the VPlan cost model and the legacy cost
model disagree. Even though I checked a number of different build
configurations on AArch64 and X86, there may be some differences
that have been missed.

Additional discussions and context can be found in @arcbbb's
https://github.com/llvm/llvm-project/pull/67647 and
https://github.com/llvm/llvm-project/pull/67934 which is an earlier
version of the current PR.

PR: https://github.com/llvm/llvm-project/pull/92555
2024-06-14 12:33:48 +01:00
Florian Hahn
52d29eb287
[LV] Add extra cost model tests with truncated inductions.
Extra test cases that caused revert of
https://github.com/llvm/llvm-project/pull/92555
2024-06-13 20:42:53 +01:00
Jay Foad
d4a0154902
[llvm-project] Fix typo "seperate" (#95373) 2024-06-13 20:20:27 +01:00
Arthur Eubanks
46080abe9b Revert "[VPlan] First step towards VPlan cost modeling. (#92555)"
This reverts commit 00798354c553d48d27006a2b06a904bd6013e31b.

Causes crashes, see comments on https://github.com/llvm/llvm-project/pull/92555.
2024-06-13 16:37:21 +00:00
Florian Hahn
00798354c5
[VPlan] First step towards VPlan cost modeling. (#92555)
This adds a new interface to compute the cost of recipes, VPBasicBlocks,
VPRegionBlocks and VPlan, initially falling back to the legacy cost model
for all recipes. Follow-up patches will gradually migrate recipes to 
compute their own costs step-by-step.

It also adds getBestPlan function to LVP which computes the cost of all
VPlans and picks the most profitable one together with the most
profitable VF.

The VPlan selected by the VPlan cost model is executed and there is an
assert to catch cases where the VPlan cost model and the legacy cost
model disagree. Even though I checked a number of different build
configurations on AArch64 and X86, there may be some differences
that have been missed.

Additional discussions and context can be found in @arcbbb's
https://github.com/llvm/llvm-project/pull/67647 and 
https://github.com/llvm/llvm-project/pull/67934 which is an earlier
version of the current PR.


PR: https://github.com/llvm/llvm-project/pull/92555
2024-06-13 14:26:18 +01:00