6091 Commits

Author SHA1 Message Date
vporpo
47d9473e49
[SandboxVec][BottomUpVec] Fix ownership of Legality (#143018)
Fix the ownership of `Legality` member variable of BottomUpVec. It
should get created in runOnFunction() and get destroyed when the
function returns.
2025-06-06 12:21:25 -07:00
Florian Hahn
01b9828a66
[VPlan] Remove unneeded friend classes from VPValue (NFC).
None of the removed classes makes use of the friendship relationship.
2025-06-05 21:40:21 +01:00
Vasileios Porpodas
79861d2db7 Reapply "[SandboxVec] Add a simple pack reuse pass (#141848)"
This reverts commit 31abf0774232735ad7a7d45e531497305bf99fae.
2025-06-05 09:14:17 -07:00
Florian Hahn
2e337349f4
[VPlan] Remove unnecessary DomTreeUpdater flush (NFC).
The current version does not need the explicit flush at this point.
2025-06-05 08:17:42 +01:00
Vasileios Porpodas
31abf07742 Revert "[SandboxVec] Add a simple pack reuse pass (#141848)"
This reverts commit 1268352656f81ea173860a8002aadb88844137e7.
2025-06-04 14:24:49 -07:00
vporpo
1268352656
[SandboxVec] Add a simple pack reuse pass (#141848)
This patch implements a simple pass that tries to de-duplicate packs. If
there are two packing patterns inserting the exact same values in the
exact same order, then we will keep the top-most one of them. Even
though such patterns may be optimized away by subsequent passes it is
still useful to do this within the vectorizer because otherwise the cost
estimation may be off, making the vectorizer over conservative.
2025-06-04 14:12:06 -07:00
Ramkumar Ramachandra
b40e4ceaa6
[ValueTracking] Make Depth last default arg (NFC) (#142384)
Having a finite Depth (or recursion limit) for computeKnownBits is very
limiting, but is currently a load-bearing necessity, as all KnownBits
are recomputed on each call and there is no caching. As a prerequisite
for an effort to remove the recursion limit altogether, either using a
clever caching technique, or writing a easily-invalidable KnownBits
analysis, make the Depth argument in APIs in ValueTracking uniformly the
last argument with a default value. This would aid in removing the
argument when the time comes, as many callers that currently pass 0
explicitly are now updated to omit the argument altogether.
2025-06-03 17:12:24 +01:00
Ramkumar Ramachandra
6716d4eaa8
[LV] Prefer DenseMap::lookup over find (NFC) (#141809)
Apart from the stylistic improvement, lookup has the nice property of
returning a default-constructed object on failure-to-find, while find
returns the end iterator, which cannot be dereferenced.
2025-06-03 14:37:19 +01:00
Florian Hahn
5520ab3d50
[VPlan] Add ComputeAnyOfResult VPInstruction (NFC) (#141932)
Add a dedicated opcode for any-of reduction, similar to
https://github.com/llvm/llvm-project/pull/132689 and
https://github.com/llvm/llvm-project/pull/132690.

The patch also explictly adds the start value to not require
RecurrenceDescriptor during execute. It also allows freezing the start
value to make it poison-safe.

PR: https://github.com/llvm/llvm-project/pull/141932
2025-06-03 14:33:53 +01:00
Luke Lau
ddfeecf4c5
[VPlan] Convert to concrete recipes before dissolving loop regions. NFCI (#141999)
After updating #118638 on tip of tree, expanding
VPWidenIntOrFpInductionRecipes fails because it needs the loop region to
get the latch to insert the increment into:

VPBasicBlock *ExitingBB =
Plan->getVectorLoopRegion()->getExitingBasicBlock();
Builder.setInsertPoint(ExitingBB,
ExitingBB->getTerminator()->getIterator());
    auto *Next = Builder.createNaryOp(AddOp, {Prev, Inc}, Flags,
WidenIVR->getDebugLoc(), "vec.ind.next");

However after #117506, the region is dissolved so it doesn't work.

This shuffles the dissolveLoopRegions steps to be after
convertToConcreteRecipes so we can use the region when expanding
VPWidenIntOrFpInductionRecipes
2025-06-03 12:05:13 +01:00
Florian Hahn
2eab83f618
[VPlan] Remove CanonicalIV when dissolving loop regions (NFC). (#142372)
Directly replace the canonical IV when we dissolve the containing
region. That ensures that it won't get removed before the region gets
removed, which would result in an invalid region.

This removes the current ordering constraint between
convertToConcreteRecipes and dissolving regions.

PR: https://github.com/llvm/llvm-project/pull/142372
2025-06-03 10:05:28 +01:00
Florian Hahn
11713e86b0
[LV] Move VPlan-based calculateRegisterUsage to VPlanAnalysis (NFC). (#135673)
Move VPlan-based calculateRegisterUsage from LoopVectorize
to VPlanAnalysis.cpp. It is a VPlan-based analysis and this helps
to reduce the size of LoopVectorize.

PR: https://github.com/llvm/llvm-project/pull/135673
2025-06-02 17:40:50 +01:00
Ramkumar Ramachandra
b8c4eea3d8
[VPlan] Simplify PredPHI LiveIn -> LiveIn (#142271)
5f39be5 ([VPlan] Use InstSimplifyFolder instead of TargetFolder) updated
simplifyRecipe to fold live-ins to Values that are not necessarily
Constant, but forgot to update the corresponding PredPHI folder, which
still folds PredPHI constant -> constant. Update it to fold PredPHI
LiveIn -> LiveIn.

Fixes #141968.
2025-06-02 14:56:35 +01:00
Florian Hahn
3b474bc510
[VPlan] Use VPSingleDef in simplifyRecipe (NFC).
All simplifications are applied to VPSingleDefRecipes. Check for them
early to skip unnecessary work and remove a number of getVPSingleValue
calls.
2025-06-01 15:32:02 +01:00
Florian Hahn
33bbce5e34
[VPlan] Get plan once in simplifyRecipe (NFC).
Also check once if the plan is unrolled at the end, to make it easier to
add more transforms that apply after unrolling.
2025-06-01 12:46:08 +01:00
Kazu Hirata
c0bf51e3ad [Vectorize] Fix a warning
This patch fixes:

  llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:1865:17: error:
  unused variable 'Preds' [-Werror,-Wunused-variable]
2025-05-31 14:40:19 -07:00
Florian Hahn
0f00a96fed
[VPlan] Simplify branch on False in VPlan transform (NFC). (#140409)
Simplify branch on false, starting with the branch from the middle block
to the scalar preheader. Initially this helps simplifying the initial
VPlan construction.

Depends on https://github.com/llvm/llvm-project/pull/140405.

PR: https://github.com/llvm/llvm-project/pull/140409
2025-05-31 20:32:45 +01:00
Jon Roelofs
798058fca5
[Remarks] Remove an upcast footgun. NFC (#142191)
CodeRegion's were previously passed as Value*, but then immediately
upcast to BasicBlock. Let's keep the type information around until the
use cases for non-BasicBlock code regions actually materialize.
2025-05-31 11:07:54 -07:00
Ramkumar Ramachandra
f057a593a7
[VPlan] Improve code in VPWidenCallRecipe (NFC) (#141926)
Use operands() instead of {op_begin(), op_end()}. Also rename
arg_operands to args to match CallBase.
2025-05-31 15:41:20 +02:00
Ramkumar Ramachandra
07ba406cbd
[VPlan] Improve code in VPWidenIntrinsic (NFC) (#141936)
Use operands() instead of {op_begin(), op_end()}.
2025-05-31 15:20:28 +02:00
Florian Hahn
78eafb14f7
[VPlan] Add getIndexFor(Predecessor|Successor) helpers (NFC).
Move code to get the index of a predecessor and successor to helpers in
VPBlockBase, to avoid duplication and enable future reuse.

Split off from https://github.com/llvm/llvm-project/pull/140409.
2025-05-31 12:53:05 +01:00
Florian Hahn
c3cce7caf8
[VPlan] Remove unused VPUser constructors (NFC).
Now all users construct VPUsers using VPUser(ArraryRef<VPValue *>).
Remove the other unused constructors.
2025-05-31 12:20:32 +01:00
Florian Hahn
e4ef651695
[VPlan] Simplify VPReductionPHIRecipe::execute (NFC).
Simplify VPReductionPHIRecipe::execute by handling the simple cases
first, by directly using State.get() to the appropriate start value.
2025-05-30 15:56:44 +01:00
Florian Hahn
10bd4cd9cd
[VPlan] Remove ResumePhi opcode, use regular PHI instead (NFC). (#140405)
Use regular VPPhi instead of a separate opcode for resume phis. This
removes an unneeded specialized opcode and unifies the code
(verification, printing, updating when CFG is changed).

Depends on https://github.com/llvm/llvm-project/pull/140132.

PR: https://github.com/llvm/llvm-project/pull/140405
2025-05-30 12:50:08 +01:00
Florian Hahn
417e43ad43
[LV] Set PhiTy once in adjustRecipesForReductions (NFC). 2025-05-30 08:33:15 +01:00
Alexey Bataev
cb648ba970 [SLP]Check if the user node has instructions, used only outside
Gather nodes with parents, which scalar instructions are used only
outside, are generated before the whole tree vectorization. Need to
teach isGatherShuffledSingleRegisterEntry to check that such nodes are
emitted first and they cannot depend on other nodes, which are emitted
later.

Fixes #141628
2025-05-29 10:09:49 -07:00
Florian Hahn
9ea4924720
[VPlan] Use EMIT-SCALAR for single-scalar VPPhis (NFC).
Follow-up to https://github.com/llvm/llvm-project/pull/141428, to also
use EMIT-SCALAR for VPPhis that are single scalars.
2025-05-29 11:20:07 +01:00
Florian Hahn
5b85e4b08d
[VPlan] Use EMIT-SCALAR when printing single-scalar VPInstructions. (#141428)
By using SINGLE-SCALAR when printing, it is clear in the debug output
that those VPInstructions only produce a single scalar.

Split off in preparation for
https://github.com/llvm/llvm-project/pull/140623.

PR: https://github.com/llvm/llvm-project/pull/141428
2025-05-29 09:29:06 +01:00
Ramkumar Ramachandra
663aea2601
[LV] Clean up unused template args of min/max (NFC) (#141778) 2025-05-29 09:57:22 +02:00
Elvis Wang
332fe08f1d
[VPlan] Implement VPlan-based cost model for VPReduction, VPExtendedReduction and VPMulAccumulateReduction. (#113903)
This patch implement the VPlan-based cost model for VPReduction,
VPExtendedReduction and VPMulAccumulateReduction.

With this patch, we can calculate the reduction cost by the VPlan-based
cost model so remove the reduction costs in `precomputeCost()`.

Ref: Original instruction based implementation:
https://reviews.llvm.org/D93476
2025-05-29 11:15:16 +08:00
Florian Hahn
249301c779
[LoopUtils] Pass sentinel value directly to createFindLastIVRed (NFC).
Now that there is only a single FindLastIV recurrence kind, simply pass
the sentinel value instead of the full recurrence descriptor to tighten
the interface.
2025-05-28 22:00:11 +01:00
Florian Hahn
0d7b34bfc1
[LoopUtils] Pass start value directly to createAnyOfReduction (NFC).
Now that there is only a single AnyOf recurrence kind, simply pass the
start value instead of the full recurrence descriptor, to tighten the
interface.
2025-05-28 21:28:02 +01:00
Florian Hahn
440a8adb86
[VPlan] Use VPIRFlags to manage FMFs for ComputeReductionResult (NFC).
Manage fast-math flags using VPIRFlags from VPInstruciton, in inline
with other VPInstructions. With this change, we now print the correctly
flags for ComputeReductionResult, other than that NFC.
2025-05-28 20:54:58 +01:00
vporpo
9c6a442f29
[SandboxVec] Add TransactionAlwaysRevert pass (#141688)
This patch adds a region pass that reverts the IR state unconditionally.
This is used for testing.
2025-05-28 10:57:58 -07:00
Luke Lau
2e7489c8c8 [VectorCombine] Fix build on gcc-7.5
Hopefully this fixes the build failure at
https://lab.llvm.org/buildbot/#/builders/116/builds/13423. gcc-14
seems to be able to deduce the type and compile this fine, but for
gcc-7 we need to avoid the Use/Value mismatch I guess.
2025-05-28 10:55:38 +01:00
Ramkumar Ramachandra
5f39be5917
[VPlan] Use InstSimplifyFolder instead of TargetFolder (#141222)
For more powerful folding with operands that are not necessarily
all-constant, use InstSimplifyFolder instead of TargetFolder in
tryToConstantFold, and rename the function tryToFoldLiveIns.
2025-05-28 11:00:14 +02:00
Luke Lau
2b9ded64b0
[VectorCombine] Support nary operands and intrinsics in scalarizeOpOrCmp (#138406)
This adds support for unary operands, and unary + ternary intrinsics in
scalarizeOpOrCmp (FKA scalarizeBinOpOrCmp).

The motivation behind this is to scalarize more intrinsics in
VectorCombine rather than in DAGCombine, so we can sink splats across
basic blocks: see https://github.com/llvm/llvm-project/pull/137786

The main change required is to generalize the existing VecC0/VecC1 rules
across n-ary ops:

- An operand can either be a constant vector or an insert of a scalar
into a constant vector
- If it's an insert, the index needs to be static and in bounds
- If it's an insert, all indices need to be the same across all operands
- If all the operands are constant vectors, bail as it will get constant
folded anyway
2025-05-28 09:45:54 +01:00
Ramkumar Ramachandra
a8edb6a548
[VPlan] Improve cast code in VPlanRecipes (NFC) (#141240) 2025-05-27 22:31:46 +02:00
Florian Hahn
ad58ea3ba8
[VPlan] Bail out before construction VPlan0 if MinVF > MaxVF.
This reduces the cases where we need to create initial VPlans
unnecessarily after 567b3172da2d52f5df70a37f3de06b7000b25968.

buildVPlansWithVPRecipes is called with MinVF > MaxVF if the target does
not support scalable vectors.

Recovers some of the compile-time impact
http://llvm-compile-time-tracker.com/compare.php?from=3033f202f6707937cd28c2473479db134993f96f&to=1a0b9e5834f7fd4abf058864e656f8e26b7a26ff&stat=instructions:u
2025-05-27 21:19:11 +01:00
Luke Lau
97f6076ded
[VectorCombine][X86] Use updated getVectorInstrCost hook (#137823)
This addresses a TODO where previously scalarizeBinopOrCmp
conservatively bailed if one of the operands was a load.

getVectorInstrCost was updated to take in values in
https://reviews.llvm.org/D140498 so we can pass in the scalar value to
be inserted, which should return an accurate cost for a gather.

To prevent regressions on x86 this tries to constant fold NewVecC up
front so we can pass it into TTI and get a more accurate cost.

We want to remove this restriction on RISC-V since this is always
profitable whether or not the scalar is a load.
2025-05-27 16:27:28 +01:00
Florian Hahn
d56deea1e4
[VPlan] Connect Entry to scalar preheader during initial construction. (#140132)
Update initial construction to connect the Plan's entry to the scalar
preheader during initial construction. This moves a small part of the
 skeleton creation out of ILV and will also enable replacing
 VPInstruction::ResumePhi with regular VPPhi recipes.

Resume phis need 2 incoming values to start with, the second being the
bypass value from the scalar ph (and used to replicate the incoming
value for other bypass blocks). Adding the extra edge ensures we
incoming values for resume phis match the incoming blocks.

PR: https://github.com/llvm/llvm-project/pull/140132
2025-05-27 16:07:56 +01:00
Kazu Hirata
89308de4b0
[llvm] Value-initialize values with *Map::try_emplace (NFC) (#141522)
try_emplace value-initializes values, so we do not need to pass
nullptr to try_emplace when the value types are raw pointers or
std::unique_ptr<T>.
2025-05-26 15:13:02 -07:00
Florian Hahn
567b3172da
[VPlan] Construct initial once and pass clones to tryToBuildVPlan (NFC). (#141363)
Update to only build an initial, plain-CFG VPlan once, and then
transform & optimize clones.

This requires changes to ::clone() for VPInstruction and
VPWidenPHIRecipe to allow for proper cloning of the recipes in the
initial VPlan.

PR: https://github.com/llvm/llvm-project/pull/141363
2025-05-26 13:42:47 +01:00
Simon Pilgrim
63eb00483f VPlanRecipes.cpp - fix "not all control paths return a value" MSVC warning 2025-05-25 15:16:01 +01:00
Florian Hahn
c0506a11f4
[VPlan] Separate out logic to manage IR flags to VPIRFlags (NFC). (#140621)
This patch moves the logic to manage IR flags to a separate VPIRFlags
class. For now, VPRecipeWithIRFlags is the only class that inherits
VPIRFlags. The new class allows for simpler passing of flags when
constructing recipes, simplifying the constructors for various recipes
(VPInstruction in particular, which now just has 2 constructors, one
taking an extra VPIRFlags argument.

This mirrors the approach taken for VPIRMetadata and makes it easier to
extend in the future. The patch also adds a unified flagsValidForOpcode
to check if the flags in a VPIRFlags match the provided opcode.

PR: https://github.com/llvm/llvm-project/pull/140621
2025-05-25 11:13:11 +01:00
Florian Hahn
dcef154b5c
[VPlan] Replace VPRegionBlock with explicit CFG before execute (NFCI). (#117506)
Building on top of https://github.com/llvm/llvm-project/pull/114305,
replace VPRegionBlocks with explicit CFG before executing.

This brings the final VPlan closer to the IR that is generated and
helps to simplify codegen.

It will also enable further simplifications of phi handling during
execution and transformations that do not have to preserve the 
canonical IV required by loop regions. This for example could include
replacing the canonical IV with an EVL based phi while completely
removing the original canonical IV.

PR: https://github.com/llvm/llvm-project/pull/117506
2025-05-24 19:17:16 +01:00
Kazu Hirata
0ef8ef66cc
[Transforms] Remove unused includes (NFC) (#141357)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-05-24 09:37:43 -07:00
Florian Hahn
e089d48944
[VPlan] VPWidenGEPRecipe uses first lane of invariant indices (NFC)
Update VPWidenGEPRecipe::onlyFirstLaneUsed to return true for indices
that are defined outside the loop regions, if the base pointer is not
invariant.
2025-05-24 17:32:05 +01:00
Ramkumar Ramachandra
1e4841881e
[VPlan] Strip dead includes in VPRecipeBuilder (NFC) (#141239) 2025-05-24 16:19:17 +01:00
Alexey Bataev
aa452b65fc [SLP]Restore insertion points after gathers vectorization
Restore insertion points after gathers vectorization to avoid a crash in
a root node vectorization.

Fixes #141265
2025-05-24 07:25:20 -07:00