310 Commits

Author SHA1 Message Date
Florian Hahn
177f27d220
[VPlan] Add incoming_[blocks,values] iterators to VPPhiAccessors (NFC) (#138472)
Add 3 new iterator ranges to VPPhiAccessors

* incoming_values(): returns a range over the incoming
  values of a phi 
* incoming_blocks(): returns a range over the incoming 
  blocks of a phi
* incoming_values_and_blocks: returns a range over pairs of
   incoming values and blocks.

Depends on https://github.com/llvm/llvm-project/pull/124838.

PR: https://github.com/llvm/llvm-project/pull/138472
2025-08-14 16:47:04 +01:00
Florian Hahn
06fd0f9d65
[VPlan] Move initial skeleton construction earlier (NFC). (#150848)
Split up the not clearly named prepareForVectorization transform into
buildVPlan0, which adds the vector preheader, middle and scalar
preheader blocks, as well as the canonical induction recipes and sets
the trip count. The new transform is run directly after building the
plain CFG VPlan initially.

The remaining code handling early exits and adding the branch in the
middle block is renamed to handleEarlyExitsAndAddMiddleCheck and still
runs at the original position.

With the code movement, we only have to add the skeleton once to the
initial VPlan, and cloning will take care of the rest. It will also
enable moving other construction steps to work directly on VPlan0, like
adding resume phis.

PR: https://github.com/llvm/llvm-project/pull/150848
2025-08-09 20:54:42 +01:00
Florian Hahn
e80e7e717e
[VPlan] Use scalar VPPhi instead of VPWidenPHIRecipe in createPlainCFG. (#150847)
The initial VPlan closely reflects the original scalar loop, so unsing
VPWidenPHIRecipe here is premature. Widened phi recipes should only be
introduced together with other widened recipes.

PR: https://github.com/llvm/llvm-project/pull/150847
2025-08-06 14:43:03 +01:00
Florian Hahn
2ae996cbbe
[LAA] Support assumptions in evaluatePtrAddRecAtMaxBTCWillNotWrap (#147047)
This patch extends the logic added in
https://github.com/llvm/llvm-project/pull/128061 to support
dereferenceability information from assumptions as well.

Unfortunately both assumption cache and the dominator tree need to be
threaded through multiple layers to make them available where needed.

PR: https://github.com/llvm/llvm-project/pull/147047
2025-08-01 14:18:07 +01:00
Luke Lau
253a9f2c52 [VPlan] Delete IR instruction after test. NFC
This fixes a LeakSanitizer failure on the sanitizer buildbots:
https://lab.llvm.org/buildbot/#/builders/52/builds/10088
2025-08-01 11:59:11 +08:00
Luke Lau
3e579d93ab [VPlan] Fix unit test without LLVM_ENABLE_DUMP. NFC
Without dumping the faulty recipe isn't printed, so account for that
like in the other tests. Fixes the buildbot failure at
https://lab.llvm.org/buildbot/#/builders/2/builds/30229
2025-08-01 00:17:47 +08:00
Luke Lau
08c5944222
[VPlan] Fix header phi VPInstruction verification. NFC (#151472)
Noticed this when checking the invariant that all phis in the header
block must be header phis. I think there's a missing set of parentheses
here, since otherwise it only cast<VPInstruction> when RecipeI isn't a
VPInstruction.
2025-07-31 23:09:20 +08:00
Florian Hahn
d1f2a661f4
[VPlan] Pass debug location explicitly to VPBlendRecipe (NFC).
This enables creating VPBlendRecipes without underlying PHINode.
2025-07-27 09:12:26 +01:00
vporpo
1eacdddc0c
[SandboxVec][SeedCollector][NFC] Replace cl::opt flags with constructor args (#143206)
The `SeedCollector` class gets two new arguments: `CollectStores` and
`CollectLoads`. These replace the `sbvec-collect-seeds` cl::opt flag.
This is done to help with reusing the SeedCollector class in a future
pass. The cl::opt flag is moved to the seed collection pass:
Passes/SeedCollection.cpp
2025-06-27 12:27:25 -07:00
Matt Arsenault
c91cbafad2
TargetLibraryInfo: Delete default TargetLibraryInfoImpl constructor (#145826)
It should not be possible to construct one without a triple. It would
also be nice to delete TargetLibraryInfoWrapperPass, but that is more
difficult.
2025-06-26 16:12:36 +09:00
Florian Hahn
c3e25e7fc4
[VPlan] Add VPInst::getNumOperandsForOpcode, use to verify in ctor (NFC) (#142284)
Add a new getNumOperandsForOpcode helper to determine the number of
operands from the opcode. For now, it is used to verify the number
operands at VPInstruction construction.

It returns -1 for a few opcodes where the number of operands cannot be
determined (GEP, Switch, PHI, Call).

This can also be used in a follow-up to determine if a VPInstruction is
masked based on the number of arguments.

PR: https://github.com/llvm/llvm-project/pull/142284
2025-06-24 20:39:35 +01:00
Vasileios Porpodas
79861d2db7 Reapply "[SandboxVec] Add a simple pack reuse pass (#141848)"
This reverts commit 31abf0774232735ad7a7d45e531497305bf99fae.
2025-06-05 09:14:17 -07:00
Vasileios Porpodas
31abf07742 Revert "[SandboxVec] Add a simple pack reuse pass (#141848)"
This reverts commit 1268352656f81ea173860a8002aadb88844137e7.
2025-06-04 14:24:49 -07:00
vporpo
1268352656
[SandboxVec] Add a simple pack reuse pass (#141848)
This patch implements a simple pass that tries to de-duplicate packs. If
there are two packing patterns inserting the exact same values in the
exact same order, then we will keep the top-most one of them. Even
though such patterns may be optimized away by subsequent passes it is
still useful to do this within the vectorizer because otherwise the cost
estimation may be off, making the vectorizer over conservative.
2025-06-04 14:12:06 -07:00
Florian Hahn
10bd4cd9cd
[VPlan] Remove ResumePhi opcode, use regular PHI instead (NFC). (#140405)
Use regular VPPhi instead of a separate opcode for resume phis. This
removes an unneeded specialized opcode and unifies the code
(verification, printing, updating when CFG is changed).

Depends on https://github.com/llvm/llvm-project/pull/140132.

PR: https://github.com/llvm/llvm-project/pull/140405
2025-05-30 12:50:08 +01:00
Florian Hahn
9ea4924720
[VPlan] Use EMIT-SCALAR for single-scalar VPPhis (NFC).
Follow-up to https://github.com/llvm/llvm-project/pull/141428, to also
use EMIT-SCALAR for VPPhis that are single scalars.
2025-05-29 11:20:07 +01:00
Florian Hahn
d56deea1e4
[VPlan] Connect Entry to scalar preheader during initial construction. (#140132)
Update initial construction to connect the Plan's entry to the scalar
preheader during initial construction. This moves a small part of the
 skeleton creation out of ILV and will also enable replacing
 VPInstruction::ResumePhi with regular VPPhi recipes.

Resume phis need 2 incoming values to start with, the second being the
bypass value from the scalar ph (and used to replicate the incoming
value for other bypass blocks). Adding the extra edge ensures we
incoming values for resume phis match the incoming blocks.

PR: https://github.com/llvm/llvm-project/pull/140132
2025-05-27 16:07:56 +01:00
Florian Hahn
c0506a11f4
[VPlan] Separate out logic to manage IR flags to VPIRFlags (NFC). (#140621)
This patch moves the logic to manage IR flags to a separate VPIRFlags
class. For now, VPRecipeWithIRFlags is the only class that inherits
VPIRFlags. The new class allows for simpler passing of flags when
constructing recipes, simplifying the constructors for various recipes
(VPInstruction in particular, which now just has 2 constructors, one
taking an extra VPIRFlags argument.

This mirrors the approach taken for VPIRMetadata and makes it easier to
extend in the future. The patch also adds a unified flagsValidForOpcode
to check if the flags in a VPIRFlags match the provided opcode.

PR: https://github.com/llvm/llvm-project/pull/140621
2025-05-25 11:13:11 +01:00
Florian Hahn
672e9263cb
Reapply "[VPlan] Support cloning initial VPlan (NFC)."
This reverts commit 204252e2df80876702616518a5154dccacf3ebac.

Recommit with a fix for the leak in a unit test.
2025-05-23 21:22:31 +01:00
Florian Hahn
95ba5508e5
Reapply "[VPlan] Move predication to VPlanTransform (NFC). (#128420)"
This reverts commit 793bb6b257fa4d9f4af169a4366cab3da01f2e1f.

The recommitted version contains a fix to make sure only the original
phis are processed in convertPhisToBlends nu collecting them in a vector
first. This fixes a crash when no mask is needed, because there is only
a single incoming value.

Original message:
This patch moves the logic to predicate and linearize a VPlan to a
dedicated VPlan transform. It mostly ports the existing logic directly.

There are a number of follow-ups planned in the near future to
further improve on the implementation:
* Edge and block masks are cached in VPPredicator, but the block masks
are still made available to VPRecipeBuilder, so they can be accessed
during recipe construction. As a follow-up, this should be replaced by
adding mask operands to all VPInstructions that need them and use that
during recipe construction.
* The mask caching in a map also means that this map needs updating each
time a new recipe replaces a VPInstruction; this would also be handled
by adding mask operands.

PR: https://github.com/llvm/llvm-project/pull/128420
2025-05-22 08:16:15 +01:00
Florian Hahn
793bb6b257
Revert "[VPlan] Move predication to VPlanTransform (NFC). (#128420)"
This reverts commit b263c08e1a0b54a871915930aa9a1a6ba205b099.

Looks like this triggers a crash in one of the Fortran tests. Reverting
while I investigate
    https://lab.llvm.org/buildbot/#/builders/41/builds/6825
2025-05-21 19:24:21 +01:00
Florian Hahn
b263c08e1a
[VPlan] Move predication to VPlanTransform (NFC). (#128420)
This patch moves the logic to predicate and linearize a VPlan to a
dedicated VPlan transform. It mostly ports the existing logic directly.

There are a number of follow-ups planned in the near future to
further improve on the implementation:
* Edge and block masks are cached in VPPredicator, but the block masks
are still made available to VPRecipeBuilder, so they can be accessed
during recipe construction. As a follow-up, this should be replaced by
adding mask operands to all VPInstructions that need them and use that
during recipe construction.
* The mask caching in a map also means that this map needs updating each
time a new recipe replaces a VPInstruction; this would also be handled
by adding mask operands.


PR: https://github.com/llvm/llvm-project/pull/128420
2025-05-21 15:47:33 +01:00
Florian Hahn
204252e2df
Revert "[VPlan] Support cloning initial VPlan (NFC)."
This reverts commit 5fa985e751c8f890fff31e190473aeeb6f7a9fc5.

Revert as this seems to introduce a call to a pure virtual function on a
few configs, e.g.
    https://lab.llvm.org/buildbot/#/builders/169/builds/11535
2025-05-18 22:03:00 +01:00
Florian Hahn
5fa985e751
[VPlan] Support cloning initial VPlan (NFC).
Support cloning VPlans as they are created by the initial buildVPlan,
i.e. scalar header not yet connected and no trip-count set. This is not
used yet but will in follow-up changes/

Also add a unit test for cloning & printing.
2025-05-18 19:37:17 +01:00
Jonathan Thackray
6d942c5c16
[llvm] Fix test breakage in Vectorize/VPlanVerifierTest.cpp (#140079)
Fix test breakage in Vectorize/VPlanVerifierTest.cpp introduced in
change 849990479 (typo).
2025-05-15 16:22:25 +01:00
Florian Hahn
849990479f
[VPlan] Update check line in verifier unit test w/o assertions.
Should fix failures with assertions disabled, including
https://lab.llvm.org/buildbot/#/builders/2/builds/24015.
2025-05-15 12:36:12 +01:00
Florian Hahn
8bbe0d050a
[VPlan] Verify dominance for incoming values of phi-like recipes. (#124838)
Update the verifier to verify dominance for incoming values for phi-like
recipes. The defining recipe must dominate the incoming block for the
incoming value.

Builds on top of https://github.com/llvm/llvm-project/pull/138472 to
retrieve incoming values & corresponding blocks for phi-like recipes.

PR: https://github.com/llvm/llvm-project/pull/124838
2025-05-15 12:20:54 +01:00
Jessica Clarke
864f0ff4ef
[clang][IR] Overload @llvm.thread.pointer to support non-AS0 targets (#132489)
Thread-local globals live, by default, in the default globals address
space, which may not be 0, so we need to overload @llvm.thread.pointer
to support other address spaces, and use the default globals address
space in Clang.
2025-05-14 21:51:56 +01:00
Graham Hunter
5b9246517f
[LV] Fix ScalarIVSteps vplan pattern matcher, remove m_CanonicalIV() (#138298)
783a846 changed VPScalarIVStepsRecipe to take 3 arguments (adding
VF explicitly) instead of 2, but didn't change the corresponding
pattern matcher.

This matcher was only used in vputils::isHeaderMask, and no test
ever reached that function with a ScalarIVSteps recipe for the
value being matched -- it was always a WideCanonicalIV. So the
matcher bailed out immediately before checking arguments and
asserting that the number of arguments in the recipe was the
same provided by the matcher.

Since the constructors for ScalarIVSteps take 3 values, we should
be safe to update the matcher and guard it with a dedicated gtest.

m_CanonicalIV() on the other hand is removed; as a phi recipe it
may not have a consistent number of arguments to match, only
requiring one (the start value) when being constructed with the
assumption that a second incoming value is added for the backedge
later. In order to keep the matcher we would need to add multiple
matchers with different numbers of arguments for it depending on
what phase of vplan construction we were in, and ensure that we
never reorder matcher usage vs. vplan transformation. Since the
main IR PatternMatch.h doesn't contain any matchers for PHI nodes,
I think we can just remove it and match via m_Specific() using the
VPValue we get from Plan.getCanonicalIV().
2025-05-14 15:01:03 +01:00
Florian Hahn
ba2dacd276
[VPlan] Print use and definition in verifier on violation.
Improves the error message when a use comes before the def by including
the use and def, when print utilities are available.
2025-05-13 09:52:02 +01:00
Florian Hahn
2f55123cbb
[VPlan] Handle early exit before forming regions. (NFC) (#138393)
Move early-exit handling up front to original VPlan construction, before
introducing early exits.

This builds on https://github.com/llvm/llvm-project/pull/137709, which
adds exiting edges to the original VPlan, instead of adding exit blocks
later.

This retains the exit conditions early, and means we can handle early
exits before forming regions, without the reliance on VPRecipeBuilder.

Once we retain all exits initially, handling early exits before region
construction ensures the regions are valid; otherwise we would leave
edges exiting the region from elsewhere than the latch.

Removing the reliance on VPRecipeBuilder removes the dependence on
mapping IR BBs to VPBBs and unblocks predication as VPlan transform:
https://github.com/llvm/llvm-project/pull/128420.

Depends on https://github.com/llvm/llvm-project/pull/137709 (included in
PR).

PR: https://github.com/llvm/llvm-project/pull/138393
2025-05-12 12:53:20 +01:00
Florian Hahn
cfde685e22
[VPlan] Sink VPB2IRBB lookups to VPRecipeBuilder (NFC).
This allows migrating some more code to be based on VPBBs in
VPRecipeBuilder, in preparation for
https://github.com/llvm/llvm-project/pull/128420.
2025-05-10 22:00:58 +01:00
Florian Hahn
e854c381c6
[VPlan] Manage noalias/alias_scope metadata in VPlan. (#136450)
Use VPIRMetadata added in
https://github.com/llvm/llvm-project/pull/135272
to also manage no-alias metadata added by versioning.

Note that this means we have to build the no-alias metadata up-front
once. If it is not used, it will be discarded automatically.

This also fixes a case where incorrect metadata was added to wide
loads/stores that got converted from an interleave group.

Compile-time impact is neutral:

https://llvm-compile-time-tracker.com/compare.php?from=38bf1af41c5425a552a53feb13c71d82873f1c18&to=2fd7844cfdf5ec0f1c2ce0b9b3ae0763245b6922&stat=instructions:u
2025-05-09 11:19:12 +01:00
Florian Hahn
d06d43a9e8
[VPlan] Add printPhiOperands to VPPhiAccessors, use for wide phis.
(NFC modulo debug output changes)

Add generic helper to print phi operands (incoming values) together with
their incoming blocks.

As more and more transforms are added, keeping the incoming blocks of
phis becomes more important. Print incoming blocks via VPPhiAcessors, to
make debugging easier.
2025-05-08 20:56:48 +01:00
Florian Hahn
339dc9500b
[VPlan] Retain exit conditions and edges in initial VPlan (NFC). (#137709)
Update initial VPlan construction to include exit conditions and edges.

The loop region is now first constructed without entry/exiting. Those
are set after inserting the region in the CFG, to preserve the original
predecessor/successor order of blocks.

For now, all early exits are disconnected before forming the regions,
but a follow-up will update uncountable exit handling to also happen
here. This is required to enable VPlan predication and remove the
dependence any IR BBs
(https://github.com/llvm/llvm-project/pull/128420).

PR: https://github.com/llvm/llvm-project/pull/137709
2025-05-08 18:10:52 +01:00
Florian Hahn
aadf35cb41
[VPlan] Verify number preds and operands matches for VPIRPhis. (NFC)
Extend the verifier to ensure the number of predecessors and operands
match for VPIRPhis.
2025-05-05 15:32:02 +01:00
Florian Hahn
edb690dc5b
Reapply "[VPlan] Add canonical IV during construction (NFC)."
This reverts commit d431921677ae923d189ff2d6f188f676a2964ed8.

Missing gtests have been updated.

Original message:

This addresses an existing TODO and simply moves the current code to add
canonical IV recipes to the initial skeleton construction, at the same
place where the corresponding region will be introduced.
2025-05-03 10:54:59 +01:00
Florian Hahn
d431921677
Revert "[VPlan] Add canonical IV during construction (NFC)."
This reverts commit e17122fffa8d233fcf9f717354ecda46173f1b8d.

Revert as this seems to break some unit tests on some bots.
2025-04-29 22:55:11 +01:00
Florian Hahn
e17122fffa
[VPlan] Add canonical IV during construction (NFC).
This addresses an existing TODO and simply moves the current code to add
canonical IV recipes to the initial skeleton construction, at the same
place where the corresponding region will be introduced.
2025-04-29 22:38:59 +01:00
Florian Hahn
d2ce88a939
[VPlan] Create initial skeleton before creating regions. (NFC)
Move out the logic to prepare for vectorization to a separate transform,
before creating loop regions. This was discussed as follow-up
in https://github.com/llvm/llvm-project/pull/136455.

This just moves the existing code around slightly  and will simplify
follow-up patches to include the exiting edges during initial VPlan
construction.
2025-04-28 21:51:32 +01:00
Florian Hahn
826f237cb4
[VPlan] Don't added separate vector latch block (NFC).
Simplify initial VPlan construction by not creating a separate
vector.latch block, which isn't needed and will get folded away later.
This has been suggested as independent clean-up multiple times.
2025-04-26 22:03:18 +01:00
Kazu Hirata
e9487fed29
[llvm] Construct SmallVector with iterator ranges (NFC) (#136460) 2025-04-19 19:07:10 -07:00
Florian Hahn
e232d28eff
[VPlan] Move plain CFG construction to VPlanConstruction. (NFC)
Follow-up as discussed in https://github.com/llvm/llvm-project/pull/129402.

After bc03d6cce257, the VPlanHCFGBuilder doesn't actually build a HCFG
any longer. Move what remains directly into VPlanConstruction.cpp.
2025-04-18 21:52:05 +01:00
Florian Hahn
bc03d6cce2
[VPlan] Introduce all loop regions as VPlan transform. (NFC) (#129402)
Further simplify VPlan CFG builder by moving introduction of inner
regions to a VPlan transform, building on
https://github.com/llvm/llvm-project/pull/128419.

The HCFG builder now only constructs plain CFGs. I will move it to
VPlanConstruction as follow-up.

Depends on https://github.com/llvm/llvm-project/pull/128419.

PR: https://github.com/llvm/llvm-project/pull/129402
2025-04-16 13:30:45 +02:00
Florian Hahn
380defd4b3
[VPlan] Update VPInterleaveRecipe to take debug loc directly as arg (NFC) 2025-04-02 22:46:38 +01:00
Florian Hahn
783a846507
[VPlan] Add VF as operand to VPScalarIVStepsRecipe.
Similarly to other recipes, update VPScalarIVStepsRecipe to also take
the runtime VF as argument. This removes some unnecessary runtime VF
computations for scalable vectors. It will also allow dropping the
UF == 1 restriction for narrowing interleave groups required in
577631f0a528.
2025-03-28 21:48:59 +00:00
Luke Lau
6a8606e99e
[VPlan] Only store RecurKind + FastMathFlags in VPReductionRecipe. NFCI (#131300)
VPReductionRecipes take a RecurrenceDescriptor, but only use the
RecurKind and FastMathFlags in it when executing. This patch makes the
recipe more lightweight by stripping it to only take the latter two.

The motiviation for this is to simplify an upcoming patch to support
in-loop AnyOf reductions. For an in-loop AnyOf reduction we want to
create an Or reduction, and by using RecurKind we can create an
arbitrary reduction without needing a full RecurrenceDescriptor.
2025-03-24 19:18:54 +08:00
Florian Hahn
2e13ec561c
[VPlan] Bail out on non-intrinsic calls in VPlanNativePath.
Update initial VPlan-construction in VPlanNativePath in line with the
inner loop path, in that it bails out when encountering constructs it
cannot handle, like non-intrinsic calls.

Fixes https://github.com/llvm/llvm-project/issues/131071.
2025-03-19 21:35:15 +00:00
Elvis Wang
ed19620b8c
[VPlan] Make VPReductionRecipe a VPRecipeWithIRFlags. NFC (#130881)
This patch change the parent of the VPReductionRecipe from
VPSingleDefRecipe to VPRecipeWithIRFlags and also print/get/drop/control
flags by the VPRecipeWithIRFlags. This will remove the dependency of the
underlying instruction.

This patch also add a new function `setFastMathFlags()` to the
VPRecipeWithIRFlags because the entire reduction chain may contains
multiple instructions. And the underlying instruction may not contains
the corresponding flags for this reduction.

Split from #113903.
2025-03-18 10:08:23 +08:00
Florian Hahn
fd267082ee
[VPlan] Refactor VPlan creation, add transform introducing region (NFC). (#128419)
Create an empty VPlan first, then let the HCFG builder create a plain
CFG for the top-level loop (w/o a top-level region). The top-level
region is introduced by a separate VPlan-transform. This is instead of
creating the vector loop region before building the VPlan CFG for the
input loop.

This simplifies the HCFG builder (which should probably be renamed) and
moves along the roadmap ('buildLoop') outlined in [1].

As follow-up, I plan to also preserve the exit branches in the initial
VPlan out of the CFG builder, including connections to the exit blocks.

The conversion from plain CFG with potentially multiple exits to a
single entry/exit region will be done as VPlan transform in a follow-up.

This is needed to enable VPlan-based predication. Currently early exit
support relies on building the block-in masks on the original CFG,
because exiting branches and conditions aren't preserved in the VPlan.
So in order to switch to VPlan-based predication, we will have to
preserve them in the initial plain CFG, so the exit conditions are
available explicitly when we convert to single entry/exit regions.

Another follow-up is updating the outer loop handling to also introduce
VPRegionBlocks for nested loops as transform. Currently the existing
logic in the builder will take care of creating VPRegionBlocks for
nested loops, but not the top-level loop.

[1]
https://llvm.org/devmtg/2023-10/slides/techtalks/Hahn-VPlan-StatusUpdateAndRoadmap.pdf

PR: https://github.com/llvm/llvm-project/pull/128419
2025-03-09 15:05:35 +00:00