636 Commits

Author SHA1 Message Date
Florian Hahn
40304d8fef
Reapply "[VPlan] Remove manual region removal when simplifying for VF and UF. (#181252)" (#188589)
This reverts commit e30f9c19464bcf1bf1e9f69b63884fb78ad2d05d.

Re-land, now that the reported crash causing the revert has been fixed
as part of 77fb84889 (#187504).

Original message:

Replace manual region dissolution code in
simplifyBranchConditionForVFAndUF with using general
removeBranchOnConst. simplifyBranchConditionForVFAndUF now just creates
a (BranchOnCond true) or updates BranchOnTwoConds.

The loop then gets automatically removed by running removeBranchOnConst.

This removes a bunch of special logic to handle header phi replacements
and CFG updates. With the new code, there's no restriction on what kind
of header phi recipes the loop contains.

Note that VPEVLBasedIVRecipe needs to be marked as readnone. This is
technically unrelated, but I could not find an independent test that
would be impacted.

The code to deal with epilogue resume values now needs updating, because
we may simplify a reduction directly to the start value.

PR: https://github.com/llvm/llvm-project/pull/181252
2026-03-26 10:14:10 +00:00
Florian Hahn
86c1510418
[VPlan] Remove isVector guard in getCostForRecipeWithOpcode. (#188126)
The legacy cost model computes and passes RHSInfo both when widening and
replicating. Match behavior in VPlan-based cost model.

The added test shows that we now compute the same cost as the legacy
cost model.

Without this change, the test added in
llvm/test/Transforms/LoopVectorize/AArch64/predicated-costs.ll would
crash with https://github.com/llvm/llvm-project/pull/187056.

PR: https://github.com/llvm/llvm-project/pull/188126
2026-03-25 09:59:13 +00:00
Florian Hahn
77fb848894
Reapply "[LV] Simplify and unify resume value handling for epilogue vec." (#187504)
This reverts commit cdaf29f84dd0abbd1f961982799059c92d76625b.

This version skips removeBranchOnConst when vectorizing the epilogue, as
it may trigger folds that remove the resume phi used as resume value
from the epilogue.

This fixes https://github.com/llvm/llvm-project/issues/187323.

Original message:
This patch tries to drastically simplify resume value handling for the
scalar loop when vectorizing the epilogue.

It uses a simpler, uniform approach for updating all resume values in
the scalar loop:

1. Create ResumeForEpilogue recipes for all scalar resume phis in the
main loop (the epilogue plan will have exactly the same scalar resume
phis, in exactly the same order)
2. Update ::execute for ResumeForEpilogue to set the underlying value
when executing. This is not super clean, but allows easy lookup of the
generated IR value when we update the resume phis in the epilogue. Once
we connect the 2 plans together explicitly, this can be removed.
3. Use the list of ResumeForEpilogue VPInstructions from the main loop
to update the resume/bypass values from the epilogue.

This simplifies the code quite a bit, makes it more robust (should fix
https://github.com/llvm/llvm-project/issues/179407) and also fixes a
mis-compile in the existing tests (see change in

llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-sub-epilogue-vec.ll,
where previously we would incorrectly resume using the start value when
the epilogue iteration check failed)

In some cases, we get simpler code, due to additional CSE, in some cases
the induction end value computations get moved from the epilogue
iteration check to the vector preheader. We could try to sink the
instructions as cleanup, but it is probably not worth the trouble.

Fixes https://github.com/llvm/llvm-project/issues/179407.

PR for recommit https://github.com/llvm/llvm-project/pull/188134
2026-03-23 22:09:40 +00:00
Florian Hahn
cdaf29f84d
Revert "[LV] Simplify and unify resume value handling for epilogue vec." (#187504)
Reverts llvm/llvm-project#185969

This is suspected to cause a miscompile in 549.fotonik3d_r from SPEC 2017 FP
2026-03-19 14:38:37 +00:00
Florian Hahn
13a093b2b2
[VPlan] Compute cost for predicated loads/stores to invariant address. (#181572)
Update VPReplicateRecipe::computeCost to compute the cost for stores to
invariant addresses only masked by the header mask.

This matches the legacy cost model logic, but it is slightly odd that
the legacy cost model only seems to do this for stores predicated by the
header mask (i.e. tail-folding and not executed conditionally
otherwise). This is probably something we want to re-evaluate
eventually.

PR: https://github.com/llvm/llvm-project/pull/181572
2026-03-18 16:21:02 +00:00
Luke Lau
bf46a95f2c
[VPlan] Use target's index type for {First,Last}ActiveLane instead of i64 (#186361)
Fixes #186005

On RV32 with zve32x, i.e. no legal 64 bit types either scalar or vector,
@llvm.cttz.elts.i64 cannot be lowered and so returns an illegal cost for
scalable VFs. However VPInstruction::FirstActiveLane and
VPInstruction::LastActiveLane always use a hardcoded i64 type.

This causes a legacy/VPlan cost model mismatch in the live-out.ll test,
and in early-exit-live-out.ll prevents the scalable VF from being
chosen.

This PR teaches the two VPInstructions to use the target's index type,
i.e. the width of a pointer in the default address space, so it will
generate a 32 bit cttz.elts on RV32. This should be large enough to hold
the maximum number of elements in a vector, as if the vector was any
bigger it would imply it isn't accessible by memory.

I considered using the canonical IV type but I don't think that will
work since the canonical IV can be i64 on RV32, and it causes
regressions due to extra zexting on 64-bit targets with a 32-bit IV.
2026-03-18 15:01:21 +00:00
Ramkumar Ramachandra
f7763570e5
[VPlan] Improve code in VPlanRecipes using VPlanPatternMatch (NFC) (#187130) 2026-03-18 09:41:04 +00:00
David Sherwood
6f966fb5da
[LV] Add select instruction to VPReplicateRecipe::computeCost (#186825)
I've added the Instruction::Select opcode to the existing list of
opcodes that call getCostForRecipeWithOpcode. There are currently 5
tests that ask for the cost of the select:

  Transforms/LoopVectorize/AArch64/widen-gep-all-indices-invariant.ll
  Transforms/LoopVectorize/first-order-recurrence-with-uniform-ops.ll
  Transforms/LoopVectorize/narrow-to-single-scalar.ll
  Transforms/LoopVectorize/replicate_fneg.ll
  Transforms/LoopVectorize/single-scalar-cast-minbw.ll

The fact they all pass with this change is hopefully proof enough that
the costs are correct.
2026-03-17 09:38:13 +00:00
Florian Hahn
013f2542a2
[LV] Simplify and unify resume value handling for epilogue vec. (#185969)
This patch tries to drastically simplify resume value handling for the
scalar loop when vectorizing the epilogue.

It uses a simpler, uniform approach for updating all resume values in
the scalar loop:

1. Create ResumeForEpilogue recipes for all scalar resume phis in the
main loop (the epilogue plan will have exactly the same scalar resume
phis, in exactly the same order)
2. Update ::execute for ResumeForEpilogue to set the underlying value
when executing. This is not super clean, but allows easy lookup of the
generated IR value when we update the resume phis in the epilogue. Once
we connect the 2 plans together explicitly, this can be removed.
3. Use the list of ResumeForEpilogue VPInstructions from the main loop
to update the resume/bypass values from the epilogue.

This simplifies the code quite a bit, makes it more robust (should fix
https://github.com/llvm/llvm-project/issues/179407) and also fixes a
mis-compile in the existing tests (see change in
llvm/test/Transforms/LoopVectorize/AArch64/partial-reduce-sub-epilogue-vec.ll,
where previously we would incorrectly resume using the start value when
the epilogue iteration check failed)

In some cases, we get simpler code, due to additional CSE, in some cases
the induction end value computations get moved from the epilogue
iteration check to the vector preheader. We could try to sink the
instructions as cleanup, but it is probably not worth the trouble.

Fixes https://github.com/llvm/llvm-project/issues/179407.
2026-03-16 21:21:59 +00:00
Ramkumar Ramachandra
616bf5abd1
[VPlan] Introduce VPlan::getDataLayout (NFC) (#186418) 2026-03-13 16:17:04 +00:00
Luke Lau
776589a3b5
[VPlan] Handle FindLast in VPIRFlags::printFlags (#185857)
Noticed this when -vplan-print-after-all crashed on a find-last
reduction. We don't yet return an opcode for it because there's no
in-loop reduction.
2026-03-11 21:14:27 +08:00
Alexis Engelke
4fd826d1f9
[IR] Split Br into UncondBr and CondBr (#184027)
BranchInst currently represents both unconditional and conditional
branches. However, these are quite different operations that are often
handled separately. Therefore, split them into separate opcodes and
classes to allow distinguishing these operations in the type system.
Additionally, this also slightly improves compile-time performance.
2026-03-11 12:31:10 +00:00
Aiden Grossman
e30f9c1946 Revert "Reapply "[VPlan] Remove manual region removal when simplifying for VF and UF. (#181252)""
This reverts commit 6aa115bba55054b0dc81ebfc049e8c7a29e614b2.

This is causing crashes. See #185345 for details.
2026-03-09 04:24:01 +00:00
Florian Hahn
6aa115bba5
Reapply "[VPlan] Remove manual region removal when simplifying for VF and UF. (#181252)"
This reverts commit d7e037c8383e66e5c07897f144f6d8ef47258682.

Recommit with a small fix to properly handle ordered reductions when
connecting the epilogue.

Original message:

Replace manual region dissolution code in
simplifyBranchConditionForVFAndUF with using general
removeBranchOnConst. simplifyBranchConditionForVFAndUF now just creates
a (BranchOnCond true) or updates BranchOnTwoConds.

The loop then gets automatically removed by running removeBranchOnConst.

This removes a bunch of special logic to handle header phi replacements
and CFG updates. With the new code, there's no restriction on what kind
of header phi recipes the loop contains.

Note that VPEVLBasedIVRecipe needs to be marked as readnone. This is
technically unrelated, but I could not find an independent test that
would be impacted.

The code to deal with epilogue resume values now needs updating, because
we may simplify a reduction directly to the start value.

PR: https://github.com/llvm/llvm-project/pull/181252
2026-03-08 11:13:40 +00:00
Benjamin Maxwell
03c34bb59e
[LV] Support interleaving with FindLast reductions (#184099)
This extends the existing support to work with arbitrary interleave
factors. The main change here is reworking the ExtractLastActive
VPInstruction to take a variable amount of arguments and handling it in
unrollRecipeByUF and VPInstruction::generate.

The select condition for all mask/data values in a find-last recurrence
is the true if the mask for any part is true. Because of this the masks
for inactive parts will be updated to all-false when the parts with
active lanes are updated. This ensures the mask/data for last active
element always corresponds to the greatest part with an active lane.

This means finding the last element in the middle block simply requires
chaining the `extract.last.active` to forward the result from the last
active part through any inactive parts ahead of it.
2026-03-06 15:30:58 +00:00
Florian Hahn
17aaa0e590
[VPlan] Use bitfield to store Cmp predicates and GEP wrap flags. (NFC) (#181571)
Instead of storing CmpInst::Predicate/GepNoWrapFlags, only store their
raw bitfield values. This reduces the size of VPIRFlags from 12 to 3
bytes.

PR: https://github.com/llvm/llvm-project/pull/181571
2026-03-03 19:46:30 +00:00
Benjamin Maxwell
74c0ee7e72
[TTI] Remove TargetLibraryInfo from IntrinsicCostAttributes (NFC) (#183764)
This is a remnant from when `sincos` costs used the vector mappings from
`TargetLibraryInfo::getVectorMappingInfo`.
2026-03-01 10:16:16 +00:00
Florian Hahn
73d655a598
[VPlan] Support unrolling/cloning masked VPInstructions.
Account for masked VPInstruction when verifying the operands in the
constructor. Fixes a crash when trying to unroll VPlans for predicated
early exits.
2026-02-27 22:14:45 +00:00
Florian Hahn
d7e037c838
Revert "[VPlan] Remove manual region removal when simplifying for VF and UF. (#181252)"
This reverts commit 9c53215d213189d1f62e8f6ee7ba73a089ac2269.

Appears to cause crashes with ordered reductions, revert while I
investigate
2026-02-27 21:29:41 +00:00
Florian Hahn
9c53215d21
[VPlan] Remove manual region removal when simplifying for VF and UF. (#181252)
Replace manual region dissolution code in
simplifyBranchConditionForVFAndUF with using general
removeBranchOnConst. simplifyBranchConditionForVFAndUF now just creates
a (BranchOnCond true) or updates BranchOnTwoConds.

The loop then gets automatically removed by running removeBranchOnConst.

This removes a bunch of special logic to handle header phi replacements
and CFG updates. With the new code, there's no restriction on what kind
of header phi recipes the loop contains.

Note that VPEVLBasedIVRecipe needs to be marked as readnone. This is
technically unrelated, but I could not find an independent test that
would be impacted.

The code to deal with epilogue resume values now needs updating, because
we may simplify a reduction directly to the start value.

PR: https://github.com/llvm/llvm-project/pull/181252
2026-02-27 16:49:54 +00:00
Florian Hahn
d5e501725e
Reapply "[VPlan] Use VPInstructionWithType for Load in VPlan0 (NFC)"
This reverts commit 97835516393311d681d1ff6bec67e1093f94890e.

Unit tests have been updated
2026-02-26 22:39:33 +00:00
Aiden Grossman
9783551639 Revert "[VPlan] Use VPInstructionWithType for Load in VPlan0 (NFC)"
This reverts commit 2576ee1fd93fb87699650734ffafdb8092062d59.

This was causing test failures when running check-llvm-unit.
2026-02-26 22:35:10 +00:00
Florian Hahn
2576ee1fd9
[VPlan] Use VPInstructionWithType for Load in VPlan0 (NFC)
VPInstructionWithType directly allows modeling the loaded type.
2026-02-26 22:08:09 +00:00
Florian Hahn
32b8b9ba1e
[VPlan] Simplify ExitingIVValue and use for tail-folded IVs. (#182507)
Now that we have ExitingIVValue, we can also use it for tail-folded
loops; the only difference is that we have to compute the end value with
the original trip count instead the vector trip count.

This allows removing the induction increment operand only used when
tail-folding.

PR: https://github.com/llvm/llvm-project/pull/182507
2026-02-26 11:48:04 +00:00
Florian Hahn
bf4705c05b
[VPlan] Supported conditionally executed single early exits. (#182395)
Add support for a single early exit that is executed conditionally. To
make sure the mask from any non-exiting control flow is combined with
the early exit condition.

To do so, introduce a MaskedCond VPInstruction, which is inserted as
user of the early-exit condition, at the point of the early-exit branch.
The VPInstruction will get masked automatically if needed by the
predicator, ensuring that we properly account for it when checking
whether the early exit has been taken.

Note that this does not allow for instructions that require predication
after the early exit. This requires additional work in progress:
https://github.com/llvm/llvm-project/pull/172454

As an alternative to MaskedCond, we could also predicate before handling
early exiting blocks: https://github.com/llvm/llvm-project/pull/181830

PR: https://github.com/llvm/llvm-project/pull/182395
2026-02-25 14:28:04 +00:00
Luke Lau
ff88b83fed
[VPlan] Handle extracts for middle blocks also used by early exiting blocks. NFC (#181789)
Currently createExtractsForLiveOuts only handles creating extracts when
the middle block has one predecessor, but if an early exit exits to the
same block as the latch then it might have multiple predecessors.

This handles the latter case to avoid the need to handle it in
VPlanTransforms::handleUncountableEarlyExits. Addresses the comment in
https://github.com/llvm/llvm-project/pull/174864#discussion_r2794153217
2026-02-23 04:03:49 +00:00
Luke Lau
6a5375fbce
[VPlan] Plumb recurrence FMFs through VPReductionPHIRecipe via VPIRFlags. NFC (#181694)
In order to be able to create selects for reduction phis through tail
folding in foldTailByMasking (#176143), make VPReductionPHIRecipe an
instance of VPIRFlags and plumb the FMFs from the original RdxDesc.

This allows us to remove more uses of the RecurrenceDescriptor in
addReductionResultComputation, which should help untie it from
LoopVectorizationLegality.
2026-02-19 11:23:47 +00:00
Benjamin Maxwell
867272d52a
[LV] Pass symbolic VF to CalculateTripCountMinusVF and CanonicalIVIncrementForPart (NFC) (#180542)
This makes it easier to update the runtime VF per VPlan.
2026-02-18 08:58:47 +00:00
Shih-Po Hung
97fa3e5936
[NFC][VPlan] Rename VPEVLBasedIVPHIRecipe to VPCurrentIterationPHIRecipe (#177114)
This is groundwork for #151300, which aims to support first-faulting
loads in non-tail-folded early-exit loops.
Per #175900, we need a variable-length stepping transform that can
shared between EVL and non-EVL loops.
The idea is to have an EVL-independent counter and transform for
tracking the cumulative number of processed elements.

This patch renames the existing counter (VPEVLBasedIVPHIRecipe) and
transform (canonicalizeEVLLoops) to be EVL-independent:
- Rename VPEVLBasedIVPHIRecipe to VPCurrentIterationRecipe to
  reflect its general purpose of tracking processed element count.
- Rename canonicalizeEVLLoops to convertToVariableLengthStep.

This is NFC.
2026-02-18 07:04:58 +00:00
Ramkumar Ramachandra
2b7c1f9d82
[VPlan] Directly unroll VectorEndPointerRecipe (#172372)
Directly unroll VectorEndPointerRecipe following 0636225b ([VPlan]
Directly unroll VectorPointerRecipe, #168886). It allows us to leverage
existing VPlan simplifications to optimize.

Co-authored-by: Luke Lau <luke@igalia.com>
Co-authored-by: Florian Hahn <flo@fhahn.com>
2026-02-16 09:59:55 +00:00
Florian Hahn
f3a816598d
[VPlan] Add VPSymbolicValue for UF. (NFC)
Add a symbolic unroll factor (UF) to VPlan similar to VF & VFxUF that
gets replaced with the concrete UF during plan execution, similar to how VF
is used for the vectorization factor. This is a preparatory change that
allows transforms to use the symbolic UF before the concrete UF is
determined.

Note that the old getUF that returns the concrete UF after unrolling has
been renamed to getConcreteUF.

Split off from the re-commit of 8d29d093096
(https://github.com/llvm/llvm-project/pull/149706) as suggested.
2026-02-15 15:24:35 +00:00
Florian Hahn
b3dcf485d2
[VPlan] Compute NumPredStores for VPReplicateRecipe costs in VPlan.
Compute the number of predicated stores directly in VPlan instead of
using CM.useEmulatedMaskMemRefHack(), which will only account for the
number of predicated stores for the last VF the legacy cost model
considered.

Fixes https://github.com/llvm/llvm-project/issues/181183
2026-02-13 21:16:53 +00:00
Florian Hahn
ede1a9626b
[LV] Vectorize early exit loops with multiple exits. (#174864)
Building on top of the recent changes to introduce BranchOnTwoConds,
this patch adds support for vectorizing loops with multiple early exits,
all dominating a countable latch. The early exits must form a
dominance chain, so we can simply check which early exit has been taken
in dominance order.

Currently LoopVectorizationLegality ensures that all exits other than
the latch must be uncountable. handleUncountableEarlyExits now collects
those uncountable exits and processes each exit.

In the vector region, we compute if any exit has been taken, by taking
the OR of all early exit conditions (EarlyExitConds) and checking if
there's
any active lane.

If the early exit is taken, we exit the loop and compute which early
exit
has been taken. The first taken early exit is the one where its exit
condition is true in the first active lane of EarlyExitConds.

We create a chain of dispatch blocks outside the loop to check this for
the early exit blocks ordered by dominance.

Depends on https://github.com/llvm/llvm-project/pull/174016.

PR: https://github.com/llvm/llvm-project/pull/174864
2026-02-13 16:44:23 +00:00
Ramkumar Ramachandra
2223b931c5
[VPlan] Introduce m_c_Logical(And|Or) (#180048) 2026-02-12 13:14:08 +00:00
Benjamin Maxwell
f22a178b13
Reland "[LV] Support conditional scalar assignments of masked operations" (#180708)
This patch extends the support added in #158088 to loops where the
assignment is non-speculatable (e.g. a conditional load or divide).

For example, the following loop can now be vectorized:

```
int simple_csa_int_load(
  int* a, int* b, int default_val, int N, int threshold)
{
  int result = default_val;
  for (int i = 0; i < N; ++i)
    if (a[i] > threshold)
      result = b[i];
  return result;
}
```

It does this by extending the recurrence matching from only looking for
selects, to include phis where all operands are the header phi, except
for one which can be an arbitrary value outside the recurrence.

---

Reverts llvm/llvm-project#180275 (original PR: #178862)

Additional type legalization for `ISD::VECTOR_FIND_LAST_ACTIVE` was
added in #180290, which should resolve the backend crashes on x86.
2026-02-10 09:57:48 +00:00
Vishruth Thimmaiah
84f4b1e52d Reland "[LoopVectorize] Support vectorization of overflow intrinsics" (#180526)
Enables support for marking overflow intrinsics `uadd`, `sadd`, `usub`,
`ssub`, `umul` and `smul` as trivially vectorizable.

Fixes #174617

---

This patch is a reland of #174835.

Reverts #179819
2026-02-09 15:32:04 +00:00
David Sherwood
44031ae79f
[LV] Fix issue in VPFirstOrderRecurrencePHIRecipe::usesFirstLaneOnly (#179977)
In some cases we decide to vectorise loops with first-order recurrences
using VF=1, IC>1. We then attempt to unroll a vplan in replicateByVF,
however when trying to erase the list of values from the parent we
trigger the following assert:

```
virtual llvm::VPRecipeValue::~VPRecipeValue(): Assertion `Users.empty()
  && "trying to delete a VPRecipeValue with remaining users"' failed.
```

The problem seems to stem from this code:

```
  DefR->replaceUsesWithIf(LaneDefs[0], [DefR](VPUser &U, unsigned) {
    return U.usesFirstLaneOnly(DefR);
  });
```

since usesFirstLaneOnly returns false and we fail to replace uses of
DefR with LaneDefs[0]. Upon inspection the only VPUser objects that
return false are VPInstruction::FirstOrderRecurrenceSplice and
VPFirstOrderRecurrencePHIRecipe. Since the values are all scalar it's
simply not possible for us to be using anything other than the first
lane. I've fixed this by bailing out of replicateByVF early for plans with
only a scalar VF.

Fixes https://github.com/llvm/llvm-project/issues/179671
2026-02-09 13:42:26 +00:00
Luke Lau
8cd86ff284
[VPlan] Propagate FastMathFlags from phis to blends (#180226)
If a phi has fast math flags, we can propagate it to the widened select.
To do this, this patch makes VPPhi and VPBlendRecipe subclasses of
VPRecipeWithIRFlags, and propagates it through PlainCFGBuilder and
VPPredicator.

Alive2 proofs for some of the FMFs (it looks like it can't reason about
the full "fast" set yet)
nnan: https://alive2.llvm.org/ce/z/f0bRd4
nsz: https://alive2.llvm.org/ce/z/u9P96T

The actual motivation for this to eventually be able to move the special
casing for tail folding in
LoopVectorizationPlanner::addReductionResultComputation into the CFG in
#176143, which requires passing through FMFs.
2026-02-09 19:38:58 +08:00
Florian Hahn
6324ee32c1
[VPlan] Use PredBB's terminator as insert point for VPIRPhi extracts.
Use PredBB's terminator as insert point in VPIRPhi::execute to make sure
the extracts are placed after any possibly sunk instructions.

Fixes https://github.com/llvm/llvm-project/issues/180363.
2026-02-08 20:36:36 +00:00
Florian Hahn
7509cad693
[VPlan] Support masked VPInsts, use for predication (NFC) (#142285)
Add support for mask operands to most VPInstructions, using
getNumOperandsForOpcode.

This allows VPlan predication to predicate VPInstructions directly. The
mask will then be dropped or handled when creating wide recipes.

Depends on https://github.com/llvm/llvm-project/pull/142284.
Depends on https://github.com/llvm/llvm-project/pull/168784.

PR: https://github.com/llvm/llvm-project/pull/142285
2026-02-08 18:23:36 +00:00
Florian Hahn
3c5b05427d
[VPlan] Pass underlying instr to getMemoryOpCost in ::computeCost.
Pass underlying instruction to getMemoryOpCost in
VPReplicateRecipe::computeCost if UsedByLoadStoreAddress is true.
Some targets use the underlying instruction to improve costs,
and this is needed to match the legacy cost model.

Fixes https://github.com/llvm/llvm-project/issues/177780.
Fixes https://github.com/llvm/llvm-project/issues/177772.
2026-02-08 16:15:39 +00:00
Florian Hahn
3192fe2c7b
[VPlan] Fall back to legacy cost model if PtrSCEV is nullptr.
There are some cases when PtrSCEV can be nullptr. Fall back to legacy
cost model, to not call isLoopInvariant with nullptr.

Fixes a crash after 0c4f8094939d2.
2026-02-08 11:55:12 +00:00
Florian Hahn
0c4f809493
[VPlan] Compute predicated load/store costs in VPlan. (NFC) (#179129)
Update VPReplicateReicpe::computeCost to compute predicated load/store
costs directly, unless the pointer is uniform. In that case, the legacy
cost model uses a different logic, which will be migrated separately.

PR: https://github.com/llvm/llvm-project/pull/179129
2026-02-07 20:02:54 +00:00
Kewen Meng
703c2762d3
Revert "[LV] Support conditional scalar assignments of masked operations" (#180275)
Reverts llvm/llvm-project#178862 

revert to unblock bot:
https://lab.llvm.org/buildbot/#/builders/206/builds/13225
2026-02-06 13:24:40 -08:00
Florian Hahn
fdce0ea708
[VPlan] Add ExitingIVValue VPInstruction. (#175651)
Add a new VPInstruction opcode to compute the exiting value of an
induction variable after vectorization. This replaces the pattern of
extracting the last lane from the last part of the induction backedge
value when applicable.

This allows us to always use the pre-computed IV end value. It will also
allow unifying end value creation for both induction resume and exit
values.

PR: https://github.com/llvm/llvm-project/pull/175651
2026-02-06 12:27:31 +00:00
Benjamin Maxwell
4f90eb6427
[LV] Support conditional scalar assignments of masked operations (#178862)
This patch extends the support added in #158088 to loops where the
assignment is non-speculatable (e.g. a conditional load or divide).

For example, the following loop can now be vectorized:

```
int simple_csa_int_load(
  int* a, int* b, int default_val, int N, int threshold)
{
  int result = default_val;
  for (int i = 0; i < N; ++i)
    if (a[i] > threshold)
      result = b[i];
  return result;
}
```

It does this by extending the recurrence matching from only looking for
selects, to include phis where all operands are the header phi, except
for one which can be an arbitrary value outside the recurrence.
2026-02-06 11:43:06 +00:00
Alexander Kornienko
7165353506
Revert "[LoopVectorize] Support vectorization of overflow intrinsics" (#179819)
Reverts llvm/llvm-project#174835, which causes clang crashes.

See
https://github.com/llvm/llvm-project/pull/174835#issuecomment-3844233831
and https://github.com/llvm/llvm-project/issues/179671 for details.
2026-02-05 15:41:49 +01:00
Florian Hahn
8240cf337a
[VPlan] Always set flags for overflowing ops etc via VPIRFlags. (#179138)
Enforce that all VPInstructions set the correct OpType of the VPIRFlags.
Flag mis-matches (e.g. VPInstruction Add without `OverflowingBinOp`
being set) can cause crashes (e.g. in CSE) or potentially mis-compiles.

Add a few helpers in VPBuilder to create common instructions with
correct flags.

PR: https://github.com/llvm/llvm-project/pull/179138
2026-02-03 12:33:23 +00:00
Ramkumar Ramachandra
a19cbc4b77
[VPlan] Rename VectorEndPointer's IndexedTy to SourceElementTy (NFC) (#178856)
For consistency with IR terminology.
2026-02-01 11:30:26 +00:00
Florian Hahn
abfd56293c
[VPlan] Mark VPActiveLaneMaskPHIRecipe as readnone. (#177886)
VPWidenActiveLaneMaskPHIRecipe does not have side-effects and also does
not access memory. Mark accordingly. This allows hoisting of some
invariant loads out of loops and also removing unused phi recipes in the
future.

In
llvm/test/Transforms/LoopVectorize/AArch64/conditional-branches-cost.ll,
the hoisting makes vectorization profitable.

PR: https://github.com/llvm/llvm-project/pull/177886
2026-01-30 16:12:30 +00:00