79 Commits

Author SHA1 Message Date
Florian Hahn
50b9ca4dda
[VPlan] Simplify Plan's entry in removeBranchOnConst. (#154510)
After https://github.com/llvm/llvm-project/pull/153643, there may be a
BranchOnCond with constant condition in the entry block.

Simplify those in removeBranchOnConst. This removes a number of
redundant conditional branch from entry blocks.

In some cases, it may also make the original scalar loop unreachable,
because we know it will never execute. In that case, we need to remove
the loop from LoopInfo, because all unreachable blocks may dominate each
other, making LoopInfo invalid. In those cases, we can also completely
remove the loop, for which I'll share a follow-up patch.

Depends on https://github.com/llvm/llvm-project/pull/153643.

PR: https://github.com/llvm/llvm-project/pull/154510
2025-09-18 19:25:05 +01:00
Ramkumar Ramachandra
d8fd511480
[VPlan] Introduce CSE pass (#151872)
Introduce a simple common-subexpression-elimination pass at the
VPlan-level, running late during the execution of the VPlan. The
long-term vision is to get rid of the legacy non-VPlan-based cse routine
in LV, but this patch doesn't yet fully subsume it.
2025-09-02 12:23:29 +01:00
Florian Hahn
89ae085859
[VPlan] Remove VPVectorPointer for part 0 after unrolling. (#149735)
VPVectorPointer for part 0 is just the pointer operand. Simplify it
after unrolling. This removes a large number of redundant GEPs with
index 0.

PR: https://github.com/llvm/llvm-project/pull/149735
2025-07-27 13:53:26 +01:00
Florian Hahn
fa3ec0c17c
[VPlan] Materialize constant vector trip counts before final opts. (#142309)
Materialize constant vector trip counts before ::execute, if the trip
count can be computed as Original (TC / (VF * UF)) * (VF * UF). For now
this excludes when the tail is folded or scalar epilogues are required.

This enables removing a number of redundant branches from the middle
block.

For now this is also only done when not vectorizing the epilogue, as the
simplification complicates stitching the 2 plans together.

PR: https://github.com/llvm/llvm-project/pull/142309
2025-07-26 17:16:36 +01:00
Florian Hahn
dfca6c0d3b
[VPlan] Remove no-op SCALAR-STEPS after unrolling. (#123655)
After unrolling, there may be additional simplifications that can be
applied. One example is removing SCALAR-STEPS for the first part where
only the first lane is demanded.

This removes redundant adds of 0 from a large number of tests (~200),
many which I am still working on updating.

In preparation for removing redundant WideIV steps added in
https://github.com/llvm/llvm-project/pull/119284.

PR: https://github.com/llvm/llvm-project/pull/123655
2025-03-25 12:57:24 +00:00
Florian Hahn
4277c21059
[VPlan] Introduce explicit broadcasts for live-ins. (#124644)
Add a new VPInstruction::Broadcast opcode and use it to materialize
explicit broadcasts of live-ins. The initial patch only materlizes the
broadcasts if the vector preheader dominates all uses that need it.
Later patches will pick the best valid insert point, thus retiring
implicit hoisting of broadcasts from VPTransformsState::get().

PR: https://github.com/llvm/llvm-project/pull/124644
2025-02-26 13:57:51 +00:00
Florian Hahn
32c4493d5f
[VPlan] Add incoming values for all predecessor to ResumePHI (NFCI).
Follow-up as discussed when using VPInstruction::ResumePhi for all resume
values (#112147). This patch explicitly adds incoming values for each
predecessor in VPlan. This simplifies codegen and allows transformations
adjusting the predecessors of blocks with

NFC modulo incoming block order in phis.
2025-02-09 11:20:20 +00:00
Florian Hahn
7f3428d3ed
[VPlan] Compute induction end values in VPlan. (#112145)
Use createDerivedIV to compute IV end values directly in VPlan, instead
of creating them up-front.

This allows updating IV users outside the loop as follow-up.

Depends on https://github.com/llvm/llvm-project/pull/110004 and
https://github.com/llvm/llvm-project/pull/109975.

PR: https://github.com/llvm/llvm-project/pull/112145
2024-12-29 19:05:08 +00:00
Florian Hahn
4ad0fdd163
[VPlan] Remove reverse() of predecessors from VPInstruction::generate.
This was originally done to reduce the diff for the change. Remove it
and update the remaining tests. NFC modulo reordering of incoming
values.

Clean up after https://github.com/llvm/llvm-project/pull/114292.
2024-12-17 20:44:32 +00:00
Florian Hahn
7f7f540a48
Reapply "[VPlan] Update scalar induction resume values in VPlan. (#110577)"
This reverts commit f09b16e2671cbcdf7cb7dc7ed705db092a9deda1.

The crash when building llvm-test-suite with stage2 should have been
fixed by 1091fad31a83d5ab87eb6fa11fe3bdb3f0d152ea.
2024-12-06 19:41:51 +00:00
Nikita Popov
f09b16e267 Revert "[VPlan] Update scalar induction resume values in VPlan. (#110577)"
This reverts commit 0678e2058364ec10b94560d27ec7138dfa003287.
This reverts commit 1091fad31a83d5ab87eb6fa11fe3bdb3f0d152ea.

Causes crashes in llvm-test-suite when using stage 2 clang.
2024-12-06 18:01:42 +01:00
Florian Hahn
0678e20583
[VPlan] Update scalar induction resume values in VPlan. (#110577)
Updated ILV.createInductionResumeValues (now createInductionResumeVPValue)
to directly update the VPIRInstructions wrapping the original phis with the 
created resume values.

This is the first step towards modeling them completely in VPlan.
Subsequent patches will move creation of the resume values completely
into VPlan.

Depends on https://github.com/llvm/llvm-project/pull/109975.

PR: https://github.com/llvm/llvm-project/pull/110577
2024-12-06 12:26:19 +00:00
Paul Walker
38fffa630e
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548) 2024-11-06 11:53:33 +00:00
Florian Hahn
99741ac285
[VPlan] Introduce explicit ExtractFromEnd recipes for live-outs. (#100658)
Introduce explicit ExtractFromEnd recipes to extract the final values
for live-outs instead of implicitly extracting in VPLiveOut::fixPhi.

This is a follow-up to the recent changes of modeling extracts for
recurrences and consolidates live-out extract creation for fixed-order
recurrences at a single place: addLiveOutsForFirstOrderRecurrences.

It is also in preparation of replacing VPLiveOut with VPIRInstructions
wrapping the original scalar phis.

PR: https://github.com/llvm/llvm-project/pull/100658
2024-08-21 10:06:44 +02:00
Florian Hahn
9a5a8731e7
[VPlan] Introduce ResumePhi VPInstruction, use to create phi for FOR. (#94760)
This patch introduces a new ResumePhi VPInstruction which creates a phi
in a leaf block of a VPlan. The first use is to create the phi node for
fixed-order recurrence resume values in the scalar preheader.

The VPInstruction takes 2 operands: 1) the incoming value from the
middle-block and a default value to be used for all other incoming
blocks.

In follow-up changes, it will also be used to create phis for reduction
and induction resume values.

Depends on https://github.com/llvm/llvm-project/pull/92651

PR: https://github.com/llvm/llvm-project/pull/94760
2024-07-11 16:08:04 +01:00
Florian Hahn
99d6c6d936
[VPlan] Model branch cond to enter scalar epilogue in VPlan. (#92651)
This patch moves branch condition creation to enter the scalar epilogue
loop to VPlan. Modeling the branch in the middle block also requires
modeling the successor blocks. This is done using the recently
introduced VPIRBasicBlock.

Note that the middle.block is still created as part of the skeleton and
then patched in during VPlan execution. Unfortunately the skeleton needs
to create the middle.block early on, as it is also used for induction
resume value creation and is also needed to properly update the
dominator tree during skeleton creation.

After this patch lands, I plan to move induction resume value and phi
node creation in the scalar preheader to VPlan. Once that is done, we
should be able to create the middle.block in VPlan directly.

This is a re-worked version based on the earlier
https://reviews.llvm.org/D150398 and the main change is the use of
VPIRBasicBlock.

Depends on https://github.com/llvm/llvm-project/pull/92525

PR: https://github.com/llvm/llvm-project/pull/92651
2024-07-05 10:08:42 +01:00
Florian Hahn
3808ba78de
[VPlan] Model middle block via VPIRBasicBlock. (#95816)
Use VPIRBasicBlock to wrap the middle block and implement patching up
branches in predecessors in VPIRBasicBlock::execute. The IR middle block
is only created after skeleton creation. Initially a regular
VPBasicBlock is created, which will later be replaced by a
VPIRBasicBlock once the middle IR basic block has been created.

Note that this slightly changes the order of instructions created in the
middle block; code generated by recipe execution in the middle block
will now be inserted before the terminator (and in between the compare
to used by the terminator). The original order will be restored in
https://github.com/llvm/llvm-project/pull/92651.


PR: https://github.com/llvm/llvm-project/pull/95816
2024-06-20 13:42:20 +01:00
Florian Hahn
40a72f8cc4
[VPlan] Support extracting any lane of uniform value.
If the value we are extracting a lane from is uniform, only the first
lane will be set. Return lane 0 for any requested lane.

This fixes a crash when trying to extract the last lane for a
first-order recurrence resume value.

Fixes https://github.com/llvm/llvm-project/issues/95520.
2024-06-14 22:16:52 +01:00
Florian Hahn
05e1b5340b
[VPlan] Model FOR resume value extraction in VPlan. (#93396)
This patch uses the ExtractFromEnd VPInstruction opcode
to extract the value of a FOR to be used as resume value for the ph in
the scalar loop.

It adds a new live-out that temporarily wraps the FOR phi in the scalar
loop. fixFixedOrderRecurrence will process live outs for fixed order
recurrence phis by creating a new phi node in the scalar preheader, 
using the generated value for the live-out as incoming value from the
middle block and the original start value as incoming value for the
other edge. Creation of the phi in the preheader, as well as updating
the phi in the scalar loop will also be moved to VPlan in the future,
eventually retiring fixFixedOrderRecurrence

Depends on https://github.com/llvm/llvm-project/pull/93395

PR: https://github.com/llvm/llvm-project/pull/93396
2024-06-05 11:18:06 +01:00
Fangrui Song
ef9090fcb5 [test] Fix check prefixes 2024-05-13 14:01:00 -07:00
Florian Hahn
96e83d3705
[LV] Use IRBuilder to create and optimize middle-block compare.
Split off from D150398 to avoid builder-related diff changes there.
Using IRBuilder to create ICmps simplifies the result if both operands
are constants.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D158332
2023-08-29 11:42:18 +01:00
Florian Hahn
35af27c30a
[VPlan] Only create extracts for recurrence exits if there are live-outs.
Move the code to collect live-out earlier and only generate extracts for
exit values if there are any live-outs that use them.

Depends on D147472.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D147567
2023-04-10 21:08:34 +01:00
Florian Hahn
c83fdc905a
[LV] Perform recurrence sinking directly on VPlan.
This patch updates LV to sink recipes directly using the VPlan use
chains. The initial patch only moves sinking to be purely VPlan-based.
Follow-up patches will move legality checks to VPlan as well.

At the moment, there's a single test failure remaining.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D142589
2023-02-08 15:49:29 +00:00
Paul Walker
eae26b6640 [IRBuilder] Use canonical i64 type for insertelement index used by vector splats.
Instcombine prefers this canonical form (see getPreferredVectorIndex),
as does IRBuilder when passing the index as an integer so we may as
well use the prefered form from creation.

NOTE: All test changes are mechanical with nothing else expected
beyond a change of index type from i32 to i64.

Differential Revision: https://reviews.llvm.org/D140983
2023-01-11 14:08:06 +00:00
Florian Hahn
f5c766ba14
[LV] Convert a few tests to use opaque pointers (NFC). 2022-12-27 23:01:41 +00:00
Bjorn Pettersson
2e14900db9 [test][NewPM] Use -passes=loop-vectorize instead of -loop-vectorize
Update a bunch of loop-vectorize regression tests to use the new PM
syntax (opt -passes=loop-vectorize) instead of the deprecated legacy
PM syntax (opt -loop-vectorize).
2022-04-28 16:46:00 +02:00
Dávid Bolvanský
872f7000fc Revert "[NFCI] Regenerate SROA/LoopVectorize test checks"
This reverts commit 14e3450fb57305aa9ff3e9e60687b458e43835c9.
2022-04-04 01:15:30 +02:00
Dávid Bolvanský
a113a582b1 [NFCI] Regenerate LoopVectorize test checks 2022-04-03 21:56:24 +02:00
Florian Hahn
95f76bff1c
[LV] Create & use VPScalarIVSteps for all scalar users.
This patch is a follow-up to D115953. It updates optimizeInductions
to also introduce new VPScalarIVStepsRecipes if an IV has both vector
and scalar uses.

It updates all uses that only need scalar values to use the newly
created recipe for the scalar steps.

This completes untangling of VPWidenIntOrFpInductionRecipe
code-generation. Now the recipe *only* creates the widened vector
values, as it says on the tin.

The code to genereate IR has been moved directly to
VPWidenIntOrFpInductionRecipe::execute.

Note that the recipe has been updated to hold a reference to
ScalarEvolution, which is needed to expand the step, until we can place
the corresponding SCEV expansion in the pre-header.

Depends on D120827.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D120828
2022-03-13 17:15:24 +00:00
Florian Hahn
139215af8e
[IVDescriptor] Find original 'Previous' for first-order recurrences.
This patch extends first-order recurrence handling to support cases
where we already sunk an instruction for a different recurrence, but
LastPrev comes before Previous.

To handle those cases correctly, we need to find the earliest entry for
the sink-after chain, because this is references the Previous from the
original recurrence. This is needed to ensure we use the correct
instruction as sink point.

Depends on D118558.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D118642
2022-03-03 16:41:26 +00:00
Florian Hahn
470b5c7f0d
[LV] Add test with multiple use of a FOR chained together.
Additional test coverage for D118642.
2022-03-01 14:18:23 +00:00
Florian Hahn
5c7ae10cec
[LV] Add store to test to make sure the loop is not dead.
Add an extra store to the test, to make sure the operations in the loop
cannot be optimized away after D118051.
2022-02-20 15:05:29 +00:00
zhongyunde
b2f5164deb [IVDescriptors] Support FOR where we have multiple sink pointed
Handles the case where Previous doesn't come before LastPrev incorrectly.
Fix https://github.com/llvm/llvm-project/issues/53483

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D118558
2022-02-14 09:30:35 +08:00
Florian Hahn
9474c3009e
[LV] Move unrelated tests from first-order-recurrence-chains.ll 2022-02-11 09:15:42 +00:00
Florian Hahn
02ee3fbff8
[LV] Add additional complex first order recurrence test.
Add a new test case with 2 first-order recurrences, which share a user.
2022-01-31 19:54:14 +00:00
Florian Hahn
8f12175fed
[VPlan] Use VPlan to check if only the first lane is used.
This removes the remaining dependence on LoopVectorizationCostModel from
buildScalarSteps and is required so it can be moved out of ILV.

It also improves allows us to remove a few unneeded instructions.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D116554
2022-01-30 13:07:29 +00:00
Florian Hahn
7e68061305
[IRBuilder] Migrate add-folding to value-based FoldAdd.
Depends on D116935.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D116968
2022-01-12 09:24:46 +00:00
Florian Hahn
f395a4f8d5
[SCEVExpand] Only create required predicate checks.
Currently generateOverflowCheck always creates code for Step being
negative and positive, followed by a select at the end depending on
Step's sign.

This patch updates the code to only create either the checks for step
being positive or negative, if the sign is known.

Follow-up to D116696.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D116747
2022-01-07 14:49:02 +00:00
Florian Hahn
86d113a8b8
[SCEVExpand] Do not create redundant 'or false' for pred expansion.
This patch updates SCEVExpander::expandUnionPredicate to not create
redundant 'or false, x' instructions. While those are trivially
foldable, they can be easily avoided and hinder code that checks the
size/cost of the generated checks before further folds.

I am planning on look into a few other similar improvements to code
generated by SCEVExpander.

I remember a while ago @lebedev.ri working on doing some trivial folds
like that in IRBuilder itself, but there where concerns that such
changes may subtly break existing code.

Reviewed By: reames, lebedev.ri

Differential Revision: https://reviews.llvm.org/D116696
2022-01-06 11:52:19 +00:00
Roman Lebedev
b291597112
Revert rest of IRBuilderBase's short-circuiting folds
Upon further investigation and discussion,
this is actually the opposite direction from what we should be taking,
and this direction wouldn't solve the motivational problem anyway.

Additionally, some more (polly) tests have escaped being updated.
So, let's just take a step back here.

This reverts commit f3190dedeef9da2109ea57e4cb372f295ff53b88.
This reverts commit 749581d21f2b3f53e4fca4eb8728c942d646893b.
This reverts commit f3df87d57e096143670e0fd396e81d43393a2dd2.
This reverts commit ab1dbcecd6f0969976fafd62af34730436ad5944.
2021-10-28 02:15:14 +03:00
Roman Lebedev
101aaf62ef
Revert "[NFC] IRBuilderBase::CreateAdd(): place constant onto RHS"
Clang OpenMP codegen tests are failing,
will recommit afterwards.

This reverts commit 4723c9b3c6c46632a5d66e65d198899894b1e2c5.
2021-10-27 22:21:37 +03:00
Roman Lebedev
42712698fd
Revert "[IR] IRBuilderBase::CreateAdd(): short-circuit x + 0 --> x"
Clang OpenMP codegen tests are failing.

This reverts commit 288f1f8abe5835180a0021f142043ee261ab3846.
This reverts commit cb90e5356ac1594e95fed8e208d6e0e9b6a87db1.
2021-10-27 22:21:37 +03:00
Roman Lebedev
cb90e5356a
[IR] IRBuilderBase::CreateAdd(): short-circuit x + 0 --> x
There's precedent for that in `CreateOr()`/`CreateAnd()`.

The motivation here is to avoid bloating the run-time check's IR
in `SCEVExpander::generateOverflowCheck()`.

Refs. https://reviews.llvm.org/D109368#3089809
2021-10-27 21:34:38 +03:00
Roman Lebedev
4723c9b3c6
[NFC] IRBuilderBase::CreateAdd(): place constant onto RHS 2021-10-27 21:34:38 +03:00
Roman Lebedev
156f10c840
[IR] SCEVExpander::generateOverflowCheck(): short-circuit umul_with_overflow-by-one
It's a no-op, no overflow happens ever: https://alive2.llvm.org/ce/z/Zw89rZ

While generally i don't like such hacks,
we have a very good reason to do this: here we are expanding
a run-time correctness check for the vectorization,
and said `umul_with_overflow` will not be optimized out
before we query the cost of the checks we've generated.

Which means, the cost of run-time checks would be artificially inflated,
and after https://reviews.llvm.org/D109368 that will affect
the minimal trip count for which these checks are even evaluated.
And if they aren't even evaluated, then the vectorized code
certainly won't be run.

We could consider doing this in IRBuilder,  but then we'd need to
also teach `CreateExtractValue()` to look into chain of `insertvalue`'s,
and i'm not sure there's precedent for that.

Refs. https://reviews.llvm.org/D109368#3089809
2021-10-27 19:45:55 +03:00
Roman Lebedev
f3df87d57e
[IR] IRBuilderBase::CreateOr(): fix short-circuiting for constant on LHS
There is no guarantee that the constant is on RHS here,
we have to handle both cases.

Refs. https://reviews.llvm.org/D109368#3089809
2021-10-27 18:01:06 +03:00
Roman Lebedev
ab1dbcecd6
[IR] IRBuilderBase::CreateSelect(): if cond is a constant i1, short-circuit
While we could emit such a tautological `select`,
it will stick around until the next instsimplify invocation,
which may happen after we count the cost of this redundant `select`.
Which is precisely what happens with loop vectorization legality checks,
and that artificially increases the cost of said checks,
which is bad.

There is prior art for this in `IRBuilderBase::CreateAnd()`/`IRBuilderBase::CreateOr()`.

Refs. https://reviews.llvm.org/D109368#3089809
2021-10-27 18:01:05 +03:00
Florian Hahn
7a1e73f0b9
Recommit "[VPlan] Add recipe for first-order rec phis, make splicing explicit."
This reverts the revert commit b1777b04dc4b1a9fee0e7effa7e177892ab32ef0.

The patch originally got reverted due to a crash:
https://bugs.chromium.org/p/chromium/issues/detail?id=1232798#c2

The underlying issue was that we were not using the stored values from
the modified memory recipes, but the out-of-date values directly from
the IR (accessed via the VPlan). This should be fixed in d995d6376. A
reduced version of the reproducer has been added in 93664503be6b.
2021-07-26 15:50:30 +01:00
Nico Weber
b1777b04dc Revert "[VPlan] Add recipe for first-order rec phis, make splicing explicit."
Makes clang crash: https://reviews.llvm.org/D105008#2903350
This reverts commit d2a73fb44ea0b8c981e4b923f811f18793fc4770.

Also revert a minor formatting follow-up:
This reverts commit 82834a673246f27a541ffcc57e0eb65b008102ef.
2021-07-25 17:39:28 -04:00
Florian Hahn
d2a73fb44e
[VPlan] Add recipe for first-order rec phis, make splicing explicit.
This patch adds a VPFirstOrderRecurrencePHIRecipe, to further untangle
VPWidenPHIRecipe into distinct recipes for distinct use cases/lowering.
See D104989 for a new recipe for reduction phis.

This patch also introduces a new `FirstOrderRecurrenceSplice`
VPInstruction opcode, which is used to make the forming of the vector
recurrence value explicit in VPlan. This more accurately models def-uses
in VPlan and also simplifies code-generation. Now, the vector recurrence
values are created at the right place during VPlan-codegeneration,
rather than during post-VPlan fixups.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D105008
2021-07-20 16:14:17 +02:00