After https://github.com/llvm/llvm-project/pull/153643, there may be a
BranchOnCond with constant condition in the entry block.
Simplify those in removeBranchOnConst. This removes a number of
redundant conditional branch from entry blocks.
In some cases, it may also make the original scalar loop unreachable,
because we know it will never execute. In that case, we need to remove
the loop from LoopInfo, because all unreachable blocks may dominate each
other, making LoopInfo invalid. In those cases, we can also completely
remove the loop, for which I'll share a follow-up patch.
Depends on https://github.com/llvm/llvm-project/pull/153643.
PR: https://github.com/llvm/llvm-project/pull/154510
Introduce a simple common-subexpression-elimination pass at the
VPlan-level, running late during the execution of the VPlan. The
long-term vision is to get rid of the legacy non-VPlan-based cse routine
in LV, but this patch doesn't yet fully subsume it.
VPVectorPointer for part 0 is just the pointer operand. Simplify it
after unrolling. This removes a large number of redundant GEPs with
index 0.
PR: https://github.com/llvm/llvm-project/pull/149735
Materialize constant vector trip counts before ::execute, if the trip
count can be computed as Original (TC / (VF * UF)) * (VF * UF). For now
this excludes when the tail is folded or scalar epilogues are required.
This enables removing a number of redundant branches from the middle
block.
For now this is also only done when not vectorizing the epilogue, as the
simplification complicates stitching the 2 plans together.
PR: https://github.com/llvm/llvm-project/pull/142309
After unrolling, there may be additional simplifications that can be
applied. One example is removing SCALAR-STEPS for the first part where
only the first lane is demanded.
This removes redundant adds of 0 from a large number of tests (~200),
many which I am still working on updating.
In preparation for removing redundant WideIV steps added in
https://github.com/llvm/llvm-project/pull/119284.
PR: https://github.com/llvm/llvm-project/pull/123655
Add a new VPInstruction::Broadcast opcode and use it to materialize
explicit broadcasts of live-ins. The initial patch only materlizes the
broadcasts if the vector preheader dominates all uses that need it.
Later patches will pick the best valid insert point, thus retiring
implicit hoisting of broadcasts from VPTransformsState::get().
PR: https://github.com/llvm/llvm-project/pull/124644
Follow-up as discussed when using VPInstruction::ResumePhi for all resume
values (#112147). This patch explicitly adds incoming values for each
predecessor in VPlan. This simplifies codegen and allows transformations
adjusting the predecessors of blocks with
NFC modulo incoming block order in phis.
This was originally done to reduce the diff for the change. Remove it
and update the remaining tests. NFC modulo reordering of incoming
values.
Clean up after https://github.com/llvm/llvm-project/pull/114292.
This reverts commit f09b16e2671cbcdf7cb7dc7ed705db092a9deda1.
The crash when building llvm-test-suite with stage2 should have been
fixed by 1091fad31a83d5ab87eb6fa11fe3bdb3f0d152ea.
This reverts commit 0678e2058364ec10b94560d27ec7138dfa003287.
This reverts commit 1091fad31a83d5ab87eb6fa11fe3bdb3f0d152ea.
Causes crashes in llvm-test-suite when using stage 2 clang.
Updated ILV.createInductionResumeValues (now createInductionResumeVPValue)
to directly update the VPIRInstructions wrapping the original phis with the
created resume values.
This is the first step towards modeling them completely in VPlan.
Subsequent patches will move creation of the resume values completely
into VPlan.
Depends on https://github.com/llvm/llvm-project/pull/109975.
PR: https://github.com/llvm/llvm-project/pull/110577
Introduce explicit ExtractFromEnd recipes to extract the final values
for live-outs instead of implicitly extracting in VPLiveOut::fixPhi.
This is a follow-up to the recent changes of modeling extracts for
recurrences and consolidates live-out extract creation for fixed-order
recurrences at a single place: addLiveOutsForFirstOrderRecurrences.
It is also in preparation of replacing VPLiveOut with VPIRInstructions
wrapping the original scalar phis.
PR: https://github.com/llvm/llvm-project/pull/100658
This patch introduces a new ResumePhi VPInstruction which creates a phi
in a leaf block of a VPlan. The first use is to create the phi node for
fixed-order recurrence resume values in the scalar preheader.
The VPInstruction takes 2 operands: 1) the incoming value from the
middle-block and a default value to be used for all other incoming
blocks.
In follow-up changes, it will also be used to create phis for reduction
and induction resume values.
Depends on https://github.com/llvm/llvm-project/pull/92651
PR: https://github.com/llvm/llvm-project/pull/94760
This patch moves branch condition creation to enter the scalar epilogue
loop to VPlan. Modeling the branch in the middle block also requires
modeling the successor blocks. This is done using the recently
introduced VPIRBasicBlock.
Note that the middle.block is still created as part of the skeleton and
then patched in during VPlan execution. Unfortunately the skeleton needs
to create the middle.block early on, as it is also used for induction
resume value creation and is also needed to properly update the
dominator tree during skeleton creation.
After this patch lands, I plan to move induction resume value and phi
node creation in the scalar preheader to VPlan. Once that is done, we
should be able to create the middle.block in VPlan directly.
This is a re-worked version based on the earlier
https://reviews.llvm.org/D150398 and the main change is the use of
VPIRBasicBlock.
Depends on https://github.com/llvm/llvm-project/pull/92525
PR: https://github.com/llvm/llvm-project/pull/92651
Use VPIRBasicBlock to wrap the middle block and implement patching up
branches in predecessors in VPIRBasicBlock::execute. The IR middle block
is only created after skeleton creation. Initially a regular
VPBasicBlock is created, which will later be replaced by a
VPIRBasicBlock once the middle IR basic block has been created.
Note that this slightly changes the order of instructions created in the
middle block; code generated by recipe execution in the middle block
will now be inserted before the terminator (and in between the compare
to used by the terminator). The original order will be restored in
https://github.com/llvm/llvm-project/pull/92651.
PR: https://github.com/llvm/llvm-project/pull/95816
If the value we are extracting a lane from is uniform, only the first
lane will be set. Return lane 0 for any requested lane.
This fixes a crash when trying to extract the last lane for a
first-order recurrence resume value.
Fixes https://github.com/llvm/llvm-project/issues/95520.
This patch uses the ExtractFromEnd VPInstruction opcode
to extract the value of a FOR to be used as resume value for the ph in
the scalar loop.
It adds a new live-out that temporarily wraps the FOR phi in the scalar
loop. fixFixedOrderRecurrence will process live outs for fixed order
recurrence phis by creating a new phi node in the scalar preheader,
using the generated value for the live-out as incoming value from the
middle block and the original start value as incoming value for the
other edge. Creation of the phi in the preheader, as well as updating
the phi in the scalar loop will also be moved to VPlan in the future,
eventually retiring fixFixedOrderRecurrence
Depends on https://github.com/llvm/llvm-project/pull/93395
PR: https://github.com/llvm/llvm-project/pull/93396
Split off from D150398 to avoid builder-related diff changes there.
Using IRBuilder to create ICmps simplifies the result if both operands
are constants.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D158332
Move the code to collect live-out earlier and only generate extracts for
exit values if there are any live-outs that use them.
Depends on D147472.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D147567
This patch updates LV to sink recipes directly using the VPlan use
chains. The initial patch only moves sinking to be purely VPlan-based.
Follow-up patches will move legality checks to VPlan as well.
At the moment, there's a single test failure remaining.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D142589
Instcombine prefers this canonical form (see getPreferredVectorIndex),
as does IRBuilder when passing the index as an integer so we may as
well use the prefered form from creation.
NOTE: All test changes are mechanical with nothing else expected
beyond a change of index type from i32 to i64.
Differential Revision: https://reviews.llvm.org/D140983
Update a bunch of loop-vectorize regression tests to use the new PM
syntax (opt -passes=loop-vectorize) instead of the deprecated legacy
PM syntax (opt -loop-vectorize).
This patch is a follow-up to D115953. It updates optimizeInductions
to also introduce new VPScalarIVStepsRecipes if an IV has both vector
and scalar uses.
It updates all uses that only need scalar values to use the newly
created recipe for the scalar steps.
This completes untangling of VPWidenIntOrFpInductionRecipe
code-generation. Now the recipe *only* creates the widened vector
values, as it says on the tin.
The code to genereate IR has been moved directly to
VPWidenIntOrFpInductionRecipe::execute.
Note that the recipe has been updated to hold a reference to
ScalarEvolution, which is needed to expand the step, until we can place
the corresponding SCEV expansion in the pre-header.
Depends on D120827.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D120828
This patch extends first-order recurrence handling to support cases
where we already sunk an instruction for a different recurrence, but
LastPrev comes before Previous.
To handle those cases correctly, we need to find the earliest entry for
the sink-after chain, because this is references the Previous from the
original recurrence. This is needed to ensure we use the correct
instruction as sink point.
Depends on D118558.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D118642
This removes the remaining dependence on LoopVectorizationCostModel from
buildScalarSteps and is required so it can be moved out of ILV.
It also improves allows us to remove a few unneeded instructions.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D116554
Currently generateOverflowCheck always creates code for Step being
negative and positive, followed by a select at the end depending on
Step's sign.
This patch updates the code to only create either the checks for step
being positive or negative, if the sign is known.
Follow-up to D116696.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D116747
This patch updates SCEVExpander::expandUnionPredicate to not create
redundant 'or false, x' instructions. While those are trivially
foldable, they can be easily avoided and hinder code that checks the
size/cost of the generated checks before further folds.
I am planning on look into a few other similar improvements to code
generated by SCEVExpander.
I remember a while ago @lebedev.ri working on doing some trivial folds
like that in IRBuilder itself, but there where concerns that such
changes may subtly break existing code.
Reviewed By: reames, lebedev.ri
Differential Revision: https://reviews.llvm.org/D116696
Upon further investigation and discussion,
this is actually the opposite direction from what we should be taking,
and this direction wouldn't solve the motivational problem anyway.
Additionally, some more (polly) tests have escaped being updated.
So, let's just take a step back here.
This reverts commit f3190dedeef9da2109ea57e4cb372f295ff53b88.
This reverts commit 749581d21f2b3f53e4fca4eb8728c942d646893b.
This reverts commit f3df87d57e096143670e0fd396e81d43393a2dd2.
This reverts commit ab1dbcecd6f0969976fafd62af34730436ad5944.
Clang OpenMP codegen tests are failing.
This reverts commit 288f1f8abe5835180a0021f142043ee261ab3846.
This reverts commit cb90e5356ac1594e95fed8e208d6e0e9b6a87db1.
There's precedent for that in `CreateOr()`/`CreateAnd()`.
The motivation here is to avoid bloating the run-time check's IR
in `SCEVExpander::generateOverflowCheck()`.
Refs. https://reviews.llvm.org/D109368#3089809
It's a no-op, no overflow happens ever: https://alive2.llvm.org/ce/z/Zw89rZ
While generally i don't like such hacks,
we have a very good reason to do this: here we are expanding
a run-time correctness check for the vectorization,
and said `umul_with_overflow` will not be optimized out
before we query the cost of the checks we've generated.
Which means, the cost of run-time checks would be artificially inflated,
and after https://reviews.llvm.org/D109368 that will affect
the minimal trip count for which these checks are even evaluated.
And if they aren't even evaluated, then the vectorized code
certainly won't be run.
We could consider doing this in IRBuilder, but then we'd need to
also teach `CreateExtractValue()` to look into chain of `insertvalue`'s,
and i'm not sure there's precedent for that.
Refs. https://reviews.llvm.org/D109368#3089809
While we could emit such a tautological `select`,
it will stick around until the next instsimplify invocation,
which may happen after we count the cost of this redundant `select`.
Which is precisely what happens with loop vectorization legality checks,
and that artificially increases the cost of said checks,
which is bad.
There is prior art for this in `IRBuilderBase::CreateAnd()`/`IRBuilderBase::CreateOr()`.
Refs. https://reviews.llvm.org/D109368#3089809
This reverts the revert commit b1777b04dc4b1a9fee0e7effa7e177892ab32ef0.
The patch originally got reverted due to a crash:
https://bugs.chromium.org/p/chromium/issues/detail?id=1232798#c2
The underlying issue was that we were not using the stored values from
the modified memory recipes, but the out-of-date values directly from
the IR (accessed via the VPlan). This should be fixed in d995d6376. A
reduced version of the reproducer has been added in 93664503be6b.
Makes clang crash: https://reviews.llvm.org/D105008#2903350
This reverts commit d2a73fb44ea0b8c981e4b923f811f18793fc4770.
Also revert a minor formatting follow-up:
This reverts commit 82834a673246f27a541ffcc57e0eb65b008102ef.
This patch adds a VPFirstOrderRecurrencePHIRecipe, to further untangle
VPWidenPHIRecipe into distinct recipes for distinct use cases/lowering.
See D104989 for a new recipe for reduction phis.
This patch also introduces a new `FirstOrderRecurrenceSplice`
VPInstruction opcode, which is used to make the forming of the vector
recurrence value explicit in VPlan. This more accurately models def-uses
in VPlan and also simplifies code-generation. Now, the vector recurrence
values are created at the right place during VPlan-codegeneration,
rather than during post-VPlan fixups.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D105008