Add a new getNumOperandsForOpcode helper to determine the number of
operands from the opcode. For now, it is used to verify the number
operands at VPInstruction construction.
It returns -1 for a few opcodes where the number of operands cannot be
determined (GEP, Switch, PHI, Call).
This can also be used in a follow-up to determine if a VPInstruction is
masked based on the number of arguments.
PR: https://github.com/llvm/llvm-project/pull/142284
This patch moves the logic to manage IR flags to a separate VPIRFlags
class. For now, VPRecipeWithIRFlags is the only class that inherits
VPIRFlags. The new class allows for simpler passing of flags when
constructing recipes, simplifying the constructors for various recipes
(VPInstruction in particular, which now just has 2 constructors, one
taking an extra VPIRFlags argument.
This mirrors the approach taken for VPIRMetadata and makes it easier to
extend in the future. The patch also adds a unified flagsValidForOpcode
to check if the flags in a VPIRFlags match the provided opcode.
PR: https://github.com/llvm/llvm-project/pull/140621
Support cloning VPlans as they are created by the initial buildVPlan,
i.e. scalar header not yet connected and no trip-count set. This is not
used yet but will in follow-up changes/
Also add a unit test for cloning & printing.
Thread-local globals live, by default, in the default globals address
space, which may not be 0, so we need to overload @llvm.thread.pointer
to support other address spaces, and use the default globals address
space in Clang.
Similarly to other recipes, update VPScalarIVStepsRecipe to also take
the runtime VF as argument. This removes some unnecessary runtime VF
computations for scalable vectors. It will also allow dropping the
UF == 1 restriction for narrowing interleave groups required in
577631f0a528.
VPReductionRecipes take a RecurrenceDescriptor, but only use the
RecurKind and FastMathFlags in it when executing. This patch makes the
recipe more lightweight by stripping it to only take the latter two.
The motiviation for this is to simplify an upcoming patch to support
in-loop AnyOf reductions. For an in-loop AnyOf reduction we want to
create an Or reduction, and by using RecurKind we can create an
arbitrary reduction without needing a full RecurrenceDescriptor.
This patch change the parent of the VPReductionRecipe from
VPSingleDefRecipe to VPRecipeWithIRFlags and also print/get/drop/control
flags by the VPRecipeWithIRFlags. This will remove the dependency of the
underlying instruction.
This patch also add a new function `setFastMathFlags()` to the
VPRecipeWithIRFlags because the entire reduction chain may contains
multiple instructions. And the underlying instruction may not contains
the corresponding flags for this reduction.
Split from #113903.
Update VPWidenPHIRecipe to use the predecessors in VPlan to determine
the incoming blocks instead of tracking them separately. This brings
VPWidenPHIRecipe in line with the other phi recipes.
PR: https://github.com/llvm/llvm-project/pull/126388
This is extracted from #118638
After c7ebe4f we will crash in fixNonInductionPHIs if we use a
VPWidenPHIRecipe with the vector preheader as an incoming block, because
the phi will reference the old non-IRBB vector preheader.
This fixes this by updating VPBlockUtils::reassociateBlocks to update
any VPWidenPHIRecipes's incoming blocks.
This assumes that if the VPWidenPHIRecipe is in a VPRegionBlock, it's in
the entry block, and that we are replacing a VPBasicBlock with another
VPBasicBlock.
Nothing in VPlan.h directly depends on VPTransformState, VPCostContext,
VPFRange, VPlanPrinter or VPSlotTracker. Move them out to a separate
header to reduce the size of widely used VPlan.h.
This is a first step towards more cleanly separating declarations in
VPlan.
Besides reducing VPlan.h's size, this also allows including additional
VPlan-related headers in VPlanHelpers.h for use there. An example is
using VPDominatorTree in VPTransformState
(https://github.com/llvm/llvm-project/pull/117138).
PR: https://github.com/llvm/llvm-project/pull/124104
Tighten access to constructors similar to ef1260acc0. VPValues should
either be constructed by constructors of recipes defining them or should
be live-ins created by VPlan (via getOrAddLiveIn).
16d19aaed moved to manage block creation via VPlan directly, with VPlan
owning the created blocks. Follow up to make the VPBlock constructors
private, to require creation via VPlan helpers and thus preventing
issues due to manually constructing blocks.
This just copies the same conservative definition from mayWriteToMemory,
and enables more VPInstructions to be hoisted out in LICM.
I think this should give more accurate costs, and I was able to build
llvm-test-suite without the legacy-vplan cost model assertion going off.
This patch changes the way blocks are managed by VPlan. Previously all
blocks reachable from entry would be cleaned up when a VPlan is
destroyed. With this patch, each VPlan keeps track of blocks created for
it in a list and this list is then used to delete all blocks in the list
when the VPlan is destroyed. To do so, block creation is funneled
through helpers in directly in VPlan.
The main advantage of doing so is it simplifies CFG transformations, as
those do not have to take care of deleting any blocks, just adjusting
the CFG. This helps to simplify
https://github.com/llvm/llvm-project/pull/108378 and
https://github.com/llvm/llvm-project/pull/106748.
This also simplifies handling of 'immutable' blocks a VPlan holds
references to, which at the moment only include the scalar header block.
PR: https://github.com/llvm/llvm-project/pull/120918
Update C++ unit tests to use VPlanTestBase to construct initial VPlan,
using a constructor that creates the VP blocks directly in the
constructor.
Split off from and in preparation for
https://github.com/llvm/llvm-project/pull/120918.
As a first step to move towards modeling the full skeleton in VPlan,
start by wrapping IR blocks created during legacy skeleton creation in
VPIRBasicBlocks and hook them into the VPlan. This means the skeleton
CFG is represented in VPlan, just before execute. This allows moving
parts of skeleton creation into recipes in the VPBBs gradually.
Note that this allows retiring some manual DT updates, as this will be
handled automatically during VPlan execution.
PR: https://github.com/llvm/llvm-project/pull/114292
Update VPlan to include the scalar loop header. This allows retiring
VPLiveOut, as the remaining live-outs can now be handled by adding
operands to the wrapped phis in the scalar loop header.
Note that the current version only includes the scalar loop header, no
other loop blocks and also does not wrap it in a region block.
PR: https://github.com/llvm/llvm-project/pull/109975
This allows calling ::dump() on various sub-classes of VPSingleDefRecipe
directly, as it resolves an ambigous name lookup.
Previously, calling VPWidenRecipe::dump() (and others), would result in
the following errors:
llvm/unittests/Transforms/Vectorize/VPlanTest.cpp:1284:19: error: member 'dump' found in multiple base classes of different types
1284 | WidenR->dump();
| ^
llvm/include/../lib/Transforms/Vectorize/VPlanValue.h:434:8: note: member found by ambiguous name lookup
434 | void dump() const;
| ^
llvm/include/../lib/Transforms/Vectorize/VPlanValue.h:108:8: note: member found by ambiguous name lookup
108 | void dump() const;
| ^
1 error generated.
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
This patch splits off intrinsic hanlding to a new
VPWidenIntrinsicRecipe. VPWidenIntrinsicRecipes only need access to the
intrinsic ID to widen and the scalar result type (in case the intrinsic
is overloaded on the result type). It does not need access to an
underlying IR call instruction or function.
This means VPWidenIntrinsicRecipe can be created easily without access
to underlying IR.
Unify logic for mayWriteToMemory and mayHaveSideEffects for
VPInstruction, with the later relying on the former. Also extend to
handle binary operators.
Split off from https://github.com/llvm/llvm-project/pull/106441
raw_string_ostream::flush() is essentially a no-op (also specified in docs).
Don't call it in tests that aren't meant to test 'raw_string_ostream' itself.
p.s. remove a few redundant calls to raw_string_ostream::str()
VPEVLBasedIVPHIRecipe inherits from VPSingleDefRecipe. Add
VPEVLBasedIVPHISC to VPSingleDefRecipe::classof to make isa/dyn_cast &
co work as expected.
Split off https://github.com/llvm/llvm-project/pull/67934.
Replace relying on the underling CallInst for looking up the called
function and its types by instead adding the called function as operand,
in line with how called functions are handled in CallInst.
Operand bundles, metadata and fast-math flags are optionally used if
there's an underlying CallInst.
This enables creating VPWidenCallRecipes without requiring an underlying
IR instruction.
This patch introduces a new VPWidenMemoryRecipe base class and distinct
sub-classes to model loads and stores.
This is a first step in an effort to simplify and modularize code
generation for widened loads and stores and enable adding further more
specialized memory recipes.
PR: https://github.com/llvm/llvm-project/pull/87411
Add extra tests for printing recipes not inserted in a VPlan yet, e.g.
when using a debugger.
Guard against regressions in changes to printing, i.e.
https://github.com/llvm/llvm-project/pull/81411.
VPBlendRecipe does not use the first mask operand. Removing it allows
VPlan-based DCE to remove unused mask computations.
This also fixes#87410, where unused Not VPInstructions are considered
having only their first lane demanded, but some of their operands
providing a vector value due to other users.
Fixes https://github.com/llvm/llvm-project/issues/87410
PR: https://github.com/llvm/llvm-project/pull/87770
Instead of using ILV.useOrderedReductions during ::execute, instead
store the information at recipe construction.
Another step towards making recipe'::execute independent of legacy ILV.
This patch implements cloning for VPlans and recipes. Cloning is used in
the epilogue vectorization path, to clone the VPlan for the main vector
loop. This means we won't re-use a VPlan when executing the VPlan for
the epilogue vector loop, which in turn will enable us to perform
optimizations based on UF & VF.