llvm-project

Author	SHA1	Message	Date
David Sherwood	4ed7bcb4a6	[VPlan][NFC] Add new getMiddleBlock interface to VPlan (#113558 ) This work is in preparation for PRs #112138 and #88385 where the middle block is not guaranteed to be the immediate successor to the region block. I've simply add new getMiddleBlock() interfaces to VPlan that for now just return cast<VPBasicBlock>(VectorRegion->getSingleSuccessor()) Once PR #112138 lands we'll need to do more work to discover the middle block.	2024-11-01 10:50:52 +00:00
Florian Hahn	b021464d35	[VPlan] Introduce scalar loop header in plan, remove VPLiveOut. (#109975 ) Update VPlan to include the scalar loop header. This allows retiring VPLiveOut, as the remaining live-outs can now be handled by adding operands to the wrapped phis in the scalar loop header. Note that the current version only includes the scalar loop header, no other loop blocks and also does not wrap it in a region block. PR: https://github.com/llvm/llvm-project/pull/109975	2024-10-31 21:36:44 +01:00
Mel Chen	8420dbf2b9	[VPlan] Refine the constructor of VPWidenIntrinsicRecipe. nfc (#113890 ) Infers member MayReadFromMemory, MayWriteToMemory, and MayHaveSideEffects based on intrinsic attributes. --------- Co-authored-by: Florian Hahn <flo@fhahn.com>	2024-10-30 12:22:28 +08:00
Florian Hahn	680901ed80	[VPlan] Implement VPHeaderPHIRecipe::computeCost. Fill out computeCost implementations for various header PHI recipes, matching the legacy cost model for now.	2024-10-29 21:04:31 +00:00
Shih-Po Hung	266ff98cba	[LV][VPlan] Use VF VPValue in VPVectorPointerRecipe (#110974 ) Refactors VPVectorPointerRecipe to use the VF VPValue to obtain the runtime VF, similar to #95305. Since only reverse vector pointers require the runtime VF, the patch sets VPUnrollPart::PartOpIndex to 1 for vector pointers and 2 for reverse vector pointers. As a result, the generation of reverse vector pointers is moved into a separate recipe.	2024-10-26 23:18:50 +08:00
Florian Hahn	ef217a0f6b	[VPlan] Introduce and use getVectorPreheader (NFC). Introduce a dedicated function to retrieve the vector preheader. This ensures the correct block is used, even if the skeleton is exetended.	2024-10-23 21:01:52 -07:00
Kazu Hirata	38fca7b7db	[Vectorize] Simplify code with DenseMap::operator[] (NFC) (#113246 )	2024-10-21 21:35:38 -07:00
Elvis Wang	b3edc764f7	[VPlan] Implement VPWidenCastRecipe::computeCost(). (NFCI) (#111339 ) This patch implement `VPWidenCastRecipe::computeCost()` and skip cast recipies in the in-loop reduction.	2024-10-22 12:23:49 +08:00
Florian Hahn	1d9b3222f3	[VPlan] Implement VPWidenSelectRecipe::computeCost. Implement VPlan-based cost computation for VPWidenSelectRecipe.	2024-10-22 03:10:04 +01:00
Florian Hahn	b497010854	[VPlan] Use VPInstruction::Name when assigning names (NFCI). This slightly improves the printing of VPInstructions. NFC except debug output.	2024-10-18 05:52:35 +01:00
Florian Hahn	81bbe19383	[VPlan] Add VPSingleDefRecipe::dump() to resolve ambigous lookup (NFC). This allows calling ::dump() on various sub-classes of VPSingleDefRecipe directly, as it resolves an ambigous name lookup. Previously, calling VPWidenRecipe::dump() (and others), would result in the following errors: llvm/unittests/Transforms/Vectorize/VPlanTest.cpp:1284:19: error: member 'dump' found in multiple base classes of different types 1284 \| WidenR->dump(); \| ^ llvm/include/../lib/Transforms/Vectorize/VPlanValue.h:434:8: note: member found by ambiguous name lookup 434 \| void dump() const; \| ^ llvm/include/../lib/Transforms/Vectorize/VPlanValue.h:108:8: note: member found by ambiguous name lookup 108 \| void dump() const; \| ^ 1 error generated.	2024-10-17 05:31:29 +01:00
Florian Hahn	34cdd67c85	[VPlan] Use VPWidenIntrinsicRecipe to vp.select. (#110489 ) Use VPWidenIntrinsicRecipe (https://github.com/llvm/llvm-project/pull/110486) to create vp.select intrinsics. This potentially offers an alternative to duplicating EVL recipes for all existing recipes. There are some recipes that will need duplicates (at least at the moment), due to extra code-gen needs (e.g. widening loads and stores). But in cases the intrinsic can directly be used, creating the widened intrinsic directly would reduce the need to duplicate some recipes. PR: https://github.com/llvm/llvm-project/pull/110489	2024-10-15 21:48:15 +01:00
David Sherwood	175461a22a	[NFC][LoopVectorize] Make replaceVPBBWithIRVPBB more efficient (#111514 ) In replaceVPBBWithIRVPBB we spend time erasing and appending predecessors and successors from a list, when all we really have to do is replace the old with the new. Not only is this more efficient, but it also preserves the ordering of successors and predecessors. This is something which may become important for vectorising early exit loops (see PR #88385), since a VPIRInstruction is the wrapper for a live-out phi with extra operands that map to the incoming block according to the block's predecessor.	2024-10-15 14:11:55 +01:00
Elvis Wang	3c91a2f73e	[VPlan] Implement VPReductionRecipe::computeCost(). NFC (#107790 ) Implementation of `computeCost()` function for `VPReductionRecipe`. Note that `in-loop` and `any-of` reductions are not supported by VPlan-based cost model currently.	2024-10-15 15:44:37 +08:00
Florian Hahn	fa3258ecb8	[VPlan] Sink retrieving legacy costs to more specific computeCost impls. (#109708 ) Make legacy cost retrieval independent of getInstructionForCost by sinking it to more specific ::computeCost implementation (specifically VPInterleaveRecipe::computeCost and VPSingleDefRecipe::computeCost). Inline getInstructionForCost to VPRecipeBase::cost(), as it is now only used to decide which recipes to skip during cost computation and when to apply forced costs. PR: https://github.com/llvm/llvm-project/pull/109708	2024-10-09 13:58:58 +01:00
Florian Hahn	6fbbe152fa	[VPlan] Introduce VPWidenIntrinsicRecipe to separate from libcall. (#110486 ) This patch splits off intrinsic hanlding to a new VPWidenIntrinsicRecipe. VPWidenIntrinsicRecipes only need access to the intrinsic ID to widen and the scalar result type (in case the intrinsic is overloaded on the result type). It does not need access to an underlying IR call instruction or function. This means VPWidenIntrinsicRecipe can be created easily without access to underlying IR.	2024-10-08 22:37:20 +01:00
Florian Hahn	36fc291b6e	[VPlan] Implement VPBlendRecipe::computeCost. Implement VPBlendRecipe::computeCost. VPBlendRecipe is currently is also used if only the first lane is used. This also requires pre-computing costs for forced scalars and instructions considered profitable to scalarize. For those, the cost will be computed separately in the legacy cost model. This will also be needed when implementing VPReplicateRecipe::computeCost.	2024-10-08 21:33:42 +01:00
Florian Hahn	7f74651837	[VPlan] Use pointer to member 0 as VPInterleaveRecipe's pointer arg. (#106431 ) Update VPInterleaveRecipe to always use the pointer to member 0 as pointer argument. This in many cases helps to remove unneeded index adjustments and simplifies VPInterleaveRecipe::execute. In some rare cases, the address of member 0 does not dominate the insert position of the interleave group. In those cases a PtrAdd VPInstruction is emitted to compute the address of member 0 based on the address of the insert position. Alternatively we could hoist the recipe computing the address of member 0.	2024-10-06 22:53:13 +01:00
Florian Hahn	0344123ffb	[VPlan] Manage FMFs for VPWidenCall via VPRecipeWithIRFlags. (NFC) Update VPWidenCallRecipe to manage fast-math flags directly via VPRecipeWithIRFlags. This addresses a TODO and allows adjusting the FMFs directly on the recipe. Also fixes printing for flags for VPWidenCallRecipe.	2024-10-01 13:20:34 +01:00
Florian Hahn	725eb6bb12	[VPlan] Move createVPIRBasicBlock helper to VPIRBasicBlock (NFC). Move the helper to VPIRBasicBlock to allow easier re-use outside VPlan.cpp	2024-09-30 22:12:09 +01:00
Graham Hunter	6f1a8c2da2	[LV] Vectorize histogram operations (#99851 ) This patch implements autovectorization support for the 'all-in-one' histogram intrinsic, which seems to have more support than the 'standalone' intrinsic. See https://discourse.llvm.org/t/rfc-vectorization-support-for-histogram-count-operations/74788/ for an overview of the work and my notes on the tradeoffs between the two approaches.	2024-09-27 13:08:55 +01:00
Elvis Wang	a068b974b1	[VPlan] Implement VPWidenLoad/StoreEVLRecipe::computeCost(). (#109644 ) Currently the EVL recipes transfer the tail masking to the EVL. But in the legacy cost model, the mask exist and will calculate the instruction cost of the mask. To fix the difference between the VPlan-based cost model and the legacy cost model, we always calculate the instruction cost for the mask in the EVL recipes. Note that we should remove the mask cost in the EVL recipes when we don't need to compare to the legacy cost model. This patch also fixes #109468.	2024-09-26 07:10:25 +08:00
Florian Hahn	aae7ac6685	[VPlan] Remove VPIteration, update to use directly VPLane instead (NFC) After 8ec406757cb92 (https://github.com/llvm/llvm-project/pull/95842), only the lane part of VPIteration is used. Simplify the code by replacing remaining uses of VPIteration with VPLane directly.	2024-09-25 16:44:42 +01:00
Florian Hahn	3fbf6f8bb1	[LV] Remove more references of unrolled parts after 57f5d8f2fe. Continue to clean up some now stale references of unroll parts and related terminology as pointed out post-commit for 06c3a7d.	2024-09-24 15:50:31 +01:00
Florian Hahn	f76dae1586	[VPlan] Only store single scalar array per VPValue in VPTransState (NFC) After 8ec406757cb92 (https://github.com/llvm/llvm-project/pull/95842), VPTransformState only stores a single scalar vector per VPValue. Simplify the code by replacing the nested SmallVector in PerPartScalars with a single SmallVector and rename to VPV2Scalars for clarity.	2024-09-23 19:24:28 +01:00
Florian Hahn	57f5d8f2fe	[VPlan] Only store single vector per VPValue in VPTransformState. (NFC) After 8ec406757cb92 (https://github.com/llvm/llvm-project/pull/95842), VPTransformState only stores a single vector value per VPValue. Simplify the code by replacing the SmallVector in PerPartOutput with a single Value * and rename to VPV2Vector for clarity. Also remove the redundant Part argument from various accessors.	2024-09-23 11:28:24 +01:00
Florian Hahn	06c3a7d2d7	[VPlan] Remove unneeded State.UF after 8ec406757cb92 (NFC). State.UF is not needed any longer after 8ec406757cb92 (https://github.com/llvm/llvm-project/pull/95842). Clean it up, simplifying ::execute of existing recipes.	2024-09-22 20:42:37 +01:00
Florian Hahn	8ec406757c	[VPlan] Implement unrolling as VPlan-to-VPlan transform. (#95842 ) This patch implements explicit unrolling by UF as VPlan transform. In follow up patches this will allow simplifying VPTransform state (no need to store unrolled parts) as well as recipe execution (no need to generate code for multiple parts in an each recipe). It also allows for more general optimziations (e.g. avoid generating code for recipes that are uniform-across parts). It also unifies the logic dealing with unrolled parts in a single place, rather than spreading it out across multiple places (e.g. VPlan post processing for header-phi recipes previously.) In the initial implementation, a number of recipes still take the unrolled part as additional, optional argument, if their execution depends on the unrolled part. The computation for start/step values for scalable inductions changed slightly. Previously the step would be computed as scalar and then splatted, now vscale gets splatted and multiplied by the step in a vector mul. This has been split off https://github.com/llvm/llvm-project/pull/94339 which also includes changes to simplify VPTransfomState and recipes' ::execute. The current version mostly leaves existing ::execute untouched and instead sets VPTransfomState::UF to 1. A follow-up patch will clean up all references to VPTransformState::UF. Another follow-up patch will simplify VPTransformState to only store a single vector value per VPValue. PR: https://github.com/llvm/llvm-project/pull/95842	2024-09-21 19:47:37 +01:00
Florian Hahn	4eb9838409	[VPlan] Generalize VPValue::isDefinedOutsideLoopRegions. Update isDefinedOutsideLoopRegions to check if a recipe is defined outside any region. Split off already approved https://github.com/llvm/llvm-project/pull/95842 now that this can be tested separately after landing VPlan-based LICM https://github.com/llvm/llvm-project/issues/107501	2024-09-20 15:34:00 +01:00
Florian Hahn	256100489d	[VPlan] Rename isDefinedOutside[Vector]Regions -> [Loop] (NFC) Clarify name of helper, split off from https://github.com/llvm/llvm-project/pull/95842/files#r1765556732.	2024-09-19 11:20:31 +01:00
Florian Hahn	0d736e296c	[VPlan] Add getSCEVExprForVPValue util, use to get trip count SCEV (NFC) (#94464 ) Add a new getSCEVExprForVPValue utility which can be used to get a SCEV expression for a VPValue. The initial implementation only returns SCEVs for live-in IR values (by constructing a SCEV based on the live-in IR value) and VPExpandSCEVRecipe. This is enough to serve its first use, getting a SCEV for a VPlan's trip count, but will be extended in the future. It also removes createTripCountSCEV, as the new helper can be used to retrieve the SCEV from the VPlan. PR: https://github.com/llvm/llvm-project/pull/94464	2024-09-18 14:41:56 +01:00
David Sherwood	b29c5b66fd	[NFC][LoopVectorize] Dont pass LLVMContext to VPTypeAnalysis constructor (#108540 ) We already pass a Type object into the VPTypeAnalysis constructor, which can be used to obtain the context. While in the same area it also made sense to avoid passing the context into the VPTransformState and VPCostContext constructors.	2024-09-16 09:12:11 +01:00
Florian Hahn	f0c5caa814	[VPlan] Add VPIRInstruction, use for exit block live-outs. (#100735 ) Add a new VPIRInstruction recipe to wrap existing IR instructions not to be modified during execution, execept for PHIs. For PHIs, a single VPValue operand is allowed, and it is used to add a new incoming value for the single predecessor VPBB. Expect PHIs, VPIRInstructions cannot have any operands. Depends on https://github.com/llvm/llvm-project/pull/100658. PR: https://github.com/llvm/llvm-project/pull/100735	2024-09-14 21:21:55 +01:00
Florian Hahn	a794ee4559	[VPlan] Add VPValue for VF, use it for VPWidenIntOrFpInductionRecipe. (#95305 ) Similar to VFxUF, also add a VF VPValue to VPlan and use it to get the runtime VF in VPWidenIntOrFpInductionRecipe. Code for VF is only generated if there are users of VF, to avoid unnecessary test changes. PR: https://github.com/llvm/llvm-project/pull/95305	2024-09-10 10:41:35 +01:00
Kolya Panchenko	00e40c9b5b	[LV] Support binary and unary operations with EVL-vectorization (#93854 ) The patch adds `VPWidenEVLRecipe` which represents `VPWidenRecipe` + EVL argument. The new recipe replaces `VPWidenRecipe` in `tryAddExplicitVectorLength` for each binary and unary operations. Follow up patches will extend support for remaining cases, like `FCmp` and `ICmp`	2024-09-06 11:41:36 -04:00
Elvis Wang	ed220e1571	[VPlan][NFC] Implement `VPWidenMemoryRecipe::computeCost()`. (#105614 ) In this patch, we implement the `computeCost()` function in `VPWidenMemoryRecipe`.	2024-09-04 09:46:02 +08:00
Florian Hahn	9ccf82543d	[VPlan] Implement VPWidenCallRecipe::computeCost (NFCI). (#106047 ) Implement cost computation for VPWidenCallRecipe. In some cases, targets use argument info to compute intrinsic costs. If all operands of the call are VPValues with an underlying IR value, use the IR values as arguments. PR: https://github.com/llvm/llvm-project/pull/106731	2024-09-01 16:26:08 +01:00
Ramkumar Ramachandra	71ede8d831	VPlan: factor out VPlanUtils into its own file (NFC) (#105857 )	2024-08-28 13:54:41 +01:00
Paul Walker	6932f47cfd	[NFC][VPlan] Correct two typos in comments.	2024-08-22 12:17:50 +00:00
Paul Walker	4f075086e7	[LLVM][VPlan] Keep all VPBlend masks until VPlan transformation. (#104015 ) It's not possible to pick the best mask to remove when optimising VPBlend at construction and so this patch refactors the code to move the decision (and thus transformation) to VPlanTransforms. NOTE: This patch does not change the decision of which mask to pick. That will be done in a following PR to keep this patch as NFC from an output point of view.	2024-08-21 12:51:40 +01:00
Florian Hahn	1aa8a6f691	[VPlan] Compute cost for most opcodes in VPWidenRecipe (NFCI). (#98764 ) Implement VPWidenRecipe::computeCost for most cases (except UDiv,SDiv,URem,SRem which require additional logic). Note that this specializes `::computeCost` instead of `::cost`, as `VPRecipeBase::cost` is responsible for skipping cost-computations for pre-computed recipes for now. The most recent version of the VPlan-based cost model introduction has been committed on Jul 10 (b841e2eca3b5c8b) and we should probably give it at least a week in case additional mismatches surface. PR: https://github.com/llvm/llvm-project/pull/98764	2024-08-16 21:20:23 +02:00
Florian Hahn	5a42a677aa	[VPlan] Mark VPVectorPointer as only using the first part of the ptr. VPVectorPointerRecipe only uses the first part of the pointer operand, so mark it accordingly. Follow-up suggested as part of https://github.com/llvm/llvm-project/pull/99808.	2024-08-12 08:46:55 +01:00
Mel Chen	e3d9b01a36	[VPlan][NFC] Make VPValue pointer const. (#101334 )	2024-08-02 09:34:25 +08:00
Mel Chen	3834523f77	[LV][EVL] Refine the constructors of EVL recipe to use call by reference. NFC (#100088 )	2024-07-26 16:50:21 +08:00
Florian Hahn	a3092152ac	[VPlan] Don't create live-outs for induction increments. Follow up to fc9cd3272b5 to also skip creating live-outs for IV increments, as those are also generated independent of VPlan for now.	2024-07-25 21:34:55 +01:00
Florian Hahn	e6fdecd290	[VPlan] Drop references to Ingredient from VPWidenRecipe comments (NFC) VPWidenRecipe has been updated to use Opcode + operands instead of an Instruction 'ingredient'. Reword the comments.	2024-07-22 20:33:13 +01:00
Mel Chen	4eb30cfb34	[LV][EVL] Support in-loop reduction using tail folding with EVL. (#90184 ) Following from #87816, add VPReductionEVLRecipe to describe vector predication reduction. Address one of TODOs from #76172.	2024-07-16 16:15:24 +08:00
Graham Hunter	22a7f6dcc4	Revert "[LV] Autovectorization for the all-in-one histogram intrinsic" (#98493 ) Reverts llvm/llvm-project#91458 to deal with post-commit reviewer requests.	2024-07-11 16:39:30 +01:00
Florian Hahn	9a5a8731e7	[VPlan] Introduce ResumePhi VPInstruction, use to create phi for FOR. (#94760 ) This patch introduces a new ResumePhi VPInstruction which creates a phi in a leaf block of a VPlan. The first use is to create the phi node for fixed-order recurrence resume values in the scalar preheader. The VPInstruction takes 2 operands: 1) the incoming value from the middle-block and a default value to be used for all other incoming blocks. In follow-up changes, it will also be used to create phis for reduction and induction resume values. Depends on https://github.com/llvm/llvm-project/pull/92651 PR: https://github.com/llvm/llvm-project/pull/94760	2024-07-11 16:08:04 +01:00
Graham Hunter	1860fd049e	[LV] Autovectorization for the all-in-one histogram intrinsic (#91458 ) This patch implements limited loop vectorization support for the 'all-in-one' histogram intrinsic. The feature is disabled by default, and when enabled will only vectorize if there are no other users of values in the gather-modify-scatter sequence.	2024-07-11 15:33:30 +01:00

1 2 3 4 5 ...

461 Commits