llvm-project

Author	SHA1	Message	Date
Florian Hahn	81bbe19383	[VPlan] Add VPSingleDefRecipe::dump() to resolve ambigous lookup (NFC). This allows calling ::dump() on various sub-classes of VPSingleDefRecipe directly, as it resolves an ambigous name lookup. Previously, calling VPWidenRecipe::dump() (and others), would result in the following errors: llvm/unittests/Transforms/Vectorize/VPlanTest.cpp:1284:19: error: member 'dump' found in multiple base classes of different types 1284 \| WidenR->dump(); \| ^ llvm/include/../lib/Transforms/Vectorize/VPlanValue.h:434:8: note: member found by ambiguous name lookup 434 \| void dump() const; \| ^ llvm/include/../lib/Transforms/Vectorize/VPlanValue.h:108:8: note: member found by ambiguous name lookup 108 \| void dump() const; \| ^ 1 error generated.	2024-10-17 05:31:29 +01:00
Florian Hahn	3860e29e0e	[VPlan] Mark VPVectorPointerRecipe as not having sideeffects. VectorPointer doesn't read from memory or have any sideeffects. Mark it accordingly.	2024-10-16 06:10:19 +01:00
Florian Hahn	34cdd67c85	[VPlan] Use VPWidenIntrinsicRecipe to vp.select. (#110489 ) Use VPWidenIntrinsicRecipe (https://github.com/llvm/llvm-project/pull/110486) to create vp.select intrinsics. This potentially offers an alternative to duplicating EVL recipes for all existing recipes. There are some recipes that will need duplicates (at least at the moment), due to extra code-gen needs (e.g. widening loads and stores). But in cases the intrinsic can directly be used, creating the widened intrinsic directly would reduce the need to duplicate some recipes. PR: https://github.com/llvm/llvm-project/pull/110489	2024-10-15 21:48:15 +01:00
Florian Hahn	2a46e5d039	[VPlan] Implement VPInterleaveRecipe::computeCost. (#106067 ) Implement computing costs for VPInterleaveRecipe. PR: https://github.com/llvm/llvm-project/pull/106067	2024-10-15 20:50:28 +01:00
Elvis Wang	3c91a2f73e	[VPlan] Implement VPReductionRecipe::computeCost(). NFC (#107790 ) Implementation of `computeCost()` function for `VPReductionRecipe`. Note that `in-loop` and `any-of` reductions are not supported by VPlan-based cost model currently.	2024-10-15 15:44:37 +08:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Florian Hahn	fa3258ecb8	[VPlan] Sink retrieving legacy costs to more specific computeCost impls. (#109708 ) Make legacy cost retrieval independent of getInstructionForCost by sinking it to more specific ::computeCost implementation (specifically VPInterleaveRecipe::computeCost and VPSingleDefRecipe::computeCost). Inline getInstructionForCost to VPRecipeBase::cost(), as it is now only used to decide which recipes to skip during cost computation and when to apply forced costs. PR: https://github.com/llvm/llvm-project/pull/109708	2024-10-09 13:58:58 +01:00
Florian Hahn	01cbbc52dc	[VPlan] Request lane 0 for pointer arg in PtrAdd. After 7f74651, the pointer operand may be replicated of a PtrAdd. Instead of requesting a single scalar, request lane 0, which correctly handles the case when there is a scalar-per-lane. Fixes https://github.com/llvm/llvm-project/issues/111606.	2024-10-09 13:18:54 +01:00
Florian Hahn	6fbbe152fa	[VPlan] Introduce VPWidenIntrinsicRecipe to separate from libcall. (#110486 ) This patch splits off intrinsic hanlding to a new VPWidenIntrinsicRecipe. VPWidenIntrinsicRecipes only need access to the intrinsic ID to widen and the scalar result type (in case the intrinsic is overloaded on the result type). It does not need access to an underlying IR call instruction or function. This means VPWidenIntrinsicRecipe can be created easily without access to underlying IR.	2024-10-08 22:37:20 +01:00
Florian Hahn	36fc291b6e	[VPlan] Implement VPBlendRecipe::computeCost. Implement VPBlendRecipe::computeCost. VPBlendRecipe is currently is also used if only the first lane is used. This also requires pre-computing costs for forced scalars and instructions considered profitable to scalarize. For those, the cost will be computed separately in the legacy cost model. This will also be needed when implementing VPReplicateRecipe::computeCost.	2024-10-08 21:33:42 +01:00
Florian Hahn	3829fd75c8	[VPlan] Remove redundant getVPSingleValue for VPSingleDefRecipes (NFC).	2024-10-08 20:31:41 +01:00
Florian Hahn	3ec6f805c5	[VPlan] Don't created GEP x, 0 for interleave group pointers. The GEP with offet 0 is redundant, remove it. This addresses a TODO from 7f74651837b ((#106431).	2024-10-08 12:08:13 +01:00
Florian Hahn	7f74651837	[VPlan] Use pointer to member 0 as VPInterleaveRecipe's pointer arg. (#106431 ) Update VPInterleaveRecipe to always use the pointer to member 0 as pointer argument. This in many cases helps to remove unneeded index adjustments and simplifies VPInterleaveRecipe::execute. In some rare cases, the address of member 0 does not dominate the insert position of the interleave group. In those cases a PtrAdd VPInstruction is emitted to compute the address of member 0 based on the address of the insert position. Alternatively we could hoist the recipe computing the address of member 0.	2024-10-06 22:53:13 +01:00
Florian Hahn	68210c7c26	[VPlan] Only generate first lane for VPPredInstPHI if no others used. IF only the first lane of the result is used, only generate the first lane. Fixes https://github.com/llvm/llvm-project/issues/111042.	2024-10-05 19:15:05 +01:00
Florian Hahn	0344123ffb	[VPlan] Manage FMFs for VPWidenCall via VPRecipeWithIRFlags. (NFC) Update VPWidenCallRecipe to manage fast-math flags directly via VPRecipeWithIRFlags. This addresses a TODO and allows adjusting the FMFs directly on the recipe. Also fixes printing for flags for VPWidenCallRecipe.	2024-10-01 13:20:34 +01:00
Mel Chen	f8373cb0f9	[LV] Reuse VPReplicateRecipe to handle scalar stores in exit block. (#106342 ) This patch separates the computation of the final reduction result and the intermediate stores of reduction. --------- Co-authored-by: Florian Hahn <flo@fhahn.com>	2024-09-30 15:35:09 +08:00
Graham Hunter	6f1a8c2da2	[LV] Vectorize histogram operations (#99851 ) This patch implements autovectorization support for the 'all-in-one' histogram intrinsic, which seems to have more support than the 'standalone' intrinsic. See https://discourse.llvm.org/t/rfc-vectorization-support-for-histogram-count-operations/74788/ for an overview of the work and my notes on the tradeoffs between the two approaches.	2024-09-27 13:08:55 +01:00
Youngsuk Kim	e177dd6fbb	[llvm] Replace uses of Type::getPointerTo() (NFC) (#110163 ) Replace uses of `Type::getPointerTo()` which is to be removed. --------- Co-authored-by: Nikita Popov <github@npopov.com>	2024-09-26 16:38:50 -04:00
Florian Hahn	68ed1728bf	[VPlan] Unify mayWriteToMemory and mayHaveSideEffects logic for VPInst. Unify logic for mayWriteToMemory and mayHaveSideEffects for VPInstruction, with the later relying on the former. Also extend to handle binary operators. Split off from https://github.com/llvm/llvm-project/pull/106441	2024-09-26 19:16:43 +01:00
Elvis Wang	a068b974b1	[VPlan] Implement VPWidenLoad/StoreEVLRecipe::computeCost(). (#109644 ) Currently the EVL recipes transfer the tail masking to the EVL. But in the legacy cost model, the mask exist and will calculate the instruction cost of the mask. To fix the difference between the VPlan-based cost model and the legacy cost model, we always calculate the instruction cost for the mask in the EVL recipes. Note that we should remove the mask cost in the EVL recipes when we don't need to compare to the legacy cost model. This patch also fixes #109468.	2024-09-26 07:10:25 +08:00
Florian Hahn	aae7ac6685	[VPlan] Remove VPIteration, update to use directly VPLane instead (NFC) After 8ec406757cb92 (https://github.com/llvm/llvm-project/pull/95842), only the lane part of VPIteration is used. Simplify the code by replacing remaining uses of VPIteration with VPLane directly.	2024-09-25 16:44:42 +01:00
Philip Reames	d288574363	[TTI][RISCV] Model cost of loading constants arms of selects and compares (#109824 ) This follows in the spirit of 7d82c99403f615f6236334e698720bf979959704, and extends the costing API for compares and selects to provide information about the operands passed in an analogous manner. This allows us to model the cost of materializing the vector constant, as some select-of-constants are significantly more expensive than others when you account for the cost of materializing the constants involved. This is a stepping stone towards fixing https://github.com/llvm/llvm-project/issues/109466. A separate SLP patch will be required to utilize the new API.	2024-09-25 07:25:57 -07:00
Alexey Bataev	60ed2361c0	[LV][EVL]Explicitly model AVL as sub, original TC, EVL_PHI. Patch explicitly models AVL as sub original TC, EVL_PHI instead of having it in EXPLICIT-VECTOR-LENGTH VPInstruction. Required for correct safe dependence distance suport. Reviewers: fhahn, ayalz Reviewed By: ayalz Pull Request: https://github.com/llvm/llvm-project/pull/108869	2024-09-25 08:58:29 -04:00
Florian Hahn	3fbf6f8bb1	[LV] Remove more references of unrolled parts after 57f5d8f2fe. Continue to clean up some now stale references of unroll parts and related terminology as pointed out post-commit for 06c3a7d.	2024-09-24 15:50:31 +01:00
Florian Hahn	040bb37195	[VPlan] Fix incorrect argument for CreateBinOp after 06c3a7d2d764. 06c3a7d2d764 incorrectly updated CreateBinOp to pass the debug location, which gets interpreted as FPMath node. Remove the argument.	2024-09-24 11:18:50 +01:00
Florian Hahn	57f5d8f2fe	[VPlan] Only store single vector per VPValue in VPTransformState. (NFC) After 8ec406757cb92 (https://github.com/llvm/llvm-project/pull/95842), VPTransformState only stores a single vector value per VPValue. Simplify the code by replacing the SmallVector in PerPartOutput with a single Value * and rename to VPV2Vector for clarity. Also remove the redundant Part argument from various accessors.	2024-09-23 11:28:24 +01:00
Florian Hahn	06c3a7d2d7	[VPlan] Remove unneeded State.UF after 8ec406757cb92 (NFC). State.UF is not needed any longer after 8ec406757cb92 (https://github.com/llvm/llvm-project/pull/95842). Clean it up, simplifying ::execute of existing recipes.	2024-09-22 20:42:37 +01:00
Florian Hahn	8ec406757c	[VPlan] Implement unrolling as VPlan-to-VPlan transform. (#95842 ) This patch implements explicit unrolling by UF as VPlan transform. In follow up patches this will allow simplifying VPTransform state (no need to store unrolled parts) as well as recipe execution (no need to generate code for multiple parts in an each recipe). It also allows for more general optimziations (e.g. avoid generating code for recipes that are uniform-across parts). It also unifies the logic dealing with unrolled parts in a single place, rather than spreading it out across multiple places (e.g. VPlan post processing for header-phi recipes previously.) In the initial implementation, a number of recipes still take the unrolled part as additional, optional argument, if their execution depends on the unrolled part. The computation for start/step values for scalable inductions changed slightly. Previously the step would be computed as scalar and then splatted, now vscale gets splatted and multiplied by the step in a vector mul. This has been split off https://github.com/llvm/llvm-project/pull/94339 which also includes changes to simplify VPTransfomState and recipes' ::execute. The current version mostly leaves existing ::execute untouched and instead sets VPTransfomState::UF to 1. A follow-up patch will clean up all references to VPTransformState::UF. Another follow-up patch will simplify VPTransformState to only store a single vector value per VPValue. PR: https://github.com/llvm/llvm-project/pull/95842	2024-09-21 19:47:37 +01:00
Jay Foad	e03f427196	[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133 ) It is almost always simpler to use {} instead of std::nullopt to initialize an empty ArrayRef. This patch changes all occurrences I could find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor could be deprecated or removed.	2024-09-19 16:16:38 +01:00
Florian Hahn	256100489d	[VPlan] Rename isDefinedOutside[Vector]Regions -> [Loop] (NFC) Clarify name of helper, split off from https://github.com/llvm/llvm-project/pull/95842/files#r1765556732.	2024-09-19 11:20:31 +01:00
Shih-Po Hung	ffcff2f465	[VPlan][NFC] Fix the value name of VECTOR_GEP (#107544 ) This patch passes the string `"vector.gep"` to CreateGEP instead of CreateMul.	2024-09-18 19:22:36 +08:00
LiqinWeng	a2994b2999	[LV][NFC] Unify printing for WidenEVLReicpe with other EVL recipes (#108177 )	2024-09-18 15:03:37 +08:00
Florian Hahn	f0c5caa814	[VPlan] Add VPIRInstruction, use for exit block live-outs. (#100735 ) Add a new VPIRInstruction recipe to wrap existing IR instructions not to be modified during execution, execept for PHIs. For PHIs, a single VPValue operand is allowed, and it is used to add a new incoming value for the single predecessor VPBB. Expect PHIs, VPIRInstructions cannot have any operands. Depends on https://github.com/llvm/llvm-project/pull/100658. PR: https://github.com/llvm/llvm-project/pull/100735	2024-09-14 21:21:55 +01:00
Florian Hahn	a794ee4559	[VPlan] Add VPValue for VF, use it for VPWidenIntOrFpInductionRecipe. (#95305 ) Similar to VFxUF, also add a VF VPValue to VPlan and use it to get the runtime VF in VPWidenIntOrFpInductionRecipe. Code for VF is only generated if there are users of VF, to avoid unnecessary test changes. PR: https://github.com/llvm/llvm-project/pull/95305	2024-09-10 10:41:35 +01:00
Kazu Hirata	ce192b87b2	[Vectorize] Fix a warning This patch fixes: llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp:1278:12: error: unused variable 'Op0' [-Werror,-Wunused-variable]	2024-09-06 09:12:06 -07:00
Kolya Panchenko	00e40c9b5b	[LV] Support binary and unary operations with EVL-vectorization (#93854 ) The patch adds `VPWidenEVLRecipe` which represents `VPWidenRecipe` + EVL argument. The new recipe replaces `VPWidenRecipe` in `tryAddExplicitVectorLength` for each binary and unary operations. Follow up patches will extend support for remaining cases, like `FCmp` and `ICmp`	2024-09-06 11:41:36 -04:00
Philip Reames	3d9abfc9f8	Consolidate all IR logic for getting the identity value of a reduction [nfc] This change merges the three different places (at the IR layer) for finding the identity value of a reduction into a single copy. This depends on several prior commits which fix ommissions and bugs in the distinct copies, but this patch itself should be fully non-functional. As the new comments and naming try to make clear, the identity value is a property of the @llvm.vector.reduce.* intrinsic, not of e.g. the recurrence descriptor. (We still provide an interface for clients using recurrence descriptors, but the implementation simply translates to the intrinsic which each corresponds to.) As a note, the getIntrinsicIdentity API does not support fminnum/fmaxnum or fminimum/fmaximum which is why we still need manual logic (but at least only one copy of manual logic) for those cases.	2024-09-04 08:23:21 -07:00
Elvis Wang	ed220e1571	[VPlan][NFC] Implement `VPWidenMemoryRecipe::computeCost()`. (#105614 ) In this patch, we implement the `computeCost()` function in `VPWidenMemoryRecipe`.	2024-09-04 09:46:02 +08:00
Philip Reames	3e8840ba71	Remove "Target" from createXReduction naming [nfc] Despite the stale comments, none of these actually use TTI, and they're solely generating standard LLVM IR.	2024-09-03 17:03:55 -07:00
Philip Reames	0b2f2537a5	[LV] Separate AnyOf recurrence from getRecurrenceIdentity [NFC] These recurrence types don't have a meaningful identity, and the routine was abused to return the start value instead. Out of the three callers to this routine, only one actually wants this behavior. This is a prep change for removing the routine entirely and commoning it with other copies of the same logic.	2024-09-03 09:46:30 -07:00
Florian Hahn	50a02e7c68	[VPlan] Pass intrinsic inst to TTI in VPWidenCallRecipe::computeCost. Follow-up to 9ccf825, adjust computeCost to also pass IntrinsicInst to TTI if available, as there are multiple places in TTI which use the IntrinsicInst. Fixes https://github.com/llvm/llvm-project/issues/107016.	2024-09-02 20:47:37 +01:00
Florian Hahn	b0de7fa466	[VPlan] Use op from underlying call in computeCost if needed. This fixes a divergence between legacy and VPlan-based cost model, e.g. if one of the operands has an first-order recurrence phi as operand.	2024-09-02 14:00:10 +01:00
Florian Hahn	9ccf82543d	[VPlan] Implement VPWidenCallRecipe::computeCost (NFCI). (#106047 ) Implement cost computation for VPWidenCallRecipe. In some cases, targets use argument info to compute intrinsic costs. If all operands of the call are VPValues with an underlying IR value, use the IR values as arguments. PR: https://github.com/llvm/llvm-project/pull/106731	2024-09-01 16:26:08 +01:00
Philip Reames	c53008de89	[VPlan] Manually jumpthread a bit of reduction code for readability [nfc]	2024-08-30 12:46:49 -07:00
Ramkumar Ramachandra	71ede8d831	VPlan: factor out VPlanUtils into its own file (NFC) (#105857 )	2024-08-28 13:54:41 +01:00
Shao-Ce SUN	2f0d32692e	[NFC][VPlan] Trim extra spaces in `VPDerivedIVRecipe::print` during debugging (#106041 ) before: ``` EMIT vp<%3> = CANONICAL-INDUCTION ir<0>, vp<%8> vp<%4> = DERIVED-IV ir<%n> + vp<%3> * ir<-1> vp<%5> = SCALAR-STEPS vp<%4>, ir<-1> ``` after: ``` EMIT vp<%3> = CANONICAL-INDUCTION ir<0>, vp<%8> vp<%4> = DERIVED-IV ir<%n> + vp<%3> * ir<-1> vp<%5> = SCALAR-STEPS vp<%4>, ir<-1> ```	2024-08-26 21:23:05 +08:00
Florian Hahn	1fa6c99a09	[VPlan] Move EVL memory recipes to VPlanRecipes.cpp (NFC) Move VPWiden[Load\|Store]EVLRecipe::executeto VPlanRecipes.cpp in line with other ::execute implementations that don't depend on anything defined in LoopVectorization.cpp	2024-08-22 18:30:49 +01:00
Paul Walker	4f075086e7	[LLVM][VPlan] Keep all VPBlend masks until VPlan transformation. (#104015 ) It's not possible to pick the best mask to remove when optimising VPBlend at construction and so this patch refactors the code to move the decision (and thus transformation) to VPlanTransforms. NOTE: This patch does not change the decision of which mask to pick. That will be done in a following PR to keep this patch as NFC from an output point of view.	2024-08-21 12:51:40 +01:00
Florian Hahn	99741ac285	[VPlan] Introduce explicit ExtractFromEnd recipes for live-outs. (#100658 ) Introduce explicit ExtractFromEnd recipes to extract the final values for live-outs instead of implicitly extracting in VPLiveOut::fixPhi. This is a follow-up to the recent changes of modeling extracts for recurrences and consolidates live-out extract creation for fixed-order recurrences at a single place: addLiveOutsForFirstOrderRecurrences. It is also in preparation of replacing VPLiveOut with VPIRInstructions wrapping the original scalar phis. PR: https://github.com/llvm/llvm-project/pull/100658	2024-08-21 10:06:44 +02:00
Florian Hahn	1aa8a6f691	[VPlan] Compute cost for most opcodes in VPWidenRecipe (NFCI). (#98764 ) Implement VPWidenRecipe::computeCost for most cases (except UDiv,SDiv,URem,SRem which require additional logic). Note that this specializes `::computeCost` instead of `::cost`, as `VPRecipeBase::cost` is responsible for skipping cost-computations for pre-computed recipes for now. The most recent version of the VPlan-based cost model introduction has been committed on Jul 10 (b841e2eca3b5c8b) and we should probably give it at least a week in case additional mismatches surface. PR: https://github.com/llvm/llvm-project/pull/98764	2024-08-16 21:20:23 +02:00

1 2 3 4 5 ...

300 Commits