llvm-project

Author	SHA1	Message	Date
Florian Hahn	320038579d	[VPlan] Return cost of PHI for scalar VFs in computeCost for FORs. This fixes a crash when the VF is scalar. Fixes https://github.com/llvm/llvm-project/issues/116375.	2024-11-21 21:11:21 +00:00
Finn Plummer	8663b8777e	[NFC][VectorUtils][TargetTransformInfo] Add `isVectorIntrinsicWithOverloadTypeAtArg` api (#114849 ) This changes allows target intrinsics to specify and overwrite overloaded types. - Updates `ReplaceWithVecLib` to not provide TTI as there most probably won't be a use-case - Updates `SLPVectorizer` to use available TTI - Updates `VPTransformState` to pass down TTI - Updates `VPlanRecipe` to use passed-down TTI This change will let us add scalarization for `asdouble`: #114847	2024-11-21 11:04:25 -08:00
Noah Goldstein	8af5ae0648	[VPlan] Preserve IR flags when widening casts We have `nneg` for both `sext` and `uitofp`. Fixes #114856 Closes #115373	2024-11-08 17:21:05 -06:00
Kazu Hirata	aa825b74af	[Vectorize] Remove unused includes (NFC) (#114643 ) Identified with misc-include-cleaner.	2024-11-03 08:58:51 -08:00
Florian Hahn	af6ebb70d2	[VPlan] Implement computeCost for remaining VPSingleDefRecipes. Provide computeCost implementations for all remaining sub-classes of VPSingleDefRecipe. This pushes one of the last uses of getLegacyCost directly to its user: VPReplicateRecipe.	2024-11-02 19:38:31 +00:00
Florian Hahn	b021464d35	[VPlan] Introduce scalar loop header in plan, remove VPLiveOut. (#109975 ) Update VPlan to include the scalar loop header. This allows retiring VPLiveOut, as the remaining live-outs can now be handled by adding operands to the wrapped phis in the scalar loop header. Note that the current version only includes the scalar loop header, no other loop blocks and also does not wrap it in a region block. PR: https://github.com/llvm/llvm-project/pull/109975	2024-10-31 21:36:44 +01:00
Florian Hahn	680901ed80	[VPlan] Implement VPHeaderPHIRecipe::computeCost. Fill out computeCost implementations for various header PHI recipes, matching the legacy cost model for now.	2024-10-29 21:04:31 +00:00
Shih-Po Hung	266ff98cba	[LV][VPlan] Use VF VPValue in VPVectorPointerRecipe (#110974 ) Refactors VPVectorPointerRecipe to use the VF VPValue to obtain the runtime VF, similar to #95305. Since only reverse vector pointers require the runtime VF, the patch sets VPUnrollPart::PartOpIndex to 1 for vector pointers and 2 for reverse vector pointers. As a result, the generation of reverse vector pointers is moved into a separate recipe.	2024-10-26 23:18:50 +08:00
Florian Hahn	e724226da7	[VPlan] Return cost of 0 for VPWidenCastRecipe without underlying value. In some cases, VPWidenCastRecipes are created but not considered in the legacy cost model, including truncates/extends when evaluating a reduction in a smaller type. Return 0 for such casts for now, to avoid divergences between VPlan and legacy cost models. Fixes https://github.com/llvm/llvm-project/issues/113526.	2024-10-25 21:25:44 +02:00
Elvis Wang	b3edc764f7	[VPlan] Implement VPWidenCastRecipe::computeCost(). (NFCI) (#111339 ) This patch implement `VPWidenCastRecipe::computeCost()` and skip cast recipies in the in-loop reduction.	2024-10-22 12:23:49 +08:00
Florian Hahn	a4819bd46d	[VPlan] Restore case accidentially dropped in 34cdd67c85. 34cdd67c85 accidentially dropped the case for VPWidenIntrinsicSC. Add it back again. Thanks @Mel-Chen for spotting this.	2024-10-21 20:33:58 -07:00
Florian Hahn	1d9b3222f3	[VPlan] Implement VPWidenSelectRecipe::computeCost. Implement VPlan-based cost computation for VPWidenSelectRecipe.	2024-10-22 03:10:04 +01:00
Florian Hahn	2a6b09e0d3	[LV] Use type from InsertPos for cost computation of interleave groups. Previously the legacy cost model would pick the type for the cost computation depending on the order of the members in the input IR. This is incompatible with the VPlan-based cost model (independent of original IR order) and also doesn't match code-gen, which uses the type of the insert position. Update the legacy cost model to use the type (and address space) from the Group's insert position. This brings the legacy cost model in line with the legacy cost model and fixes a divergence between both models. Note that the X86 cost model seems to assign different costs to groups with i64 and double types. Added a TODO to check. Fixes https://github.com/llvm/llvm-project/issues/112922.	2024-10-18 19:12:40 -07:00
Alexey Bataev	f148d5791b	[LV]Initial support for safe distance in predicated DataWithEVL vectorization mode. Enabled initial support for max safe distance in DataWithEVL mode. If max safe distance is required, need to emit special code: CMP = icmp ult AVL, MAX_SAFE_DISTANCE SAFE_AVL = select CMP, AVL, MAX_SAFE_DISTANCE EVL = call i32 @llvm.experimental.get.vector.length(i64 SAFE_AVL) while vectorize the loop in DataWithEVL tail folding mode. Reviewers: fhahn Reviewed By: fhahn Pull Request: https://github.com/llvm/llvm-project/pull/102897	2024-10-18 15:51:49 -04:00
Florian Hahn	81bbe19383	[VPlan] Add VPSingleDefRecipe::dump() to resolve ambigous lookup (NFC). This allows calling ::dump() on various sub-classes of VPSingleDefRecipe directly, as it resolves an ambigous name lookup. Previously, calling VPWidenRecipe::dump() (and others), would result in the following errors: llvm/unittests/Transforms/Vectorize/VPlanTest.cpp:1284:19: error: member 'dump' found in multiple base classes of different types 1284 \| WidenR->dump(); \| ^ llvm/include/../lib/Transforms/Vectorize/VPlanValue.h:434:8: note: member found by ambiguous name lookup 434 \| void dump() const; \| ^ llvm/include/../lib/Transforms/Vectorize/VPlanValue.h:108:8: note: member found by ambiguous name lookup 108 \| void dump() const; \| ^ 1 error generated.	2024-10-17 05:31:29 +01:00
Florian Hahn	3860e29e0e	[VPlan] Mark VPVectorPointerRecipe as not having sideeffects. VectorPointer doesn't read from memory or have any sideeffects. Mark it accordingly.	2024-10-16 06:10:19 +01:00
Florian Hahn	34cdd67c85	[VPlan] Use VPWidenIntrinsicRecipe to vp.select. (#110489 ) Use VPWidenIntrinsicRecipe (https://github.com/llvm/llvm-project/pull/110486) to create vp.select intrinsics. This potentially offers an alternative to duplicating EVL recipes for all existing recipes. There are some recipes that will need duplicates (at least at the moment), due to extra code-gen needs (e.g. widening loads and stores). But in cases the intrinsic can directly be used, creating the widened intrinsic directly would reduce the need to duplicate some recipes. PR: https://github.com/llvm/llvm-project/pull/110489	2024-10-15 21:48:15 +01:00
Florian Hahn	2a46e5d039	[VPlan] Implement VPInterleaveRecipe::computeCost. (#106067 ) Implement computing costs for VPInterleaveRecipe. PR: https://github.com/llvm/llvm-project/pull/106067	2024-10-15 20:50:28 +01:00
Elvis Wang	3c91a2f73e	[VPlan] Implement VPReductionRecipe::computeCost(). NFC (#107790 ) Implementation of `computeCost()` function for `VPReductionRecipe`. Note that `in-loop` and `any-of` reductions are not supported by VPlan-based cost model currently.	2024-10-15 15:44:37 +08:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Florian Hahn	fa3258ecb8	[VPlan] Sink retrieving legacy costs to more specific computeCost impls. (#109708 ) Make legacy cost retrieval independent of getInstructionForCost by sinking it to more specific ::computeCost implementation (specifically VPInterleaveRecipe::computeCost and VPSingleDefRecipe::computeCost). Inline getInstructionForCost to VPRecipeBase::cost(), as it is now only used to decide which recipes to skip during cost computation and when to apply forced costs. PR: https://github.com/llvm/llvm-project/pull/109708	2024-10-09 13:58:58 +01:00
Florian Hahn	01cbbc52dc	[VPlan] Request lane 0 for pointer arg in PtrAdd. After 7f74651, the pointer operand may be replicated of a PtrAdd. Instead of requesting a single scalar, request lane 0, which correctly handles the case when there is a scalar-per-lane. Fixes https://github.com/llvm/llvm-project/issues/111606.	2024-10-09 13:18:54 +01:00
Florian Hahn	6fbbe152fa	[VPlan] Introduce VPWidenIntrinsicRecipe to separate from libcall. (#110486 ) This patch splits off intrinsic hanlding to a new VPWidenIntrinsicRecipe. VPWidenIntrinsicRecipes only need access to the intrinsic ID to widen and the scalar result type (in case the intrinsic is overloaded on the result type). It does not need access to an underlying IR call instruction or function. This means VPWidenIntrinsicRecipe can be created easily without access to underlying IR.	2024-10-08 22:37:20 +01:00
Florian Hahn	36fc291b6e	[VPlan] Implement VPBlendRecipe::computeCost. Implement VPBlendRecipe::computeCost. VPBlendRecipe is currently is also used if only the first lane is used. This also requires pre-computing costs for forced scalars and instructions considered profitable to scalarize. For those, the cost will be computed separately in the legacy cost model. This will also be needed when implementing VPReplicateRecipe::computeCost.	2024-10-08 21:33:42 +01:00
Florian Hahn	3829fd75c8	[VPlan] Remove redundant getVPSingleValue for VPSingleDefRecipes (NFC).	2024-10-08 20:31:41 +01:00
Florian Hahn	3ec6f805c5	[VPlan] Don't created GEP x, 0 for interleave group pointers. The GEP with offet 0 is redundant, remove it. This addresses a TODO from 7f74651837b ((#106431).	2024-10-08 12:08:13 +01:00
Florian Hahn	7f74651837	[VPlan] Use pointer to member 0 as VPInterleaveRecipe's pointer arg. (#106431 ) Update VPInterleaveRecipe to always use the pointer to member 0 as pointer argument. This in many cases helps to remove unneeded index adjustments and simplifies VPInterleaveRecipe::execute. In some rare cases, the address of member 0 does not dominate the insert position of the interleave group. In those cases a PtrAdd VPInstruction is emitted to compute the address of member 0 based on the address of the insert position. Alternatively we could hoist the recipe computing the address of member 0.	2024-10-06 22:53:13 +01:00
Florian Hahn	68210c7c26	[VPlan] Only generate first lane for VPPredInstPHI if no others used. IF only the first lane of the result is used, only generate the first lane. Fixes https://github.com/llvm/llvm-project/issues/111042.	2024-10-05 19:15:05 +01:00
Florian Hahn	0344123ffb	[VPlan] Manage FMFs for VPWidenCall via VPRecipeWithIRFlags. (NFC) Update VPWidenCallRecipe to manage fast-math flags directly via VPRecipeWithIRFlags. This addresses a TODO and allows adjusting the FMFs directly on the recipe. Also fixes printing for flags for VPWidenCallRecipe.	2024-10-01 13:20:34 +01:00
Mel Chen	f8373cb0f9	[LV] Reuse VPReplicateRecipe to handle scalar stores in exit block. (#106342 ) This patch separates the computation of the final reduction result and the intermediate stores of reduction. --------- Co-authored-by: Florian Hahn <flo@fhahn.com>	2024-09-30 15:35:09 +08:00
Graham Hunter	6f1a8c2da2	[LV] Vectorize histogram operations (#99851 ) This patch implements autovectorization support for the 'all-in-one' histogram intrinsic, which seems to have more support than the 'standalone' intrinsic. See https://discourse.llvm.org/t/rfc-vectorization-support-for-histogram-count-operations/74788/ for an overview of the work and my notes on the tradeoffs between the two approaches.	2024-09-27 13:08:55 +01:00
Youngsuk Kim	e177dd6fbb	[llvm] Replace uses of Type::getPointerTo() (NFC) (#110163 ) Replace uses of `Type::getPointerTo()` which is to be removed. --------- Co-authored-by: Nikita Popov <github@npopov.com>	2024-09-26 16:38:50 -04:00
Florian Hahn	68ed1728bf	[VPlan] Unify mayWriteToMemory and mayHaveSideEffects logic for VPInst. Unify logic for mayWriteToMemory and mayHaveSideEffects for VPInstruction, with the later relying on the former. Also extend to handle binary operators. Split off from https://github.com/llvm/llvm-project/pull/106441	2024-09-26 19:16:43 +01:00
Elvis Wang	a068b974b1	[VPlan] Implement VPWidenLoad/StoreEVLRecipe::computeCost(). (#109644 ) Currently the EVL recipes transfer the tail masking to the EVL. But in the legacy cost model, the mask exist and will calculate the instruction cost of the mask. To fix the difference between the VPlan-based cost model and the legacy cost model, we always calculate the instruction cost for the mask in the EVL recipes. Note that we should remove the mask cost in the EVL recipes when we don't need to compare to the legacy cost model. This patch also fixes #109468.	2024-09-26 07:10:25 +08:00
Florian Hahn	aae7ac6685	[VPlan] Remove VPIteration, update to use directly VPLane instead (NFC) After 8ec406757cb92 (https://github.com/llvm/llvm-project/pull/95842), only the lane part of VPIteration is used. Simplify the code by replacing remaining uses of VPIteration with VPLane directly.	2024-09-25 16:44:42 +01:00
Philip Reames	d288574363	[TTI][RISCV] Model cost of loading constants arms of selects and compares (#109824 ) This follows in the spirit of 7d82c99403f615f6236334e698720bf979959704, and extends the costing API for compares and selects to provide information about the operands passed in an analogous manner. This allows us to model the cost of materializing the vector constant, as some select-of-constants are significantly more expensive than others when you account for the cost of materializing the constants involved. This is a stepping stone towards fixing https://github.com/llvm/llvm-project/issues/109466. A separate SLP patch will be required to utilize the new API.	2024-09-25 07:25:57 -07:00
Alexey Bataev	60ed2361c0	[LV][EVL]Explicitly model AVL as sub, original TC, EVL_PHI. Patch explicitly models AVL as sub original TC, EVL_PHI instead of having it in EXPLICIT-VECTOR-LENGTH VPInstruction. Required for correct safe dependence distance suport. Reviewers: fhahn, ayalz Reviewed By: ayalz Pull Request: https://github.com/llvm/llvm-project/pull/108869	2024-09-25 08:58:29 -04:00
Florian Hahn	3fbf6f8bb1	[LV] Remove more references of unrolled parts after 57f5d8f2fe. Continue to clean up some now stale references of unroll parts and related terminology as pointed out post-commit for 06c3a7d.	2024-09-24 15:50:31 +01:00
Florian Hahn	040bb37195	[VPlan] Fix incorrect argument for CreateBinOp after 06c3a7d2d764. 06c3a7d2d764 incorrectly updated CreateBinOp to pass the debug location, which gets interpreted as FPMath node. Remove the argument.	2024-09-24 11:18:50 +01:00
Florian Hahn	57f5d8f2fe	[VPlan] Only store single vector per VPValue in VPTransformState. (NFC) After 8ec406757cb92 (https://github.com/llvm/llvm-project/pull/95842), VPTransformState only stores a single vector value per VPValue. Simplify the code by replacing the SmallVector in PerPartOutput with a single Value * and rename to VPV2Vector for clarity. Also remove the redundant Part argument from various accessors.	2024-09-23 11:28:24 +01:00
Florian Hahn	06c3a7d2d7	[VPlan] Remove unneeded State.UF after 8ec406757cb92 (NFC). State.UF is not needed any longer after 8ec406757cb92 (https://github.com/llvm/llvm-project/pull/95842). Clean it up, simplifying ::execute of existing recipes.	2024-09-22 20:42:37 +01:00
Florian Hahn	8ec406757c	[VPlan] Implement unrolling as VPlan-to-VPlan transform. (#95842 ) This patch implements explicit unrolling by UF as VPlan transform. In follow up patches this will allow simplifying VPTransform state (no need to store unrolled parts) as well as recipe execution (no need to generate code for multiple parts in an each recipe). It also allows for more general optimziations (e.g. avoid generating code for recipes that are uniform-across parts). It also unifies the logic dealing with unrolled parts in a single place, rather than spreading it out across multiple places (e.g. VPlan post processing for header-phi recipes previously.) In the initial implementation, a number of recipes still take the unrolled part as additional, optional argument, if their execution depends on the unrolled part. The computation for start/step values for scalable inductions changed slightly. Previously the step would be computed as scalar and then splatted, now vscale gets splatted and multiplied by the step in a vector mul. This has been split off https://github.com/llvm/llvm-project/pull/94339 which also includes changes to simplify VPTransfomState and recipes' ::execute. The current version mostly leaves existing ::execute untouched and instead sets VPTransfomState::UF to 1. A follow-up patch will clean up all references to VPTransformState::UF. Another follow-up patch will simplify VPTransformState to only store a single vector value per VPValue. PR: https://github.com/llvm/llvm-project/pull/95842	2024-09-21 19:47:37 +01:00
Jay Foad	e03f427196	[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133 ) It is almost always simpler to use {} instead of std::nullopt to initialize an empty ArrayRef. This patch changes all occurrences I could find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor could be deprecated or removed.	2024-09-19 16:16:38 +01:00
Florian Hahn	256100489d	[VPlan] Rename isDefinedOutside[Vector]Regions -> [Loop] (NFC) Clarify name of helper, split off from https://github.com/llvm/llvm-project/pull/95842/files#r1765556732.	2024-09-19 11:20:31 +01:00
Shih-Po Hung	ffcff2f465	[VPlan][NFC] Fix the value name of VECTOR_GEP (#107544 ) This patch passes the string `"vector.gep"` to CreateGEP instead of CreateMul.	2024-09-18 19:22:36 +08:00
LiqinWeng	a2994b2999	[LV][NFC] Unify printing for WidenEVLReicpe with other EVL recipes (#108177 )	2024-09-18 15:03:37 +08:00
Florian Hahn	f0c5caa814	[VPlan] Add VPIRInstruction, use for exit block live-outs. (#100735 ) Add a new VPIRInstruction recipe to wrap existing IR instructions not to be modified during execution, execept for PHIs. For PHIs, a single VPValue operand is allowed, and it is used to add a new incoming value for the single predecessor VPBB. Expect PHIs, VPIRInstructions cannot have any operands. Depends on https://github.com/llvm/llvm-project/pull/100658. PR: https://github.com/llvm/llvm-project/pull/100735	2024-09-14 21:21:55 +01:00
Florian Hahn	a794ee4559	[VPlan] Add VPValue for VF, use it for VPWidenIntOrFpInductionRecipe. (#95305 ) Similar to VFxUF, also add a VF VPValue to VPlan and use it to get the runtime VF in VPWidenIntOrFpInductionRecipe. Code for VF is only generated if there are users of VF, to avoid unnecessary test changes. PR: https://github.com/llvm/llvm-project/pull/95305	2024-09-10 10:41:35 +01:00
Kazu Hirata	ce192b87b2	[Vectorize] Fix a warning This patch fixes: llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp:1278:12: error: unused variable 'Op0' [-Werror,-Wunused-variable]	2024-09-06 09:12:06 -07:00
Kolya Panchenko	00e40c9b5b	[LV] Support binary and unary operations with EVL-vectorization (#93854 ) The patch adds `VPWidenEVLRecipe` which represents `VPWidenRecipe` + EVL argument. The new recipe replaces `VPWidenRecipe` in `tryAddExplicitVectorLength` for each binary and unary operations. Follow up patches will extend support for remaining cases, like `FCmp` and `ICmp`	2024-09-06 11:41:36 -04:00

1 2 3 4 5

214 Commits