llvm-project

Author	SHA1	Message	Date
Nikita Popov	2d209d964a	[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902 ) This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.	2024-06-27 16:38:15 +02:00
Florian Hahn	ab9c2b1c54	[VPlan] Restructure code for BranchOnCond codegen. (NFCI) Reoder code to exit early if the BranchOnCond isn't in an exiting block. This delays retrieving the parent region, which may not be present. Split off from https://github.com/llvm/llvm-project/pull/92651.	2024-06-23 20:11:37 +01:00
Florian Hahn	f1f3c34b47	Revert "Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )"" This reverts commit 242cc200ccb24e22eaf54aed7b0b0c84cfc54c0b and eea150c84053035163f307b46549a2997a343ce9, as it is causing a build bot failure and there have been a number of crashes reported at https://github.com/llvm/llvm-project/pull/92555	2024-06-21 19:54:21 +01:00
Florian Hahn	242cc200cc	Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )" This reverts commit 6f538f6a2d3224efda985e9eb09012fa4275ea92. Extra tests for crashes discovered when building Chromium have been added in fb86cb7ec157689e, 3be7312f81ad2. Original message: This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's https://github.com/llvm/llvm-project/pull/67647 and https://github.com/llvm/llvm-project/pull/67934 which is an earlier version of the current PR. PR: https://github.com/llvm/llvm-project/pull/92555	2024-06-20 17:32:52 +01:00
Florian Hahn	40a72f8cc4	[VPlan] Support extracting any lane of uniform value. If the value we are extracting a lane from is uniform, only the first lane will be set. Return lane 0 for any requested lane. This fixes a crash when trying to extract the last lane for a first-order recurrence resume value. Fixes https://github.com/llvm/llvm-project/issues/95520.	2024-06-14 22:16:52 +01:00
Arthur Eubanks	6f538f6a2d	Revert "Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )"" This reverts commit 90fd99c0795711e1cf762a02b29b0a702f86a264. This reverts commit 43e6f46936e177e47de6627a74b047ba27561b44. Causes crashes, see comments on https://github.com/llvm/llvm-project/pull/92555.	2024-06-14 17:47:08 +00:00
Florian Hahn	90fd99c079	Recommit "[VPlan] First step towards VPlan cost modeling. (#92555 )" This reverts commit 46080abe9b136821eda2a1a27d8a13ceac349f8c. Extra tests have been added in 52d29eb287. Original message: This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's https://github.com/llvm/llvm-project/pull/67647 and https://github.com/llvm/llvm-project/pull/67934 which is an earlier version of the current PR. PR: https://github.com/llvm/llvm-project/pull/92555	2024-06-14 12:33:48 +01:00
Arthur Eubanks	46080abe9b	Revert "[VPlan] First step towards VPlan cost modeling. (#92555 )" This reverts commit 00798354c553d48d27006a2b06a904bd6013e31b. Causes crashes, see comments on https://github.com/llvm/llvm-project/pull/92555.	2024-06-13 16:37:21 +00:00
Florian Hahn	00798354c5	[VPlan] First step towards VPlan cost modeling. (#92555 ) This adds a new interface to compute the cost of recipes, VPBasicBlocks, VPRegionBlocks and VPlan, initially falling back to the legacy cost model for all recipes. Follow-up patches will gradually migrate recipes to compute their own costs step-by-step. It also adds getBestPlan function to LVP which computes the cost of all VPlans and picks the most profitable one together with the most profitable VF. The VPlan selected by the VPlan cost model is executed and there is an assert to catch cases where the VPlan cost model and the legacy cost model disagree. Even though I checked a number of different build configurations on AArch64 and X86, there may be some differences that have been missed. Additional discussions and context can be found in @arcbbb's https://github.com/llvm/llvm-project/pull/67647 and https://github.com/llvm/llvm-project/pull/67934 which is an earlier version of the current PR. PR: https://github.com/llvm/llvm-project/pull/92555	2024-06-13 14:26:18 +01:00
Florian Hahn	2f4ebf8545	[VPlan] Handle more cases in VPInstruction::onlyFirstPartUsed. Handle binary ops and a few other instructions in onlyFirstPartUsed; they only use the first part if they themselves only have their first part used.	2024-06-09 13:19:44 +01:00
Florian Hahn	998c33e5fc	[VPlan] Mark FirstOrderRecurrenceSplice as not having side-effects. Now that FOR exit and resume value creation is explicitly modeled in VPlan (05e1b5340b0caf1, 07b330132c0b) it doesn't depend on the first order recurrence splice being preserved and it can now be marked as not having side-effects. This allows removal of first-order-recurrence-splce if the FOR is only used in the exit or as scalar ph resume value.	2024-06-08 21:40:30 +01:00
Florian Hahn	a43d999d14	[VPlan] Check if only first part is used for all per-part VPInsts. Apply the onlyFirstPartUsed logic generally to all per-part VPInstructions. Note that the test changes remove the second part of an unsued first-order recurrence splice.	2024-06-08 20:31:54 +01:00
Florian Hahn	07b330132c	[VPlan] Model FOR extract of exit value in VPlan. (#93395 ) This patch introduces a new ExtractFromEnd VPInstruction opcode to extract the value of a FOR for users outside the loop (i.e. in the scalar loop's exits). This moves the first part of fixing first order recurrences to VPlan, and removes some additional code to patch up live-outs, which is now handled automatically. The majority of test changes is due to changes in the order of which the extracts are generated now. As we are now using VPTransformState to generate the extracts, we may be able to re-use existing extracts in the loop body in some cases. For scalable vectors, in some cases we now have to compute the runtime VF twice, as each extract is now independent, but those should be trivial to clean up for later passes (and in line with other places in the code that also liberally re-compute runtime VFs). PR: https://github.com/llvm/llvm-project/pull/93395	2024-06-03 20:20:30 +01:00
Florian Hahn	d187005cad	[VPlan] Update VPBlendRecipe codegen for for first-lane only. Update VPBlendRecipe::execute to support generating code for first-lane only. This fixes a crash in the newly added test @test_not_first_lane_only_wide_compare_incoming_order_swapped.	2024-05-15 11:00:15 +01:00
Florian Hahn	632317e9ab	[VPlan] Add non-poison propagating LogicalAnd VPInstruction opcode. (#91897 ) Add a new opcode to mode non-poison propagating logical AND operations used when generating edge masks. This follows the similar decision to model Not as dedicated opcode as well, to improve clarity. This also helps to simplify the matchers for https://github.com/llvm/llvm-project/pull/89386. PR: https://github.com/llvm/llvm-project/pull/91897	2024-05-14 09:42:49 +01:00
Florian Hahn	bccb7ed8ac	Reapply "[LV] Improve AnyOf reduction codegen. (#78304 )" This reverts the revert commit c6e01627acf859. This patch includes a fix for any-of reductions and epilogue vectorization. Extra test coverage for the issue that caused the revert has been added in bce3bfced5fe0b019 and an assertion has been added in c7209cbb8be7a3c65813. -------------------------------- Original commit message: Update AnyOf reduction code generation to only keep track of the AnyOf property in a boolean vector in the loop, only selecting either the new or start value in the middle block. The patch incorporates feedback from https://reviews.llvm.org/D153697. This fixes the #62565, as now there aren't multiple uses of the start/new values. Fixes https://github.com/llvm/llvm-project/issues/62565 PR: https://github.com/llvm/llvm-project/pull/78304	2024-05-03 14:40:49 +01:00
Florian Hahn	a48ebb8276	[VPlan] Check type directly in ::isCanonical (NFC). Directly check the type of the wide induction matches the canonical induction. Refactor suggested in and in preparation for https://github.com/llvm/llvm-project/pull/89603	2024-05-03 13:12:33 +01:00
Florian Hahn	e846778e52	[VPlan] Make CallInst optional for VPWidenCallRecipe (NFCI). Replace relying on the underling CallInst for looking up the called function and its types by instead adding the called function as operand, in line with how called functions are handled in CallInst. Operand bundles, metadata and fast-math flags are optionally used if there's an underlying CallInst. This enables creating VPWidenCallRecipes without requiring an underlying IR instruction.	2024-05-01 20:48:22 +01:00
Florian Hahn	e2a72fa583	[VPlan] Introduce recipes for VP loads and stores. (#87816 ) Introduce new subclasses of VPWidenMemoryRecipe for VP (vector-predicated) loads and stores to address multiple TODOs from https://github.com/llvm/llvm-project/pull/76172 Note that the introduction of the new recipes also improves code-gen for VP gather/scatters by removing the redundant header mask. With the new approach, it is not sufficient to look at users of the widened canonical IV to find all uses of the header mask. In some cases, a widened IV is used instead of separately widening the canonical IV. To handle that, first collect all VPValues representing header masks (by looking at users of both the canonical IV and widened inductions that are canonical) and then checking all users (recursively) of those header masks. Depends on https://github.com/llvm/llvm-project/pull/87411. PR: https://github.com/llvm/llvm-project/pull/87816	2024-04-19 09:44:23 +01:00
Florian Hahn	a9bafe91dd	[VPlan] Split VPWidenMemoryInstructionRecipe (NFCI). (#87411 ) This patch introduces a new VPWidenMemoryRecipe base class and distinct sub-classes to model loads and stores. This is a first step in an effort to simplify and modularize code generation for widened loads and stores and enable adding further more specialized memory recipes. PR: https://github.com/llvm/llvm-project/pull/87411	2024-04-17 11:00:58 +01:00
Arthur Eubanks	c6e01627ac	Revert "Reapply "[LV] Improve AnyOf reduction codegen. (#78304 )"" This reverts commit c6e38b928c56f562aea68a8e90f02dbdf0eada85. Causes miscompiles, see comments on #78304.	2024-04-16 20:40:21 +00:00
Florian Hahn	c836983671	[VPlan] Remove unused first mask op from VPBlendRecipe. (#87770 ) VPBlendRecipe does not use the first mask operand. Removing it allows VPlan-based DCE to remove unused mask computations. This also fixes #87410, where unused Not VPInstructions are considered having only their first lane demanded, but some of their operands providing a vector value due to other users. Fixes https://github.com/llvm/llvm-project/issues/87410 PR: https://github.com/llvm/llvm-project/pull/87770	2024-04-09 11:14:05 +01:00
Florian Hahn	15d11a4de9	[VPlan] Track IsOrdered in VPReductionRecipe, remove use of ILV (NFCI). Instead of using ILV.useOrderedReductions during ::execute, instead store the information at recipe construction. Another step towards making recipe'::execute independent of legacy ILV.	2024-04-07 20:33:22 +01:00
Florian Hahn	c6e38b928c	Reapply "[LV] Improve AnyOf reduction codegen. (#78304 )" This reverts the revert commit 589c7abb03448. This patch includes a fix for any-of reductions and epilogue vectorization. Extra test coverage for the issue that caused the revert has been added in 399ff08e29d. -------------------------------- Original commit message: Update AnyOf reduction code generation to only keep track of the AnyOf property in a boolean vector in the loop, only selecting either the new or start value in the middle block. The patch incorporates feedback from https://reviews.llvm.org/D153697. This fixes the #62565, as now there aren't multiple uses of the start/new values. Fixes https://github.com/llvm/llvm-project/issues/62565 PR: https://github.com/llvm/llvm-project/pull/78304	2024-04-05 13:45:13 +01:00
Alexey Bataev	413a66f339	[LV, VP]VP intrinsics support for the Loop Vectorizer + adding new tail-folding mode using EVL. (#76172 ) This patch introduces generating VP intrinsics in the Loop Vectorizer. Currently the Loop Vectorizer supports vector predication in a very limited capacity via tail-folding and masked load/store/gather/scatter intrinsics. However, this does not let architectures with active vector length predication support take advantage of their capabilities. Architectures with general masked predication support also can only take advantage of predication on memory operations. By having a way for the Loop Vectorizer to generate Vector Predication intrinsics, which (will) provide a target-independent way to model predicated vector instructions. These architectures can make better use of their predication capabilities. Our first approach (implemented in this patch) builds on top of the existing tail-folding mechanism in the LV (just adds a new tail-folding mode using EVL), but instead of generating masked intrinsics for memory operations it generates VP intrinsics for loads/stores instructions. The patch adds a new VPlanTransforms to replace the wide header predicate compare with EVL and updates codegen for load/stores to use VP store/load with EVL. Other important part of this approach is how the Explicit Vector Length is computed. (VP intrinsics define this vector length parameter as Explicit Vector Length (EVL)). We use an experimental intrinsic `get_vector_length`, that can be lowered to architecture specific instruction(s) to compute EVL. Also, added a new recipe to emit instructions for computing EVL. Using VPlan in this way will eventually help build and compare VPlans corresponding to different strategies and alternatives. Differential Revision: https://reviews.llvm.org/D99750	2024-04-04 18:30:17 -04:00
Florian Hahn	16da9d5351	[VPlan] Remove redundant set of debug loc in VPInstruction (NFCI). Consistently use setDebugLocFrom and remove redundant setDebugLocFrom.	2024-04-02 10:43:34 +01:00
Florian Hahn	06bb8c9f20	[VPlan] Explicitly handle scalar pointer inductions. (#83068 ) Add a new PtrAdd opcode to VPInstruction that corresponds to IRBuilder::CreatePtrAdd, which creates a GEP with source element type i8. This is then used to model scalarizing VPWidenPointerInductionRecipe by introducing scalar-steps to model the index increment followed by a PtrAdd. Note that PtrAdd needs to be able to generate code for only the first lane or for all lanes. This may warrant introducing a separate recipe for scalarizing that can be created without relying on the underlying IR. Depends on https://github.com/llvm/llvm-project/pull/80271 PR: https://github.com/llvm/llvm-project/pull/83068	2024-03-26 16:01:57 +01:00
Florian Hahn	f0a8738401	[VPlan] Generate CalculateTripCountMinusVF for Part 0 only. (NFCI). The value produced by CalculateTripCountMinusVF VPInstructions is independent of the part. Only compute it for part 0 and use that for other parts.	2024-03-24 20:59:54 +00:00
Kirill Stoimenov	589c7abb03	Revert "[LV] Improve AnyOf reduction codegen. (#78304 )" Broke sanitizer bots: https://lab.llvm.org/buildbot/#/builders/74/builds/26697 This reverts commit 95fef1dfefd5467206e74c089d29806fcd82889b.	2024-03-14 14:57:01 +00:00
Florian Hahn	95fef1dfef	[LV] Improve AnyOf reduction codegen. (#78304 ) Update AnyOf reduction code generation to only keep track of the AnyOf property in a boolean vector in the loop, only selecting either the new or start value in the middle block. The patch incorporates feedback from https://reviews.llvm.org/D153697. This fixes the #62565, as now there aren't multiple uses of the start/new values. Fixes https://github.com/llvm/llvm-project/issues/62565 PR: https://github.com/llvm/llvm-project/pull/78304	2024-03-14 11:22:06 +00:00
Florian Hahn	9277a32305	[VPlan] Funnel recipe insert* through VPBasicBlock::insert (NFCI). This allows relying on VPBasicBlock::insert to make sure insertion is well formed, i.e. by updating the recipe's parent as well as other potential invariants in the future.	2024-03-11 10:56:40 +00:00
Cameron McInally	012d217174	[LV] Use scalar CMP for active-lane-mask with scalar VF (#83902 ) Instead of generating a <1 x i1> active lane mask intrinsic, generate the equivalent scalar ICMP instead. This allows us to avoid unnecessarily extracting the scalar part from the vector mask. Fixes llvm#73894.	2024-03-06 15:59:35 -05:00
Florian Hahn	911055e34f	[VPlan] Consistently use (Part, 0) for first lane scalar values (#80271 ) At the moment, some VPInstructions create only a single scalar value, but use VPTransformatState's 'vector' storage for this value. Those values are effectively uniform-per-VF (or in some cases uniform-across-VF-and-UF). Using the vector/per-part storage doesn't interact well with other recipes, that more accurately using (Part, Lane) to look up scalar values and prevents VPInstructions creating scalars from interacting with other recipes working with scalars. This PR tries to unify handling of scalars by using (Part, 0) for scalar values where only the first lane is demanded. This allows using VPInstructions with other recipes like VPScalarCastRecipe and is also needed when using VPInstructions in more cases otuside the vector loop region to generate scalars. Depends on https://github.com/llvm/llvm-project/pull/80269	2024-02-26 19:06:43 +00:00
Florian Hahn	cd160a6e98	[VPlan] Do not add call results with void type to State (NFC). With vector libraries, we may vectorize calls with void return types. Do not add those values to the state; they can never be accessed.	2024-02-21 20:36:17 +00:00
Florian Hahn	47abbf4fe9	[VPlan] Update VPInst::onlyFirstLaneUsed to check users. (#80269 ) A VPInstruction only has its first lane used if all users use its first lane only. Use vputils::onlyFirstLaneUsed to continue checking the recipe's users to handle more cases. Besides allowing additional introduction of scalar steps when interleaving in some cases, this also enables using an Add VPInstruction to model the increment - as a follow up.	2024-02-03 16:19:10 +00:00
Florian Hahn	2906f3626b	[VPlan] Update ::onlyScalarsGenerated to take IsScalable bool (NFCI). Instead of passing in a full VF, just pass IsScalable as bool.	2024-02-03 14:51:14 +00:00
Graham Hunter	d4c0171423	[LV] Fix handling of interleaving linear args (#78725 ) Currently when interleaving vector calls with linear arguments, the Part is ignored and all vector calls use the initial value from the first lane of the current iteration. Fix this to extract from the correct part of the linear vector.	2024-01-26 11:30:35 +00:00
Florian Hahn	0ab539fd67	[VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (#78113 ) Add a new recipe to model scalar cast instructions, without relying on an underlying instruction. This allows creating scalar casts, without relying on an underlying instruction (like the current VPReplicateRecipe). The new recipe is used to explicitly model both truncating the induction step and the VPDerivedIVRecipe, thus simplifying both the recipe and code needed to introduce it. Truncating VPWidenIntOrFpInductionRecipes should also be modeled using the new recipe, as follow-up. PR: https://github.com/llvm/llvm-project/pull/78113	2024-01-26 11:13:05 +00:00
Florian Hahn	42fb1fac9e	[VPlan] Use DebugLoc from recipe in VPWidenCallRecipe (NFCI). Instead of using the debug location of the underlying instruction, use the debug location from the recipe. This removes an unneeded dependency of the underlying instruction.	2024-01-19 13:33:03 +00:00
Florian Hahn	abdb61f5fd	[VPlan] Introduce VPSingleDefRecipe. (#77023 ) This patch introduces a new common base class for recipes defining a single result VPValue. This has been discussed/mentioned at various previous reviews as potential follow-up and helps to replace various getVPSingleValue calls. PR: https://github.com/llvm/llvm-project/pull/77023	2024-01-19 10:27:53 +00:00
Florian Hahn	6011d6b2cc	[VPlan] Use start value of reduction phi to determine type (NFCI). Instead of accessing the underlying original IR value, check the type of the start value from the recipe directly.	2024-01-16 14:39:51 +00:00
Florian Hahn	51afb10174	[LV] Create block in mask up-front if needed. (#76635 ) At the moment, block and edge masks are created on demand, which means that they are inserted at the point where they are demanded and then cached. It is possible that the mask for a block is looked up later at a point that's not dominated by the point where the mask has been inserted. To avoid this, create masks up front on entry to the corresponding basic block and leave it to VPlan simplification to remove unneeded masks. Note that we need to create masks for all blocks, if any of the blocks in the loop needs predication, as computing the mask of a block depends on the masks of its predecessor. Needed for #76090. https://github.com/llvm/llvm-project/pull/76635	2024-01-09 10:50:08 +00:00
Florian Hahn	18ec3304a9	[VPlan] Manage InBounds via VPRecipeWithIRFlags for VectorPtrRecipe. As suggested as follow-up in https://github.com/llvm/llvm-project/pull/72164, manage inbounds via VPRecipeWithIRFlags. Note that in some cases we can now preserve inbounds in a few more cases.	2024-01-07 13:58:05 +00:00
Florian Hahn	3fb0d8dc80	Recommit "[VPlan] Mark Select VPInstructions as not having sideeffects." With #70253 landed, selects for reduction results are explicitly used by ComputeReductionResult and Selects can be marked as not having side-effects again. This reverts the revert commit 173032902c960d4d0d67b521d8c149553d8e8ba3.	2024-01-06 12:08:06 +00:00
Florian Hahn	241fe83704	[VPlan] Introduce ComputeReductionResult VPInstruction opcode. (#70253 ) This patch introduces a new ComputeReductionResult opcode to compute the final reduction result in the middle block. The code from fixReduction has been moved to ComputeReductionResult, after some earlier cleanup changes to model parts of fixReduction explicitly elsewhere as needed. The recipe may be broken down further in the future. Note that the phi nodes to merge the reduction result from the trip count check and the middle block, to be used as resume value for the scalar remainder loop are also generated based on ComputeReductionResult. Once we have a VPValue for the reduction result, this can also be modeled explicitly and moved out of the recipe.	2024-01-04 22:53:18 +00:00
Alexandros Lamprineas	e512df3ecc	[LV] Fix crash when vectorizing function calls with linear args. (#76274 ) llvm/lib/IR/Type.cpp:694: Assertion `isValidElementType(ElementType) && "Element type of a VectorType must be an integer, floating point, or pointer type."' failed. Stack dump: llvm::FixedVectorType::get(llvm::Type, unsigned int) llvm::VPWidenCallRecipe::execute(llvm::VPTransformState&) llvm::VPBasicBlock::execute(llvm::VPTransformState) llvm::VPRegionBlock::execute(llvm::VPTransformState) llvm::VPlan::execute(llvm::VPTransformState) ... Happens with function calls of void return type.	2024-01-02 18:14:16 +00:00
Florian Hahn	f18536d642	[VPlan] Model address separately. (#72164 ) Move vector pointer generation to a separate VPVectorPointerRecipe. This untangles address computation from the memory recipes future and is also needed to enable explicit unrolling in VPlan. https://github.com/llvm/llvm-project/pull/72164	2024-01-01 19:51:15 +00:00
Shih-Po Hung	3d422a9859	[VPlan] Implement mayHaveSideEffects/mayWriteToMemory for VPInterleav… (#71360 ) …eRecipe This helps VPlanTransforms::removeDeadRecipes to work on VPInterleaveRecipe	2023-12-15 00:23:14 +08:00
Florian Hahn	173032902c	Revert "[VPlan] Mark Select VPInstructions as not having sideeffects." This reverts commit 19918ac34dc5d304ec6ad413ceae1d4394abe28f. Fixes #75298. There is still a case where we miss the correct users outside the main vector loop for reductions, and that is tail-folded loops with reductions where the final value is stored after the loop. This should be handled explicitly in #70253	2023-12-13 21:05:24 +00:00
Florian Hahn	19918ac34d	[VPlan] Mark Select VPInstructions as not having sideeffects. Select VPInstructions don't have sideeffects, mark them accordingly.	2023-12-11 12:26:32 +00:00

1 2 3

137 Commits