llvm-project

Author	SHA1	Message	Date
Florian Hahn	a8ec1eb843	[VPlan] Dont assign slots to VPValues with an underlying value. This makes sure the numbering for VPValues without underlying values is consecutive.	2024-04-09 21:30:51 +01:00
Alexey Bataev	413a66f339	[LV, VP]VP intrinsics support for the Loop Vectorizer + adding new tail-folding mode using EVL. (#76172 ) This patch introduces generating VP intrinsics in the Loop Vectorizer. Currently the Loop Vectorizer supports vector predication in a very limited capacity via tail-folding and masked load/store/gather/scatter intrinsics. However, this does not let architectures with active vector length predication support take advantage of their capabilities. Architectures with general masked predication support also can only take advantage of predication on memory operations. By having a way for the Loop Vectorizer to generate Vector Predication intrinsics, which (will) provide a target-independent way to model predicated vector instructions. These architectures can make better use of their predication capabilities. Our first approach (implemented in this patch) builds on top of the existing tail-folding mechanism in the LV (just adds a new tail-folding mode using EVL), but instead of generating masked intrinsics for memory operations it generates VP intrinsics for loads/stores instructions. The patch adds a new VPlanTransforms to replace the wide header predicate compare with EVL and updates codegen for load/stores to use VP store/load with EVL. Other important part of this approach is how the Explicit Vector Length is computed. (VP intrinsics define this vector length parameter as Explicit Vector Length (EVL)). We use an experimental intrinsic `get_vector_length`, that can be lowered to architecture specific instruction(s) to compute EVL. Also, added a new recipe to emit instructions for computing EVL. Using VPlan in this way will eventually help build and compare VPlans corresponding to different strategies and alternatives. Differential Revision: https://reviews.llvm.org/D99750	2024-04-04 18:30:17 -04:00
Florian Hahn	e5abd963c7	[VPlan] Remove VPTransformState::addMetadata with ArrayRef arg (NFCI). addMeadata is only over called with a single element, clean up the variant that takes multiple values.	2024-04-03 09:43:12 +01:00
Florian Hahn	8a614c1d31	[VPlan] Rename getVPValueOrAddLiveIn -> getOrAddLiveIn (NFCI). The helper now only deals with live-ins, clarify the name.	2024-03-28 21:02:15 +00:00
Florian Hahn	06bb8c9f20	[VPlan] Explicitly handle scalar pointer inductions. (#83068 ) Add a new PtrAdd opcode to VPInstruction that corresponds to IRBuilder::CreatePtrAdd, which creates a GEP with source element type i8. This is then used to model scalarizing VPWidenPointerInductionRecipe by introducing scalar-steps to model the index increment followed by a PtrAdd. Note that PtrAdd needs to be able to generate code for only the first lane or for all lanes. This may warrant introducing a separate recipe for scalarizing that can be created without relying on the underlying IR. Depends on https://github.com/llvm/llvm-project/pull/80271 PR: https://github.com/llvm/llvm-project/pull/83068	2024-03-26 16:01:57 +01:00
Florian Hahn	2435dcd83a	[VPlan] Add initial pattern match implementation for VPInstruction. (#80563 ) Add an initial version of a pattern match for VPValues and recipes, starting with VPInstruction. PR: https://github.com/llvm/llvm-project/pull/80563	2024-03-03 21:48:58 +00:00
Florian Hahn	911055e34f	[VPlan] Consistently use (Part, 0) for first lane scalar values (#80271 ) At the moment, some VPInstructions create only a single scalar value, but use VPTransformatState's 'vector' storage for this value. Those values are effectively uniform-per-VF (or in some cases uniform-across-VF-and-UF). Using the vector/per-part storage doesn't interact well with other recipes, that more accurately using (Part, Lane) to look up scalar values and prevents VPInstructions creating scalars from interacting with other recipes working with scalars. This PR tries to unify handling of scalars by using (Part, 0) for scalar values where only the first lane is demanded. This allows using VPInstructions with other recipes like VPScalarCastRecipe and is also needed when using VPInstructions in more cases otuside the vector loop region to generate scalars. Depends on https://github.com/llvm/llvm-project/pull/80269	2024-02-26 19:06:43 +00:00
Florian Hahn	85da9f80b8	[VPlan] Remove unused VPTransformState::VPValue2Value (NFCI). Clean up unused member variable.	2024-02-25 12:14:44 +00:00
Florian Hahn	3d66d6932e	[VPlan] Support live-ins without underlying IR in type analysis. (#80723 ) A VPlan contains multiple live-ins without underlying IR, like VFxUF or VectorTripCount. Trying to infer the scalar type of those causes a crash at the moment. Update VPTypeAnalysis to take a VPlan in its constructor and assign types to those live-ins up front. All those live-ins share the type of the canonical IV. PR: https://github.com/llvm/llvm-project/pull/80723	2024-02-21 19:37:15 +00:00
Florian Hahn	9923d29cfa	[VPlan] Merge main VPlan verifer with HCFG verifier. Unify VPlan verifiers in verifyVPlanIsValid. This adds verification for various properties on blocks to the verifier used for VPlans generated by the inner loop vectorizer. It also adds def-use checks for the verifier used in the VPlan native path. This drops the separate flag to enable HCFG verification. Instead, all VPlans are verified once they have been created, if assertions are enabled. This also removes VPWidenPHIRecipe from VPHeaderPHIRecipe; it is used to model any phi node in the native path.	2024-02-20 16:43:57 +00:00
Florian Hahn	3444240540	[VPlan] Mark vputils::onlyFirstPartUsed arg as const (NFC) Split off https://github.com/llvm/llvm-project/pull/80269 as suggested.	2024-02-03 15:59:09 +00:00
Florian Hahn	6936479020	[VPlan] Mark vputils::onlyFirstLaneUsed arg as const (NFC) Split off https://github.com/llvm/llvm-project/pull/80269 as suggested.	2024-02-03 15:56:40 +00:00
Florian Hahn	2906f3626b	[VPlan] Update ::onlyScalarsGenerated to take IsScalable bool (NFCI). Instead of passing in a full VF, just pass IsScalable as bool.	2024-02-03 14:51:14 +00:00
Florian Hahn	1b37e8087e	[VPlan] use getVPValueOrAddLiveIn in VPlan::duplicate. Instead of creating live-ins manually, use getOrAddLiveIn which automatically takes care of adding them to VPLiveInsToFree. Also use it to create the VPValue for the trip-count. This fixes a leak: https://lab.llvm.org/buildbot/#/builders/168/builds/18308/steps/10/logs/stdio	2024-01-28 12:39:39 +00:00
Florian Hahn	ec402a2e53	[VPlan] Implement cloning of VPlans. (#73158 ) This patch implements cloning for VPlans and recipes. Cloning is used in the epilogue vectorization path, to clone the VPlan for the main vector loop. This means we won't re-use a VPlan when executing the VPlan for the epilogue vector loop, which in turn will enable us to perform optimizations based on UF & VF.	2024-01-27 13:30:52 +00:00
Florian Hahn	731c2049a4	[VPlan] Relax IV user assertion after 0ab539f for epilogue vec. After 0ab539fd6748adf2f638e10514dd9419597d8863, the canonical IV in the epilogue vector loop may be used by a trunc. Relax the corresponding assert. This should fix some build-bot failures, including https://lab.llvm.org/buildbot/#/builders/187/builds/14113 https://lab.llvm.org/buildbot/#/builders/98/builds/32350 https://lab.llvm.org/buildbot/#/builders/239/builds/5473	2024-01-26 13:19:25 +00:00
Florian Hahn	3683852d49	[VPlan] Use replaceUsesWithIf in replaceAllUseswith and add comment (NFCI). Follow-up to post-commit commens for b1bfe221e6.	2024-01-21 12:56:16 +00:00
Florian Hahn	241fe83704	[VPlan] Introduce ComputeReductionResult VPInstruction opcode. (#70253 ) This patch introduces a new ComputeReductionResult opcode to compute the final reduction result in the middle block. The code from fixReduction has been moved to ComputeReductionResult, after some earlier cleanup changes to model parts of fixReduction explicitly elsewhere as needed. The recipe may be broken down further in the future. Note that the phi nodes to merge the reduction result from the trip count check and the middle block, to be used as resume value for the scalar remainder loop are also generated based on ComputeReductionResult. Once we have a VPValue for the reduction result, this can also be modeled explicitly and moved out of the recipe.	2024-01-04 22:53:18 +00:00
Florian Hahn	b1bfe221e6	[VPlan] Remove unneeded getNumUsers calls in replaceAllUsesWith (NFC). As suggested post-commit for a00227197, replace unnecessary getNumUsers calls by boolean variable to indicate if users changed. Note that this also requires an early exit to detect the case where a value is replaced by itself.	2023-12-15 13:43:15 +00:00
Florian Hahn	a5891fa4d2	[VPlan] Initial modeling of VF * UF as VPValue. (#74761 ) This patch starts initial modeling of VF * UF in VPlan. Initially, introduce a dedicated VFxUF VPValue, which is then populated during VPlan::prepareToExecute. Initially, the VF * UF applies only to the main vector loop region. Once we extend the scope of VPlan in the future, we may want to associate different VFxUFs with different vector loop regions (e.g. the epilogue vector loop) This allows explicitly parameterizing recipes that rely on the VF * UF, like the canonical induction increment. At the moment, this mainly helps to avoid generating some duplicated calls to vscale with scalable vectors. It should also allow using EVL as induction increments explicitly in D99750. Referring to VF * UF is also needed in other places that we plan to migrate to VPlan, like the minimum trip count check during skeleton creation. The first version creates the value for VF * UF directly in prepareToExecute to limit the scope of the patch. A follow-on patch will model VF * UF computation explicitly in VPlan using recipes. Moved from Phabricator (https://reviews.llvm.org/D157322)	2023-12-08 18:30:30 +00:00
Florian Hahn	99aa5311ee	[VPlan] Add missing output of live-ins to VPlan dot printing. Split off live-in printing to VPlan::printLiveIns and use it to print Live-ins when printing in the DOT format.	2023-12-04 13:41:28 +00:00
Florian Hahn	906f598263	[VPlan] Remove dead IsEpilogueVec argument from prepareToExecute (NFC).	2023-11-23 16:59:50 +00:00
Florian Hahn	34c2dcd5ac	[VPlan] Move initial skeleton construction to createInitialVPlan. (NFC) This patch moves creating the middle VPBBs and an initial empty vector loop region for the top-level loop to createInitialVPlan. This consolidates code to create the initial VPlan skeleton and enables adding other bits outside the main region during initial VPlan construction. In particular, D150398 will add the exit check & branch to the middle block. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D158333	2023-11-12 13:00:44 +00:00
Florian Hahn	a002271972	[VPlan] Add VPValue::replaceUsesWithIf (NFCI). Add replaceUsesWithIf helper and use it in a few places.	2023-11-06 16:08:22 +00:00
Kazu Hirata	3af0ff99b1	[llvm] Stop including llvm/ADT/DepthFirstIterator.h (NFC) Identified with misc-include-cleaner.	2023-10-22 12:15:46 -07:00
Florian Hahn	97687b7aea	[VPlan] Add active-lane-mask as VPlan-to-VPlan transformation. This patch updates the mask creation code to always create compares of the form (ICMP_ULE, wide canonical IV, backedge-taken-count) up front when tail folding and introduce active-lane-mask as later transformation. This effectively makes (ICMP_ULE, wide canonical IV, backedge-taken-count) the canonical form for tail-folding early on. Introducing more specific active-lane-mask recipes is treated as a VPlan-to-VPlan optimization. This has the advantage of keeping the logic (and complexity) of introducing active-lane-mask recipes in a single place, instead of spreading the logic out across multiple functions. It also simplifies initial VPlan construction and enables treating introducing EVL as similar optimization. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D158779	2023-09-25 13:34:45 +01:00
Florian Hahn	168e23c741	[VPlan] Remove reference to Instr when setting debug loc. (NFCI) This allows untangling references to underlying IR for various recipes.	2023-09-05 10:59:13 +01:00
Florian Hahn	19d286bca0	[VPlan] Assert that inst isnt' a debug or pseudo inst (NFCI). Debug and pseudo instructions aren't modeled in VPlan. Turn a check into an assertion. This will help removing the direct use of Inst here in the future.	2023-09-03 21:31:31 +01:00
Florian Hahn	e18a547ce2	[VPlan] Fold if into return in prepareToExecute assertion (NFC). Independent simplification suggested in D157194.	2023-08-08 12:45:55 +01:00
Florian Hahn	af635a5547	[VPlan] Model wrap flags directly, remove NUW opcodes (NFC) Model wrap flags directly using VPRecipeWithIRFlags and clean up the duplicated NUW opcodes. D157144 will build on this and also model FMFs for VPInstruction. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D157194	2023-08-08 12:12:30 +01:00
Florian Hahn	deec9e7674	[VPlan] Move VPTransformState::get() to VPlan.cpp (NFC). The last dependency of code defined in LoopVectorize.cpp has been removed a while ago. Move VPTransformState::get() to VPlan.cpp where other members are also defined.	2023-08-03 21:49:58 +01:00
Florian Hahn	d1d0e135a1	[LV] Move packScalarIntoVectorValue to VPTransformState (NFC). This moves packScalarIntoVectorValue from ILV to the more approriate VPTransformState.	2023-08-02 12:36:48 +01:00
Elliot Goodrich	b0abd4893f	[llvm] Add missing StringExtras.h includes In preparation for removing the `#include "llvm/ADT/StringExtras.h"` from the header to source file of `llvm/Support/Error.h`, first add in all the missing includes that were previously included transitively through this header.	2023-06-25 15:42:22 +01:00
Florian Hahn	96686796f6	[VPlan] Move live-out printing to VPLiveOut::print (NFC). Preparation for D150398. This brings live-out printing in line with how printing for recipes is handled.	2023-05-22 09:53:53 +01:00
Hongtao Yu	9272d0f079	[PseudoProbe] Clean up dwarf discriminator and avoid duplicating factor. A pseudo probe is created with dwarf line information shared with its nearest instruction. If the instruction comes with a dwarf discriminator, it will be shared with the probe as well. This can confuse the later FS-AFDO discriminator assignment pass. To fix this, I'm cleaning up the discriminator fields for probes when they are inserted. I also notice another possibility to change the discriminator field of pseudo probes in the pipeline before the FS discriminator assignment pass. That is the loop unroller, which assigns duplication factor to instruction being vectorized. I'm disabling that for pseudo probe intrinsics specifically, also for callsites with probes. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D148569	2023-05-10 11:26:23 -07:00
Florian Hahn	e3afe0b89d	[VPlan] Add VPWidenCastRecipe, split off from VPWidenRecipe (NFCI). To generate cast instructions, the result type is needed. To allow creating widened casts without underlying instruction, introduce a new VPWidenCastRecipe that also holds the result type. This functionality will be used in a follow-up patch to implement truncateToMinimalBitwidths as VPlan-to-VPlan transform. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149081	2023-05-05 13:20:16 +01:00
Florian Hahn	147a56149c	[VPlan] Clean up preheader block after b85a402dd899fc. Fix a leak introduced in b85a402dd899fc and flagged by LSan https://lab.llvm.org/buildbot#builders/5/builds/33452	2023-05-04 16:29:57 +01:00
Florian Hahn	b85a402dd8	[VPlan] Introduce new entry block to VPlan for early SCEV expansion. This patch adds a new preheader block the VPlan to place SCEV expansions expansions like the trip count. This preheader block is disconnected at the moment, as the bypass blocks of the skeleton are not yet modeled in VPlan. The preheader block is executed before skeleton creation, so the SCEV expansion results can be used during skeleton creation. At the moment, the trip count expression and induction steps are expanded in the new preheader. The remainder of SCEV expansions will be moved gradually in the future. D147965 will update skeleton creation to use the steps expanded in the pre-header to fix #58811. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147964	2023-05-04 14:00:13 +01:00
Florian Hahn	79692750d2	[LV] Use VPValue for SCEV expansion in fixupIVUsers. The step is already expanded in the VPlan. Use this expansion instead. This is a step towards modeling fixing up IV users in VPlan. It also fixes a crash casued by SCEV-expanding the Step expression in fixupIVUsers, where the IR is in an incomplete state Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147963	2023-05-04 09:25:59 +01:00
Florian Hahn	b9efffa7e9	[VPlan] Add assignSlot(const VPBasicBlock *) (NFC). Factor out utility to simplify D147964 as sugested.	2023-05-03 19:51:09 +01:00
Florian Hahn	2c9d21a2a3	[VPlan] Turn Plan entry node into VPBasicBlock (NFCI). The entry to the plan is the preheader of the vector loop and guaranteed to be a VPBasicBlock. Make sure this is the case by adjusting the type. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149005	2023-04-28 12:29:06 +01:00
Florian Hahn	3157f03a34	[VPlan] Add VPValue::isLiveIn() (NFC). This helps to clarify checks in multiple places. Suggested as cleanup in D147892.	2023-04-24 17:51:12 +01:00
Florian Hahn	ff0ec4f42e	Recommit "[VPlan] Unify Value2VPValue and VPExternalDefs maps (NFCI)." This reverts the revert commit 8c2276f89887d0a27298a1bbbd2181fa54bbb509. The updated patch re-orders the getDefiningRecipe check in getVPalue to avoid a use-after-free. Original commit message: Before this patch, a VPlan contained 2 mappings for Values -> VPValue: 1) Value2VPValue and 2) VPExternalDefs. This duplication is unnecessary and there are already cases where external defs are added to Value2VPValue. This patch replaces all uses of VPExternalDefs with Value2VPValue. It clarifies the naming of getOrAddVPValue (to getOrAddExternalVPValue) and addVPValue (to addExternalVPValue). At the moment, this is NFC, but will enable additional simplifications in D147783. Depends on D147891. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147892	2023-04-18 10:29:31 +01:00
Vitaly Buka	8c2276f898	Revert "[VPlan] Unify Value2VPValue and VPExternalDefs maps (NFCI)." Asan detects heap-use-after-free, see D147892. This reverts commit 4fc190351e5af901b6107d162d07e1fbca90934f. This reverts commit 668045eb77628be13e448ffbb855473ffca1cc43.	2023-04-17 17:24:10 -07:00
Florian Hahn	668045eb77	[VPlan] Unify Value2VPValue and VPExternalDefs maps (NFCI). Before this patch, a VPlan contained 2 mappings for Values -> VPValue: 1) Value2VPValue and 2) VPExternalDefs. This duplication is unnecessary and there are already cases where external defs are added to Value2VPValue. This patch replaces all uses of VPExternalDefs with Value2VPValue. It clarifies the naming of getOrAddVPValue (to getOrAddExternalVPValue) and addVPValue (to addExternalVPValue). At the moment, this is NFC, but will enable additional simplifications in D147783. Depends on D147891. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147892	2023-04-16 15:38:31 +01:00
Florian Hahn	0dbcbfe0d0	[VPlan] Don't assign slots for external defs (NFCI). External defs are VPValues wrapping an IR value and hence will get printed as ir<>. We don't need to assign a slot for a VPValue number.	2023-04-09 21:01:21 +01:00
Florian Hahn	620e011a25	[VPlan] Don't add live-outs if scalar epilogue is required. Instead of clearing live outs when a scalar epilogue is required late, don't add live outs during VPlan construction if a scalar epilogue is required. This enables more VPlan-based DCE (if the live out would be the only user in the plan) and is a step towards removing an access of the cost model in fixedVectorizedLoop (which is after VPlan execution). Depends on D147468. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147471	2023-04-09 09:18:24 +01:00
David Green	28c8616a5b	[LV] Cleanup and reformatting for some debug messages. NFC This is just some cleanup of various debug messages, pulled out of another patch to simplify it a little.	2023-04-05 17:50:01 +01:00
Kazu Hirata	398af9b43b	[llvm] Use *{Map,Set}::contains (NFC)	2023-03-15 18:06:32 -07:00
David Green	98481bc723	[LV][VPlan] Fix printing TripCount liveins. NFC The TripCount liveins would currently be printed as badref in the vplan as they are not allocated slots in the VPSlotTracker. This patch allocates them a slot and adds them to the printed Live-Ins. It also makes a minor adjustment to printing of Live-ins to reduce the empty lines when multiple Live-ins are present. Differential Revision: https://reviews.llvm.org/D145507	2023-03-13 19:44:12 +00:00

1 2 3 4 5 ...

275 Commits