llvm-project

Author	SHA1	Message	Date
Florian Hahn	2265bb064b	[LV] Update generateInstruction to return produced value (NFC). Update generateInstruction to return the produced value instead of setting it for each opcode. This reduces the amount of duplicated code and is a preparation for D153696. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D154240	2023-07-05 19:53:59 +01:00
Kazu Hirata	3f8ed16c67	[Transforms] Remove unused forward declaration PredicateScalarEvolution The declaration was added without a corresponding class definition by: commit a84064bcda1a737658d33e96ca58516d01af70a6 Author: Florian Hahn <flo@fhahn.com> Date: Wed Dec 21 22:02:31 2022 +0000 It is most likely a misspelling of PredicatedScalarEvolution.	2023-06-22 23:45:52 -07:00
Kazu Hirata	c963892a45	[llvm] Use DenseMapBase::lookup (NFC)	2023-06-10 09:02:25 -07:00
Florian Hahn	1a28b9bce7	[VPlan] Handle invariant GEPs in isUniformAfterVectorization. This fixes a crash caused by legal treating a scalable GEP as invariant, but isUniformAfterVectorization does not handle GEPs. Partially fixes https://github.com/llvm/llvm-project/issues/60831. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D144434	2023-05-30 15:53:26 +01:00
Florian Hahn	299f0ff60e	[VPlan] Print IR flags for VPRecipeWithIRFlags. Now that IR flags are modeled as part of VPRecipeWithIRFlags, include the flags when printing recipes. Depends on D150027. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D150029	2023-05-23 20:36:16 +01:00
Florian Hahn	96686796f6	[VPlan] Move live-out printing to VPLiveOut::print (NFC). Preparation for D150398. This brings live-out printing in line with how printing for recipes is handled.	2023-05-22 09:53:53 +01:00
Florian Hahn	701f7230cd	[VPlan] Use VPRecipeWithIRFlags for VPReplicateRecipe, retire poison map Update VPReplicateRecipe to use VPRecipeWithIRFlags for IR flag handling. Retire separate MayGeneratePoisonRecipes map. Depends on D149082. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D150027	2023-05-15 11:49:20 +01:00
Florian Hahn	236a0e82df	[LV] Use VPValue to get expanded value for SCEV step expressions. Update skeleton creation logic to use SCEV expansion results from expanding the pre-header. This avoids another set of SCEV expansions that may happen after the CFG has been modified. Fixes #58811. Depends on D147964. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147965	2023-05-11 16:49:19 +01:00
Florian Hahn	c096e91735	[VPlan] Address missed suggestions from D149082. This address 2 comments missed from D149082. It sets inbounds directly when creating the GEP and fixes the order in the enum.	2023-05-09 15:17:20 +01:00
Florian Hahn	5f3343985b	[VPlan] Use VPRecipeWithIRFlags for VPWidenGEPRecipe (NFCI). Extend VPRecipeWithIRFlags to also include InBounds and use for VPWidenGEPRecipe. The last remaining recipe that needs updating for MayGeneratePoisonRecipes is VPReplicateRecipe. Depends on D149081. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149082	2023-05-09 12:33:28 +01:00
Florian Hahn	127b00b25c	[VPlan] Record IR flags on VPWidenRecipe directly (NFC). This patch introduces a VPRecipeWithIRFlags class to record various IR flags for a recipe. This allows de-coupling of IR flags from the underlying instructions. The main benefit is that it allows dropping of IR flags from recipes directly, without the need to go through State::MayGeneratePoisonRecipes. The plan is to remove MayGeneratePoisonRecipes once all relevant recipes are transitioned. It also allows dropping IR flags during VPlan-to-VPlan transforms, which will be used in a follow-up patch to implement truncateToMinimalBitwidths as VPlan-to-VPlan transform. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149079	2023-05-08 17:28:50 +01:00
Kazu Hirata	2b60bd5141	[Vectorize] Use Densemap::contains (NFC)	2023-05-06 00:02:54 -07:00
Florian Hahn	e3afe0b89d	[VPlan] Add VPWidenCastRecipe, split off from VPWidenRecipe (NFCI). To generate cast instructions, the result type is needed. To allow creating widened casts without underlying instruction, introduce a new VPWidenCastRecipe that also holds the result type. This functionality will be used in a follow-up patch to implement truncateToMinimalBitwidths as VPlan-to-VPlan transform. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149081	2023-05-05 13:20:16 +01:00
Florian Hahn	c2bef381fa	[VPlan] Remove setEntry to avoid leaks when replacing entry. Update the HCFG builder to directly connect the created CFG to the existing Plan's entry. This allows removing `setEntry`, which can cause leaks when the existing entry is replaced. Should fix https://lab.llvm.org/buildbot/#/builders/5/builds/33455/steps/13/logs/stdio	2023-05-04 19:12:02 +01:00
Florian Hahn	b85a402dd8	[VPlan] Introduce new entry block to VPlan for early SCEV expansion. This patch adds a new preheader block the VPlan to place SCEV expansions expansions like the trip count. This preheader block is disconnected at the moment, as the bypass blocks of the skeleton are not yet modeled in VPlan. The preheader block is executed before skeleton creation, so the SCEV expansion results can be used during skeleton creation. At the moment, the trip count expression and induction steps are expanded in the new preheader. The remainder of SCEV expansions will be moved gradually in the future. D147965 will update skeleton creation to use the steps expanded in the pre-header to fix #58811. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147964	2023-05-04 14:00:13 +01:00
Florian Hahn	79692750d2	[LV] Use VPValue for SCEV expansion in fixupIVUsers. The step is already expanded in the VPlan. Use this expansion instead. This is a step towards modeling fixing up IV users in VPlan. It also fixes a crash casued by SCEV-expanding the Step expression in fixupIVUsers, where the IR is in an incomplete state Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147963	2023-05-04 09:25:59 +01:00
Florian Hahn	2c9d21a2a3	[VPlan] Turn Plan entry node into VPBasicBlock (NFCI). The entry to the plan is the preheader of the vector loop and guaranteed to be a VPBasicBlock. Make sure this is the case by adjusting the type. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D149005	2023-04-28 12:29:06 +01:00
Florian Hahn	3157f03a34	[VPlan] Add VPValue::isLiveIn() (NFC). This helps to clarify checks in multiple places. Suggested as cleanup in D147892.	2023-04-24 17:51:12 +01:00
Florian Hahn	6f999769b9	[VPlan] Remove unnecessary includes from VPlan.h (NFC). Clean up some unnecessary includes from VPlan.h, which is imported in multiple files.	2023-04-24 16:10:46 +01:00
Florian Hahn	ff0ec4f42e	Recommit "[VPlan] Unify Value2VPValue and VPExternalDefs maps (NFCI)." This reverts the revert commit 8c2276f89887d0a27298a1bbbd2181fa54bbb509. The updated patch re-orders the getDefiningRecipe check in getVPalue to avoid a use-after-free. Original commit message: Before this patch, a VPlan contained 2 mappings for Values -> VPValue: 1) Value2VPValue and 2) VPExternalDefs. This duplication is unnecessary and there are already cases where external defs are added to Value2VPValue. This patch replaces all uses of VPExternalDefs with Value2VPValue. It clarifies the naming of getOrAddVPValue (to getOrAddExternalVPValue) and addVPValue (to addExternalVPValue). At the moment, this is NFC, but will enable additional simplifications in D147783. Depends on D147891. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147892	2023-04-18 10:29:31 +01:00
Vitaly Buka	8c2276f898	Revert "[VPlan] Unify Value2VPValue and VPExternalDefs maps (NFCI)." Asan detects heap-use-after-free, see D147892. This reverts commit 4fc190351e5af901b6107d162d07e1fbca90934f. This reverts commit 668045eb77628be13e448ffbb855473ffca1cc43.	2023-04-17 17:24:10 -07:00
Florian Hahn	4fc190351e	[VPlan] Remove uneeded NeedsVectorIV from VPWidenIntOrFpInduction. After recent improvements, all instances of VPWidenIntOrFpInductionRecipe should needs a vector IV and there's no need for a separate field.	2023-04-17 13:38:00 +01:00
Florian Hahn	668045eb77	[VPlan] Unify Value2VPValue and VPExternalDefs maps (NFCI). Before this patch, a VPlan contained 2 mappings for Values -> VPValue: 1) Value2VPValue and 2) VPExternalDefs. This duplication is unnecessary and there are already cases where external defs are added to Value2VPValue. This patch replaces all uses of VPExternalDefs with Value2VPValue. It clarifies the naming of getOrAddVPValue (to getOrAddExternalVPValue) and addVPValue (to addExternalVPValue). At the moment, this is NFC, but will enable additional simplifications in D147783. Depends on D147891. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147892	2023-04-16 15:38:31 +01:00
Florian Hahn	2db031528e	[VPlan] Check VPValue step in isCanonical (NFCI). Update the isCanonical() implementations to check the VPValue step operand instead of the step in the induction descriptor. At the moment this is NFC, but it enables further optimizations if the step is replaced by a constant in D147783. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147891	2023-04-16 14:48:03 +01:00
Craig Topper	4b47d875a1	[LV] Optimize trip count SCEV. To calculate the trip count we need to add 1 to the backedge taken count. If we need to widen the backedge count, it's better to do the add before the widening if we can guarantee it won't overflow. The code here is based on similar code I found in LoopIdiomRecognize. This is the vectorizer version of this InstCombine patch D142783. Looking at the IR diffs, this does look like it gets more cases than the InstCombine patch. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D147355	2023-04-12 16:17:58 -07:00
Florian Hahn	c255eb2c4b	[VPlan] Use VPLiveOut to update FOR live-out users. Instead of iterating over all LCSSA phis in the exit block, collect all LiveOut users of the FOR splice VPInstruction and only update those users. Building on top of D147471, this removes an access to the cost model after VPlan execution. Depends on D147471. Reviewed By: Ayal, michaelmaitland Differential Revision: https://reviews.llvm.org/D147472	2023-04-10 13:02:44 +01:00
Florian Hahn	620e011a25	[VPlan] Don't add live-outs if scalar epilogue is required. Instead of clearing live outs when a scalar epilogue is required late, don't add live outs during VPlan construction if a scalar epilogue is required. This enables more VPlan-based DCE (if the live out would be the only user in the plan) and is a step towards removing an access of the cost model in fixedVectorizedLoop (which is after VPlan execution). Depends on D147468. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147471	2023-04-09 09:18:24 +01:00
Florian Hahn	c7a34d355a	[VPlan] Require VFRange.End to be a power-of-2. (NFCI) This removes the need to convert the end of the range to the next power-of-2 for the end iterator after 4bd3fda5124962 and was suggested as follow-up TODO in D147468.	2023-04-08 13:04:08 +01:00
Florian Hahn	4bd3fda512	[VPlan] Add VFRange::begin() and end() iterators. (NFCI) Add an iterator to iterate over all VFs in VFRange. This simplifies some existing code and allows using all_of,any_of and none_of on a VFRange. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147468	2023-04-08 10:22:25 +01:00
Florian Hahn	11896357d4	[VPlan] Add VPInterleaveRecipe::NeedsMaskForGaps field (NFCI). This patch adds a NeedsMaskForGaps field to VPInterleaveRecipe to record whether a mask for gaps is needed. This removes a dependence on the cost model in VPlan code-generation. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D147467	2023-04-07 13:11:03 +01:00
Michael Maitland	194f3dc8fd	[VPlan] VPWidenIntOrFpInductionRecipe inherits from VPHeaderPHIRecipe Differential Revision: https://reviews.llvm.org/D144125	2023-03-14 17:01:34 -07:00
Kazu Hirata	c8f9555c4d	[Transforms] Use *{Set,Map}::contains (NFC)	2023-03-14 00:24:30 -07:00
Florian Hahn	9be8d90e62	[VPlan] Add VPWidenSelectRecipe::getCond() (NFC). Add helper to access condition, as suggested in D144489.	2023-03-10 17:49:23 +01:00
Florian Hahn	54558fd8f3	[VPlan] Replace InvariantCond field from VPWidenSelectRecipe. There is no need to store information about invariance in the recipe. Replace the fields with checks of the operands using isDefinedOutsideVectorRegions. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D144489	2023-03-10 15:28:43 +01:00
Florian Hahn	a8adb38a96	[VPlan] Replace invariance fields from VPWidenGEPRecipe. There is no need to store information about invariance in the recipe. Replace the fields with checks of the operands using isDefinedOutsideVectorRegions. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D144487	2023-03-09 17:52:22 +01:00
Florian Hahn	79272ec028	[VPlan] Add predicate to VPReplicateRecipe, expand region later. This patch adds the predicate as additional operand to VPReplicateRecipe during initial construction. The predicated recipes are later moved into replicate regions. This simplifies constructions and some VPlan transformations, like fixed-order recurrence handling. It also improves codegen in some cases (e.g. for in-loop reductions), because the recipes remain in the same block. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D143865	2023-03-08 20:11:28 +01:00
Sander de Smalen	fe1b51ffee	[LoopVectorize] Remove runtime check and scalar tail loop when tail-folding. When using tail-folding and using the predicate for both data and control-flow (the next vector iteration's predicate is generated with the llvm.active.lane.mask intrinsic and then tested for the backedge), the LoopVectorizer still inserts a runtime check to see if the 'i + VF' may at any point overflow for the given trip-count. When it does, it falls back to a scalar epilogue loop. We can get rid of that runtime check in the pre-header and therefore also remove the scalar epilogue loop. This reduces code-size and avoids a runtime check. Consider the following loop: void foo(char * __restrict__ dst, char *src, unsigned long N) { for (unsigned long i=0; i<N; ++i) dst[i] = src[i] + 42; } If 'N' is e.g. ULONG_MAX, and the VF > 1, then the loop iteration counter will overflow when calculating the predicate for the next vector iteration at some point, because LLVM does: vector.ph: %active.lane.mask.entry = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 0, i64 %N) vector.body: %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ] %active.lane.mask = phi <vscale x 16 x i1> [ %active.lane.mask.entry, %vector.ph ], [ %active.lane.mask.next, %vector.body ] ... %index.next = add i64 %index, 16 ; The add above may overflow, which would affect the lane mask and control flow. Hence a runtime check is needed. %active.lane.mask.next = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %index.next, i64 %N) %8 = extractelement <vscale x 16 x i1> %active.lane.mask.next, i64 0 br i1 %8, label %vector.body, label %for.cond.cleanup, !llvm.loop !7 The solution: What we can do instead is calculate the predicate before incrementing the loop iteration counter, such that the llvm.active.lane.mask is calculated from 'i' to 'tripcount > VF ? tripcount - VF : 0', i.e. vector.ph: %active.lane.mask.entry = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 0, i64 %N) %N_minus_VF = select %N > 16 ? %N - 16 : 0 vector.body: %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ] %active.lane.mask = phi <vscale x 16 x i1> [ %active.lane.mask.entry, %vector.ph ], [ %active.lane.mask.next, %vector.body ] ... %active.lane.mask.next = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %index, i64 %N_minus_VF) %index.next = add i64 %index, %4 ; The add above may still overflow, but this time the active.lane.mask is not affected %8 = extractelement <vscale x 16 x i1> %active.lane.mask.next, i64 0 br i1 %8, label %vector.body, label %for.cond.cleanup, !llvm.loop !7 For N = 20, we'd then get: vector.ph: %active.lane.mask.entry = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 0, i64 %N) ; %active.lane.mask.entry = <1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1> %N_minus_VF = select 20 > 16 ? 20 - 16 : 0 ; %N_minus_VF = 4 vector.body: (1st iteration) ... ; using <1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1> as predicate in the loop ... %active.lane.mask.next = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 0, i64 4) ; %active.lane.mask.next = <1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0> %index.next = add i64 0, 16 ; %index.next = 16 %8 = extractelement <vscale x 16 x i1> %active.lane.mask.next, i64 0 ; %8 = 1 br i1 %8, label %vector.body, label %for.cond.cleanup, !llvm.loop !7 ; branch to %vector.body vector.body: (2nd iteration) ... ; using <1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0> as predicate in the loop ... %active.lane.mask.next = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 16, i64 4) ; %active.lane.mask.next = <0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0> %index.next = add i64 16, 16 ; %index.next = 32 %8 = extractelement <vscale x 16 x i1> %active.lane.mask.next, i64 0 ; %8 = 0 br i1 %8, label %vector.body, label %for.cond.cleanup, !llvm.loop !7 ; branch to %for.cond.cleanup Reviewed By: fhahn, david-arm Differential Revision: https://reviews.llvm.org/D142109	2023-03-01 09:01:19 +00:00
Florian Hahn	9333b97763	[VPlan] Replace AlsoPack field with shouldPack() method (NFC). There is no need to update the AlsoPack field when creating VPReplicateRecipes. It can be easily computed based on the VP def-use chains when it is needed. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D143864	2023-02-20 10:28:26 +00:00
Graham Hunter	0fa5df1959	[LV] Synthesize all true masks for masked vector function variants When vectorizing code with function calls in it, if we encounter a function which only has vectorized variants requiring a mask we can synthesize an all-true mask to enable us to proceed. Since we want the mask to be represented in vplan, the pointer to the chosen Function is now stored as part of the VPWidenCallRecipe, and mask arguments are added at the appropriate index to the recipe operands. Reviewed By: david-arm, fhahn, reames Differential Revision: https://reviews.llvm.org/D132458	2023-02-14 14:33:18 +00:00
Florian Hahn	31d46ca8aa	[Dominators] Introduce DomTreeNodeTraits to allow customization. (NFC) This patch introduces DomTreeNodeTraits for customization. Clients can implement DomTreeNodeTraitsCustom to provide custom ParentPtr, getEntryNode and getParent. There's also a default specialization if DomTreeNodeTraitsCustom is not implemented, that assume a Function-like NodeT. This is what is used for the existing DominatorTree and MachineDominatorTree. The main motivation for this patch is using DominatorTreeBase across all regions of a VPlan, see D140513. Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D142162	2023-01-22 20:22:41 +00:00
Florian Hahn	22c9f4cf2d	[VPlan] Replace VPInterleaveRecipe::classof with VP_CLASSOF_IMPL. (NFC)	2023-01-18 14:23:22 +00:00
Florian Hahn	f615de7e26	[VPlan] Replace VPBranchOnMaskSC::classof with VP_CLASSOF_IMPL. (NFC)	2023-01-18 12:14:58 +00:00
Florian Hahn	cdd8fcdbd7	[VPlan] Replace VPExpandSCEVRecipe::classof with VP_CLASSOF_IMPL. (NFC)	2023-01-17 21:11:33 +00:00
Florian Hahn	bf1ba6bb52	[VPlan] Replace VPScalarIVStepsRecipe::classof with VP_CLASSOF_IMPL(NFC)	2023-01-17 20:53:14 +00:00
Florian Hahn	d47bdae28e	[VPlan] Remove duplicated VPValue IDs (NFCI). At the moment, both VPValue and VPDef have an ID used when casting via classof. This duplication is cumbersome, because it requires adding IDs for new recipes twice and also requires setting them twice. In a few cases, there's only a VPDef ID and no VPValue ID, which can cause same confusion. To simplify things, remove the VPValue IDs for different recipes. Instead, only retain the generic VPValue ID (= used VPValues without a corresponding defining recipe) and VPVRecipe for VPValues that are defined by recipes that inherit from VPValue. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D140848	2023-01-17 15:11:38 +00:00
Florian Hahn	133f017479	[VPlan] Remove unneeded VPUser::classof(const VPDef *) (NFC). This specialization is not needed any longer as VPRecipeBase inherits from VPUser and getDefiningRecipe returns a VPRecipeBase.	2023-01-17 09:08:33 +00:00
Florian Hahn	56ffd39c3d	[VPlan] Use VPDef prefix for VPDef IDs instead of VPRecipeBase (NFC). Various places in the code where still using the VPRecipeBase:: prefix for VPDef IDs or not prefix at all. Now that the VPDef IDs have been moved to VPDef, use this prefix instead and consistently use it.	2023-01-16 10:23:52 +00:00
Florian Hahn	ce1be13a86	[VPlan] Use VP_CLASSOF_IMPL for VPWidenCanonicalIVRecipe(NFC). Replace VPWidenCanonicalIVRecipe::classof implementation with general VP_CLASSOF_IMPL.	2023-01-02 17:52:13 +00:00
Florian Hahn	64f1d845b3	[VPlan] Use VP_CLASSOF_IMPL for VPWidenMemoryInstructionRecipe (NFC). Replace VPWidenMemoryInstructionRecipe ::classof implementation with general VP_CLASSOF_IMPL.	2023-01-02 17:32:31 +00:00
Florian Hahn	2d6d47f807	[VPlan] Use VP_CLASSOF_IMPL for VPPredInstPHI (NFC). Replace VPPredInstPHI::classof implementation with general VP_CLASSOF_IMPL.	2023-01-02 17:22:34 +00:00

1 2 3 4 5 ...

363 Commits