llvm-project

Author	SHA1	Message	Date
David Green	965a090f02	Revert "[IVDescriptors] Add pointer InductionDescriptors with non-constant strides" Multiple errors have being reported on https://reviews.llvm.org/rG498aa534f472d28db893aa9a8627d0b46e17f312 Reverting until the correctness issues can be resolved. We are also seeing a lot of performance differences from the patch. Some are looking good, but some are looking pretty bad.	2023-03-31 11:08:50 +01:00
Philip Reames	498aa534f4	[IVDescriptors] Add pointer InductionDescriptors with non-constant strides This matches the handling for integer IVs. I left the non-opaque cases alone, mostly because they're largely irrelevant today. This doesn't actually make much difference in vectorization right now as we immediately fail on aliasing checks (which also bail on non-constant strides). Slightly suprisingly, it's the case which do need runtime checks which work after this patch as they don't use the same dependency analysis path. This will also enable non-constant stride pointer recurrences for other consumers. I've auditted said code, and don't see any obvious issues.	2023-03-30 11:56:00 -07:00
David Sherwood	0ef8a79b12	[LoopVectorize] Add non-zero check for MaxPowerOf2RuntimeVF in computeMaxVF This one-line patch just tightens up the code added in 1c4fedfa35aeb8b456e2d8f4f826c0e026b9d863 where we try to avoid tail-folding if we know the runtime VF will always be a multiple of the trip count.	2023-03-29 10:08:32 +00:00
David Sherwood	1c4fedfa35	[LoopVectorize] Don't tail-fold for scalable VFs when there is no scalar tail Currently in LoopVectorize we avoid tail-folding if we can prove the trip count is always a multiple of the maximum fixed-width VF. This works because we know the vectoriser only ever chooses a VF that is a power of 2. However, if we are also considering scalable VFs then we conservatively bail out of the optimisation because we don't know the value of vscale, which could be an odd or prime number, etc. This patch tries to enable the same optimisation for scalable VFs by asking if vscale is known to be a power of 2. If so, we can then query the maximum value of vscale and use the same logic as we do for fixed-width VFs. I've also added a new TTI hook called isVScaleKnownToBeAPowerOfTwo that does the same thing as the existing TargetLowering hook. Differential Revision: https://reviews.llvm.org/D146199	2023-03-27 08:34:30 +00:00
Florian Hahn	ea929a07b6	[LV] Set inbounds flag using CreateGEP in vectorizeInterleaveGroup(NFC). This avoids having to cast the result of the builder to GetElementPtrInst.	2023-03-22 11:29:57 +00:00
Florian Hahn	af99aa0ff7	[LV] Set imbounds flag using CreateGEP in VPWidenMemInst (NFC). This avoids having to cast the result of the builder to GetElementPtrInst.	2023-03-21 11:44:21 +00:00
Florian Hahn	371bb2c9d3	[VPlan] Move createReplicateRegion out of VPRecipeBuilder.h. (NFC) The function doesn't use anything from VPRecipeBuilder, so move the definition to where it is actually used and turn it into a simple static function. It also makes the VPRecipeBuilder argument for createAndOptimizeReplicateRegions unnecessary.	2023-03-18 20:30:49 +00:00
Florian Hahn	6a6b65a84c	[LV] Restructure code creating replicate region (NFC). Re-order recipe and block creation to be in order, as suggested post-commit for 2db71c9851e5.	2023-03-18 17:17:07 +00:00
Florian Hahn	962c306a11	[LV] Don't consider pointer as uniform if it is also stored. Update isVectorizedMemAccessUse to also check if the pointer is stored. This prevents LV to incorrectly consider a pointer as uniform if it is used as both pointer and stored by the same StoreInst. Fixes #61396.	2023-03-17 16:26:16 +00:00
Graham Hunter	9aa01c4e89	[LV] Remove scalable constraints on creating bitcasts InnerLoopVectorizer::createBitOrPointerCast only supported fixed length vectors since it hadn't been updated. Supporting scalable vectors is just a matter of changing types and using elementcount instead of numelements, since there's nothing which actually relies on knowing the exact length of the vector. Original written by mgabka. Split out from D145163.	2023-03-17 16:19:33 +00:00
Florian Hahn	eca14a810e	[VPlan] Consolidate replicate region optimizations (NFC). As suggested in D143865, consolidate replicate region creation and optimization in a single helper that's exposed and used by LV.	2023-03-16 17:06:44 +00:00
Kazu Hirata	398af9b43b	[llvm] Use *{Map,Set}::contains (NFC)	2023-03-15 18:06:32 -07:00
Kazu Hirata	c8f9555c4d	[Transforms] Use *{Set,Map}::contains (NFC)	2023-03-14 00:24:30 -07:00
Philip Reames	dae682ce92	[IRBuilder] Add utilities for materializing scalable values [nfc] These idioms already appear a number of places in code, and upcoming changes to the various sanitizers continue to need more instances of the same patterns. Differential Revision: https://reviews.llvm.org/D145945	2023-03-13 11:54:19 -07:00
Florian Hahn	2db71c9851	[VPlan] Simplify code in createReplicateRegion (NFC). Simplify the code as suggested in D143865.	2023-03-11 11:47:23 +01:00
Arthur Eubanks	7c3c981442	[Passes] Remove some legacy passes DFAJumpThreading JumpThreading LibCallsShrink LoopVectorize SLPVectorizer DeadStoreElimination AggressiveDCE CorrelatedValuePropagation IndVarSimplify These are part of the optimization pipeline, of which the legacy version is deprecated and being removed.	2023-03-10 17:17:00 -08:00
Florian Hahn	54558fd8f3	[VPlan] Replace InvariantCond field from VPWidenSelectRecipe. There is no need to store information about invariance in the recipe. Replace the fields with checks of the operands using isDefinedOutsideVectorRegions. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D144489	2023-03-10 15:28:43 +01:00
Florian Hahn	a8adb38a96	[VPlan] Replace invariance fields from VPWidenGEPRecipe. There is no need to store information about invariance in the recipe. Replace the fields with checks of the operands using isDefinedOutsideVectorRegions. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D144487	2023-03-09 17:52:22 +01:00
Florian Hahn	79272ec028	[VPlan] Add predicate to VPReplicateRecipe, expand region later. This patch adds the predicate as additional operand to VPReplicateRecipe during initial construction. The predicated recipes are later moved into replicate regions. This simplifies constructions and some VPlan transformations, like fixed-order recurrence handling. It also improves codegen in some cases (e.g. for in-loop reductions), because the recipes remain in the same block. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D143865	2023-03-08 20:11:28 +01:00
Florian Hahn	3b2cf45d6b	[VPlan] Check if recipe is in ReplicateRegion for IfPredicateInstr (NFC) Check if replicate recipe is in a replicate region when considering to collect predicated instructions. This allows use IsPredicated for recipes with a mask attached directly in D143865. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D145322	2023-03-08 11:39:44 +01:00
sgokhale	4f018e54c4	[LV][AArch64] Resolve test failure due use of unordered container AArch64/reg-usage.ll has an issue with the output ordering due to use of unordered container. This was discovered by -DLLVM_REVERSE_ITERATION:BOOL=ON cmake option. This patch tries to address it by making use of ordered container. Differential Revision: https://reviews.llvm.org/D145472/	2023-03-07 16:42:21 +05:30
Graham Hunter	a180344589	[LV] Allow scalarization of function calls when masking is required This patch adds support for scalarizing calls to a function when there is a vector variant that cannot be used, either because there isn't a masked variant or because the cost model indicated a VF without a masked variant was better. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D134422	2023-03-03 15:26:04 +00:00
Sander de Smalen	c41b41eb11	[LoopVectorize] Use overflow-check analysis to improve tail-folding. This work follows on from D142109 and addresses a possible regression when we know the loop iteration counter cannot overflow. When we know the overflow-check always evaluates to false, it's better to use the other style of tail folding where it assumes a runtime check was added, because that avoids having to calculate a modified trip-count. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D142894	2023-03-01 14:17:58 +00:00
Sander de Smalen	fe1b51ffee	[LoopVectorize] Remove runtime check and scalar tail loop when tail-folding. When using tail-folding and using the predicate for both data and control-flow (the next vector iteration's predicate is generated with the llvm.active.lane.mask intrinsic and then tested for the backedge), the LoopVectorizer still inserts a runtime check to see if the 'i + VF' may at any point overflow for the given trip-count. When it does, it falls back to a scalar epilogue loop. We can get rid of that runtime check in the pre-header and therefore also remove the scalar epilogue loop. This reduces code-size and avoids a runtime check. Consider the following loop: void foo(char * __restrict__ dst, char *src, unsigned long N) { for (unsigned long i=0; i<N; ++i) dst[i] = src[i] + 42; } If 'N' is e.g. ULONG_MAX, and the VF > 1, then the loop iteration counter will overflow when calculating the predicate for the next vector iteration at some point, because LLVM does: vector.ph: %active.lane.mask.entry = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 0, i64 %N) vector.body: %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ] %active.lane.mask = phi <vscale x 16 x i1> [ %active.lane.mask.entry, %vector.ph ], [ %active.lane.mask.next, %vector.body ] ... %index.next = add i64 %index, 16 ; The add above may overflow, which would affect the lane mask and control flow. Hence a runtime check is needed. %active.lane.mask.next = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %index.next, i64 %N) %8 = extractelement <vscale x 16 x i1> %active.lane.mask.next, i64 0 br i1 %8, label %vector.body, label %for.cond.cleanup, !llvm.loop !7 The solution: What we can do instead is calculate the predicate before incrementing the loop iteration counter, such that the llvm.active.lane.mask is calculated from 'i' to 'tripcount > VF ? tripcount - VF : 0', i.e. vector.ph: %active.lane.mask.entry = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 0, i64 %N) %N_minus_VF = select %N > 16 ? %N - 16 : 0 vector.body: %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ] %active.lane.mask = phi <vscale x 16 x i1> [ %active.lane.mask.entry, %vector.ph ], [ %active.lane.mask.next, %vector.body ] ... %active.lane.mask.next = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %index, i64 %N_minus_VF) %index.next = add i64 %index, %4 ; The add above may still overflow, but this time the active.lane.mask is not affected %8 = extractelement <vscale x 16 x i1> %active.lane.mask.next, i64 0 br i1 %8, label %vector.body, label %for.cond.cleanup, !llvm.loop !7 For N = 20, we'd then get: vector.ph: %active.lane.mask.entry = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 0, i64 %N) ; %active.lane.mask.entry = <1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1> %N_minus_VF = select 20 > 16 ? 20 - 16 : 0 ; %N_minus_VF = 4 vector.body: (1st iteration) ... ; using <1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1> as predicate in the loop ... %active.lane.mask.next = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 0, i64 4) ; %active.lane.mask.next = <1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0> %index.next = add i64 0, 16 ; %index.next = 16 %8 = extractelement <vscale x 16 x i1> %active.lane.mask.next, i64 0 ; %8 = 1 br i1 %8, label %vector.body, label %for.cond.cleanup, !llvm.loop !7 ; branch to %vector.body vector.body: (2nd iteration) ... ; using <1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0> as predicate in the loop ... %active.lane.mask.next = tail call <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 16, i64 4) ; %active.lane.mask.next = <0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0> %index.next = add i64 16, 16 ; %index.next = 32 %8 = extractelement <vscale x 16 x i1> %active.lane.mask.next, i64 0 ; %8 = 0 br i1 %8, label %vector.body, label %for.cond.cleanup, !llvm.loop !7 ; branch to %for.cond.cleanup Reviewed By: fhahn, david-arm Differential Revision: https://reviews.llvm.org/D142109	2023-03-01 09:01:19 +00:00
Nikita Popov	4bc254c664	[LoopVectorize] Only fetch BFI if profile summary available BlockFrequencyInfo should generally only be fetched in PGO builds where a PSI profile summary is available. However, LoopVectorize was fetching it unconditionally. This results in a small compile-time improvement for non-PGO builds. Differential Revision: https://reviews.llvm.org/D144953	2023-02-28 14:16:21 +01:00
sgokhale	4f9a5447c6	[LV] Reland "Update logic for calculating register usage due to invariants" Previously, while calculating register usage due to invariants, it was assumed that invariant would always be part of widening instructions. This resulted in calculating vector register types for vectors which cant be legalized(check the newly added test for more details). An invariant might not always need a vector register. For e.g., invariant might just be used for iteration check. This patch checks if the invariant is part of any widening instruction and considers register usage accordingly. Fixes issue 60493 Differential Revision: https://reviews.llvm.org/D143422	2023-02-28 17:32:39 +05:30
sgokhale	3c8ddbde37	Revert "[LV] Update logic for calculating register usage due to invariants" Observing test failure for llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll This reverts commit d1628266946fdddb44bdad2b3ccf3cd5fc769f42.	2023-02-28 15:46:59 +05:30
sgokhale	d162826694	[LV] Update logic for calculating register usage due to invariants Previously, while calculating register usage due to invariants, it was assumed that invariant would always be part of widening instructions. This resulted in calculating vector register types for vectors which cant be legalized(check the newly added test for more details). An invariant might not always need a vector register. For e.g., invariant might just be used for iteration check. This patch checks if the invariant is part of any widening instruction and considers register usage accordingly. Fixes issue 60493 Differential Revision: https://reviews.llvm.org/D143422	2023-02-28 11:05:26 +05:30
Sander de Smalen	9449deda12	[LV] NFC: Move logic to query maximum vscale to its own function. To query the maximum value for vscale, the LV queries the vscale_range attribute or a TTI hook. To avoid having to reimplement the same behaviour for multiple uses (such as in D142894), it makes sense to move this code to a separate function.	2023-02-23 15:12:35 +00:00
OCHyams	620a529760	[Assignment Tracking] Choose better passes for RemoveRedundantDbgInstrs call Enabling assignment tracking without this patch, a significant amount of additional compiler run time comes from the RemoveRedundantDbgInstrs call in InstCombine. This patch reduces compiler run time by choosing better places to call RemoveRedundantDbgInstrs. In non-assignment-tracking builds, RemoveRedundantDbgInstrs is called by InstCombine if LowerDbgDeclare makes a change (i.e. it is _sometimes_ called). In assignment tracking builds LowerDbgDeclare doesn't do anything. We still need to clean up redundant intrinsics to avoid a large performance hit due to the number of instructions, so the current approach is to have InstCombine _always_ call RemoveRedundantDbgInstrs. Instrumenting the compiler to run RemoveRedundantDbgInstrs after every pass and dump the numbers and building CTMark/tramp3d-v4 indicates that SROA and LoopVectorize give us a bigger bang (number removed) for buck (times pass is run). The compile time tracker reports that this patch reduces the number of instructions retired building CTMark projects by an average of 1.1%. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D144483	2023-02-22 16:28:06 +00:00
Luke Lau	b02b1e0ed6	[LV][NFC] Use ElementCount for getMaxInterleaveFactor In order to allow targets to disable interleaving for scalable vectors, pass the entire VF's ElementCount to getMaxInterleaveFactor. This is based off of the approach used here: `8d36708507` The plan would then be to disable interleaving on scalable VFs on RISC-V in a follow up patch. See https://reviews.llvm.org/D143723#4132349 Reviewed By: reames Differential Revision: https://reviews.llvm.org/D144474	2023-02-22 10:15:05 +00:00
Liren Peng	529ee9750b	[NFC] Use single quotes for single char output during `printPipline` Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D144365	2023-02-22 02:35:13 +00:00
Florian Hahn	9333b97763	[VPlan] Replace AlsoPack field with shouldPack() method (NFC). There is no need to update the AlsoPack field when creating VPReplicateRecipes. It can be easily computed based on the VP def-use chains when it is needed. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D143864	2023-02-20 10:28:26 +00:00
Florian Hahn	a3d1de3e29	[LV] Move invalid cost remark code to separate function (NFC). The code only needs access to INvalidCosts, ORE and TheLoop, so it can easily be moved into a helper to make selectVectorizationFactor more compact. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D143957	2023-02-16 11:28:19 +00:00
Hongtao Yu	eddec9de44	[Pseudo probe] Duplicate probes in vectorized loop body. Prevoius pseudo probes were dropped out of a vectorized loop body during loop vectorization. This can result in the samples of the loop entry is used for the loop body, which in turn can cause undercounting of the loop iteration count. The undercounting can further prevent the loop from being vectorized in the next build. I'm fixing this by explicting allowing pseudo probes to be kept in the vectorized loop body, and by claiming a probe instruction is not "uniform", the vectorizer will duplicate it by the number of vector lanes. For one internal service, I'm seeing the change causes the size increase of the .pseudoprobe section by 0.7%, which should count around 0.2% of the whole binary size. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D144066	2023-02-15 10:18:08 -08:00
Graham Hunter	0fa5df1959	[LV] Synthesize all true masks for masked vector function variants When vectorizing code with function calls in it, if we encounter a function which only has vectorized variants requiring a mask we can synthesize an all-true mask to enable us to proceed. Since we want the mask to be represented in vplan, the pointer to the chosen Function is now stored as part of the VPWidenCallRecipe, and mask arguments are added at the appropriate index to the recipe operands. Reviewed By: david-arm, fhahn, reames Differential Revision: https://reviews.llvm.org/D132458	2023-02-14 14:33:18 +00:00
Fangrui Song	1e6921131a	Move global namespace cl::opt inside llvm::	2023-02-14 00:09:44 -08:00
Florian Hahn	2e6430666c	[LV] Update recipe builder functions to pass VPlan directly (NFC). Passing VPlanPtr requires a dereference of std::unique_ptr on each access, which is unnecessary. Just pass the plan by reference.	2023-02-12 22:35:14 +00:00
Sander de Smalen	5a115452c4	Reland D143267: [LoopVectorize] Use DataLayout::getIndexType instead of i32 for non-constant GEP indices. Fixed issue where 'ConstantInt::get(IndextTy, -Part)' was executed with the wrong type for Part, e.g. IndexTy was i64, but Part was 'unsigned', which led to things like 'mul i64 .., 4294967292', which was obviously wrong. Also changed sve-vector-reverse.ll to be vectorized with UF>1 to test this. This reverts commit 1f01cdda68614dba12af3cc3aff38541d0abcc6b.	2023-02-09 09:42:29 +00:00
Florian Hahn	c83fdc905a	[LV] Perform recurrence sinking directly on VPlan. This patch updates LV to sink recipes directly using the VPlan use chains. The initial patch only moves sinking to be purely VPlan-based. Follow-up patches will move legality checks to VPlan as well. At the moment, there's a single test failure remaining. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D142589	2023-02-08 15:49:29 +00:00
Sander de Smalen	1f01cdda68	Revert "[LoopVectorize] Use DataLayout::getIndexType instead of i32 for non-constant GEP indices." This patch causes a regression, so reverting it while I investigate the issue. This reverts commit e6eb84a191ca2a1afd5789c5bb398da68bb6065e.	2023-02-08 15:46:52 +00:00
Sander de Smalen	e6eb84a191	[LoopVectorize] Use DataLayout::getIndexType instead of i32 for non-constant GEP indices. This is specifically relevant for loops that vectorize using a scalable VF, where the code results in: %vscale = call i32 llvm.vscale.i32() %vf.part1 = mul i32 %vscale, 4 %gep = getelementptr ..., i32 %vf.part1 Which InstCombine then changes into: %vscale = call i32 llvm.vscale.i32() %vf.part1 = mul i32 %vscale, 4 %vf.part1.zext = sext i32 %vf.part1 to i64 %gep = getelementptr ..., i32 %vf.part1.zext D143016 tried to remove these extends, but that only works when the call to llvm.vscale.i32() has a single use. After doing any kind of CSE on these calls the combine no longer kicks in. It seems more sensible to ask DataLayout what type to use, rather than relying on InstCombine to insert the extend and hoping it can fold it away. I've only changed this for indices that are not constant, because I vaguely remember there was a reason for sticking with i32. It would also mean patching up loads more tests. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D143267	2023-02-07 11:47:51 +00:00
Sander de Smalen	005311399e	[LoopVectorize][TTI] NFCI: Clarify enum for the tail folding style. This NFC (intended) patch has several small changes: * It renames PredicationStyle to TailFoldingStyle. * It renames TTI.emitActiveLaneMask() to TTI.getPreferredTailFoldingStyle() * Simplifies some of its uses in the LoopVectorizer Rationale: To my surprise PredicationStyle::None did not mean 'no predication', but rather 'no active lane mask intrinsic', such that the predicate is created using a splat + compare with stepvector. The enum is also highly specific to tail folding, so it seems better to name this around that feature, i.e. 'tail folding style'. This also makes it more amenable to extend it to other tail folding styles, such as the one added in D142109. Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D142887	2023-02-03 14:59:57 +00:00
Kazu Hirata	f20b5071f3	[llvm] Use llvm::bit_floor instead of llvm::PowerOf2Floor (NFC)	2023-01-28 09:06:31 -08:00
ShihPo Hung	5fb3a57ea7	[Cost] Add CostKind to getVectorInstrCost and its related users LoopUnroll estimates the loop size via getInstructionCost(), but getInstructionCost() cannot pass CostKind to getVectorInstrCost(). And so does getShuffleCost() to getBroadcastShuffleOverhead(), getPermuteShuffleOverhead(), getExtractSubvectorOverhead(), and getInsertSubvectorOverhead(). To address this, this patch adds an argument CostKind to these functions. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D142116	2023-01-21 05:29:24 -08:00
Florian Hahn	e2c43a547b	[VPlan] Add vp_depth_first_deep (NFC) Similar to vp_depth_first_shallow (D140512) add vp_depth_first_deep to make existing code clearer and more compact. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D142055	2023-01-19 20:34:23 +00:00
Guillaume Chatelet	8fd5558b29	[NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize() This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.	2023-01-11 16:49:38 +00:00
Guillaume Chatelet	48f5d77eee	[NFC] Use TypeSize::getKnownMinValue() instead of TypeSize::getKnownMinSize() This change is one of a series to implement the discussion from https://reviews.llvm.org/D141134.	2023-01-11 16:36:39 +00:00
Miguel Saldivar	3c5e0d87f8	[LoopVectorize] Clear cache of `LoopAccessInfoManager` LAI is cached during the LoopDistribute pass, and is later re-used during LoopVectorize. The problem is that LoopVectorize changes SCEV, and the cached LAI does not get updated. Hence, when re-using the cached LAI, it references an invalid SCEV. Fixes #59319 Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D139601	2023-01-11 09:03:40 +00:00
Benjamin Kramer	b6942a2880	[NFC] Hide implementation details in anonymous namespaces	2023-01-08 17:37:02 +01:00

1 2 3 4 5 ...

1830 Commits