llvm-project

Author	SHA1	Message	Date
Luke Lau	8cd86ff284	[VPlan] Propagate FastMathFlags from phis to blends (#180226 ) If a phi has fast math flags, we can propagate it to the widened select. To do this, this patch makes VPPhi and VPBlendRecipe subclasses of VPRecipeWithIRFlags, and propagates it through PlainCFGBuilder and VPPredicator. Alive2 proofs for some of the FMFs (it looks like it can't reason about the full "fast" set yet) nnan: https://alive2.llvm.org/ce/z/f0bRd4 nsz: https://alive2.llvm.org/ce/z/u9P96T The actual motivation for this to eventually be able to move the special casing for tail folding in LoopVectorizationPlanner::addReductionResultComputation into the CFG in #176143, which requires passing through FMFs.	2026-02-09 19:38:58 +08:00
Florian Hahn	05a2b146fb	[LV] Optimize FindLast recurrences to FindIV (NFCI). (#177870 ) This patch restructures Find(First\|Last)IV handling. Instead of differentiating between FindLast, FindFirstIV and FindLastIV up front, this patch simplifies the logic in IVDescriptor to just identify the FindLast pattern up-front. It then adds a new VPlan transformation to optimize FindLast reductions to FindIV reductions if there is a suitable sentinel value. Find(Last\|First)IV recurrence kinds to a single FindIV kind. This is simpler and more accurate, given selecting the first/last induction of the final IV reduction is directly controlled by the corresponding recurrence kind of the ComputeReductionResult. The new structure also allows further optimizations, like vectorizing FindLastIV with another boolean reduction that tracks if the condition in the loop was ever true, if there is no suitable sentinel value. PR: https://github.com/llvm/llvm-project/pull/177870	2026-02-05 13:57:20 +00:00
Florian Hahn	8240cf337a	[VPlan] Always set flags for overflowing ops etc via VPIRFlags. (#179138 ) Enforce that all VPInstructions set the correct OpType of the VPIRFlags. Flag mis-matches (e.g. VPInstruction Add without `OverflowingBinOp` being set) can cause crashes (e.g. in CSE) or potentially mis-compiles. Add a few helpers in VPBuilder to create common instructions with correct flags. PR: https://github.com/llvm/llvm-project/pull/179138	2026-02-03 12:33:23 +00:00
Florian Hahn	dd363d0629	[VPlan] Replace UnrollPart for VPScalarIVSteps with start index op (NFC) (#170906 ) Replace the unroll part operand for VPScalarIVStepsRecipe with the start index. This simplifies https://github.com/llvm/llvm-project/pull/170053 and is also a first step to break down the recipe into its components. PR: https://github.com/llvm/llvm-project/pull/170906	2026-01-21 22:13:13 +00:00
Florian Hahn	d3f2f1366d	[LV] Consider UserIC when limiting VF. (#174573 ) If a UserIC is provided, the vector loop will process VF * UserIC. Pass it through UserIC to computeFeasibleMaxVF and use it to limit the max VF to factors where VF * UserIC <= MaxTripCount. This avoids creating dead vector loops with user provided interleave counts. PR: https://github.com/llvm/llvm-project/pull/174573	2026-01-20 14:19:11 +00:00
Florian Hahn	31b93d6e38	[VPlan] Add specialized VPValue subclasses for different types (NFC) (#172758 ) This patch adds VPValue sub-classes for the different cases we currently have: * VPIRValue: A live-in VPValue that wraps an underlying IR value * VPSymbolicValue: A symbolic VPValue not tied to an underlying value, e.g. the vector trip count or VF VPValues * VPRecipeValue: A VPValue defined by a VPDef/VPRecipeBase. This has multiple benefits: * clearer constructors for each kind of VPValue * limited scope: for example allows moving VPDef member to VPRecipeValue, reducing size of other VPValues. * stricter type checking for member variables (e.g. using VPLiveIn in the Value -> live-in map in VPlan, or using VPSymbolicValue for symbolic member VPValues) There probably are additional opportunities for cleanups as follow-ups. PR: https://github.com/llvm/llvm-project/pull/172758	2026-01-07 20:29:05 +00:00
Florian Hahn	c2a8739cd1	[VPlan] Split off VPReductionRecipe creation for in-loop reductions (NFC) (#168784 ) This patch splits off VPReductionRecipe creation for in-loop reductions to a separate transform from adjustInLoopReductions, which has been renamed. The new transform has been updated to work directly on VPInstructions, and gets applied after header phis have been processed, once on VPlan0. Builds on top of https://github.com/llvm/llvm-project/pull/168291 and https://github.com/llvm/llvm-project/pull/166099 which should be reviewed first. PR: https://github.com/llvm/llvm-project/pull/168784	2025-12-25 14:02:58 +00:00
Florian Hahn	2befda2225	[VPlan] Populate and use VPIRFlags from initial VPInstruction. (#168450 ) Update VPlan to populate VPIRFlags during VPInstruction construction and use it when creating widened recipes, instead of constructing VPIRFlags from the underlying IR instruction each time. The VPRecipeWithIRFlags constructor taking an underlying instruction and setting the flags based on it has been removed. This centralizes initial VPIRFlags creation and ensures flags are consistently available throughout VPlan transformations and makes sure we don't accidentally re-add flags from the underlying instruction that already got dropped during transformations. Follow-up to https://github.com/llvm/llvm-project/pull/167253, which did the same for VPIRMetadata. Should be NFC w.r.t. to the generated IR. PR: https://github.com/llvm/llvm-project/pull/168450	2025-11-18 15:15:14 +00:00
Florian Hahn	3cba379e3d	[VPlan] Populate and use VPIRMetadata from VPInstructions (NFC) (#167253 ) Update VPlan to populate VPIRMetadata during VPInstruction construction and use it when creating widened recipes, instead of constructing VPIRMetadata from the underlying IR instruction each time. This centralizes VPIRMetadata in VPInstructions and ensures metadata is consistently available throughout VPlan transformations. PR: https://github.com/llvm/llvm-project/pull/167253	2025-11-17 21:28:49 +00:00
Ramkumar Ramachandra	ef023cae38	Reland [VPlan] Expand WidenInt inductions with nuw/nsw (#168354 ) Changes: The previous patch had to be reverted to a mismatching-OpType assert in cse. The reduced-test has now been added corresponding to a RVV pointer-induction, and the pointer-induction case has been updated to use createOverflowingBinaryOp. While at it, record VPIRFlags in VPWidenInductionRecipe.	2025-11-17 13:44:25 +00:00
Florian Hahn	f773efcffb	[VPlan] Add VPIRMetadata parameter to VPInstruction constructor. (NFC) Update VPInstruction constructor to accept VPIRMetadata between the Flags and DebugLoc parameters. This allows metadata to be passed during construction rather than assigned afterward.	2025-11-01 21:57:52 +00:00
Florian Hahn	6e83937f39	[VPlan] Add getConstantInt helpers for constant int creation (NFC). Add getConstantInt helper methods to VPlan to simplify the common pattern of creating constant integer live-ins. Suggested as follow-up in https://github.com/llvm/llvm-project/pull/164127.	2025-11-01 04:13:01 +00:00
Luke Lau	9fe1f29541	[VPlan] Set flags when constructing zexts using VPWidenCastRecipe (#164198 ) createWidenCast doesn't set the flag type, so when we simplify trunc (zext nneg x) -> zext x we would hit an assertion in CSE that the flag types don't match with other VPWidenCastRecipes that weren't simplified. This fixes it the same way trunc flags are handled too. As an aside I think it should be correct to preserve the nneg flag in this case since the input operand is still non-negative after the transform. But that's left to another PR. Fixes https://github.com/llvm/llvm-project/issues/164171	2025-10-20 10:39:16 +00:00
Florian Hahn	4bf5ab4f9d	[VPlan] Set flags when constructing truncs using VPWidenCastRecipe. VPWidenCastRecipes with Trunc opcodes where missing the correct OpType for IR flags. Update createWidenCast to set the correct flags for truncs, and use it consistenly. Fixes https://github.com/llvm/llvm-project/issues/162374.	2025-10-12 14:01:12 +01:00
Florian Hahn	50b9ca4dda	[VPlan] Simplify Plan's entry in removeBranchOnConst. (#154510 ) After https://github.com/llvm/llvm-project/pull/153643, there may be a BranchOnCond with constant condition in the entry block. Simplify those in removeBranchOnConst. This removes a number of redundant conditional branch from entry blocks. In some cases, it may also make the original scalar loop unreachable, because we know it will never execute. In that case, we need to remove the loop from LoopInfo, because all unreachable blocks may dominate each other, making LoopInfo invalid. In those cases, we can also completely remove the loop, for which I'll share a follow-up patch. Depends on https://github.com/llvm/llvm-project/pull/153643. PR: https://github.com/llvm/llvm-project/pull/154510	2025-09-18 19:25:05 +01:00
Florian Hahn	8796dfdcba	[VPlan] Consolidate logic to update loop metadata and profile info. This patch consolidates updating loop metadata and profile info for both the remainder and vector loops in a single place. This is NFC, modulo consistently applying vectorization specific metadata also in the experimental VPlan-native path. Split off from https://github.com/llvm/llvm-project/pull/154510.	2025-09-04 21:50:40 +01:00
Hassnaa Hamdi	35b22764e2	[LV][AArch64] Prefer epilogue with fixed-width over scalable VF. (#155546 ) In case of equal costs Prefer epilogue with fixed-width over scalable VF. That is helpful in cases like post-LTO vectorization where epilogue with fixed-width VF can be removed when we eventually know that the trip count is less than the epilogue iterations.	2025-09-04 19:31:30 +01:00
Florian Hahn	5faed1ad84	[VPlan] Add VPlan-based addMinIterCheck, replace ILV for non-epilogue. (#153643 ) This patch adds a new VPlan-based addMinimumIterationCheck, which replaced the ILV version for the non-epilogue case. The VPlan-based version constructs a SCEV expression to compute the minimum iterations, use that to check if the check is known true or false. Otherwise it creates a VPExpandSCEV recipe and emits a compare-and-branch. When using epilogue vectorization, we still need to create the minimum trip-count-check during the legacy skeleton creation. The patch moves the definitions out of ILV. PR: https://github.com/llvm/llvm-project/pull/153643	2025-08-26 15:52:31 +01:00
Ramkumar Ramachandra	97f554249c	[VPlan] Preserve nusw in createInBoundsPtrAdd (#151549 ) Rename createInBoundsPtrAdd to createNoWrapPtrAdd, and preserve nusw as well as inbounds at the callsite.	2025-08-18 17:48:42 +01:00
Florian Hahn	424258947e	[VPlan] Materialize VF and VFxUF using VPInstructions. (#152879 ) Materialize VF and VFxUF computation using VPInstruction instead of directly creating IR. This is one of the last few steps needed to model the full vector skeleton in VPlan. This is mostly NFC, although in some cases we remove some unused computations. PR: https://github.com/llvm/llvm-project/pull/152879	2025-08-12 14:13:13 +01:00
Luke Lau	94a6cd464e	[VPlan] Expand VPWidenPointerInductionRecipe into separate recipes (#148274 ) This is the VPWidenPointerInductionRecipe equivalent of #118638, with the motivation of allowing us to use the EVL as the induction step. There is a new VPInstruction added, WidePtrAdd to allow adding the step vector to the induction phi, since VPInstruction::PtrAdd only handles scalars or multiple scalar lanes. Originally this transformation was copied from the original recipe's execute code, but it's since been simplifed by teaching `unrollWidenInductionByUF` to unroll the recipe, which brings it inline with VPWidenIntOrFpInductionRecipe.	2025-08-05 16:54:02 +08:00
Florian Hahn	c9dd14d1d4	[VPlan] Compute interleave count for VPlan. (#149702 ) Move selectInterleaveCount to LoopVectorizationPlanner and retrieve some information directly from VPlan. Register pressure was already computed for a VPlan, and with this patch we now also check for reductions directly on VPlan, as well as checking how many load and store operations remain in the loop. This should be mostly NFC, but we may compute slightly different interleave counts, except for some edge cases, e.g. where dead loads have been removed. This shouldn't happen in practice, and the patch doesn't cause changes across a large test corpus on AArch64. Computing the interleave count based on VPlan allows for making better decisions in presence of VPlan optimizations, for example when operations on interleave groups are narrowed. Note that there are a few test changes for tests that were still checking the legacy cost-model output when it was computed in selectInterleaveCount. PR: https://github.com/llvm/llvm-project/pull/149702	2025-08-05 09:42:55 +01:00
Ramkumar Ramachandra	20f6ec4b29	[VPlan] Make VPBuilder APIs uniformly take ArrayRef (NFC) (#151484 )	2025-07-31 11:33:04 +01:00
Florian Hahn	004c67ea25	[LV] Vectorize maxnum/minnum w/o fast-math flags. (#148239 ) Update LV to vectorize maxnum/minnum reductions without fast-math flags, by adding an extra check in the loop if any inputs to maxnum/minnum are NaN, due to maxnum/minnum behavior w.r.t to signaling NaNs. Signed-zeros are already handled consistently by maxnum/minnum. If any input is NaN, exit the vector loop, compute the reduction result up to the vector iteration that contained NaN inputs and * resume in the scalar loop New recurrence kinds are added for reductions using maxnum/minnum without fast-math flags. PR: https://github.com/llvm/llvm-project/pull/148239	2025-07-18 21:58:19 +01:00
Florian Hahn	64686c59c3	[VPlan] Connect (MemRuntime\|SCEV)Check blocks as VPlan transform (NFC). (#143879 ) Connect SCEV and memory runtime check block directly in VPlan as VPIRBasicBlocks, removing ILV::emitSCEVChecks and ILV::emitMemRuntimeChecks. The new logic is currently split across LoopVectorizationPlanner::addRuntimeChecks which collects a list of {Condition, CheckBlock} pairs and performs some checks and emits remarks if needed. The list of checks is then added to VPlan in VPlanTransforms::connectCheckBlocks. PR: https://github.com/llvm/llvm-project/pull/143879	2025-07-09 14:03:25 +02:00
Ramkumar Ramachandra	f1e1b48023	[LV] Strip redundant fn in VPBuilder (NFC) (#147499 )	2025-07-08 13:41:29 +01:00
Ramkumar Ramachandra	b334ffd4f4	[VPlan] Refine return types in VPBuilder (NFC) (#108858 )	2025-06-20 14:01:15 +01:00
Philip Reames	53ea522d1b	[LV] Introduce and use VPBuilder::createScalarZExtOrTrunc [nfc] (#144946 ) Reduce redundant code, make the flow slightly easier to read.	2025-06-19 14:12:14 -07:00
Stephen Tozer	aa8a1fa6f5	[DLCov][NFC] Annotate intentionally-blank DebugLocs in existing code (#136192 ) Following the work in PR #107279, this patch applies the annotative DebugLocs, which indicate that a particular instruction is intentionally missing a location for a given reason, to existing sites in the compiler where their conditions apply. This is NFC in ordinary LLVM builds (each function `DebugLoc::getFoo()` is inlined as `DebugLoc()`), but marks the instruction in coverage-tracking builds so that it will be ignored by Debugify, allowing only real errors to be reported. From a developer standpoint, it also communicates the intentionality and reason for a missing DebugLoc. Some notes for reviewers: - The difference between `I->dropLocation()` and `I->setDebugLoc(DebugLoc::getDropped())` is that the former _may_ decide to keep some debug info alive, while the latter will always be empty; in this patch, I always used the latter (even if the former could technically be correct), because the former could result in some (barely) different output, and I'd prefer to keep this patch purely NFC. - I've generally documented the uses of `DebugLoc::getUnknown()`, with the exception of the vectorizers - in summary, they are a huge cause of dropped source locations, and I don't have the time or the domain knowledge currently to solve that, so I've plastered it all over them as a form of "fixme".	2025-06-11 17:42:10 +01:00
Florian Hahn	567b3172da	[VPlan] Construct initial once and pass clones to tryToBuildVPlan (NFC). (#141363 ) Update to only build an initial, plain-CFG VPlan once, and then transform & optimize clones. This requires changes to ::clone() for VPInstruction and VPWidenPHIRecipe to allow for proper cloning of the recipes in the initial VPlan. PR: https://github.com/llvm/llvm-project/pull/141363	2025-05-26 13:42:47 +01:00
Florian Hahn	c0506a11f4	[VPlan] Separate out logic to manage IR flags to VPIRFlags (NFC). (#140621 ) This patch moves the logic to manage IR flags to a separate VPIRFlags class. For now, VPRecipeWithIRFlags is the only class that inherits VPIRFlags. The new class allows for simpler passing of flags when constructing recipes, simplifying the constructors for various recipes (VPInstruction in particular, which now just has 2 constructors, one taking an extra VPIRFlags argument. This mirrors the approach taken for VPIRMetadata and makes it easier to extend in the future. The patch also adds a unified flagsValidForOpcode to check if the flags in a VPIRFlags match the provided opcode. PR: https://github.com/llvm/llvm-project/pull/140621	2025-05-25 11:13:11 +01:00
Florian Hahn	f2e62cfca5	[VPlan] Add VPPhi subclass for VPInstruction with PHI opcodes.(NFC) (#139151 ) Similarly to VPInstructionWithType and VPIRPhi, add VPPhi as a subclass for VPInstruction. This allows implementing the VPPhiAccessors trait, making available helpers for generic printing of incoming values / blocks and accessors for incoming blocks and values. It will also allow properly verifying def-uses for values used by VPInstructions with PHI opcodes via https://github.com/llvm/llvm-project/pull/124838. PR: https://github.com/llvm/llvm-project/pull/139151	2025-05-10 11:08:00 +01:00
Florian Hahn	e854c381c6	[VPlan] Manage noalias/alias_scope metadata in VPlan. (#136450 ) Use VPIRMetadata added in https://github.com/llvm/llvm-project/pull/135272 to also manage no-alias metadata added by versioning. Note that this means we have to build the no-alias metadata up-front once. If it is not used, it will be discarded automatically. This also fixes a case where incorrect metadata was added to wide loads/stores that got converted from an interleave group. Compile-time impact is neutral: https://llvm-compile-time-tracker.com/compare.php?from=38bf1af41c5425a552a53feb13c71d82873f1c18&to=2fd7844cfdf5ec0f1c2ce0b9b3ae0763245b6922&stat=instructions:u	2025-05-09 11:19:12 +01:00
Florian Hahn	7f4e36ebf6	[VPlan] Create PHI VPInstruction using VPBuilder (NFC). Use builder to create scalar PHI VPInstructions.	2025-05-07 20:47:37 +01:00
Luke Lau	b0f2bfc7e4	[VPlan] Use correct non-FMF constructor in VPInstructionWithType createNaryOp (#137632 ) Currently if we try to create a VPInstructionWithType without a FMF via VPBuilder::createNaryOp we will use the constructor that asserts `assert(isFPMathOp() && "this op can't take fast-math flags");`. This fixes it by checking if FMFs have a value, similar to the other createNaryOp overloads. This is needed by #129508	2025-04-29 20:35:19 +08:00
Florian Hahn	54b33eba16	[VPlan] Add opcode to create step for wide inductions. (#119284 ) This patch adds a WideIVStep opcode that can be used to create a vector with the steps to increment a wide induction. The opcode has 2 operands * the vector step * the scale of the vector step The opcode is later converted into a sequence of recipes that convert the scale and step to the target type, if needed, and then multiply vector step by scale. This simplifies code that needs to materialize step vectors, e.g. replacing wide IVs as follow up to https://github.com/llvm/llvm-project/pull/108378 with an increment of the wide IV step. PR: https://github.com/llvm/llvm-project/pull/119284	2025-04-14 23:20:44 +02:00
Florian Hahn	e27a21f6a7	[VPlan] Add hasScalarTail, use instead of !CM.foldTailByMasking() (NFC). (#134674 ) Now that VPlan is able to fold away redundant branches to the scalar preheader, we can directly check in VPlan if the scalar tail may execute. hasScalarTail returns true if the tail may execute. We know that the scalar tail won't execute if the scalar preheader doesn't have any predecessors, i.e. is not reachable. This removes some late uses of the legacy cost model. PR: https://github.com/llvm/llvm-project/pull/134674	2025-04-11 12:50:59 +01:00
Florian Hahn	6a9e8fc50c	[VPlan] Introduce VPInstructionWithType, use instead of VPScalarCast(NFC) (#129706 ) There are some opcodes that currently require specialized recipes, due to their result type not being implied by their operands, including casts. This leads to duplication from defining multiple full recipes. This patch introduces a new VPInstructionWithType subclass that also stores the result type. The general idea is to have opcodes needing to specify a result type to use this general recipe. The current patch replaces VPScalarCastRecipe with VInstructionWithType, a similar patch for VPWidenCastRecipe will follow soon. There are a few proposed opcodes that should also benefit, without the need of workarounds: * https://github.com/llvm/llvm-project/pull/129508 * https://github.com/llvm/llvm-project/pull/119284 PR: https://github.com/llvm/llvm-project/pull/129706	2025-04-10 22:30:40 +01:00
Florian Hahn	3a859b11e3	[VPlan] Set and use debug location for VPScalarIVStepsRecipe. This adds missing debug location for VPscalarIVStepsRecipe. The location of the corresponding phi is used.	2025-04-04 21:14:36 +01:00
Florian Hahn	783a846507	[VPlan] Add VF as operand to VPScalarIVStepsRecipe. Similarly to other recipes, update VPScalarIVStepsRecipe to also take the runtime VF as argument. This removes some unnecessary runtime VF computations for scalable vectors. It will also allow dropping the UF == 1 restriction for narrowing interleave groups required in 577631f0a528.	2025-03-28 21:48:59 +00:00
Ramkumar Ramachandra	e8d882a95b	[LV] Audit and fix nits in cl::opts (NFC) (#130601 ) Non-static cl::opts should be under the llvm namespace.	2025-03-25 10:19:45 +00:00
Florian Hahn	0d3ba087f7	[LV] Move IV bypass value creation out of ILV (NFC) createInductionAdditionalBypassValues is only used for epilogue vectorization now. Move it out of ILV, which means we do not have to thread through ExpandedSCEVs and also don't have to track the bypass values in ILV. Instead, directly create them if needed after executing the epilogue plan. This moves more the epilogue specific logic out of the generic executePlan.	2025-03-22 20:36:45 +00:00
Florian Hahn	2e13ec561c	[VPlan] Bail out on non-intrinsic calls in VPlanNativePath. Update initial VPlan-construction in VPlanNativePath in line with the inner loop path, in that it bails out when encountering constructs it cannot handle, like non-intrinsic calls. Fixes https://github.com/llvm/llvm-project/issues/131071.	2025-03-19 21:35:15 +00:00
Florian Hahn	f8fa93193b	[LV] Add VPBuilder::insert, use to insert created vector pointer (NFC). Split off from https://github.com/llvm/llvm-project/pull/124432 as suggested. Adds VPBuilder::insert, inspired by IRBuilderBase.	2025-02-03 22:20:40 +00:00
Florian Hahn	5008277322	[VPlan] Move auxiliary declarations out of VPlan.h (NFC). (#124104 ) Nothing in VPlan.h directly depends on VPTransformState, VPCostContext, VPFRange, VPlanPrinter or VPSlotTracker. Move them out to a separate header to reduce the size of widely used VPlan.h. This is a first step towards more cleanly separating declarations in VPlan. Besides reducing VPlan.h's size, this also allows including additional VPlan-related headers in VPlanHelpers.h for use there. An example is using VPDominatorTree in VPTransformState (https://github.com/llvm/llvm-project/pull/117138). PR: https://github.com/llvm/llvm-project/pull/124104	2025-02-02 13:44:07 +00:00
Florian Hahn	f4230b4332	[VPlan] Add and use debug location for VPScalarCastRecipe. Update the recipe it always take a debug location and set it.	2025-01-05 20:08:51 +00:00
Florian Hahn	7f3428d3ed	[VPlan] Compute induction end values in VPlan. (#112145 ) Use createDerivedIV to compute IV end values directly in VPlan, instead of creating them up-front. This allows updating IV users outside the loop as follow-up. Depends on https://github.com/llvm/llvm-project/pull/110004 and https://github.com/llvm/llvm-project/pull/109975. PR: https://github.com/llvm/llvm-project/pull/112145	2024-12-29 19:05:08 +00:00
Nikita Popov	1157187496	[VPlan] Propagate all GEP flags (#119899 ) Store GEPNoWrapFlags instead of only InBounds and propagate them.	2024-12-17 13:48:50 +01:00
Florian Hahn	590f451b60	[VPlan] Allow setting IR name for VPDerivedIVRecipe (NFCI). Allow setting the name to use for the generated IR value of the derived IV in preparations for https://github.com/llvm/llvm-project/pull/112145. This is analogous to VPInstruction::Name.	2024-11-24 20:39:12 +00:00
Julian Nagele	a8538b9138	[LV] Vectorize Epilogues for loops with small VF but high IC (#108190 ) - Consider MainLoopVF * IC when determining whether Epilogue Vectorization is profitable - Allow the same VF for the Epilogue as for the main loop - Use an upper bound for the trip count of the Epilogue when choosing the Epilogue VF PR: https://github.com/llvm/llvm-project/pull/108190 --------- Co-authored-by: Florian Hahn <flo@fhahn.com>	2024-11-17 19:35:32 +00:00

1 2 3 4

180 Commits