llvm-project

Author	SHA1	Message	Date
Florian Hahn	0e11a92447	[VPlan] Implement printing VPIRMetadata. (#168385 ) mplement printing for VPIRMetadata, using generic dyn_cast to VPIRMetadata. Depends on https://github.com/llvm/llvm-project/pull/166245 PR: https://github.com/llvm/llvm-project/pull/168385	2025-12-04 10:56:16 +00:00
Shih-Po Hung	ff89558e4e	[VPlan] Fix opcode in LoadStore EVL recipe (#170594 ) After #169885 lands, vp_load/vp_store are handled by getMemIntrinsicInstrCost, so we can use the correct opcode here.	2025-12-04 09:54:47 +00:00
Florian Hahn	50916a4adc	[VPlan] Use predicate in VPInstruction::computeCost for selects. (#170278 ) In some cases, the lowering a select depends on the predicate. If the condition of a select is a compare instruction, thread the predicate through to the TTI hook. PR: https://github.com/llvm/llvm-project/pull/170278	2025-12-03 19:48:23 +00:00
Florian Hahn	bd5fa63335	[VPlan] Remove duplicated computeCost call (NFC). Remove a redundant duplicated computeCost call. NFC, just skipping an unneeded call.	2025-12-02 21:59:40 +00:00
Ramkumar Ramachandra	dec77e4f87	[VPlan] Improve code in VPInstruction::generate (NFC) (#169470 ) Make miscellaneous improvements including inlining some expressions and re-using the existing State.Builder reference.	2025-12-01 17:47:32 +00:00
Florian Hahn	25ab47bd40	[VPlan] Use wide IV if scalar lanes > 0 are used with scalable vectors. (#169796 ) For scalable vectors, VPScsalarIVStepsRecipe cannot create all scalar step values. At the moment, it creates a vector, in addition to to the first lane. The only supported case for this is when only the last lane is used. A recipe should not set both scalar and vector values. Instead, we can simply use a vector induction. It would also be possible to preserve the current vector code-gen, by creating VPInstructions based on the first lane of VPScalarIVStepsRecipe, but using a vector induction seems simpler. PR: https://github.com/llvm/llvm-project/pull/169796	2025-12-01 17:33:36 +00:00
Shih-Po Hung	b9bdec3021	[TTI][Vectorize] Migrate masked/gather-scatter/strided/expand-compress costing (NFCI) (#165532 ) In #160470, there is a discussion about the possibility to explored a general approach for handling memory intrinsics. API changes: - Remove getMaskedMemoryOpCost, getGatherScatterOpCost, getExpandCompressMemoryOpCost, getStridedMemoryOpCost from Analysis/TargetTransformInfo. - Add getMemIntrinsicInstrCost. In BasicTTIImpl, map intrinsic IDs to existing target implementation until the legacy TTI hooks are retired. - masked_load/store → getMaskedMemoryOpCost - masked_/vp_gather/scatter → getGatherScatterOpCost - masked_expandload/compressstore → getExpandCompressMemoryOpCost - experimental_vp_strided_{load,store} → getStridedMemoryOpCost TODO: add support for vp_load_ff. No functional change intended; costs continue to route to the same target-specific hooks.	2025-11-28 05:14:37 +00:00
Florian Hahn	f8eca64a28	Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )" This reverts commit a6edeedbfa308876d6f2b1648729d52970bb07e6. The following fixes have landed, addressing issues causing the original revert: * https://github.com/llvm/llvm-project/pull/169298 * https://github.com/llvm/llvm-project/pull/167897 * https://github.com/llvm/llvm-project/pull/168949 Original message: Building on top of https://github.com/llvm/llvm-project/pull/148817, introduce a new abstract LastActiveLane opcode that gets lowered to Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1). When folding the tail, update all extracts for uses outside the loop the extract the value of the last actice lane. See also https://github.com/llvm/llvm-project/issues/148603 PR: https://github.com/llvm/llvm-project/pull/149042	2025-11-26 20:03:55 +00:00
Florian Hahn	d58ebe339c	Revert "Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )"" This reverts commit 72e51d389f66d9cc6b55fd74b56fbbd087672a43. Missed some test updates.	2025-11-26 19:41:39 +00:00
Florian Hahn	72e51d389f	Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )" This reverts commit a6edeedbfa308876d6f2b1648729d52970bb07e6. The following fixes have landed, addressing issues causing the original revert: * https://github.com/llvm/llvm-project/pull/169298 * https://github.com/llvm/llvm-project/pull/167897 * https://github.com/llvm/llvm-project/pull/168949 Original message: Building on top of https://github.com/llvm/llvm-project/pull/148817, introduce a new abstract LastActiveLane opcode that gets lowered to Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1). When folding the tail, update all extracts for uses outside the loop the extract the value of the last actice lane. See also https://github.com/llvm/llvm-project/issues/148603 PR: https://github.com/llvm/llvm-project/pull/149042	2025-11-26 19:31:25 +00:00
Sam Tebbs	071d1fb8be	[LV] Use VPReductionRecipe for partial reductions (#147513 ) Partial reductions can easily be represented by the VPReductionRecipe class by setting their scale factor to something greater than 1. This PR merges the two together and gives VPReductionRecipe a VFScaleFactor so that it can choose to generate the partial reduction intrinsic at execute time. Stacked PRs: 1. https://github.com/llvm/llvm-project/pull/147026 2. https://github.com/llvm/llvm-project/pull/147255 3. https://github.com/llvm/llvm-project/pull/156976 4. https://github.com/llvm/llvm-project/pull/160154 5. https://github.com/llvm/llvm-project/pull/147302 6. https://github.com/llvm/llvm-project/pull/162503 7. -> https://github.com/llvm/llvm-project/pull/147513 Replaces https://github.com/llvm/llvm-project/pull/146073 .	2025-11-26 16:18:22 +00:00
Ramkumar Ramachandra	2d4a8dadba	[VPlan] Use DL index type consistently for GEPs (#169396 ) In preparation to strip VPUnrollPartAccessor and unroll recipes directly, strip unnecessary complication in getGEPIndexTy, as the unroll part will no longer be available in follow-ups (see #168886 for instance). The patch also helps by doing a mass test update up-front. Narrowing the GEP index type conditionally does not yield any benefit, and the change is non-functional in terms of emitted assembly. While at it, avoid hard-coding address-space 0, and use the pointer operand's address space to get the GEP index type.	2025-11-26 12:25:55 +00:00
Ramkumar Ramachandra	cb63e99e58	[VPlan] Include flags in VectorPointerRecipe::printRecipe (#169466 ) The change is non-functional with respect to emitted IR.	2025-11-25 10:26:51 +00:00
Ramkumar Ramachandra	37f7b3128d	Reland [VPlan] Handle WidenGEP in narrowToSingleScalars (#167880 ) Changes: Fix a missed update to WidenGEP::usesFirstLaneOnly, and include reduced-case test that was previously hitting the new assert: the underlying reason was that VPWidenGEP::usesScalars was too weak, and the single-scalar WidenGEP was not narrowed by narrowToSingleScalarRecipes. This allows us to strip a special case in VPWidenGEP::execute.	2025-11-24 18:11:58 +00:00
Luke Lau	456b0512c9	[VPlan] Set ZeroIsPoison=false for FirstActiveLane (#169298 ) When interleaving a loop with an early exit, the parts before the active lane will be all zero. Currently we emit @llvm.experimental.cttz.elts with ZeroIsPoison=true for these parts, which means that they will produce poison. We don't see any miscompiles today on AArch64 because it has the same lowering for cttz.elts regardless of ZeroIsPoison, but this may cause issues on RISC-V when interleaving. This fixes it by setting ZeroIsPoison=false. The codegen is slightly worse on RISC-V when ZeroIsPoison=false and we could potentially recover it by enabling it again when UF=1, but this is left to another PR. This is split off from #168738, where LastActiveLane can get expanded to a FirstActiveLane with an all-zeroes mask.	2025-11-24 14:39:26 +00:00
Florian Hahn	996213c6ea	[VPlan] Refine mayRead/WriteFromMemory for VPInst, fix VPlan SLP check. Fix VPlan SLP check incorrectly bailing out for non-VPInstructions. Starting from the beginning of the block will include canonical IVs, which in turn are not VPInstructions. If we hit a non-VPInstruction, we should conservatively treat is as potentially unvectorizable. To keep the tests working as expected, refine mayRead/WriteFromMemory for Load and GEP VPInstructions.	2025-11-23 21:12:24 +00:00
Florian Hahn	21378fb75a	[VPlan] Merge `fcmp uno` feeding AnyOf. (#166823 ) Fold any-of (fcmp uno %A, %A), (fcmp uno %B, %B), ... -> any-of (fcmp uno %A, %B), ... This pattern is generated to check if any vector lane is NaN, and combining multiple compares is beneficial on architectures that have dedicated instructions. Alive2 Proof: https://alive2.llvm.org/ce/z/vA_aoM Combine suggested as part of https://github.com/llvm/llvm-project/pull/161735 PR: https://github.com/llvm/llvm-project/pull/166823	2025-11-23 15:52:19 +00:00
Florian Hahn	31711c908f	[VPlan] Only apply forced cost to recipes with underlying values. (#168372 ) Only apply forced instruction costs to recipes with underlying values to match the legacy cost model. A VPlan may have a number of additional VPInstructions without underlying values that are not considered for its cost, and assigning forced costs to them would incorrectly inflate its cost. This fixes a cost divergence between legacy and VPlan-based cost models with forced instruction costs. PR: https://github.com/llvm/llvm-project/pull/168372	2025-11-21 14:21:16 +00:00
Florian Hahn	7acfbc23a7	[VPlan] Remove PtrIV::IsScalarAfterVectorization, use VPlan analysis. (#168289 ) Remove `VPWidenPointerInductionRecipe::IsScalarAfterVectorization` and replace it with `onlyScalarValuesUsed`. This removes the need to carry state from the legacy cost model through VPlan, and the VPlan-based analysis gives more accurate results, avoiding a number of extracts. PR: https://github.com/llvm/llvm-project/pull/168289	2025-11-20 18:58:25 +00:00
Florian Hahn	0730913529	[VPlan] Print debug info for all recipes. (#168454 ) Use the recently refactored VPRecipeBase::print to print debug location for all recipes. PR: https://github.com/llvm/llvm-project/pull/168454	2025-11-19 10:10:08 +00:00
Shih-Po Hung	961940e1a7	[TTI] Use MemIntrinsicCostAttributes for getMaskedMemoryOpCost (#168029 ) - Split from #165532. This is a step toward a unified interface for masked/gather-scatter/strided/expand-compress cost modeling. - Replace the ad-hoc parameter list with a single attributes object. API change: ``` - InstructionCost getMaskedMemoryOpCost(Opcode, Src, Alignment, - AddressSpace, CostKind); + InstructionCost getMaskedMemoryOpCost(MemIntrinsicCostAttributes, + CostKind); ``` Notes: - NFCI intended: callers populate MemIntrinsicCostAttributes with the same information as before. - Follow-up: migrate gather/scatter, strided, and expand/compress cost queries to the same attributes-based entry point.	2025-11-19 09:51:12 +08:00
Florian Hahn	1e3ea03293	[VPlan] VPIRFlags kind for FCmp with predicate + fast-math flags (NFCI). FCmp instructions have both a predicate and fast-math flags. Introduce a new FCmp kind, that combines both to model this correctly in the current system. This should be NFC modulo VPlan printing which now includes the correct fast-math flags.	2025-11-18 22:09:53 +00:00
Florian Hahn	2befda2225	[VPlan] Populate and use VPIRFlags from initial VPInstruction. (#168450 ) Update VPlan to populate VPIRFlags during VPInstruction construction and use it when creating widened recipes, instead of constructing VPIRFlags from the underlying IR instruction each time. The VPRecipeWithIRFlags constructor taking an underlying instruction and setting the flags based on it has been removed. This centralizes initial VPIRFlags creation and ensures flags are consistently available throughout VPlan transformations and makes sure we don't accidentally re-add flags from the underlying instruction that already got dropped during transformations. Follow-up to https://github.com/llvm/llvm-project/pull/167253, which did the same for VPIRMetadata. Should be NFC w.r.t. to the generated IR. PR: https://github.com/llvm/llvm-project/pull/168450	2025-11-18 15:15:14 +00:00
Florian Hahn	3cba379e3d	[VPlan] Populate and use VPIRMetadata from VPInstructions (NFC) (#167253 ) Update VPlan to populate VPIRMetadata during VPInstruction construction and use it when creating widened recipes, instead of constructing VPIRMetadata from the underlying IR instruction each time. This centralizes VPIRMetadata in VPInstructions and ensures metadata is consistently available throughout VPlan transformations. PR: https://github.com/llvm/llvm-project/pull/167253	2025-11-17 21:28:49 +00:00
Ramkumar Ramachandra	ef023cae38	Reland [VPlan] Expand WidenInt inductions with nuw/nsw (#168354 ) Changes: The previous patch had to be reverted to a mismatching-OpType assert in cse. The reduced-test has now been added corresponding to a RVV pointer-induction, and the pointer-induction case has been updated to use createOverflowingBinaryOp. While at it, record VPIRFlags in VPWidenInductionRecipe.	2025-11-17 13:44:25 +00:00
Florian Hahn	7e730da128	[VPlan] Add printRecipe, prepare printing metadata in ::print (NFC) (#166244 ) Add a new pinrRecipe which handles printing the recipe without common info like debug info or metadata. Prepares to print them once, in ::print(), after/in combination with https://github.com/llvm/llvm-project/pull/165825. PR: https://github.com/llvm/llvm-project/pull/166244	2025-11-17 12:01:40 +00:00
Florian Hahn	67c8e38b8b	[VPlan] Delegate to other VPInstruction constructors. (NFCI) Update VPInstruction constructor to delegate to constructor with more comprehensive checking and validation. This required updating some unit tests, to make sure the constructed VPInstructions are valid.	2025-11-16 22:21:00 +00:00
Alex Bradbury	f2336d4c7e	Revert "[VPlan] Expand WidenInt inductions with nuw/nsw" (#168080 ) Reverts llvm/llvm-project#163538 This is causing build failures on the two-stage RVV buildbots. e.g. https://lab.llvm.org/buildbot/#/builders/214/builds/1363. I've shared a reproducer and more information at https://github.com/llvm/llvm-project/pull/163538#issuecomment-3533482822 This reverts commit 355e0f94af5adabe90ac57110ce1b47596afd4cd.	2025-11-14 16:11:48 +00:00
Ramkumar Ramachandra	355e0f94af	[VPlan] Expand WidenInt inductions with nuw/nsw (#163538 ) While at it, record VPIRFlags in VPWidenInductionRecipe.	2025-11-14 12:10:55 +00:00
Mel Chen	3277f6caef	[LV] Explicitly disable in-loop reductions for AnyOf and FindIV. nfc (#163541 ) Currently, in-loop reductions for AnyOf and FindIV are not supported. They were implicitly blocked. This happened because RecurrenceDescriptor::getReductionOpChain could not detect their recurrence chain. The reason is that RecurrenceDescriptor::getOpcode was set to Instruction::Or, but the recurrence chains of AnyOf and FindIV do not actually contain an Instruction::Or. This patch explicitly disables in-loop reductions for AnyOf and FindIV instead of relying on getReductionOpChain to implicitly prevent them.	2025-11-14 09:14:07 +00:00
Florian Hahn	a6edeedbfa	Revert "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )" This reverts commit 62d1a080e69e3c5e98840e000135afa7c688a77b. This appears to be causing some runtime failures on RISCV https://lab.llvm.org/buildbot/#/builders/210/builds/5221	2025-11-13 22:34:55 +00:00
Luke Lau	c0f7d51e8a	[VPlan] Simplify ExplicitVectorLength(%AVL) -> %AVL when AVL <= VF (#167647 ) [`llvm.experimental.get.vector.length`](https://llvm.org/docs/LangRef.html#id2399) has the property that if the AVL (%cnt) is less than or equal to VF (%max_lanes) then the return value is just AVL. This patch uses SCEV to simplify this in optimizeForVFAndUF, and adds `ExplicitVectorLength` to `VPInstruction::opcodeMayReadOrWriteFromMemory` so it gets removed once dead.	2025-11-13 13:17:01 +00:00
Florian Hahn	62d1a080e6	[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 ) Building on top of https://github.com/llvm/llvm-project/pull/148817, introduce a new abstract LastActiveLane opcode that gets lowered to Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1). When folding the tail, update all extracts for uses outside the loop the extract the value of the last actice lane. See also https://github.com/llvm/llvm-project/issues/148603 PR: https://github.com/llvm/llvm-project/pull/149042	2025-11-12 15:11:00 +00:00
Florian Hahn	519cf3c2b8	[VPlan] Remove unneeded getDefiningRecipe with isa/cast/dyn_cast. (NFC) Classof for most recipes directly supports VPValue, so there is no need to call getDefiningRecipe when using isa/cast/dyn_cast.	2025-11-11 22:07:48 +00:00
Ramkumar Ramachandra	c8c328406c	Revert "[VPlan] Handle WidenGEP in narrowToSingleScalars" (#167509 ) This reverts commit fdd52f5fe130fb8b98f4aed3d15aa0789cce6b40, as it causes buildbot failures. This will give us time to investigate the failure. https://lab.llvm.org/buildbot/#/builders/210/builds/5160	2025-11-11 14:29:28 +00:00
Ramkumar Ramachandra	fdd52f5fe1	[VPlan] Handle WidenGEP in narrowToSingleScalars (#166740 ) This allows us to strip a special case in VPWidenGEP::execute.	2025-11-11 10:33:55 +00:00
Sander de Smalen	517d725463	[LV] Move condition to VPPartialReductionRecipe::execute (#166136 ) This means that VPExpressions will now be constructed for VPPartialReductionRecipe's when the loop has tail-folding predication. Note that control-flow (if/else) predication is not yet handled for partial reductions, because of the way partial reductions are recognised and built up.	2025-11-11 09:42:54 +00:00
Luke Lau	bfd4155f23	[VPlan] Don't apply predication discount to non-originally-predicated blocks (#160449 ) Split off from #158690. Currently if an instruction needs predicated due to tail folding, it will also have a predicated discount applied to it in multiple places. This is likely inaccurate because we can expect a tail folded instruction to be executed on every iteration bar the last. This fixes it by checking if the instruction/block was originally predicated, and in doing so prevents vectorization with tail folding where we would have had to scalarize the memory op anyway. On llvm-test-suite this causes 4 loops in total to no longer be vectorized with -O3 on arm64-apple-darwin, and there's no observable performance impact.	2025-11-10 12:10:40 +00:00
Ramkumar Ramachandra	eab44600fb	[VPlan] Rename onlyFirst(Lane\|Part)Used (NFC) (#166562 ) Rename onlyFirst(Lane\|Part)Used to usesFirst(Lane\|Part)Only, in line with usesScalars, for clarity.	2025-11-06 10:07:58 +00:00
Florian Hahn	b0b4616790	[VPlan] Handle single-scalar conds in VPWidenSelectRecipe. (#165506 ) Generalize VPWidenSelectRecipe codegen to consider single-scalar conditions instead of just loop-invariant ones. If the condition is a single-scalar, we can simply use a scalar condition. PR: https://github.com/llvm/llvm-project/pull/165506	2025-11-05 22:11:29 +00:00
Florian Hahn	1c727baf69	[VPlan] Mark BranchOnCount and BranchOnCond as having side effects (NFC) BranchOnCount and BranchOnCond do not read memory, but cannot be moved. Mark them as having side-effects, but not reading/writing memory, which more accurately models that above. This allows removing some special checking for branches both in the current code and future patches.	2025-11-02 21:14:37 +00:00
Florian Hahn	f773efcffb	[VPlan] Add VPIRMetadata parameter to VPInstruction constructor. (NFC) Update VPInstruction constructor to accept VPIRMetadata between the Flags and DebugLoc parameters. This allows metadata to be passed during construction rather than assigned afterward.	2025-11-01 21:57:52 +00:00
Florian Hahn	a943132761	[VPlan] Add VPRegionBlock::getCanonicalIVType (NFC). (#164127 ) Split off from https://github.com/llvm/llvm-project/pull/156262. Similar to VPRegionBlock::getCanonicalIV, add helper to get the type of the canonical IV, in preparation for removing VPCanonicalIVPHIRecipe. PR: https://github.com/llvm/llvm-project/pull/164127	2025-10-31 20:05:02 -07:00
Florian Hahn	b2d12d6f2b	[VPlan] Extend getSCEVForVPV, use to compute VPReplicateRecipe cost. (#161276 ) Update getSCEVExprForVPValue to handle more complex expressions, to use it in VPReplicateRecipe::comptueCost. In particular, it supports construction SCEV expressions for GetElementPtr VPReplicateRecipes, with operands that are VPScalarIVStepsRecipe, VPDerivedIVRecipe and VPCanonicalIVRecipe. If we hit a sub-expression we don't support yet, we return SCEVCouldNotCompute. Note that the SCEV expression is valid VF = 1: we only support construction AddRecs for VPCanonicalIVRecipe, which is an AddRec starting at 0 and stepping by 1. The returned SCEV expressions could be converted to a VF specific one, by rewriting the AddRecs to ones with the appropriate step. Note that the logic for constructing SCEVs for GetElementPtr was directly ported from ScalarEvolution.cpp. Another thing to note is that we construct SCEV expression purely by looking at the operation of the recipe and its translated operands, w/o accessing the underlying IR (the exception being getting the source element type for GEPs). PR: https://github.com/llvm/llvm-project/pull/161276	2025-10-30 15:46:19 -07:00
Ramkumar Ramachandra	a2d873fb87	[VPlan] Introduce cannotHoistOrSinkRecipe, fix miscompile (#162674 ) Factor out common code to determine legality of hoisting and sinking. The patch has the side-effect of fixing an underlying bug, where a load/store pair is reordered.	2025-10-28 09:36:17 +00:00
Mel Chen	6bf948999f	[VPlan] Store memory alignment in VPWidenMemoryRecipe. nfc (#165255 ) Add an member Alignment to VPWidenMemoryRecipe to store memory alignment directly in the recipe. Update constructors, clone(), and relevant methods to use this stored alignment instead of querying the IR instruction. This allows VPWidenLoadRecipe/VPWidenStoreRecipe to be constructed without relying on the original IR instruction in the future.	2025-10-28 15:29:35 +08:00
Florian Hahn	523c796df7	[VPlan] Use VPlan type inference to get address space for recipes. (NFC) Instead of accessing the address space from the IR reference, retrieve it via type inference.	2025-10-28 04:51:24 +00:00
Sam Tebbs	6b19a546aa	[LV] Bundle partial reductions inside VPExpressionRecipe (#147302 ) This PR bundles partial reductions inside the VPExpressionRecipe class. Stacked PRs: 1. https://github.com/llvm/llvm-project/pull/147026 2. https://github.com/llvm/llvm-project/pull/147255 3. https://github.com/llvm/llvm-project/pull/156976 4. https://github.com/llvm/llvm-project/pull/160154 5. -> https://github.com/llvm/llvm-project/pull/147302 6. https://github.com/llvm/llvm-project/pull/162503 7. https://github.com/llvm/llvm-project/pull/147513	2025-10-23 11:18:55 +00:00
Florian Hahn	9317975a7a	[VPlan] Match legacy behavior w.r.t. using pointer phis as scalar addrs. When the legacy cost model scalarizes loads that are used as addresses for other loads and stores, it looks to phi nodes, if they are direct address operands of loads/stores. Match this behavior in isUsedByLoadStoreAddress, to fix a divergence between legacy and VPlan-based cost model.	2025-10-20 11:09:25 +01:00
Florian Hahn	b9ce7656e9	[VPlan] Add VPInstruction to unpack vector values to scalars. (#155670 ) Add a new Unpack VPInstruction (name to be improved) to explicitly extract scalars values from vectors. Test changes are movements of the extracts: they are no generated together and also directly after the producer. Depends on https://github.com/llvm/llvm-project/pull/155102 (included in PR) PR: https://github.com/llvm/llvm-project/pull/155670	2025-10-19 18:49:05 +00:00

1 2 3 4 5 ...

532 Commits