llvm-project

Author	SHA1	Message	Date
Florian Hahn	8240cf337a	[VPlan] Always set flags for overflowing ops etc via VPIRFlags. (#179138 ) Enforce that all VPInstructions set the correct OpType of the VPIRFlags. Flag mis-matches (e.g. VPInstruction Add without `OverflowingBinOp` being set) can cause crashes (e.g. in CSE) or potentially mis-compiles. Add a few helpers in VPBuilder to create common instructions with correct flags. PR: https://github.com/llvm/llvm-project/pull/179138	2026-02-03 12:33:23 +00:00
Florian Hahn	14a209f852	[VPlan] Replace ComputeFindIVRes with ComputeRdxRes + cmp + sel (NFC) (#176672 ) Replace ComputeFindIVResult with ComputeReductionResult + explicit compare + select, to more explicitly and simpler model computing finding the first/last induction, which boils down to a min/max reduction + compare and select of the sentinel value. PR: https://github.com/llvm/llvm-project/pull/176672	2026-01-22 19:28:47 +00:00
Florian Hahn	dd363d0629	[VPlan] Replace UnrollPart for VPScalarIVSteps with start index op (NFC) (#170906 ) Replace the unroll part operand for VPScalarIVStepsRecipe with the start index. This simplifies https://github.com/llvm/llvm-project/pull/170053 and is also a first step to break down the recipe into its components. PR: https://github.com/llvm/llvm-project/pull/170906	2026-01-21 22:13:13 +00:00
Florian Hahn	459990dcf7	[VPlan] Replace PhiR operand of ComputeFindIVResult with VPIRFlags. #174026 (#175461 ) Replace the Phi recipe operand of ComputeFindIVResult with VPIRFlags, building on top of https://github.com/llvm/llvm-project/pull/174026. PR: https://github.com/llvm/llvm-project/pull/175461	2026-01-17 16:23:33 +00:00
Florian Hahn	370eeef877	[VPlan] Add matchers for reduction result VPInstructions (NFC). Add dedicated matchers for reduction result VPInstructions, to be re-used in follow-up patches, including https://github.com/llvm/llvm-project/pull/167851.	2026-01-16 11:28:30 +00:00
Florian Hahn	d5c11b9a24	[VPlan] Replace PhiR operand of ComputeRdxResult with VPIRFlags. (#174026 ) Remove the artificial PhiR operand of ComputeReductionResult, which was only used to look up recurrence kind, in-loop and ordered properties. Instead, encode them as VPIRFlags as suggested by @ayalz in https://github.com/llvm/llvm-project/pull/170223. This addresses a TODO to make codegen for ComputeReductionResult independent of looking up information from other recipes. This is NFC w.r.t. codegen, the printing has been improved to include the reduction type, and whether it is in-loop/ordered. PR: https://github.com/llvm/llvm-project/pull/174026	2026-01-14 07:45:44 +00:00
Florian Hahn	31b93d6e38	[VPlan] Add specialized VPValue subclasses for different types (NFC) (#172758 ) This patch adds VPValue sub-classes for the different cases we currently have: * VPIRValue: A live-in VPValue that wraps an underlying IR value * VPSymbolicValue: A symbolic VPValue not tied to an underlying value, e.g. the vector trip count or VF VPValues * VPRecipeValue: A VPValue defined by a VPDef/VPRecipeBase. This has multiple benefits: * clearer constructors for each kind of VPValue * limited scope: for example allows moving VPDef member to VPRecipeValue, reducing size of other VPValues. * stricter type checking for member variables (e.g. using VPLiveIn in the Value -> live-in map in VPlan, or using VPSymbolicValue for symbolic member VPValues) There probably are additional opportunities for cleanups as follow-ups. PR: https://github.com/llvm/llvm-project/pull/172758	2026-01-07 20:29:05 +00:00
Victor Chernyakin	c438773432	[LLVM][ADT] Migrate users of `make_scope_exit` to CTAD (#174030 ) This is a followup to #173131, which introduced the CTAD functionality.	2026-01-02 20:42:56 -08:00
Ramkumar Ramachandra	0636225b93	[VPlan] Directly unroll VectorPointerRecipe (#168886 ) In an effort to get rid of VPUnrollPartAccessor and directly unroll recipes, start by directly unrolling VectorPointerRecipe, allowing for VPlan-based simplifications and simplification of the corresponding execute.	2025-12-15 10:54:06 +00:00
Florian Hahn	e6e3f94b5c	[VPlan] Re-add clarifying comment regarding part to extract. (NFC) Re-add and emphasize comment regarding extracting from the last part, as suggested post-commit in https://github.com/llvm/llvm-project/pull/171145.	2025-12-12 21:51:33 +00:00
Florian Hahn	0768068ff0	[VPlan] Remove ExtractLastLane for plans with scalar VFs. (#171145 ) ExtractLastLane is a no-op for scalar VFs. Update simplifyRecipe to remove them. This also requires adjusting the code in VPlanUnroll.cpp to split off handling of ExtractLastLane/ExtractPenultimateElement for scalar VFs, which now needs to match ExtractLastPart. PR: https://github.com/llvm/llvm-project/pull/171145	2025-12-09 11:59:40 +00:00
Florian Hahn	3fc7419236	[VPlan] Replace ExtractLast(Elem\|LanePerPart) with ExtractLast(Lane/Part) (#164124 ) Replace ExtractLastElement and ExtractLastLanePerPart with more generic and specific ExtractLastLane and ExtractLastPart, which model distinct parts of extracting across parts and lanes. ExtractLastElement == ExtractLastLane(ExtractLastPart) and ExtractLastLanePerPart == ExtractLastLane, the latter clarifying the name of the opcode. A new m_ExtractLastElement matcher is provided for convenience. The patch should be NFC modulo printing changes. PR: https://github.com/llvm/llvm-project/pull/164124	2025-12-07 15:15:43 +00:00
Florian Hahn	db85babddd	[VPlan] Use m_Intrinsic to match assumes/noalias_scope_decl (NFC). Use pattern matching to check for intrinsics to slightly simplify code.	2025-11-27 18:50:34 +00:00
Florian Hahn	f8eca64a28	Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )" This reverts commit a6edeedbfa308876d6f2b1648729d52970bb07e6. The following fixes have landed, addressing issues causing the original revert: * https://github.com/llvm/llvm-project/pull/169298 * https://github.com/llvm/llvm-project/pull/167897 * https://github.com/llvm/llvm-project/pull/168949 Original message: Building on top of https://github.com/llvm/llvm-project/pull/148817, introduce a new abstract LastActiveLane opcode that gets lowered to Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1). When folding the tail, update all extracts for uses outside the loop the extract the value of the last actice lane. See also https://github.com/llvm/llvm-project/issues/148603 PR: https://github.com/llvm/llvm-project/pull/149042	2025-11-26 20:03:55 +00:00
Florian Hahn	d58ebe339c	Revert "Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )"" This reverts commit 72e51d389f66d9cc6b55fd74b56fbbd087672a43. Missed some test updates.	2025-11-26 19:41:39 +00:00
Florian Hahn	72e51d389f	Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )" This reverts commit a6edeedbfa308876d6f2b1648729d52970bb07e6. The following fixes have landed, addressing issues causing the original revert: * https://github.com/llvm/llvm-project/pull/169298 * https://github.com/llvm/llvm-project/pull/167897 * https://github.com/llvm/llvm-project/pull/168949 Original message: Building on top of https://github.com/llvm/llvm-project/pull/148817, introduce a new abstract LastActiveLane opcode that gets lowered to Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1). When folding the tail, update all extracts for uses outside the loop the extract the value of the last actice lane. See also https://github.com/llvm/llvm-project/issues/148603 PR: https://github.com/llvm/llvm-project/pull/149042	2025-11-26 19:31:25 +00:00
Florian Hahn	091aece72b	[VPlan] Remove redundant transferFlags call from replicateByVF (NFC). Flags are now passed on construction/cloning. Remove unnecessary transferFlags call, and make code independent of VPRecipeWithIRFlags, to support additional recipes in the future.	2025-11-25 20:57:42 +00:00
Florian Hahn	2befda2225	[VPlan] Populate and use VPIRFlags from initial VPInstruction. (#168450 ) Update VPlan to populate VPIRFlags during VPInstruction construction and use it when creating widened recipes, instead of constructing VPIRFlags from the underlying IR instruction each time. The VPRecipeWithIRFlags constructor taking an underlying instruction and setting the flags based on it has been removed. This centralizes initial VPIRFlags creation and ensures flags are consistently available throughout VPlan transformations and makes sure we don't accidentally re-add flags from the underlying instruction that already got dropped during transformations. Follow-up to https://github.com/llvm/llvm-project/pull/167253, which did the same for VPIRMetadata. Should be NFC w.r.t. to the generated IR. PR: https://github.com/llvm/llvm-project/pull/168450	2025-11-18 15:15:14 +00:00
Florian Hahn	a6edeedbfa	Revert "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 )" This reverts commit 62d1a080e69e3c5e98840e000135afa7c688a77b. This appears to be causing some runtime failures on RISCV https://lab.llvm.org/buildbot/#/builders/210/builds/5221	2025-11-13 22:34:55 +00:00
Florian Hahn	62d1a080e6	[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-folding. (#149042 ) Building on top of https://github.com/llvm/llvm-project/pull/148817, introduce a new abstract LastActiveLane opcode that gets lowered to Not(Mask) → FirstActiveLane(NotMask) → Sub(result, 1). When folding the tail, update all extracts for uses outside the loop the extract the value of the last actice lane. See also https://github.com/llvm/llvm-project/issues/148603 PR: https://github.com/llvm/llvm-project/pull/149042	2025-11-12 15:11:00 +00:00
Ramkumar Ramachandra	eab44600fb	[VPlan] Rename onlyFirst(Lane\|Part)Used (NFC) (#166562 ) Rename onlyFirst(Lane\|Part)Used to usesFirst(Lane\|Part)Only, in line with usesScalars, for clarity.	2025-11-06 10:07:58 +00:00
Florian Hahn	6e83937f39	[VPlan] Add getConstantInt helpers for constant int creation (NFC). Add getConstantInt helper methods to VPlan to simplify the common pattern of creating constant integer live-ins. Suggested as follow-up in https://github.com/llvm/llvm-project/pull/164127.	2025-11-01 04:13:01 +00:00
Florian Hahn	a943132761	[VPlan] Add VPRegionBlock::getCanonicalIVType (NFC). (#164127 ) Split off from https://github.com/llvm/llvm-project/pull/156262. Similar to VPRegionBlock::getCanonicalIV, add helper to get the type of the canonical IV, in preparation for removing VPCanonicalIVPHIRecipe. PR: https://github.com/llvm/llvm-project/pull/164127	2025-10-31 20:05:02 -07:00
Florian Hahn	b9ce7656e9	[VPlan] Add VPInstruction to unpack vector values to scalars. (#155670 ) Add a new Unpack VPInstruction (name to be improved) to explicitly extract scalars values from vectors. Test changes are movements of the extracts: they are no generated together and also directly after the producer. Depends on https://github.com/llvm/llvm-project/pull/155102 (included in PR) PR: https://github.com/llvm/llvm-project/pull/155670	2025-10-19 18:49:05 +00:00
Florian Hahn	4f23767852	[VPlan] Add m_FirstActiveLane matcher (NFC). Add m_FirstActiveLane, to slightly simplify pattern matching in preparation for https://github.com/llvm/llvm-project/pull/149042.	2025-10-15 18:55:26 +01:00
Florian Hahn	861519327a	[VPlan] Move getCanonicalIV to VPRegionBlock (NFC). (#163020 ) The canonical IV is tied to region blocks; move getCanonicalIV there and update all users. PR: https://github.com/llvm/llvm-project/pull/163020	2025-10-15 12:48:35 +01:00
Ramkumar Ramachandra	869c76dda3	[VPlan] Allow zero-operand m_BranchOn(Cond\|Count) (NFC) (#162721 )	2025-10-13 08:50:09 +01:00
Ramkumar Ramachandra	f68f3b9a7e	[VPlan] Allow zero-operand m_VPInstruction (NFC) (#159550 )	2025-09-18 12:40:31 +01:00
Ramkumar Ramachandra	148a83543b	[LV] Introduce m_One and improve (0\|1)-match (NFC) (#157419 )	2025-09-15 10:34:06 +00:00
Florian Hahn	b8eaceb39b	[VPlan] Explicitly replicate VPInstructions by VF. (#155102 ) Extend replicateByVF added in #142433 (aa240293190) to also explicitly unroll replicating VPInstructions. Now the only remaining case where we replicate for all lanes is VPReplicateRecipes in replicate regions. PR: https://github.com/llvm/llvm-project/pull/155102	2025-09-12 17:06:26 +01:00
Florian Hahn	1efa997317	[VPlan] Handle stores to single-scalar addr in narrowToSingleScalars. Move handling of stores to single-scalar/uniform address from replicateByVF to narrowToSingleScalar.	2025-09-10 21:58:29 +01:00
Ramkumar Ramachandra	1e0e0e0a56	[VPlan] Improve style around container-inserts (NFC) (#155174 )	2025-08-26 14:12:59 +01:00
Florian Hahn	9f87cd68a4	[VPlan] Add m_ExtractLastElement matcher. (NFC)	2025-08-23 21:21:03 +01:00
Florian Hahn	d67dba5e88	[VPlan] Check Def2LaneDefs first in cloneForLane. (NFC) If we have entries in Def2LaneDefs, we always have to use it. Move the check before. Otherwise we may not pick the correct operand, e.g. if Op was a replicate recipe that got single-scalar after replicating it. Fixes https://github.com/llvm/llvm-project/issues/154330.	2025-08-21 11:34:49 +01:00
Florian Hahn	7e9989390d	[VPlan] Materialize Build(Struct)Vectors for VPReplicateRecipes. (NFCI) (#151487 ) Materialze Build(Struct)Vectors explicitly for VPRecplicateRecipes, to serve their users requiring a vector, instead of doing so when unrolling by VF. Now we only need to implicitly build vectors in VPTransformState::get for VPInstructions. Once they are also unrolled by VF we can remove the code-path alltogether. PR: https://github.com/llvm/llvm-project/pull/151487	2025-08-18 20:49:42 +01:00
Luke Lau	aea82a780a	[VPlan] Remove some getCanonicalIV() uses. NFC (#152969 ) A lot of time getCanonicalIV() is used to get the canonical IV type, e.g. to instantiate a VPTypeAnalysis or to get the LLVMContext. However VPTypeAnalysis has a constructor that takes the VPlan directly and there's a method on VPlan to get the LLVMContext directly, so use those instead where possible. This lets us remove a constructor on VPTypeAnalysis. Also remove an unused LLVMContext argument in UnrollState whilst we're here.	2025-08-11 18:12:05 +08:00
Luke Lau	94a6cd464e	[VPlan] Expand VPWidenPointerInductionRecipe into separate recipes (#148274 ) This is the VPWidenPointerInductionRecipe equivalent of #118638, with the motivation of allowing us to use the EVL as the induction step. There is a new VPInstruction added, WidePtrAdd to allow adding the step vector to the induction phi, since VPInstruction::PtrAdd only handles scalars or multiple scalar lanes. Originally this transformation was copied from the original recipe's execute code, but it's since been simplifed by teaching `unrollWidenInductionByUF` to unroll the recipe, which brings it inline with VPWidenIntOrFpInductionRecipe.	2025-08-05 16:54:02 +08:00
Florian Hahn	80c43b6c07	[VPlan] Add ExtractLane VPInst to extract across multiple parts. (#148817 ) This patch adds a new ExtractLane VPInstruction which extracts across multiple parts using a wide index, to be used in combination with FirstActiveLane. The patch updates early-exit codegen to use it instead ExtractElement, which is only per-part. With this change, interleaving should work correctly with early-exit loops. The patch removes the restrictions added in 6f43754e9 (#145877), but does not yet automatically select interleave counts > 1 for early-exit loops. I'll share a patch as follow-up. The cost of extracting a lane adds non-trivial overhead in the exit block, so that should be considered when picking the interleave count. PR: https://github.com/llvm/llvm-project/pull/148817	2025-07-27 08:08:25 +01:00
James Y Knight	093afed969	[VPlan] Fix miscompile after PR #142433 . (#147398 ) This fixes a bug introduced by aa2402931908317f5cc19b164ef17c5a74f2ae67, "[VPlan] Unroll VPReplicateRecipe by VF", which cloned a VPReplicateRecipe without transferring the flags from the original. That can cause incorrect nsw/nuw flags to be emitted on the new instructions, which may result in miscompiles. It turns out there were no test-cases in the repo which end up hitting the situation where the recipe requires instruction clones to have different flags from the underlying instruction. The existing tests covered the flags being correct when the replacement instruction is a vectorized version of the initial instruction, but not when it required clones. A new test is added covering this.	2025-07-08 14:51:27 -04:00
Florian Hahn	20fbbd7675	[LV] Add support for cmp reductions with decreasing IVs. (#140451 ) Similar to FindLastIV, add FindFirstIVSMin to support select (icmp(), x, y) reductions where one of x or y is a decreasing induction, producing a SMin reduction. It uses signed max as sentinel value. PR: https://github.com/llvm/llvm-project/pull/140451	2025-06-29 11:17:03 +01:00
Florian Hahn	1949536494	[VPlan] Also visit VPBBs outside loop region when unrolling by VF. Make sure all VPBBs outside the top-level loop region and directly inside the region are visited; all those blocks may contain VPReplicateRecipes that need unrolling. This makes sure we unroll VPRepicateRecipes by VF if they are hoisted out of the loop, but cannot be converted to single scalar recipes yet.	2025-06-28 19:02:22 +01:00
Florian Hahn	ec62dee703	[VPlan] Handle FirstActiveLane when unrolling. (#145394 ) Currently FirstActiveLane is not handled correctly during unrolling. This is currently causing mis-compiles when vectorizing early-exit loops with interleaving forced. This patch updates handling of FirstActiveLane to be analogous to computing final reduction results: during unrolling, the created copies for its original operand are added as additional operands, and FirstActiveLane will always produce the index of the first active lane across all unrolled iterations. Note that some of the generated code is still incorrect, as we also need to handle ExtractElement with FirstActiveLane operands. I will share patches for those soon as well. PR: https://github.com/llvm/llvm-project/pull/145394	2025-06-27 08:44:57 +01:00
Florian Hahn	5b76cdba5a	[VPlan] Handle AnyOf when unrolling. (#145340 ) Currently AnyOf is not handled correctly during unrolling. This is currently causing mis-compiles when vectorizing early-exit loops with interleaving forced (even though selectInterleaveCount will currently only pick IC = 1, unless forced by the user). This patch updates handling of AnyOf to be analogous to computing final reduction results: during unrolling, the created copies for its original operand are added as additional operands, and AnyOf will always produce the reduced value across all unrolled iterations. Note that the generated code is still incorrect, as we also need to handle FirstActiveLane and ExtractElement with FirstActiveLane operands. I will share patches for those soon as well. PR: https://github.com/llvm/llvm-project/pull/145340	2025-06-26 14:19:38 +01:00
Florian Hahn	aa24029319	[VPlan] Unroll VPReplicateRecipe by VF. (#142433 ) Explicitly unroll VPReplicateRecipes outside replicate regions by VF, replacing them by VF single-scalar recipes. Extracts for operands are added as needed and the scalar results are combined to a vector using a new BuildVector VPInstruction. It also adds a few folds to simplify unnecessary extracts/BuildVectors. It also adds a BuildStructVector opcode for handling of calls that have struct return types. VPReplicateRecipe in replicate regions can will be unrolled as follow up, turing non-single-scalar VPReplicateRecipes into 'abstract', i.e. not executable. PR: https://github.com/llvm/llvm-project/pull/142433	2025-06-26 11:19:09 +01:00
Florian Hahn	c4c2d777f4	[VPlan] Fix handling of ReductionStartVector for rdxs when unrolling. Update handling of ReductionStartVector in VPlanUnroll for partial reductions. The new code makes sure all parts are properly set to the cloned ReductionStartVector. Fixes a mis-compile reported for https://github.com/llvm/llvm-project/pull/142290.	2025-06-19 13:26:19 +01:00
Florian Hahn	f68848015f	[VPlan] Manage Sentinel value for FindLastIV in VPlan. (#142291 ) Similar to modeling the start value as operand, also model the sentinel value as operand explicitly. This makes all require information for code-gen available directly in VPlan. PR: https://github.com/llvm/llvm-project/pull/142291	2025-06-13 19:17:01 +01:00
Florian Hahn	6108d50aed	[VPlan] Add ReductionStartVector VPInstruction. (#142290 ) Add a new VPInstruction::ReductionStartVector opcode to create the start values for wide reductions. This more accurately models the start value creation in VPlan and simplifies VPReductionPHIRecipe::execute. Down the line it also allows removing VPReductionPHIRecipe::RdxDesc. PR: https://github.com/llvm/llvm-project/pull/142290	2025-06-09 20:59:12 +01:00
Florian Hahn	5520ab3d50	[VPlan] Add ComputeAnyOfResult VPInstruction (NFC) (#141932 ) Add a dedicated opcode for any-of reduction, similar to https://github.com/llvm/llvm-project/pull/132689 and https://github.com/llvm/llvm-project/pull/132690. The patch also explictly adds the start value to not require RecurrenceDescriptor during execute. It also allows freezing the start value to make it poison-safe. PR: https://github.com/llvm/llvm-project/pull/141932	2025-06-03 14:33:53 +01:00
Florian Hahn	c0506a11f4	[VPlan] Separate out logic to manage IR flags to VPIRFlags (NFC). (#140621 ) This patch moves the logic to manage IR flags to a separate VPIRFlags class. For now, VPRecipeWithIRFlags is the only class that inherits VPIRFlags. The new class allows for simpler passing of flags when constructing recipes, simplifying the constructors for various recipes (VPInstruction in particular, which now just has 2 constructors, one taking an extra VPIRFlags argument. This mirrors the approach taken for VPIRMetadata and makes it easier to extend in the future. The patch also adds a unified flagsValidForOpcode to check if the flags in a VPIRFlags match the provided opcode. PR: https://github.com/llvm/llvm-project/pull/140621	2025-05-25 11:13:11 +01:00
Florian Hahn	df21288247	[VPlan] Replace ExtractFromEnd with Extract(Last\|Penultimate)Element (NFC). (#137030 ) ExtractFromEnd only has 2 uses, extracting the last and penultimate elements. Replace it with 2 separate opcodes, removing the need to materialize and handle a constant argument. PR: https://github.com/llvm/llvm-project/pull/137030	2025-04-25 16:27:29 +01:00

1 2

64 Commits