llvm-project

Author	SHA1	Message	Date
Kazu Hirata	3a7876d789	[llvm] Delete pointers without null checks (NFC) (#168183 ) Identified with readability-delete-null-pointer.	2025-11-15 08:06:08 -08:00
Florian Hahn	519cf3c2b8	[VPlan] Remove unneeded getDefiningRecipe with isa/cast/dyn_cast. (NFC) Classof for most recipes directly supports VPValue, so there is no need to call getDefiningRecipe when using isa/cast/dyn_cast.	2025-11-11 22:07:48 +00:00
Florian Hahn	0767c64043	[VPlan] Use getDefiningRecipe instead of directly accessing Def. (NFC) Use getDefiningRecipe to future-proof the code. Split off from https://github.com/llvm/llvm-project/pull/156262 as suggested.	2025-11-10 21:55:19 +00:00
Kazu Hirata	3ce5df408b	[Vectorize] Remove a redundant declaration (NFC) (#167188 ) EnableVPlanNativePath is declared in LoopVectorizationPlanner.h. Identified with readability-redundant-declaration.	2025-11-08 22:28:00 -08:00
Ramkumar Ramachandra	c1ca4a55d4	[VPlan] Strip redundant code in VPTransformState::get (NFC) (#166145 ) vputils::isSingleScalar is sufficient.	2025-11-05 21:59:47 +00:00
Florian Hahn	bfc322dd72	Revert "[VPlan] Run narrowInterleaveGroups during general VPlan optimizations. (#149706 )" This reverts commit 8d29d09309654541fb2861524276ada6a3ebf84c. There have been reports of mis-compiles in https://github.com/llvm/llvm-project/pull/149706. Revert while I investigate.	2025-10-22 21:27:11 +01:00
Florian Hahn	82b59345fe	[VPlan] Clarify naming for helpers to create loop&replicate regions (NFC) Split off to clarify naming, as suggested in https://github.com/llvm/llvm-project/pull/156262.	2025-10-21 20:41:54 +01:00
Ramkumar Ramachandra	2ec01e430a	[VPlan] Move two VPBlockUtils members (NFC) (#162507 )	2025-10-21 16:40:13 +01:00
Florian Hahn	8d29d09309	[VPlan] Run narrowInterleaveGroups during general VPlan optimizations. (#149706 ) Move narrowInterleaveGroups to to general VPlan optimization stage. To do so, narrowInterleaveGroups now has to find a suitable VF where all interleave groups are consecutive and saturate the full vector width. If such a VF is found, the original VPlan is split into 2: a) a new clone which contains all VFs of Plan, except VFToOptimize, and b) the original Plan with VFToOptimize as single VF. The original Plan is then optimized. If a new copy for the other VFs has been created, it is returned and the caller has to add it to the list of candidate plans. Together with https://github.com/llvm/llvm-project/pull/149702, this allows to take the narrowed interleave groups into account when computing costs to choose the best VF and interleave count. One example where we currently miss interleaving/unrolling when narrowing interleave groups is https://godbolt.org/z/Yz77zbacz PR: https://github.com/llvm/llvm-project/pull/149706	2025-10-21 11:37:42 +01:00
Ramkumar Ramachandra	0a4702407b	[VPlan] Improve code around canConstantBeExtended (NFC) (#161652 ) Follow up on 7c4f188 ([LV] Support multiplies by constants when forming scaled reductions), introducing m_APInt, and improving code around canConstantBeExtended: we change canConstantBeExtended to take an APInt.	2025-10-16 13:03:13 +01:00
Ramkumar Ramachandra	869c76dda3	[VPlan] Allow zero-operand m_BranchOn(Cond\|Count) (NFC) (#162721 )	2025-10-13 08:50:09 +01:00
Florian Hahn	ae7b15f2e2	[VPlan] Return invalid for scalable VF in VPReplicateRecipe::computeCost Replication is currently not supported for scalable VFs. Make sure VPReplicateRecipe::computeCost returns an invalid cost early, for scalable VFs if the recipe is not a single-scalar. Note that this moves the existing invalid-costs.ll out of the AArch64 subdirectory, as it does not use a target triple. Fixes https://github.com/llvm/llvm-project/issues/160792.	2025-10-11 19:28:02 +01:00
Florian Hahn	74af5784a5	Reapply "[VPlan] Compute cost of more replicating loads/stores in ::computeCost. (#160053 )" (#162157 ) This reverts commit f80c0baf058dbdc5 and 94eade61a02ae5. Recommit a small fix for targets using prefersVectorizedAddressing. Original message: Update VPReplicateRecipe::computeCost to compute costs of more replicating loads/stores. There are 2 cases that require extra checks to match the legacy cost model: 1. If the pointer is based on an induction, the legacy cost model passes its SCEV to getAddressComputationCost. In those cases, still fall back to the legacy cost. SCEV computations will be added as follow-up 2. If a load is used as part of an address of another load, the legacy cost model skips the scalarization overhead. Those cases are currently handled by a usedByLoadOrStore helper. Note that getScalarizationOverhead also needs updating, because when the legacy cost model computes the scalarization overhead, scalars have not been collected yet, so we can't each for replicating recipes to skip their cost, except other loads. This again can be further improved by modeling inserts/extracts explicitly and consistently, and compute costs for those operations directly where needed. PR: https://github.com/llvm/llvm-project/pull/160053	2025-10-06 22:16:08 +01:00
Alexey Bataev	f80c0baf05	Revert "Reapply "[VPlan] Compute cost of more replicating loads/stores in ::computeCost. (#160053 )" (#161724 )" This reverts commit 8f2466bc72a5ab163621cb1bf4bf53a27f1cefe7 to fix crashes reported in commits	2025-10-05 08:38:17 -07:00
Florian Hahn	8f2466bc72	Reapply "[VPlan] Compute cost of more replicating loads/stores in ::computeCost. (#160053 )" (#161724 ) This reverts commit f61be4352592639a0903e67a9b5d3ec664ad4d23. Recommit a small fix handling scalarization overhead consistently with legacy cost model if a load is used directly as operand of another memory operation, which fixes https://github.com/llvm/llvm-project/issues/161404. Original message: Update VPReplicateRecipe::computeCost to compute costs of more replicating loads/stores. There are 2 cases that require extra checks to match the legacy cost model: 1. If the pointer is based on an induction, the legacy cost model passes its SCEV to getAddressComputationCost. In those cases, still fall back to the legacy cost. SCEV computations will be added as follow-up 2. If a load is used as part of an address of another load, the legacy cost model skips the scalarization overhead. Those cases are currently handled by a usedByLoadOrStore helper. Note that getScalarizationOverhead also needs updating, because when the legacy cost model computes the scalarization overhead, scalars have not been collected yet, so we can't each for replicating recipes to skip their cost, except other loads. This again can be further improved by modeling inserts/extracts explicitly and consistently, and compute costs for those operations directly where needed. PR: https://github.com/llvm/llvm-project/pull/160053	2025-10-02 22:00:22 +01:00
Florian Hahn	7c4f188f27	[LV] Support multiplies by constants when forming scaled reductions. (#161092 ) We can create partial reductions for multiplies with constants, if the constant is small enough to be extended from source to destination type w/o changing the value. This only handles constant on the right side of a multiply, relying on other passes to canonicalize the input. Alive2 Proofs: https://alive2.llvm.org/ce/z/iWRMr6 PR: https://github.com/llvm/llvm-project/pull/161092	2025-10-02 10:53:17 +00:00
Florian Hahn	94b2617590	[VPlan] Remove VPIRPhis in exit blocks when deleting scalar loop BBs. DeleteDeadBlocks will remove single-entry phis. Remove them from the exit VPIRBBs in VPlan as well, otherwise we would retain references to deleted IR instructions. Fixes MSan failures after 8907b6d39 https://lab.llvm.org/buildbot/#/builders/164/builds/14013	2025-10-01 22:01:23 +01:00
Florian Hahn	8907b6d393	[VPlan] Remove original loop blocks if dead. (#155497 ) Build on top of https://github.com/llvm/llvm-project/pull/154510 to completely remove the blocks of dead scalar loops. Depends on https://github.com/llvm/llvm-project/pull/154510. PR: https://github.com/llvm/llvm-project/pull/155497	2025-10-01 16:53:59 +00:00
Florian Hahn	f61be43525	Revert "[VPlan] Compute cost of more replicating loads/stores in ::computeCost. (#160053 )" This reverts commit b4be7ecaf06bfcb4aa8d47c4fda1eed9bbe4ae77. See https://github.com/llvm/llvm-project/issues/161404 for a crash exposed by the change. Revert while I investigate.	2025-09-30 22:13:06 +01:00
Florian Hahn	b4be7ecaf0	[VPlan] Compute cost of more replicating loads/stores in ::computeCost. (#160053 ) Update VPReplicateRecipe::computeCost to compute costs of more replicating loads/stores. There are 2 cases that require extra checks to match the legacy cost model: 1. If the pointer is based on an induction, the legacy cost model passes its SCEV to getAddressComputationCost. In those cases, still fall back to the legacy cost. SCEV computations will be added as follow-up 2. If a load is used as part of an address of another load, the legacy cost model skips the scalarization overhead. Those cases are currently handled by a usedByLoadOrStore helper. Note that getScalarizationOverhead also needs updating, because when the legacy cost model computes the scalarization overhead, scalars have not been collected yet, so we can't each for replicating recipes to skip their cost, except other loads. This again can be further improved by modeling inserts/extracts explicitly and consistently, and compute costs for those operations directly where needed. PR: https://github.com/llvm/llvm-project/pull/160053	2025-09-29 08:08:09 +00:00
Florian Hahn	41f3438362	[VPlan] Remove dead code for scalar VFs in VPRegionBlock::cost (NFC). The VPlan cost model is not used to compute costs of scalar VFs currently, as conversion to replicate regions makes accurately computing the original scalar cost difficult. Remove left over, dead code.	2025-09-28 17:30:57 +01:00
Shih-Po Hung	0d22f8344a	[LV][EVL] Remove metadata on EVL vectorized loops (#155760 ) This patch removes the metadata emission for EVL‑vectorized loops, since there is no current in-tree consumer: 1) after VPlan performs canonical IV replacement #147222 and 2) RISCV dropped EVLIndVarSimplifyPass #151483, which was the only user of this metadata.	2025-09-23 07:39:33 +08:00
Florian Hahn	50b9ca4dda	[VPlan] Simplify Plan's entry in removeBranchOnConst. (#154510 ) After https://github.com/llvm/llvm-project/pull/153643, there may be a BranchOnCond with constant condition in the entry block. Simplify those in removeBranchOnConst. This removes a number of redundant conditional branch from entry blocks. In some cases, it may also make the original scalar loop unreachable, because we know it will never execute. In that case, we need to remove the loop from LoopInfo, because all unreachable blocks may dominate each other, making LoopInfo invalid. In those cases, we can also completely remove the loop, for which I'll share a follow-up patch. Depends on https://github.com/llvm/llvm-project/pull/153643. PR: https://github.com/llvm/llvm-project/pull/154510	2025-09-18 19:25:05 +01:00
Florian Hahn	30e9cbacab	[VPlan] Move logic to compute scalarization overhead to cost helper(NFC) Extract the logic to compute the scalarization overhead to a helper for easy re-use in the future.	2025-09-13 20:41:44 +01:00
Florian Hahn	b8eaceb39b	[VPlan] Explicitly replicate VPInstructions by VF. (#155102 ) Extend replicateByVF added in #142433 (aa240293190) to also explicitly unroll replicating VPInstructions. Now the only remaining case where we replicate for all lanes is VPReplicateRecipes in replicate regions. PR: https://github.com/llvm/llvm-project/pull/155102	2025-09-12 17:06:26 +01:00
Florian Hahn	8796dfdcba	[VPlan] Consolidate logic to update loop metadata and profile info. This patch consolidates updating loop metadata and profile info for both the remainder and vector loops in a single place. This is NFC, modulo consistently applying vectorization specific metadata also in the experimental VPlan-native path. Split off from https://github.com/llvm/llvm-project/pull/154510.	2025-09-04 21:50:40 +01:00
Florian Hahn	507ff082c2	[VPlan] Move runtime check blocks to correct position during exec (NFC). Move adjusting the position of completely disconnected IR blocks to VPIRBasicBlock::execute.	2025-09-01 16:15:02 +01:00
Florian Hahn	a53a5ed65d	[VPlan] Add VPBlockBase::hasPredecessors (NFC). Split off from https://github.com/llvm/llvm-project/pull/154510/, add helper to check if a block has any predecessors.	2025-09-01 09:44:49 +01:00
Ramkumar Ramachandra	1e0e0e0a56	[VPlan] Improve style around container-inserts (NFC) (#155174 )	2025-08-26 14:12:59 +01:00
Florian Hahn	7e9989390d	[VPlan] Materialize Build(Struct)Vectors for VPReplicateRecipes. (NFCI) (#151487 ) Materialze Build(Struct)Vectors explicitly for VPRecplicateRecipes, to serve their users requiring a vector, instead of doing so when unrolling by VF. Now we only need to implicitly build vectors in VPTransformState::get for VPInstructions. Once they are also unrolled by VF we can remove the code-path alltogether. PR: https://github.com/llvm/llvm-project/pull/151487	2025-08-18 20:49:42 +01:00
Florian Hahn	5892a2beec	[VPlan] Remove dead code from GetBroadCastInstr (NFCI). All relevant places should already explicitly materialize broadcasts. Remove dead code from VPTransformState::get	2025-08-17 21:51:14 +01:00
Florian Hahn	424258947e	[VPlan] Materialize VF and VFxUF using VPInstructions. (#152879 ) Materialize VF and VFxUF computation using VPInstruction instead of directly creating IR. This is one of the last few steps needed to model the full vector skeleton in VPlan. This is mostly NFC, although in some cases we remove some unused computations. PR: https://github.com/llvm/llvm-project/pull/152879	2025-08-12 14:13:13 +01:00
Luke Lau	aea82a780a	[VPlan] Remove some getCanonicalIV() uses. NFC (#152969 ) A lot of time getCanonicalIV() is used to get the canonical IV type, e.g. to instantiate a VPTypeAnalysis or to get the LLVMContext. However VPTypeAnalysis has a constructor that takes the VPlan directly and there's a method on VPlan to get the LLVMContext directly, so use those instead where possible. This lets us remove a constructor on VPTypeAnalysis. Also remove an unused LLVMContext argument in UnrollState whilst we're here.	2025-08-11 18:12:05 +08:00
Florian Hahn	82d633e9ff	[VPlan] Materialize vector trip count using VPInstructions. (#151925 ) Materialize the vector trip count computation using VPInstruction instead of directly creating IR. This is one of the last few steps needed to model the full vector skeleton in VPlan. It also simplifies vector-trip count computations for scalable vectors, as we can re-use the UF x VF computation. PR: https://github.com/llvm/llvm-project/pull/151925	2025-08-08 11:44:32 +01:00
Florian Hahn	95c32bf2d4	[VPlan] Return invalid cost if any skeleton block has invalid costs. (#151940 ) We need to reject plans that contain recipes with invalid costs. LICM can move recipes with invalid costs out of the loop region, which then get missed by the main cost computation. Extend the logic to check recipes for invalid cost currently only covering the middle block to include all skeleton blocks. Fixes https://github.com/llvm/llvm-project/issues/144358 Fixes https://github.com/llvm/llvm-project/issues/151664 PR: https://github.com/llvm/llvm-project/pull/151940	2025-08-07 10:45:27 +01:00
Luke Lau	94a6cd464e	[VPlan] Expand VPWidenPointerInductionRecipe into separate recipes (#148274 ) This is the VPWidenPointerInductionRecipe equivalent of #118638, with the motivation of allowing us to use the EVL as the induction step. There is a new VPInstruction added, WidePtrAdd to allow adding the step vector to the induction phi, since VPInstruction::PtrAdd only handles scalars or multiple scalar lanes. Originally this transformation was copied from the original recipe's execute code, but it's since been simplifed by teaching `unrollWidenInductionByUF` to unroll the recipe, which brings it inline with VPWidenIntOrFpInductionRecipe.	2025-08-05 16:54:02 +08:00
Florian Hahn	559d1dff89	[VPlan] Materialize BackedgeTakenCount using VPInstructions. Explicitly compute the backedge-taken count using VPInstruction. This is needed to model the full skeleton in VPlan. NFC modulo some instruction re-ordering.	2025-08-03 12:21:28 +01:00
Florian Hahn	fa3ec0c17c	[VPlan] Materialize constant vector trip counts before final opts. (#142309 ) Materialize constant vector trip counts before ::execute, if the trip count can be computed as Original (TC / (VF * UF)) * (VF * UF). For now this excludes when the tail is folded or scalar epilogues are required. This enables removing a number of redundant branches from the middle block. For now this is also only done when not vectorizing the epilogue, as the simplification complicates stitching the 2 plans together. PR: https://github.com/llvm/llvm-project/pull/142309	2025-07-26 17:16:36 +01:00
Florian Hahn	64686c59c3	[VPlan] Connect (MemRuntime\|SCEV)Check blocks as VPlan transform (NFC). (#143879 ) Connect SCEV and memory runtime check block directly in VPlan as VPIRBasicBlocks, removing ILV::emitSCEVChecks and ILV::emitMemRuntimeChecks. The new logic is currently split across LoopVectorizationPlanner::addRuntimeChecks which collects a list of {Condition, CheckBlock} pairs and performs some checks and emits remarks if needed. The list of checks is then added to VPlan in VPlanTransforms::connectCheckBlocks. PR: https://github.com/llvm/llvm-project/pull/143879	2025-07-09 14:03:25 +02:00
Igor Kirillov	aeec2c6e48	[VPlan] Speed up VPSlotTracker by using ModuleSlotTracker (#139881 ) Currently, when VPSlotTracker is initialized with a VPlan, its assignName method calls printAsOperand on each underlying instruction. Each such call recomputes slot numbers for the entire function, leading to O(N × M) complexity, where M is the number of instructions in the loop and N is the number of instructions in the function. This results in slow debug output for large loops. For example, printing costs of all instructions becomes O(M² × N), which is especially painful when enabling verbose dumps. This patch improves debugging performance by caching slot numbers using ModuleSlotTracker. It avoids redundant recomputation and makes debug output significantly faster.	2025-06-26 22:40:48 +01:00
Florian Hahn	aa24029319	[VPlan] Unroll VPReplicateRecipe by VF. (#142433 ) Explicitly unroll VPReplicateRecipes outside replicate regions by VF, replacing them by VF single-scalar recipes. Extracts for operands are added as needed and the scalar results are combined to a vector using a new BuildVector VPInstruction. It also adds a few folds to simplify unnecessary extracts/BuildVectors. It also adds a BuildStructVector opcode for handling of calls that have struct return types. VPReplicateRecipe in replicate regions can will be unrolled as follow up, turing non-single-scalar VPReplicateRecipes into 'abstract', i.e. not executable. PR: https://github.com/llvm/llvm-project/pull/142433	2025-06-26 11:19:09 +01:00
LiqinWeng	4ac4726d00	[VPlan] Format some print forms.NFC (#144644 )	2025-06-25 16:14:50 +08:00
Florian Hahn	9f7a155394	[VPlan] Update packScalarIntoVector to take and return wide value (NFC) Make the function more flexible in preparation for new users.	2025-06-21 18:03:14 +01:00
Arthur Eubanks	dfe4d44d8d	Revert "[VPlan] Remove unnecessary DomTreeUpdater flush (NFC)." (#144758 ) This reverts commit 2e337349f436d75af112c081df5ec683871cbcc8. Causes breakages internally, will post reproducer later.	2025-06-18 11:00:13 -07:00
Luke Lau	9dd1c66e8f	[VPlan] Expand VPWidenIntOrFpInductionRecipe into separate recipes (#118638 ) The motivation of this PR is to make #115274 easier to implement, and should allow us to add EVL support by just passing EVL to the VF operand. The current difficulty with widening IVs with EVL is that VPWidenIntOrFpInductionRecipe generates its own backedge value. Since it's a VPHeaderPHIRecipe the VF operand must be in the preheader, which means we can't use the EVL since it's defined in the loop body. The gist in this PR is to take the approach in #114305 and expand VPWidenIntOrFpInductionRecipe into several recipes for the initial value, phi and backedge value just before execution. I.e. this example: ``` vector.ph: Successor(s): vector loop <x1> vector loop: { vector.body: WIDEN-INDUCTION %i = phi %start, %step, %vf ... EMIT branch-on-count ... No successors } ``` gets expanded to: ``` vector.ph: ... vp<%induction.start> = ... vp<%induction.increment> = ... Successor(s): vector loop <x1> vector loop: { vector.body: ir<%i> = WIDEN-PHI vp<%induction.start>, vp<%vec.ind.next> ... vp<%vec.ind.next> = add ir<%i>, vp<%induction.increment> EMIT branch-on-count ... No successors } ``` This allows us to a value defined in the loop in the backedge value, and also means we can just reuse the existing backedge fixups in VPlan::execute without having to specially handle it ourselves. After this #115274 should just become a matter of setting the VF operand to EVL (and building the increment step in the loop body, not the preheader).	2025-06-17 18:24:07 +01:00
Florian Hahn	790df93298	[VPlan] Mark VPFirstOrderRecurrencePHI as not reading/writing memory. First-order recurrence phis don't have side-effects and don't read or write memory. Mark them as such.	2025-06-15 22:00:47 +01:00
David Sherwood	541e5118ce	[LV] Use getFixedValue instead of getKnownMinValue when appropriate (#143526 ) There are many places in VPlan and LoopVectorize where we use getKnownMinValue to discover the number of elements in a vector. Where we expect the vector to have a fixed length, I have used the stronger getFixedValue call. I believe this is clearer and adds extra protection in the form of an assert in getFixedValue that the vector is not scalable. While looking at VPFirstOrderRecurrencePHIRecipe::computeCost I also took the liberty of simplifying the code. In theory I believe this patch should be NFC, but I'm reluctant to add that to the title in case we're just missing tests for some of the VPlan changes. I built and ran the LLVM test suite when targeting neoverse-v1 and it seemed ok.	2025-06-13 11:43:50 +01:00
Stephen Tozer	a08a831515	[DLCov][NFC] Propagate annotated DebugLocs through transformations (#138047 ) Part of the coverage-tracking feature, following #107279. In order for DebugLoc coverage testing to work, we firstly have to set annotations for intentionally-empty DebugLocs, and secondly we have to ensure that we do not drop these annotations as we propagate DebugLocs throughout compilation. As the annotations exist as part of the DebugLoc class, and not the underlying DILocation, they will not survive a DebugLoc->DILocation->DebugLoc roundtrip. Therefore this patch modifies a number of places in the compiler to propagate DebugLocs directly rather than via the underlying DILocation. This has no effect on the output of normal builds; it only ensures that during coverage builds, we do not drop incorrectly annotations and therefore create false positives. The bulk of these changes are in replacing DILocation::getMergedLocation(s) with a DebugLoc equivalent, and in changing the IRBuilder to store a DebugLoc directly rather than storing DILocations in its general Metadata array. We also use a new function, `DebugLoc::orElse`, which selects the "best" DebugLoc out of a pair (valid location > annotated > empty), preferring the current DebugLoc on a tie - this encapsulates the existing behaviour at a few sites where we _may_ assign a DebugLoc to an existing instruction, while extending the logic to handle annotation DebugLocs at the same time.	2025-06-12 14:06:27 +01:00
Florian Hahn	2e337349f4	[VPlan] Remove unnecessary DomTreeUpdater flush (NFC). The current version does not need the explicit flush at this point.	2025-06-05 08:17:42 +01:00
Florian Hahn	2eab83f618	[VPlan] Remove CanonicalIV when dissolving loop regions (NFC). (#142372 ) Directly replace the canonical IV when we dissolve the containing region. That ensures that it won't get removed before the region gets removed, which would result in an invalid region. This removes the current ordering constraint between convertToConcreteRecipes and dissolving regions. PR: https://github.com/llvm/llvm-project/pull/142372	2025-06-03 10:05:28 +01:00

1 2 3 4 5 ...

452 Commits