llvm-project

Author	SHA1	Message	Date
Ramkumar Ramachandra	1de55c9693	[VPlan] Avoid sinking allocas in sinkScalarOperands (#166135 ) Use cannotHoistOrSinkRecipe to forbid sinking allocas.	2025-11-05 13:06:24 +00:00
Florian Hahn	290ff955f0	[VPlan] Verify incoming values of VPIRPhi matches before checking (NFC) Update the verifier to first check if the number of incoming values matches the number of predecessors, before using incoming_values_and_blocks. We unfortunately need also check here, as this may be called before verifyPhiRecipes runs. Also update the verifier unit tests, to actually fail for the expected recipes.	2025-11-04 18:34:14 +00:00
Florian Hahn	af9a4263a1	[LAA] Only use inbounds/nusw in isNoWrap if the GEP is dereferenced. (#161445 ) Update isNoWrap to only use the inbounds/nusw flags from GEPs that are guaranteed to be dereferenced on every iteration. This fixes a case where we incorrectly determine no dependence. I think the issue is isolated to code that evaluates the resulting AddRec at BTC, just using it to compute the distance between accesses should still be fine; if the access does not execute in a given iteration, there's no dependence in that iteration. But isolating the code is not straight-forward, so be conservative for now. The practical impact should be very minor (only one loop changed across a corpus with 27k modules from large C/C++ workloads. Fixes https://github.com/llvm/llvm-project/issues/160912. PR: https://github.com/llvm/llvm-project/pull/161445	2025-11-04 17:08:12 +00:00
Julian Nagele	28a20b4af9	[VectorCombine] Avoid inserting freeze when scalarizing extend-extract if all extracts would lead to UB on poison. (#164683 ) This change aims to avoid inserting a freeze instruction between the load and bitcast when scalarizing extend-extract. This is particularly useful in combination with https://github.com/llvm/llvm-project/pull/164682, which can then potentially further scalarize, provided there is no freeze. alive2 proof: https://alive2.llvm.org/ce/z/W-GD88	2025-11-04 12:39:04 +00:00
Ramkumar Ramachandra	0a95a86634	[VPlan] Fix first-lane comment in sinkScalarOperands (NFC) (#166347 ) To follow-up on a post-commit review.	2025-11-04 12:02:58 +00:00
Ramkumar Ramachandra	0cae0af520	[VPlan] Shorten insert-idiom in sinkScalarOperands (NFC) (#166343 ) To follow-up on a post-commit review.	2025-11-04 10:04:57 +00:00
Florian Hahn	ce925820d8	[VPlan] Use operands() driectly in VPInstruction::clone() (NFC). There's no need to create temporary SmallVectors.	2025-11-03 16:28:27 +00:00
Alexey Bataev	7d5659083c	[SLP]Do not create copyable node, if parent node is non-schedulable and has a use in binop. If the parent node is non-schedulable (only externally used instructions), and at least one instruction has multiple uses and used in the binop, such copyable node should be created. Otherwise, it may contain wrong def-use chain model, which cannot be effective detected. Fixes #166035	2025-11-03 08:00:22 -08:00
Mel Chen	40a042e49c	[VPlanTransform] Specialize simplifyRecipe for VPSingleDefRecipe pointer. nfc (#165568 ) The function simplifyRecipe now takes a VPSingleDefRecipe pointer since it only simplifies single-def recipes for now.	2025-11-03 09:00:54 +00:00
Luke Lau	97d4e96cc5	[VPlan] Perform optimizeMaskToEVL in terms of pattern matching (#155394 ) Currently in optimizeMaskToEVL we convert every widened load, store or reduction to a VP predicated recipe with EVL, regardless of whether or not it uses the header mask. So currently we have to be careful when working on other parts VPlan to make sure that the EVL transform doesn't break or transform something incorrectly, because it's not a semantics preserving transform. Forgetting to do so has caused miscompiles before, like the case that was fixed in #113667 This PR rewrites it to work in terms of pattern matching, so it now only converts a recipe to a VP predicated recipe if it is exactly masked with the header mask. After this the transform should be a true optimisation and not change any semantics, so it shouldn't miscompile things if other parts of VPlan change. This fixes #152541, and allows us to move addExplicitVectorLength into tryToBuildVPlanWithVPRecipes in #153144 It also splits out the load/store transforms into separate patterns for reversed and non-reversed, which should make #146525 easier to implement and reason about.	2025-11-03 16:53:18 +08:00
Ramkumar Ramachandra	912cc5f098	[VPlan] Improve getOrCreateVPValueForSCEVExpr (NFC) (#165699 ) Use early exit in getOrCreateVPValueForSCEVExpr.	2025-11-03 06:44:30 +00:00
Ramkumar Ramachandra	03eb3cdaaa	[VPlan] Rewrite sinkScalarOperands (NFC) (#151696 ) Rewrite sinkScalarOperands in VPlanTransforms for clarity, in preparation for follow-up work to extend it to handle more recipes.	2025-11-03 06:43:42 +00:00
Kazu Hirata	902b0bd04a	[llvm] Remove "const" in the presence of "constexpr" (NFC) (#166109 ) "const" is extraneous in the presence of "constexpr" for simple variables and arrays.	2025-11-02 15:52:44 -08:00
Kazu Hirata	707bab651f	[llvm] Remove redundant typename (NFC) (#166087 ) Identified with readability-redundant-typename.	2025-11-02 13:15:16 -08:00
Florian Hahn	1c727baf69	[VPlan] Mark BranchOnCount and BranchOnCond as having side effects (NFC) BranchOnCount and BranchOnCond do not read memory, but cannot be moved. Mark them as having side-effects, but not reading/writing memory, which more accurately models that above. This allows removing some special checking for branches both in the current code and future patches.	2025-11-02 21:14:37 +00:00
Kazu Hirata	c9ef3d8eb8	[Transforms] Use "= default" (NFC) (#166043 ) Identified with modernize-use-equals-default.	2025-11-02 08:59:24 -08:00
Florian Hahn	b7e922a3da	[VPlan] Convert BuildVector with all-equal values to Broadcast. (#165826 ) Fold BuildVector where all operands are equal to Broadcast of the first operand. This will subsequently make it easier to remove additional buildvectors/broadcasts, e.g. via https://github.com/llvm/llvm-project/pull/165506. PR: https://github.com/llvm/llvm-project/pull/165826	2025-11-01 17:28:42 -07:00
Florian Hahn	f773efcffb	[VPlan] Add VPIRMetadata parameter to VPInstruction constructor. (NFC) Update VPInstruction constructor to accept VPIRMetadata between the Flags and DebugLoc parameters. This allows metadata to be passed during construction rather than assigned afterward.	2025-11-01 21:57:52 +00:00
Florian Hahn	6e83937f39	[VPlan] Add getConstantInt helpers for constant int creation (NFC). Add getConstantInt helper methods to VPlan to simplify the common pattern of creating constant integer live-ins. Suggested as follow-up in https://github.com/llvm/llvm-project/pull/164127.	2025-11-01 04:13:01 +00:00
Florian Hahn	a943132761	[VPlan] Add VPRegionBlock::getCanonicalIVType (NFC). (#164127 ) Split off from https://github.com/llvm/llvm-project/pull/156262. Similar to VPRegionBlock::getCanonicalIV, add helper to get the type of the canonical IV, in preparation for removing VPCanonicalIVPHIRecipe. PR: https://github.com/llvm/llvm-project/pull/164127	2025-10-31 20:05:02 -07:00
Alexey Bataev	964c7711f4	[SLP]Fix the minbitwidth analysis for slternate opcodes If the laternate operation is more stricter than the main operation, we cannot rely on the analysis of the main operation. In such case, better to avoid doing the analysis at all, since it may affect the overall result and lead to incorrect optimization Fixes #165878	2025-10-31 15:25:13 -07:00
Florian Hahn	317b42ef5c	[VPlan] Remove original recipe after narrowing to single-scalar. Directly remove RepOrWidenR after replacing all uses. Removing the dead user early unlocks additional opportunities for further narrowing.	2025-10-31 04:38:16 +00:00
Florian Hahn	683b00bb50	[VPlan] Limit VPScalarIVSteps to step == 1 in getSCEVExprForVPValue. For now, just support VPScalarIVSteps with step == 1 in getSCEVExprForVPValue. This fixes a crash when the step would be != 1.	2025-10-31 02:22:56 +00:00
Florian Hahn	b2d12d6f2b	[VPlan] Extend getSCEVForVPV, use to compute VPReplicateRecipe cost. (#161276 ) Update getSCEVExprForVPValue to handle more complex expressions, to use it in VPReplicateRecipe::comptueCost. In particular, it supports construction SCEV expressions for GetElementPtr VPReplicateRecipes, with operands that are VPScalarIVStepsRecipe, VPDerivedIVRecipe and VPCanonicalIVRecipe. If we hit a sub-expression we don't support yet, we return SCEVCouldNotCompute. Note that the SCEV expression is valid VF = 1: we only support construction AddRecs for VPCanonicalIVRecipe, which is an AddRec starting at 0 and stepping by 1. The returned SCEV expressions could be converted to a VF specific one, by rewriting the AddRecs to ones with the appropriate step. Note that the logic for constructing SCEVs for GetElementPtr was directly ported from ScalarEvolution.cpp. Another thing to note is that we construct SCEV expression purely by looking at the operation of the recipe and its translated operands, w/o accessing the underlying IR (the exception being getting the source element type for GEPs). PR: https://github.com/llvm/llvm-project/pull/161276	2025-10-30 15:46:19 -07:00
Ramkumar Ramachandra	01fbbda62c	[LV] Strengthen assert: VPlan0 doesn't have WidenPHIs (NFC) (#165715 ) VPWidenCanonicalIV and VPBlend recipes are created by VPPredicator, and VPCanonicalIVPHI and VPInstruction recipes are created by VPlanConstruction. WidenPHIs are never created.	2025-10-30 18:32:33 +00:00
Florian Hahn	4c46ae3948	[LV] Only skip scalarization overhead for members used as address. Refine logic to scalarize interleave group member: only skip scalarization overhead for member being used as addresses. For others, use the regular scalar memory op cost. This currently doesn't trigger in practice as far as I could find, but fixes a potential divergence between VPlan- and legacy cost models. It fixes a concrete divergence with a follow-up patch, https://github.com/llvm/llvm-project/pull/161276.	2025-10-30 05:04:34 +00:00
Alexey Bataev	db6ba82acc	[SLP] Do not match the gather node with copyable parent, containing insert instruction If the gather/buildvector node has the match and this matching node has a scheduled copyable parent, and the parent node of the original node has a last instruction, which is non-schedulable and is part of the schedule copyable parent, such matching node should be excluded as non-matching, since it produces wrong def-use chain. Fixes #165435	2025-10-29 11:50:47 -07:00
Florian Hahn	98d3a25f74	[VPlan] Don't preserve LCSSA in expandSCEVs. (#165505 ) This follows similar reasoning as 45ce88758d24 (https://github.com/llvm/llvm-project/pull/159556): LV does not preserve LCSSA, it constructs it just before processing a loop to vectorize. Runtime check expressions are invariant to that loop, so expanding them should not break LCSSA form for the loop we are about to vectorize. LV creates SCEV and memory runtime checks early on and then disconnects the blocks temporarily. The patch fixes a mis-compile, where previously LCSSA construction during SCEV expand may replace uses in currently unreachable SCEV/memory check blocks. Fixes https://github.com/llvm/llvm-project/issues/162512 PR: https://github.com/llvm/llvm-project/pull/165505	2025-10-29 18:25:46 +00:00
Alexey Bataev	cf1f4896a7	[SLP]Check only instructions with unique parent instruction user Need to re-check the instruction with the non-schedulable parent, only if this parent has a user phi node (i.e. it is used only outside the block) and the user instruction has unique parent instruction. Fixes issue reported in `20675ee67d (commitcomment-168863594)`	2025-10-28 11:14:18 -07:00
Sam Tebbs	22f860a55d	[LV] Bundle (partial) reductions with a mul of a constant (#162503 ) A reduction (including partial reductions) with a multiply of a constant value can be bundled by first converting it from `reduce.add(mul(ext, const))` to `reduce.add(mul(ext, ext(const)))` as long as it is safe to extend the constant. This PR adds such bundling by first truncating the constant to the source type of the other extend, then extending it to the destination type of the extend. The first truncate is necessary so that the types of each extend's operand are then the same, and the call to canConstantBeExtended proves that the extend following a truncate is safe to do. The truncate is removed by optimisations. This is a stacked PR, 1a and 1b can be merged in any order: 1a. https://github.com/llvm/llvm-project/pull/147302 1b. https://github.com/llvm/llvm-project/pull/163175 2. -> https://github.com/llvm/llvm-project/pull/162503	2025-10-28 16:59:53 +00:00
Ramkumar Ramachandra	a2d873fb87	[VPlan] Introduce cannotHoistOrSinkRecipe, fix miscompile (#162674 ) Factor out common code to determine legality of hoisting and sinking. The patch has the side-effect of fixing an underlying bug, where a load/store pair is reordered.	2025-10-28 09:36:17 +00:00
Mel Chen	6bf948999f	[VPlan] Store memory alignment in VPWidenMemoryRecipe. nfc (#165255 ) Add an member Alignment to VPWidenMemoryRecipe to store memory alignment directly in the recipe. Update constructors, clone(), and relevant methods to use this stored alignment instead of querying the IR instruction. This allows VPWidenLoadRecipe/VPWidenStoreRecipe to be constructed without relying on the original IR instruction in the future.	2025-10-28 15:29:35 +08:00
Florian Hahn	523c796df7	[VPlan] Use VPlan type inference to get address space for recipes. (NFC) Instead of accessing the address space from the IR reference, retrieve it via type inference.	2025-10-28 04:51:24 +00:00
Kazu Hirata	042ac912b1	[llvm] Add "override" where appropriate (NFC) (#165168 ) Note that "override" makes "virtual" redundant. Identified with modernize-use-override.	2025-10-26 13:34:32 -07:00
Alexey Bataev	a7b188983f	[SLP]Consider non-inst operands, when checking insts, used outside only If the instructions in the node do not require scheduling and used outside basic block only, still need to check, if their operands are non-inst too. Such nodes should be emitted in the beginning of the block. Fixes #165151	2025-10-26 12:53:48 -07:00
Hassnaa Hamdi	be29f0dd86	[LV]: Improve accuracy of calculating remaining iterations of MainLoopVF (#156723 ) Transform TC and VF to same numerical space when they are different.	2025-10-26 14:45:44 +00:00
Florian Hahn	d020b2da54	[VPlan] Move isSingleScalar implementation to VPlanUtils.cpp (NFC) Move the implementation of vputils::isSingleScalar to VPlanUtils.cpp to enable code sharing.	2025-10-25 21:56:03 +01:00
Florian Hahn	8c29bce1e9	[VPlan] Remove SCEVToExpansion mapping (NFC). (#164490 ) VPlan::SCEVToExpansion isn't needed any longer, as SCEV expansion de-duplication is handled locally in expandSCEVs. PR: https://github.com/llvm/llvm-project/pull/164490	2025-10-24 21:38:23 +00:00
Ramkumar Ramachandra	2c6c2689c5	[VPlan] Extend tryToFoldLiveIns to fold binary intrinsics (#161703 ) InstSimplifyFolder can fold binary intrinsics, so take the opportunity to unify code with getOpcodeOrIntrinsicID, and handle the case. The additional handling of WidenGEP is non-functional, as the GEP is simplified before it is widened, as the included test shows.	2025-10-24 10:21:39 +00:00
Florian Hahn	301fa24671	[VPlan] Limit narrowInterleaveGroups to single block regions for now. Currently only regions with a single block are supported by the legality checks.	2025-10-23 23:55:59 +01:00
Sam Tebbs	6b19a546aa	[LV] Bundle partial reductions inside VPExpressionRecipe (#147302 ) This PR bundles partial reductions inside the VPExpressionRecipe class. Stacked PRs: 1. https://github.com/llvm/llvm-project/pull/147026 2. https://github.com/llvm/llvm-project/pull/147255 3. https://github.com/llvm/llvm-project/pull/156976 4. https://github.com/llvm/llvm-project/pull/160154 5. -> https://github.com/llvm/llvm-project/pull/147302 6. https://github.com/llvm/llvm-project/pull/162503 7. https://github.com/llvm/llvm-project/pull/147513	2025-10-23 11:18:55 +00:00
Florian Hahn	bfc322dd72	Revert "[VPlan] Run narrowInterleaveGroups during general VPlan optimizations. (#149706 )" This reverts commit 8d29d09309654541fb2861524276ada6a3ebf84c. There have been reports of mis-compiles in https://github.com/llvm/llvm-project/pull/149706. Revert while I investigate.	2025-10-22 21:27:11 +01:00
Kerry McLaughlin	45c0b29171	[LV] Ignore user-specified interleave count when unsafe. (#153009 ) When an VF is specified via a loop hint, it will be clamped to a safe VF or ignored if it is found to be unsafe. This is not the case for user-specified interleave counts, which can lead to loops such as the following with a memory dependence being vectorised with interleaving: ``` #pragma clang loop interleave_count(4) for (int i = 4; i < LEN; i++) b[i] = b[i - 4] + a[i]; ``` According to [1], loop hints are ignored if they are not safe to apply. This patch adds a check to prevent vectorisation with interleaving if isSafeForAnyVectorWidth() returns false. This is already checked in selectInterleaveCount(). [1] https://llvm.org/docs/LangRef.html#llvm-loop-vectorize-and-llvm-loop-interleave	2025-10-22 15:21:27 +01:00
Florian Hahn	aca53f4375	[VPlan] Skip masked interleave groups in narrowInterleaveGroups. 8d29d09309 exposed a crash due to incorrectly trying to handle masked interleave recipes. For now, the current code does not support masked interleave recipes. Bail out for them.	2025-10-22 14:10:01 +01:00
Michael Kruse	6e0553f545	Reapply "[Polly] Update ScopInliner for NPM (#125427 )" (#164601 ) An assertion failed when Polly was registering for the pass manager which assumed that there would be only Polly passes. Since this does not need to be the case, re-apply with the assert removed. Includes a non-Polly change to trigger the premerge CI to trigger check-llvm which failed for 0b9a7b80c0674c5c6f746139912111bea7eae63b, but pre-merge did not catch.	2025-10-22 15:00:28 +02:00
Florian Hahn	82b59345fe	[VPlan] Clarify naming for helpers to create loop&replicate regions (NFC) Split off to clarify naming, as suggested in https://github.com/llvm/llvm-project/pull/156262.	2025-10-21 20:41:54 +01:00
Ramkumar Ramachandra	2ec01e430a	[VPlan] Move two VPBlockUtils members (NFC) (#162507 )	2025-10-21 16:40:13 +01:00
Alexey Bataev	20675ee67d	[SLP] Check all copyable children for non-schedulable parent nodes If the parent node is non-schedulable and it includes several copies of the same instruction, its operand might be replaced by the copyable nodes in multiple children nodes, and if the instruction is commutative, they can be used in different operands. The compiler shall consider this opportunity, taking into account that non-copyable children are scheduled only ones for the same parent instruction. Fixes #164242	2025-10-21 06:39:49 -07:00
Florian Hahn	8d29d09309	[VPlan] Run narrowInterleaveGroups during general VPlan optimizations. (#149706 ) Move narrowInterleaveGroups to to general VPlan optimization stage. To do so, narrowInterleaveGroups now has to find a suitable VF where all interleave groups are consecutive and saturate the full vector width. If such a VF is found, the original VPlan is split into 2: a) a new clone which contains all VFs of Plan, except VFToOptimize, and b) the original Plan with VFToOptimize as single VF. The original Plan is then optimized. If a new copy for the other VFs has been created, it is returned and the caller has to add it to the list of candidate plans. Together with https://github.com/llvm/llvm-project/pull/149702, this allows to take the narrowed interleave groups into account when computing costs to choose the best VF and interleave count. One example where we currently miss interleaving/unrolling when narrowing interleave groups is https://godbolt.org/z/Yz77zbacz PR: https://github.com/llvm/llvm-project/pull/149706	2025-10-21 11:37:42 +01:00
Ramkumar Ramachandra	3fbae10faa	[VPlan] Improve code using m_APInt (NFC) (#161683 )	2025-10-21 10:27:03 +01:00

1 2 3 4 5 ...

6709 Commits