llvm-project

Author	SHA1	Message	Date
Florian Hahn	72376e19df	[VPlan] Remove unused VPWidenIntrinsicRecipe constructor (NFC)	2025-03-06 20:24:30 +00:00
Ramkumar Ramachandra	ddffb74afd	[LV] Strip unreachable SCEV-check blocks (#130079 ) emitSCEVChecks checks if SCEVCheckCond matches zero, and returns nullptr. However, it sets SCEVCheckCond as used before it does this, which prevents it from being removed during cleanup, resulting in unreachable blocks being emitted. Fix this.	2025-03-06 19:30:25 +00:00
Ramkumar Ramachandra	00f3089c2e	[LV] Use PatternMatch in emitTransformedIndex (NFC) (#130081 )	2025-03-06 19:23:31 +00:00
Alexey Bataev	4959025bbc	[SLP]Fix non-determinism in reused elements analysis Need to use consistent storages for unique elements, when going to iterate over them to avoid non-determinism in reused elements analysis. Fixes #130082	2025-03-06 10:12:49 -08:00
Luke Lau	6d89c042e3	[VPlan] Remove dead AnyOf reduction case in VPReductionRecipe. NFCI (#130048 ) From what I understand, we only create VPReductionRecipes for in-loop reductions, and we don't currently support in-loop AnyOf reductions. We only create VPReductionRecipes in the !PhiR->isInLoop() section of adjustRecipesForReductions, and this comment from the initial patch seems to confirm this https://reviews.llvm.org/D108136#anchor-inline-1038338, so I think we can remove this check in the condition logic. I checked compiling SPEC 2017 with -prefer-inloop-predicates and the added assertion doesn't trigger.	2025-03-07 01:05:53 +08:00
Alexey Bataev	31845cf06c	Revert "[SLP]Fix non-determinism in reused elements analysis" This reverts commit 3158525afdc3677457712963ef45c83f4f8f900f to fix a bug revealed in https://lab.llvm.org/buildbot/#/builders/123/builds/14930	2025-03-06 08:59:08 -08:00
Alexey Bataev	3158525afd	[SLP]Fix non-determinism in reused elements analysis Need to use consistent storages for unique elements, when going to iterate over them to avoid non-determinism in reused elements analysis. Fixes #130082	2025-03-06 08:51:31 -08:00
Alexey Bataev	1182be503d	[SLP]Fix a crash for buildvector nodes with parent phi nodes with same incoming blocks If trying to find matching buildvector node for another nodes, and both nodes are used by vectorized phi nodes and are coming from the same parent block, this nodes should be considered matched to avoid a crash.	2025-03-06 07:42:43 -08:00
hanbeom	5d1029b4a8	[VectorCombine] Handle shuffle of selects (#128032 ) (shuffle(select(c1,t1,f1)), (select(c2,t2,f2)), m) -> (select (shuffle c1,c2,m), (shuffle t1,t2,m), (shuffle f1,f2,m)) The behaviour of SelectInst on vectors is the same as for `V'select[i] = Condition[i] ? V'True[i] : V'False[i]`. If a ShuffleVector is performed on two selects, it will be like: `V'[mask] = (V'select[i] = Condition[i] ? V'True[i] : V'False[i])` That's why a ShuffleVector with two SelectInst is equivalent to first ShuffleVector Condition/True/False and then SelectInst that result. This patch implements the transforming described above. Proof: https://alive2.llvm.org/ce/z/97wfHp Fixes #120775	2025-03-06 12:43:47 +00:00
Luke Lau	5e54c92314	[VPlan] Fix crash when unrolling in-loop reduction chains (#129840 ) If an in-loop reduction is chained e.g. WIDEN-REDUCTION-PHI ir<%rdx> = phi ir<0>, ir<%add2> REDUCE ir<%add1> = ir<%rdx> + reduce.add (ir<%x>) REDUCE ir<%add2> = ir<%add1> + reduce.add (ir<%y>) When we try to unroll the second add reduction, we crash because we currently expect the chain to be a VPReductionPHIRecipe, when in fact it's the previous reduction. This relaxes the cast to a dyn_cast, so we end up unrolling to: WIDEN-REDUCTION-PHI ir<%rdx> = phi ir<0>, ir<%add2> WIDEN-REDUCTION-PHI ir<%rdx>.1 = phi ir<0>, ir<%add2>.1, ir<1> WIDEN-REDUCTION-PHI ir<%rdx>.2 = phi ir<0>, ir<%add2>.2, ir<2> WIDEN-REDUCTION-PHI ir<%rdx>.3 = phi ir<0>, ir<%add2>.3, ir<3> REDUCE ir<%add1> = ir<%rdx> + reduce.add (ir<%x>) REDUCE ir<%add1>.1 = ir<%rdx>.1 + reduce.add (ir<%x>.1) REDUCE ir<%add1>.2 = ir<%rdx>.2 + reduce.add (ir<%x>.2) REDUCE ir<%add1>.3 = ir<%rdx>.3 + reduce.add (ir<%x>.3) REDUCE ir<%add2> = ir<%add1> + reduce.add (ir<%y>) REDUCE ir<%add2>.1 = ir<%add1>.1 + reduce.add (ir<%y>.1) REDUCE ir<%add2>.2 = ir<%add1>.2 + reduce.add (ir<%y>.2) REDUCE ir<%add2>.3 = ir<%add1>.3 + reduce.add (ir<%y>.3) This fixes a crash when building 525.x264_r from SPEC CPU 2017 on AArch64 with -mllvm -prefer-inloop-reductions	2025-03-05 19:13:23 +08:00
Luke Lau	e1cea0d928	[LV][TTI] Remove unused ReductionFlags. NFC (#129858 ) No in-tree targets currently use it in the preferInLoopReduction/preferPredicatedReductionSelect TTI hooks. It looks like it used to be used in LoopUtils, at least in 8ca60db40bd944dc5f67e0f200a403b4e03818ea, but I presume it was replaced by RecurrenceDescriptor.	2025-03-05 18:31:12 +08:00
Alexey Bataev	855178af99	[SLP]Fix/improve getSpillCost analysis Previous implementation may took some extra time, when walked over the same instructions several times. And also it did not include proper analysis for cross-basic-block use of the vectorized values. This version fixes it. It walks over the tree and checks the deps between entries and their operands. If there are non-vectorized calls in between, it adds a single(!) spill cost, because the vector value should be spilled/reloaded only once. Also, this version caches analysis for each entries, which are detected, and do not repeats it, uses data, found during previous analysis for previous nodes. Also, it has the internal limit. If the number of instructions between nodes and their operands is too big (> than ScheduleRegionSizeBudget / VectorizableTree.size()), it is considered that the spill is required. It allows to improve compile time. Reviewers: preames, RKSimon, mikhailramalho Reviewed By: preames Pull Request: https://github.com/llvm/llvm-project/pull/129258	2025-03-04 15:47:23 -05:00
Florian Hahn	b2d70e8796	[VPlan] Use Builder to create cast recipes in VPlanTransforms (NFC). Use VPBuilder in a few more places. This avoids manual insertions and will make changing the cast recipe easier in the future.	2025-03-04 13:39:12 +00:00
Luke Lau	47fb9c4bb9	[VPlan] Add Name argument to VPWidenPHIRecipe. NFC (#129527 ) This allows a different IR name for the generated phi to be used. This is split off from #118638 and helps remove some of the diffs in it.	2025-03-04 16:47:21 +08:00
Ramkumar Ramachandra	80bdfcd411	[LoopUtils] Don't wrap in getLoopEstimatedTripCount (#129080 ) getLoopEstimatedTripCount returns the trip count based on profiling data, and its documentation says that it could return 0 when the trip count is zero, but this is not the case: a valid trip count can never be zero, and it returns 0 when the unsigned ExitCount is incremented by 1 and wraps. Some callers are careful about checking for this zero value in an std::optional, but it makes for an API with footguns, as a std::optional return value indicates that a non-nullopt value would be a valid trip count. Fix this by explicitly returning std::nullopt when the return value would wrap, and strip additional checks in callers. This also fixes a minor bug in LoopVectorize.	2025-03-04 08:43:08 +00:00
Florian Hahn	15770a1e9d	[VPlan] Remove dead recipes in entry when merging regions. (NFC) Also remove recipes in the entry of the region that will be removed. This makes sure we don't leave any dead users around. NFC at the moment, but avoids causing issues in the future.	2025-03-04 08:26:27 +00:00
Florian Hahn	dfc5f37e3a	[VPlan] Move onlyFirstLaneUsed to VPWidenInductionRecipe (NFC). Move onlyFirstLaneUsed from VPWidenIntOrFpInductionRecipe and VPWidenPointerInduction to VPWidenInductionRecipe. Also mark step value as having only its first lane used.	2025-03-03 22:43:47 +00:00
Florian Hahn	87f837cb26	[VPlan] Remove unneeded classof with VPHeaderRecipe args (NFC). The extra classof implementation is not needed any longer.	2025-03-03 20:52:28 +00:00
Mel Chen	9b4ad2fe50	[LV][EVL] Support fixed-order recurrence idiom with EVL tail folding. (#124093 ) This patch converts the llvm.vector.splice intrinsic to llvm.experimental.vp.splice, ensuring that fixed-order recurrences execute correctly when tail folding by EVL is enable. Due to the non-VFxUF penultimate EVL issue, the EVL from the previous iteration will be preserved and used in llvm.experimental.vp.splice.	2025-03-03 21:27:13 +08:00
chrisPyr	71f4c7dabe	[NFC]Make file-local cl::opt global variables static (#126486 ) #125983	2025-03-03 13:46:33 +07:00
Florian Hahn	ba7e27381f	[VPlan] Use VP_CLASSOF_IMPL in VPWidenRecipe. (NFC)	2025-03-02 11:40:23 +00:00
Florian Hahn	f937b17e85	[LV] Don't query SCEV for non-invariant values in cost model. This fixes a divergence between VPlan and legacy cost model, matching behavior further up in getInstructionCost as well. Fixes https://github.com/llvm/llvm-project/issues/129236.	2025-03-02 10:55:52 +00:00
Florian Hahn	75270e3750	[VPlan] Don't print VPlan DT after VPlan construction. (NFC) Remove unnecessary code to just print VPlan dominator tree.	2025-03-01 21:15:56 +00:00
Simon Pilgrim	5ddf40fa78	[VectorCombine] scalarizeLoadExtract - don't create scalar loads if any extract is waiting to be erased (#129375 ) If any extract is waiting to be erased, then bail out as this will distort the cost calculation and possibly lead to infinite loops. Fixes #129373	2025-03-01 16:54:22 +00:00
Florian Hahn	9f37cdca52	[VPlan] Update VPTransformState accessors to take const VPValue (NFC). This will enable using const VPValue * pointers are in more places.	2025-03-01 13:15:37 +00:00
Simon Pilgrim	8f4d2e02be	[VectorCombine] scalarizeLoadExtract - add debug message for match + cost-comparison Helps with debugging to show to that the fold found the match, and shows the old + new costs to indicate whether the fold was/wasn't profitable.	2025-03-01 09:57:08 +00:00
Jie Fu	7cf2f602df	[Vectorize] Fix unused variable warnings (NFC) /llvm-project/llvm/lib/Transforms/Vectorize/SandboxVectorizer/Passes/TransactionAcceptOrRevert.cpp:24:8: error: unused variable 'CostBefore' [-Werror,-Wunused-variable] auto CostBefore = SB.getBeforeCost(); ^ /llvm-project/llvm/lib/Transforms/Vectorize/SandboxVectorizer/Passes/TransactionAcceptOrRevert.cpp:25:8: error: unused variable 'CostAfter' [-Werror,-Wunused-variable] auto CostAfter = SB.getAfterCost(); ^ 2 errors generated.	2025-03-01 09:39:58 +08:00
vporpo	45d018097c	[SandboxVec][NFC] Add LLVM_DEBUG dumps (#129335 ) This patch updates/adds LLVM_DEBUG dumps. It moves the DEBUG_TYPE into SandboxVectorizer/Debug.h such that it can be shared across all components of the vectorizer.	2025-02-28 16:10:34 -08:00
vporpo	6ff0f69fec	[SandboxVec][BottomUpVec] Fix vectorization of vector constants (#129290 ) This patch fixes the value we generate when we vectorize constants.	2025-02-28 14:37:48 -08:00
Alexey Bataev	a36a67c79a	[SLP]Fix the analysis of the user buildvector nodes for minbitwidth If the user node is a buildvector/gather node and it has no internal instructions state, need to check properly for this state and check the type of the node itself, not its operands. Fixes #129242	2025-02-28 13:17:14 -08:00
Florian Hahn	f9b2497055	[VPlan] Use const for VPBasicBlock* in key in VPBB2IRBB (NFC). This allows queries in places where only a const pointer to VPBasiBlocks is available.	2025-02-28 20:45:11 +00:00
Alexey Bataev	e1e20c07e4	[SLP]Fix bitwidth analysis for signed nodes, incoming into UITOFP nodes If the signed node is the operand of UITOFP, the bitwidth analysis should consider minimum value between incoming bitwidth and the bitwidth of the UITOFP node. Fixes #129244	2025-02-28 11:50:50 -08:00
vporpo	c7529248cd	[SandboxVec][BottomUpVec] Add -sbvec-stop-bndl flag for debugging (#129132 ) This patch adds a helper flag for bisection debugging. This flag force-stops vectorization after this many bundles have been considered for vectorization. Using -sbvec-stop-bndl=0 will not vectorize the code at all.	2025-02-28 11:19:41 -08:00
Florian Hahn	c0bf4b2c57	[VPlan] Remove unneeded VPValue::getLiveInIRValue() const (NFC). The accessor is not needed/used.	2025-02-28 17:01:19 +00:00
vporpo	32bcc9f0d3	[SandboxVec] Add option -sbvec-allow-file for bisection debugging (#129127 ) This new option lets you specify an allow-list of source files and disables vectorization if the IR is not in the list. This can be used for debugging miscompiles.	2025-02-27 14:15:04 -08:00
vporpo	adf0abf354	[SandboxVec][BottomUpVec] Add -sbvec-stop-at flag for debugging (#129097 ) When debugging miscompiles we need a way to force-stop the vectorizer early. This helps figure out which invocation is generating incorrect code.	2025-02-27 13:33:54 -08:00
Florian Hahn	6ce41db6b0	[VPlan] Preserve DebugLoc for VPBranchOnMaskRecipe. Update code to set and generate debug location for branch recipe	2025-02-27 20:19:42 +00:00
Florian Hahn	253d691596	[VPlan] Update VPBranchOnMaskRecipe to always set the mask (NFC). The mask is always available at construction time. Make it non-optional to simlpify code.	2025-02-27 18:53:24 +00:00
vporpo	e2b0d5df84	[SandboxVec][Scheduler] Enforce scheduling SchedBundle instrs back-to-back (#128092 ) This patch fixes the behavior of the scheduler by making sure the instrs that are part of a SchedBundle are scheduled back-to-back.	2025-02-27 10:23:50 -08:00
Florian Hahn	1e1b9bccc0	[VPlan] Simplify BLEND %a, %b, NOT(%m) -> BLEND %b, %a, %m. (#128375 ) Avoid negations for normalized blends by reordering operands. PR: https://github.com/llvm/llvm-project/pull/128375	2025-02-27 17:43:24 +00:00
Alexey Bataev	69effe054c	[SLP]Check for potential safety of the truncation for vectorized scalars with multi uses If the vectorized scalars has multiple uses, need to check if it is safe to truncate the vectorized value, before actually trying doing it. Otherwise, the compiler may loose some important bits, which may lead to a miscompilation. Fixes #129057	2025-02-27 08:41:46 -08:00
David Sherwood	65c45bfa7d	[LoopVectorize][NFC] Fix formatting issue with a comment (#129033 )	2025-02-27 12:51:04 +00:00
John Brawn	8150ab93f7	[LoopVectorize] Use CodeSize as the cost kind for minsize (#124119 ) Functions marked with minsize should aim for minimum code size, so the vectorizer should use CodeSize for the cost kind and also the cost we compare should be the cost for the entire loop: it shouldn't be divided by the number of vector elements and block costs shouldn't be divided by the block probability. Possibly we should also be doing this for optsize as well, but there are a lot of tests that assume the current behaviour and the definition of optsize is less clear than minsize (for minsize the goal is to "keep the code size of this function as small as possible" whereas for optsize it's "keep the code size of this function low").	2025-02-27 11:07:02 +00:00
Benjamin Maxwell	3307b0374a	[LV] Teach the loop vectorizer llvm.sincos is trivially vectorizable (#128035 ) Depends on #123210	2025-02-27 09:37:06 +00:00
Alexey Bataev	39bab1de33	[SLP]Check if the operand for removal is the reduction operand, awaiting for the reduction If the operand of the instruction-to-be-removed is a reduction value, which is not reduced yet, and, thus, it has no users, it may be removed during operands analysis. Fixes #128736	2025-02-26 14:17:11 -08:00
Alexey Bataev	418a987285	[SLP]Do not use node, if it is a subvector or buildvector node If the buildvector has some matches with another node, which is a subvector of another buildvector node, need to check for this and cancel matching to avoid incorrect ordering of the nodes. Fixes #128770	2025-02-26 13:25:37 -08:00
Florian Hahn	4277c21059	[VPlan] Introduce explicit broadcasts for live-ins. (#124644 ) Add a new VPInstruction::Broadcast opcode and use it to materialize explicit broadcasts of live-ins. The initial patch only materlizes the broadcasts if the vector preheader dominates all uses that need it. Later patches will pick the best valid insert point, thus retiring implicit hoisting of broadcasts from VPTransformsState::get(). PR: https://github.com/llvm/llvm-project/pull/124644	2025-02-26 13:57:51 +00:00
Han-Kuan Chen	a12ca57c1c	[SLP][REVEC] Add getScalarizationOverhead helper function to reduce error when REVEC is enabled. (#128530 )	2025-02-25 23:16:05 +08:00
Florian Hahn	522b05afb6	[VPlan] Construct immutable VPIRBBs for exit blocks at construction(NFC) (#128374 ) Constract immutable VPIRBasicBlocks for all exit blocks up front and keep a list of them. Same as the scalar header, they are leaf nodes of the VPlan and won't change. Some exit blocks may be unreachable, e.g. if the scalar epilogue always executes or depending on optimizations. This simplifies both the way we retrieve the exit blocks as well as hooking up the exit blocks. PR: https://github.com/llvm/llvm-project/pull/128374	2025-02-25 14:23:27 +00:00
Elvis Wang	8009c1fd81	[LV][VPlan] Prevent calculate cost for skiped instructions in precomputeCosts(). (#127966 ) Skip calculating instruction costs for exit conditions in precomputeCosts() when it should be skipped. Reported from: https://github.com/llvm/llvm-project/issues/115744#issuecomment-2670479463 Godbolt for reduced test cases: https://godbolt.org/z/fr4YMeqcv	2025-02-25 11:09:09 +08:00

1 2 3 4 5 ...

5663 Commits