llvm-project

Author	SHA1	Message	Date
Alexey Bataev	94ec7ffa46	[SLP] Do not skip tiny trees with gathered loads to vectorize The isTreeTinyAndNotFullyVectorizable check for 2-node trees (insertelement root + gather child) was too aggressive: it rejected trees even when LoadEntriesToVectorize was non-empty, preventing gathered loads from being vectorized into masked loads/strided loads, etc. Reviewers: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/190040	2026-04-02 06:47:53 -04:00
Alexey Bataev	c6669c4993	[SLP] Guard FMulAdd conversion to require single-use/non-reordered FMul operands The FMulAdd (CombinedVectorize) transformation in transformNodes() marks an FMul child entry with zero cost, assuming it is fully absorbed into the fmuladd intrinsic. However, when any FMul scalar has multiple uses (e.g., also stored separately), the FMul must survive as a separate node. Reviewers: hiraditya, RKSimon, bababuck Pull Request: https://github.com/llvm/llvm-project/pull/189692	2026-04-01 17:14:52 -04:00
Alexey Bataev	1e06cd634e	[SLP][NFC] Fix uninitialized ReductionRoot in getTreeCost ReductionRoot was initialized to nullptr instead of the RdxRoot parameter. This caused two ScaleCost calls (for MinBWs cast cost and ReductionBitWidth resize cost) to pass nullptr as the user instruction, and suppressed the "Reduction Cost" line in debug output. In practice the scale factor is the same because the tree root's main op and the reduction root share the same basic block, so this is NFC. Reviewers: Pull Request: https://github.com/llvm/llvm-project/pull/189994	2026-04-01 12:22:02 -04:00
Alexey Bataev	c20e233020	[SLP] Replace TrackedToOrig DenseMap with parallel SmallVector in reduction Replace the DenseMap<Value, Value> TrackedToOrig with a SmallVector<Value*> indexed in parallel with Candidates. This avoids hash-table overhead for the tracked-value-to-original-value mapping in horizontal reduction processing. Fixes #189686	2026-03-31 16:22:57 -07:00
Demetrius Kanios	96bd7b6e15	[CodeGen] Add additional params to `TargetLoweringBase::getTruncStoreAction` (#187422 ) The truncating store analogue of #181104. Adds `Alignment` and `AddrSpace` parameters to `TargetLoweringBase::getTruncStoreAction` and dependents, and introduces a `getCustomTruncStoreAction` hook for targets to customize legalization behavior using this new information. This change is fully backwards compatible from the target's point of view, with `setTruncStoreAction` having identical functionality. The change is purely additive.	2026-03-30 16:52:45 -07:00
Alexey Bataev	26e0d15eaa	[SLP] Prefer to trim equal-cost alternate-shuffle subtrees If the trimming candidate subtree is rooted at an alternate-shuffle node with binary ops, and this subtree has the same cost as the buildvector node cost, better to stick with the buildvector node to avoid runtime perf regressions from shuffle/extra operations overhead that the cost model may underestimate. Skip trimming if the subtree contains ExtractElement nodes, since those operate on already-materialized vectors, which may reduced vector-to-scalar code movement and have better perf. Reviewers: hiraditya, bababuck, fhahn, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/188272	2026-03-30 16:03:18 -04:00
Alexey Bataev	c7908d3320	[SLP][NFC]Use passing-by-ref in the range based loop to prevent warnings/errors	2026-03-30 03:47:00 -07:00
Alexey Bataev	4450891580	[SLP] Check if potential bitcast/bswap candidate is a root of reduction Need to check if the potential bitcast/bswap-like construct is a root of the reduction, otherwise it cannot represent a bitcast/bswap construct. Fixes #189184	2026-03-28 13:58:22 -07:00
Ryan Buchner	a125d9b5ef	[SLP][NFC] Reapply "Refactor to prepare for constant stride stores" (#188689 ) Refactor to proceed #185964. Much of this is a refactor to address this issues. Instead of iterating over one chain at a time, attempting all VFs for that given change, we now iterate over VFs, trying each chain for the current VF. Includes fix for use after free bug.	2026-03-27 10:11:49 -07:00
Alexey Bataev	1759b81de9	[SLP]Improve analysis of copyables operands for commmutative main instruction For commutative copyables, instruction operands are always LHS and other are RHS. But if some instruction is main and has 2 instructions operands and RHS is more compatible with LHS operands, than LHS operands, need to swap such operands for better analysis. Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/185320	2026-03-26 16:03:58 -04:00
Alexey Bataev	d9a44c818f	[SLP]Initial support for vector register spills/reloads estimation Adds initial support for spill/reload estimation. Currently, it just runs the operands and calculates number of registers, used by the operands. If this number greater than the number of total available registers, it consider the first (full) groups as the candidates for the spills/reloads. Reviewers: hiraditya, RKSimon, bababuck Pull Request: https://github.com/llvm/llvm-project/pull/187594	2026-03-26 14:27:27 -04:00
Alexey Bataev	1cb9a78b5a	[SLP] Fix incorrect operand info for select in getCmpSelInstrCost The operand info passed to getCmpSelInstrCost for Select instructions was using operands 0 and 1 (condition and true value), but the API expects info about the data operands (true and false values). For selects, the data operands are at indices 1 and 2, not 0 and 1. This led to the cost model receiving the condition's operand info instead of the false arm's, potentially producing inaccurate cost estimates. Reviewers: bababuck, hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/188506	2026-03-26 07:09:28 -04:00
Jordan Rupprecht	6937866f5c	Revert "[SLP][NFC] Refactor to prepare for constant stride stores" (#188669 ) Revert 26f344e1703229aea20df616b1dbc949fbc332e1. Causes crashes. Reduced test case: https://github.com/llvm/llvm-project/pull/185997#issuecomment-4131405777	2026-03-26 04:12:42 +00:00
Ryan Buchner	26f344e170	[SLP][NFC] Refactor to prepare for constant stride stores (#185997 ) Refactor to proceed addition of strided store chain vectorization. Instead of iterating over one chain at a time, attempting all VFs for that given chain, we now iterate over VFs, trying each chain for the current VF. This will allow us to handle chains that share elements.	2026-03-25 15:17:44 -07:00
Ryan Buchner	b3455c1b84	[SLP][NFC] Remove duplicated cast (#188532 ) Introduced in #188103.	2026-03-25 20:47:22 +00:00
Alexey Bataev	34889601a9	[SLP]Mark candidate instruction as reduced value, if it is the operand of another reduced value If the next candidate is the operand of one of the reduced value candidates, such instructions also should be marked as a reduced value, not a reduction operation, even if all other requirements are met. This will allow to reduce the compile time. Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/188103	2026-03-25 11:23:51 -04:00
Jinsong Ji	642bde76c3	[SLP] Fix infinite loop in ordered reduction worklist processing (#188342 ) The ordered reduction support introduced in 94e366ef2060 can cause an infinite loop when processing complex reduction chains. The worklist algorithm re-adds instructions from PossibleOrderedReductionOps when switching to ordered mode, but doesn't track which instructions have already been processed. This allows instructions to be re-added and processed multiple times, creating cycles. Add a Visited set to track processed instructions and skip any that have already been handled, preventing the infinite loop.	2026-03-24 21:19:53 +00:00
Alexey Bataev	5f0b3d6af4	[SLP][NFC]Fix formatting and debug printing, NFC	2026-03-24 09:06:22 -07:00
Alexey Bataev	af37ac8aee	[SLP]Use reduction root explicitly from reduction analysis to avoid non-determinism Initially, the reduction root was detected using the last member of the UserIgnoreList set, which is unordered. Better to use the reduction root explicitly to avoid non-determinism in the reduction parent block, which may cause incorrect scale factor estimation for the reduction cost.	2026-03-23 09:46:33 -07:00
Alexey Bataev	5b7ad38d6b	[SLP]Fix codegen of compares with consts, being trunced If the const values have more active bits, than requested by the another operand of the compare, such constants should not be trunced to avoid miscompilation	2026-03-23 07:49:19 -07:00
Alexey Bataev	85f529dda1	Revert "[SLP]Fix codegen of compares with consts, being trunced" This reverts commit 16e0cc8308379857ecd69e6fe1aaf71e15b94910 to add a new test case for the miscompile	2026-03-23 06:27:08 -07:00
Alexey Bataev	16e0cc8308	[SLP]Fix codegen of compares with consts, being trunced If the const values have more active bits, than requested by the another operand of the compare, such constants should not be trunced to avoid miscompilation	2026-03-23 06:04:08 -07:00
Alexey Bataev	b2ba79578b	[SLP]Fix patterns for compile time blow up with ordered reductions Excluded patterns, leading to compile time blow up for integer ordered reductions.	2026-03-22 13:42:54 -07:00
Alexey Bataev	88f830aed8	[SLP]Do not try to reduced instruction, marked for deletion in previous attempts Need to skip instructions, which were vectorized and marked for deletion to prevent a compiler crash	2026-03-22 10:10:48 -07:00
Alexey Bataev	616240369e	[SLP]Do not consider copyable node with SplitVectorize parent If the copyables are schedulable and the parent node is plit vectorize, need to skip the scheduling analysis for such nodes to avoid a compiler crash	2026-03-21 06:56:59 -07:00
Alexey Bataev	db143fb2b9	[SLP][NFC]Use block number instead of pointer for stable sorting, NFC	2026-03-21 04:30:32 -07:00
Alexey Bataev	b260861b38	[SLP]Update values after ordered vectorization Need to update matching between the original reduced values and their vectorized matches after ordered reduction vectorization to avoid a compiler crash	2026-03-20 13:33:40 -07:00
Alexey Bataev	94e366ef20	[SLP] Initial support for ordered reductions Patch models ordered reductions as a series of extractelements for the cases which cannot be modeled as unordered reductions. Fixes #50590 Reviewers: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/182644	2026-03-20 13:45:14 -04:00
Alexey Bataev	2bb0fa46a8	[SLP]Prefer copyable over alternate If the instructions state is alternate and/or contains non-directly matching instructions, need to check if it is better to represent such operations as non-alternate with copyables. To do this, we need to compare operands between the instructions in their different representations and choose the best one for optimal vectorization. Reviewers: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/183777	2026-03-20 11:59:59 -04:00
Alexey Bataev	7d76a3122d	[SLP]Improve analysis for the shl-based reduced values with copyables (#185485 ) shl-based reduced values in many cases serve as a bitcast/bswap-based transfromation root, but need to improve analysis for better matching. This patch merges reduction candidates into a single reduced value array, if there are only 2 different candidate arrays, one of them has only single element, the second is a list of shl instructions. Also, sorts these shl instructions by their shift amount and merges with the single candidate, if it is profitable to have a copyable reduction.	2026-03-19 14:16:53 -04:00
Alexey Bataev	9050794e06	[SLP]Improve reductions for copyables/split nodes The original support for copyables leads to a regression in x264 in RISCV, this patch improves detection of the copyable candidates by more precise checking of the profitability and adds and extra check for splitnode reduction, if it is profitable. Fixes #184313 Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/185697	2026-03-19 12:03:05 -04:00
Alexey Bataev	582fa78753	[SLP]Do not match buildvector node, if current node is part of its combined nodes If current buildvector node is part of the combined nodes of the matching candidate node, this matching candidate must be considered as non-matching to prevent wrong def-use chain Reviewers: Pull Request: https://github.com/llvm/llvm-project/pull/187491	2026-03-19 08:15:32 -04:00
Alexey Bataev	abdcde9bbc	[SLP] Loop aware cost model/tree building Currently, SLP vectorizer do not care about loops and their trip count. It may lead to inefficient vectorization in some cases. Patch adds loop nest-aware tree building and cost estimation. When it comes to tree building, it now checks that tree do not span across different loop nests. The nodes from other loop nests are immediate buildvector nodes. The cost model adds the knowledge about loop trip count. If it is unknown, the default value is used, controlled by the -slp-cost-loop-min-trip-count=<value> option. The cost of the vector nodes in the loop is multiplied by the number of iteration (trip count), because each vector node will be executed the trip count number of times. This allows better cost estimation. Original Reviewers: jdenny-ornl, vporpo, hiraditya, RKSimon Original PR: https://github.com/llvm/llvm-project/pull/150450 Recommit after revert in c7bd3062f1dac975cf9b706f457b3c55b4bf57ff and in 4e500bd0015042b0cd4b7c87b81caeea06072d24 Reviewers: Pull Request: https://github.com/llvm/llvm-project/pull/187391	2026-03-18 17:54:01 -04:00
Alexey Bataev	4e500bd001	Revert "[SLP] Loop aware cost model/tree building" This reverts commit 6261cb4487f153c599a040d7a77524561b520240 to try to fix compile time regressions	2026-03-18 09:46:39 -07:00
Alexey Bataev	6261cb4487	[SLP] Loop aware cost model/tree building Currently, SLP vectorizer do not care about loops and their trip count. It may lead to inefficient vectorization in some cases. Patch adds loop nest-aware tree building and cost estimation. When it comes to tree building, it now checks that tree do not span across different loop nests. The nodes from other loop nests are immediate buildvector nodes. The cost model adds the knowledge about loop trip count. If it is unknown, the default value is used, controlled by the -slp-cost-loop-min-trip-count=<value> option. The cost of the vector nodes in the loop is multiplied by the number of iteration (trip count), because each vector node will be executed the trip count number of times. This allows better cost estimation. Reviewers: jdenny-ornl, vporpo, hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/150450 Recommit after revert in c7bd3062f1dac975cf9b706f457b3c55b4bf57ff	2026-03-18 07:33:07 -07:00
Ryan Buchner	af67e30a63	[SLP][NFC] Refactor BinOpSameOpcodeHelper BIT enum (#187067 ) More readable syntax and increase type width to avoid silent errors if we reach 17 members.	2026-03-17 12:38:14 -07:00
Alexis Engelke	43ec60eee5	Reland "[DomTree] Assert non-null block for pre-dom tree" (#187005 ) Reland #186790 with fix for SCEV. A loop can have more than one latch, in which case getLoopLatch returns null.	2026-03-17 14:10:04 +00:00
Alexey Bataev	d117f98ff6	[SLP]Fix legality checks for bswap-based transformations Fix the checks for the non-power-of-2 base bswaps by checking the power-of-2 of the source type, not the target scalar type. Plus, add cost estimation for zext, if the source type does not match the scalar type and fixes final bitcasting for the reduced values. Fixes https://github.com/llvm/llvm-project/pull/184018#issuecomment-4053477562	2026-03-16 11:56:24 -07:00
Alexis Engelke	e30aa40aa6	Revert "[DomTree] Assert non-null block for pre-dom tree" (#186831 ) Reverts llvm/llvm-project#186790 Breaks buildbots, there are more SLPVectorizer problems. https://lab.llvm.org/buildbot/#/builders/52/builds/15810	2026-03-16 17:29:35 +01:00
Alexey Bataev	61a9e30045	Revert "[SLP]Fix legality checks for bswap-based transformations" This reverts commit 2d4daea3b66469420fc164e76c15558b34e44c75 to fix a buildbot https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flab.llvm.org%2Fbuildbot%2F%23%2Fbuilders%2F164%2Fbuilds%2F19737&data=05%7C02%7C%7C672461616e0d4b66614208de8374a0ff%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C639092734113272365%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=%2B23yMlvZzYt7bB2gM6MmcEwPkIKQogXjcKYIZ%2Bz79zQ%3D&reserved=0	2026-03-16 09:01:49 -07:00
Alexey Bataev	2d4daea3b6	[SLP]Fix legality checks for bswap-based transformations Fix the checks for the non-power-of-2 base bswaps by checking the power-of-2 of the source type, not the target scalar type. Plus, add cost estimation for zext, if the source type does not match the scalar type. Fixes https://github.com/llvm/llvm-project/pull/184018#issuecomment-4053477562	2026-03-16 08:40:44 -07:00
Alexis Engelke	d4c22859db	[DomTree] Assert non-null block for pre-dom tree (#186790 ) In a pre-dominator tree, blocks should never be null.	2026-03-16 16:07:49 +01:00
Alexey Bataev	50822d6b25	[SLP]Do not request the last instruction for first buildvector nodes with no state If looking for the match of the gather/buildvector node and its root is a first node, which also a buildvector/gather, and has no state, we should skip the analysis for such nodes to prevent a compiler crash Fixes #185851	2026-03-11 10:11:09 -07:00
Alexey Bataev	aa90add989	[SLP]Track vectorized values in reductions for correct handling between vectorization Need to use WeakTrackingVH handler instead of the Value * to correctly track modified/replaced vectorized instructions Fixes https://github.com/llvm/llvm-project/pull/182760#issuecomment-4036706233	2026-03-11 06:05:08 -07:00
Alexey Bataev	c7bd3062f1	Revert "[SLP] Loop aware cost model/tree building" This reverts commit 8963edb534e28d548d8381675bb18af1770c3041 to fix miscompilations/compile time regressions, reported in https://github.com/llvm/llvm-project/pull/150450#issuecomment-4037224288, https://github.com/llvm/llvm-project/pull/150450#issuecomment-4037481719 and https://github.com/llvm/llvm-project/pull/150450#issuecomment-4038134121	2026-03-11 04:37:54 -07:00
Alexey Bataev	8963edb534	[SLP] Loop aware cost model/tree building Currently, SLP vectorizer do not care about loops and their trip count. It may lead to inefficient vectorization in some cases. Patch adds loop nest-aware tree building and cost estimation. When it comes to tree building, it now checks that tree do not span across different loop nests. The nodes from other loop nests are immediate buildvector nodes. The cost model adds the knowledge about loop trip count. If it is unknown, the default value is used, controlled by the -slp-cost-loop-min-trip-count=<value> option. The cost of the vector nodes in the loop is multiplied by the number of iteration (trip count), because each vector node will be executed the trip count number of times. This allows better cost estimation. Reviewers: jdenny-ornl, vporpo, hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/150450	2026-03-10 16:14:57 -04:00
tudinhh	c192e8c9e3	[SLP] Fix misvectorization in commutative to non-commutative conversion (#185230 ) Summary Fixes a miscompilation where commutative operations (e.g., or, and, mul) with a left-hand side constant were incorrectly transformed into non-commutative operations (e.g., shl, sub). The Problem In `BinOpSameOpcodeHelper::getOperand`, when a constant is at `Pos == 0`, the helper was failing to swap operand order for new non-commutative target opcodes. This resulted in inverted logic, such as transforming `or 0, %x` into `shl 0, %x` (resulting in 0) instead of the correct `%x << 0`. The Fix The existing logic only protected the Sub opcode. This patch generalizes the fix to all non-commutative instructions by using `!Instruction::isCommutative(ToOpcode)`. This ensures that for any directional operation, the variable is correctly placed on the LHS and the constant on the RHS. Changes SLPVectorizer.cpp: Replaced the specific Sub check with a general isCommutative check. Regression Test: Added lhs-constant-non-cummutative.ll to cover shl, sub, and ashr targets. Fixes #185186	2026-03-09 16:17:39 -04:00
Alexey Bataev	95919ecd57	[SLP]Allow bitcast/bswap based reductions for types, larger than the total strided size Added support for zero extending the bitcasted/bswapped type to the original type, if it is larger than the original scalar type Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/184018	2026-03-08 10:37:09 -04:00
Alexey Bataev	e0e5000ea7	[SLP]Remove Alternate early profitability checks in favor of throttling Removes early check, which may prevent some further optimizations, in favor of tree throttling. Reviewers: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/182760	2026-03-08 09:37:51 -04:00
Alexey Bataev	d8b718a3fa	[SLP]Match the mask size, when copying mask for full match Need to be careful, when filling the mask for fully matched nodes, the masks may differ in sizes Fixes a crash reported in test/Transforms/SLPVectorizer/X86/mask-size-less-common-mask.ll	2026-03-08 05:33:30 -07:00

1 2 3 4 5 ...

2528 Commits