llvm-project

Author	SHA1	Message	Date
Alexey Bataev	6261cb4487	[SLP] Loop aware cost model/tree building Currently, SLP vectorizer do not care about loops and their trip count. It may lead to inefficient vectorization in some cases. Patch adds loop nest-aware tree building and cost estimation. When it comes to tree building, it now checks that tree do not span across different loop nests. The nodes from other loop nests are immediate buildvector nodes. The cost model adds the knowledge about loop trip count. If it is unknown, the default value is used, controlled by the -slp-cost-loop-min-trip-count=<value> option. The cost of the vector nodes in the loop is multiplied by the number of iteration (trip count), because each vector node will be executed the trip count number of times. This allows better cost estimation. Reviewers: jdenny-ornl, vporpo, hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/150450 Recommit after revert in c7bd3062f1dac975cf9b706f457b3c55b4bf57ff	2026-03-18 07:33:07 -07:00
Ryan Buchner	af67e30a63	[SLP][NFC] Refactor BinOpSameOpcodeHelper BIT enum (#187067 ) More readable syntax and increase type width to avoid silent errors if we reach 17 members.	2026-03-17 12:38:14 -07:00
Alexis Engelke	43ec60eee5	Reland "[DomTree] Assert non-null block for pre-dom tree" (#187005 ) Reland #186790 with fix for SCEV. A loop can have more than one latch, in which case getLoopLatch returns null.	2026-03-17 14:10:04 +00:00
Alexey Bataev	d117f98ff6	[SLP]Fix legality checks for bswap-based transformations Fix the checks for the non-power-of-2 base bswaps by checking the power-of-2 of the source type, not the target scalar type. Plus, add cost estimation for zext, if the source type does not match the scalar type and fixes final bitcasting for the reduced values. Fixes https://github.com/llvm/llvm-project/pull/184018#issuecomment-4053477562	2026-03-16 11:56:24 -07:00
Alexis Engelke	e30aa40aa6	Revert "[DomTree] Assert non-null block for pre-dom tree" (#186831 ) Reverts llvm/llvm-project#186790 Breaks buildbots, there are more SLPVectorizer problems. https://lab.llvm.org/buildbot/#/builders/52/builds/15810	2026-03-16 17:29:35 +01:00
Alexey Bataev	61a9e30045	Revert "[SLP]Fix legality checks for bswap-based transformations" This reverts commit 2d4daea3b66469420fc164e76c15558b34e44c75 to fix a buildbot https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flab.llvm.org%2Fbuildbot%2F%23%2Fbuilders%2F164%2Fbuilds%2F19737&data=05%7C02%7C%7C672461616e0d4b66614208de8374a0ff%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C639092734113272365%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=%2B23yMlvZzYt7bB2gM6MmcEwPkIKQogXjcKYIZ%2Bz79zQ%3D&reserved=0	2026-03-16 09:01:49 -07:00
Alexey Bataev	2d4daea3b6	[SLP]Fix legality checks for bswap-based transformations Fix the checks for the non-power-of-2 base bswaps by checking the power-of-2 of the source type, not the target scalar type. Plus, add cost estimation for zext, if the source type does not match the scalar type. Fixes https://github.com/llvm/llvm-project/pull/184018#issuecomment-4053477562	2026-03-16 08:40:44 -07:00
Alexis Engelke	d4c22859db	[DomTree] Assert non-null block for pre-dom tree (#186790 ) In a pre-dominator tree, blocks should never be null.	2026-03-16 16:07:49 +01:00
Alexey Bataev	50822d6b25	[SLP]Do not request the last instruction for first buildvector nodes with no state If looking for the match of the gather/buildvector node and its root is a first node, which also a buildvector/gather, and has no state, we should skip the analysis for such nodes to prevent a compiler crash Fixes #185851	2026-03-11 10:11:09 -07:00
Alexey Bataev	aa90add989	[SLP]Track vectorized values in reductions for correct handling between vectorization Need to use WeakTrackingVH handler instead of the Value * to correctly track modified/replaced vectorized instructions Fixes https://github.com/llvm/llvm-project/pull/182760#issuecomment-4036706233	2026-03-11 06:05:08 -07:00
Alexey Bataev	c7bd3062f1	Revert "[SLP] Loop aware cost model/tree building" This reverts commit 8963edb534e28d548d8381675bb18af1770c3041 to fix miscompilations/compile time regressions, reported in https://github.com/llvm/llvm-project/pull/150450#issuecomment-4037224288, https://github.com/llvm/llvm-project/pull/150450#issuecomment-4037481719 and https://github.com/llvm/llvm-project/pull/150450#issuecomment-4038134121	2026-03-11 04:37:54 -07:00
Alexey Bataev	8963edb534	[SLP] Loop aware cost model/tree building Currently, SLP vectorizer do not care about loops and their trip count. It may lead to inefficient vectorization in some cases. Patch adds loop nest-aware tree building and cost estimation. When it comes to tree building, it now checks that tree do not span across different loop nests. The nodes from other loop nests are immediate buildvector nodes. The cost model adds the knowledge about loop trip count. If it is unknown, the default value is used, controlled by the -slp-cost-loop-min-trip-count=<value> option. The cost of the vector nodes in the loop is multiplied by the number of iteration (trip count), because each vector node will be executed the trip count number of times. This allows better cost estimation. Reviewers: jdenny-ornl, vporpo, hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/150450	2026-03-10 16:14:57 -04:00
tudinhh	c192e8c9e3	[SLP] Fix misvectorization in commutative to non-commutative conversion (#185230 ) Summary Fixes a miscompilation where commutative operations (e.g., or, and, mul) with a left-hand side constant were incorrectly transformed into non-commutative operations (e.g., shl, sub). The Problem In `BinOpSameOpcodeHelper::getOperand`, when a constant is at `Pos == 0`, the helper was failing to swap operand order for new non-commutative target opcodes. This resulted in inverted logic, such as transforming `or 0, %x` into `shl 0, %x` (resulting in 0) instead of the correct `%x << 0`. The Fix The existing logic only protected the Sub opcode. This patch generalizes the fix to all non-commutative instructions by using `!Instruction::isCommutative(ToOpcode)`. This ensures that for any directional operation, the variable is correctly placed on the LHS and the constant on the RHS. Changes SLPVectorizer.cpp: Replaced the specific Sub check with a general isCommutative check. Regression Test: Added lhs-constant-non-cummutative.ll to cover shl, sub, and ashr targets. Fixes #185186	2026-03-09 16:17:39 -04:00
Alexey Bataev	95919ecd57	[SLP]Allow bitcast/bswap based reductions for types, larger than the total strided size Added support for zero extending the bitcasted/bswapped type to the original type, if it is larger than the original scalar type Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/184018	2026-03-08 10:37:09 -04:00
Alexey Bataev	e0e5000ea7	[SLP]Remove Alternate early profitability checks in favor of throttling Removes early check, which may prevent some further optimizations, in favor of tree throttling. Reviewers: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/182760	2026-03-08 09:37:51 -04:00
Alexey Bataev	d8b718a3fa	[SLP]Match the mask size, when copying mask for full match Need to be careful, when filling the mask for fully matched nodes, the masks may differ in sizes Fixes a crash reported in test/Transforms/SLPVectorizer/X86/mask-size-less-common-mask.ll	2026-03-08 05:33:30 -07:00
Alexey Bataev	a96a0ded25	[SLP]Fix the matching of the nodes with the same scalars, but reused If the scalars are reused and the ReuseShuffleIndices is set, we may miss matching for the buildvector/gather nodes and add an extra cost	2026-03-07 10:29:34 -08:00
Alexey Bataev	2714583317	[SLP]Do not consider split vectorize nodes as vector phi nodes Split vectorize nodes should not be considered as vector PHI nodes, when trying to find the insertion point for the postpotned nodes. Fixes #184585	2026-03-04 14:03:20 -08:00
Alexey Bataev	789bf51f0c	[SLP]Do not consider condition with multiple uses and negate predicate as a candidate for inversed select If the select/zext comparison has negate predicate and is used in several places, it should not be considered as a candidate for inversed zext/select pattern, it will be replaced by a negate vector predicate, leading to an incorrect codegen for other uses	2026-03-01 12:01:19 -08:00
Alexey Bataev	9730d31284	[SLP]Fix types for reductions in revec Need to consider vector inputs, when building casts for the reduced values Fixes #170828	2026-03-01 07:54:13 -08:00
Alexey Bataev	a6e7c38ea6	[SLP]Do not vectorize select nodes with scalar and vector conditions If the select nodes contains selects with mixed scalar/vector conditions, such nodes should not be revectorized. Fixes #170836	2026-03-01 07:01:04 -08:00
Alexey Bataev	e317f42455	[SLP]Recalculate dependencies for the buildvector schedule node, if they have copyable node Need to recalculate the deps for all buildvector nodes with copyable deps to prevent a compiler crash during scheduling of instructions	2026-02-28 12:29:47 -08:00
Alexey Bataev	12e1075b64	[SLP]Fix operand reordering when estimating profitability of operands Need to swap operand for a single instruction, not for the the same lane of the first and second instruction in the list	2026-02-27 16:16:22 -08:00
Akash Dutta	cf28f23f10	[SLP] Reject duplicate shift amounts in matchesShlZExt reorder path (#183627 ) In the reordered RHS path of matchesShlZExt, the code never checked that each shift amount (0, Stride, 2×Stride, …) appears at most once. When the same shift appeared in multiple lanes, it still filled Order, producing a non-permutation (e.g. Order = [0,0,0,1]). That led to bad shuffle masks and miscompilation (e.g. shuffles with poison). The patch adds an explicit duplicate check: before setting Order[Idx] = Pos, it ensures Pos has not been seen before, using a SmallBitVector SeenPositions(VF). If a position is seen twice, the function returns false and the optimization is not applied.	2026-02-27 13:00:58 -06:00
Alexey Bataev	c08079d8e7	[SLP]Add single-use check for the bitcasted reduction If the reduced value, to be bitcasted, is used multiple times, it will require emission of the extractelement instruction. Such nodes should not be bitcasted, should be vectorized as vector instructions. Fixes https://github.com/llvm/llvm-project/pull/181940#issuecomment-3950734168	2026-02-24 05:27:38 -08:00
Alexey Bataev	95a960daa0	[SLP]Do not convert inversed cmp nodes, if they reordered/reused If the cmp node with inversed compares must be reordered/shuffled with the reuses, disable transformation for such nodes for now, they require some special processing. Fixes https://github.com/llvm/llvm-project/pull/181580#issuecomment-3933026221	2026-02-20 06:04:51 -08:00
Alexey Bataev	29d4fea59b	[SLP]Handle mixed select-to-bicasts and general reductions If the reduction tree represents mixed select-to-bitcasts and general reductions, need to handle them correctly to avoid a compiler crash Fixes https://github.com/llvm/llvm-project/pull/181940#issuecomment-3929220929	2026-02-19 13:38:34 -08:00
Alexey Bataev	38d804725f	[SLP]Do not mark for transforming to buildvector inversed compares Inversed compares must remain vector nodes, they should be converted to gathers to generate correct code. Fixes issue reported in https://github.com/llvm/llvm-project/pull/181580#issuecomment-3926951332	2026-02-19 09:37:50 -08:00
Alexey Bataev	c6425aa9ae	[SLP]Support reduced or selects of bitmask as cmp bitcast Converts reduced or(select %cmp, bitmask, 0) to zext(bitcast %vector_cmp to i<num_reduced_values>) to in Reviewers: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/181940	2026-02-18 18:01:42 -05:00
Alexey Bataev	a7c25ba33c	[SLP][NFC]Fix reorered -> reordered	2026-02-18 13:54:39 -08:00
Ryan Buchner	a6e8de7407	[SLP][NFC] Fix `MainOp/AltOp` assertion to check the correct value (#182093 ) Previously both assertions were checking MainOp. Initial assertion added incorrectly in d41e517748e2d.	2026-02-18 13:08:10 -08:00
Alexey Bataev	a5aaa9dc63	[SLP]Convert compares from zexts, promoted to selects, to inversed op, if improves codegen Some of the zext i1 (cmp) + select sequences can be transformed by inverting compare predicates to remove extra shuffles, like zext 1 (cmp ne) + select (cmp eq), 0, 2 can be modeled as select <2 x > (cmp ne), <1, 2>, zeroinitializer Reviewers: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/181580	2026-02-17 13:26:06 -05:00
Alexey Bataev	26f944bb50	[SLP]Fix an ArrayRef out-of-bounds access in slice If the revec is enabled, may have the number of parts (registers) for the combined node, not a single element node, so need to check for potential out-of-bounds access Fixes #181798	2026-02-17 10:00:13 -08:00
Alexey Bataev	ef52df4365	[SLP]Do not increase depth for type-changing nodes and NotProfitableForVectorization removal The patch changes the maximum tree size analysis. 1. Do not increase depth for type changing nodes (like casts and compares), allowing more deeper trees to be built. 2. Removes NotProfitableForVectorization workaround, not needed anymore after throttling enabled Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/180950	2026-02-17 09:52:23 -05:00
Alexey Bataev	2ff4ec172a	[SLP]Fix revec in split nodes Initially split nodes do not support vector entries in revec mode, patch fixes the issue by adding analysis for the scale factor Fixes #181546	2026-02-16 14:09:28 -08:00
Alexey Bataev	7ec7907b80	[SLP] Fix a very long loads offset, being stored in DenseMap Added a check for a very long offset to avoid a crash in the compiler Fixes #181682	2026-02-16 11:07:22 -08:00
Alexey Bataev	255b493673	[SLP]Do not overflow number of the reduced values Need to trunc the total number of the reduced values, in case if the number is too big Fixes #181520	2026-02-15 11:02:32 -08:00
Ryan Buchner	f2903793de	[SLP][NFC] Use static_assert to confirm SupportedOps is sorted (#181397 ) Can be checked at compile time.	2026-02-13 11:30:59 -08:00
Alexey Bataev	e93829e807	[SLP]Fix crash with deleted non-copyable node in scheduling copyables If the copyables are parts of the deleted nodes, need to check the actual tree to correctly handling the scheduling of copyables	2026-02-12 11:42:02 -08:00
Ryan Buchner	95ef1a5c31	[SLP] Use the correct identity when combining binary opcodes with AND/MUL (#180457 ) Fixes #180456 Fix bug in the following SLP lowering: ``` define void @sub_mul(ptr %p, ptr %s) { entry: %p1 = getelementptr i16, ptr %p, i64 1 %l0 = load i16, ptr %p %l1 = load i16, ptr %p1 %mul0 = sub i16 %l0, 0 %mul1 = mul i16 %l1, 5 %s1 = getelementptr i16, ptr %s, i64 1 store i16 %mul0, ptr %s store i16 %mul1, ptr %s1 ret void } ``` to ``` define void @sub_mul(ptr %p, ptr %s) { entry: %tmp0 = load <2 x i16>, ptr %p, align 2 %tmp1 = mul <2 x i16> %tmp0, <i16 0, i16 5> -> updates to <i16 1, i16 5> store <2 x i16> %tmp1, ptr %s, align 2 ret void } ```	2026-02-12 09:34:44 -08:00
Alexey Bataev	fc648683cd	[SLP]Add external uses estimations into tree throttling Added basic estimations for the external uses, when calculating the cost of the non-profitable trees. Excluding stores/insertelement, as thay are very good candidates for the vectorization. Also, tuned buildvector/gather cost with minimum bitwidth analysis data. Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/178024	2026-02-11 16:14:34 -05:00
Alexey Bataev	601364f1ba	[SLP]Correctly process deleted gathered loads and short trees If the gathered loads nodes are deleted for deletion, need to actually deleted them from tree. Also, if the remaining tree is too short (buildvector + gather node), need to skip such trees to avoid hanging. Fixes #180846	2026-02-11 10:27:01 -08:00
Alexey Bataev	54cdd903b8	[SLP]Skip operands comparing on non-matching (but compatible) instructions If the instructions are compatible but non-matching (zext-select pair as example), no need to perform operands analysis, just return that they are matching.	2026-02-11 04:55:29 -08:00
David Sherwood	6f0b8a7ebc	[SLP] Use the correct calling convention for vector math routines (#180759 ) When vectorising calls to math intrinsics such as llvm.pow we correctly detect and generate calls to the corresponding vector math variant. However, we don't pick up and use the calling convention for the vector math function. This matters for veclibs such as ArmPL where the aarch64_vector_pcs calling convention can improve codegen by reducing the number of registers that need saving across calls.	2026-02-11 10:52:58 +00:00
Alexey Bataev	78490acb32	[SLP]Support for zext i1 %x modeling as select %x, 1, 0 Model zext i1 %x to in as select i1 %x, in 1, in 0 in case, if there are other select instructions, which can be combined into a bundle. Fixes #178403 Recommit after revert in 993e1f66afcfe9da03bd813e669eada341b11d2f Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/180635	2026-02-10 12:54:12 -08:00
Alexey Bataev	993e1f66af	Revert "[SLP]Support for zext i1 %x modeling as select %x, 1, 0" This reverts commit 70aebae2a13114f4e3d5e2460c052d8f3de295be to fix buildbots https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flab.llvm.org%2Fbuildbot%2F%23%2Fbuilders%2F85%2Fbuilds%2F18614&data=05%7C02%7C%7Ce5641da3fe984280a6e908de68b3658c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C639063316889757116%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=65hUwLDdZkXq3zUEt3cVuqJNwXN7Alw4JKDggDbjeVk%3D&reserved=0	2026-02-10 06:49:53 -08:00
Alexey Bataev	70aebae2a1	[SLP]Support for zext i1 %x modeling as select %x, 1, 0 Model zext i1 %x to in as select i1 %x, in 1, in 0 in case, if there are other select instructions, which can be combined into a bundle. Fixes #178403 Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/180635	2026-02-10 08:59:44 -05:00
Alexey Bataev	fe754dff6d	[SLP]Remove LoadCombine workaround after handling of the copyables LoadCombine pattern handling was added as a workaround for the cases, where the SLP vectorizer could not vectorize the code effectively. With the copyables support, it can handle it directly. Also, patch adds support for scalar loads[ + bswap] pattern for byte sized loads (+ reverse bytes for bswap) Recommit after revert in 6377c86d718232fe60c548dfd7ab439f7ff84df7 Reviewers: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/174205	2026-02-05 11:16:08 -08:00
Alexey Bataev	6377c86d71	Revert "[SLP]Remove LoadCombine workaround after handling of the copyables" This reverts commit 8dbb9f66e8b14a8a06f1873a2c1b7dce366ed2d6 to fix buildbot issues https://lab.llvm.org/buildbot/#/builders/224/builds/2795	2026-02-05 09:57:00 -08:00
Alexey Bataev	8dbb9f66e8	[SLP]Remove LoadCombine workaround after handling of the copyables LoadCombine pattern handling was added as a workaround for the cases, where the SLP vectorizer could not vectorize the code effectively. With the copyables support, it can handle it directly. Also, patch adds support for scalar loads[ + bswap] pattern for byte sized loads (+ reverse bytes for bswap) Reviewers: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/174205	2026-02-05 10:42:08 -05:00

1 2 3 4 5 ...

2494 Commits