llvm-project

Author	SHA1	Message	Date
paperchalice	945a186089	[DAGCombiner] Remove most `UnsafeFPMath` references (#146295 ) This pull request removes all references to `UnsafeFPMath` in dag combiner except FP_ROUND. - Set fast math flags in some tests.	2025-08-22 15:27:25 +08:00
Alex MacLean	a3ed96b899	[NVPTX] Legalize aext-load to zext-load to expose more DAG combines (#154251 )	2025-08-21 15:33:23 -07:00
Jim Lin	fd28257195	[DAGCombiner] Fold umax/umin operations with vscale operands (#154461 ) If umax/umin operations with vscale operands, that can be constant folded.	2025-08-21 09:15:40 +08:00
Matt Arsenault	276c1d8114	DAG: Add assert to getNode for EXTRACT_SUBVECTOR indexes (#154099 ) Verify it's a multiple of the result vector element count instead of asserting this in random combines. The testcase in #153808 fails in the wrong point. Add an assert to getNode so the invalid extract asserts at construction instead of use.	2025-08-20 09:55:43 +09:00
Simon Pilgrim	fcb36ca8cc	[DAG] visitTRUNCATE - merge the trunc(abd) and trunc(avg) handling which are almost identical (#154301 ) CC @houngkoungting	2025-08-19 12:59:39 +01:00
黃國庭	0773854575	[DAG] Fold trunc(avg(x,y)) for avgceil/floor u/s nodes if they have sufficient leading zero/sign bits (#152273 ) avgceil version : https://alive2.llvm.org/ce/z/2CKrRh Fixes #147773 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-18 16:36:26 +01:00
Simon Pilgrim	858d1dfa2c	[DAG] visitTRUNCATE - early out from computeKnownBits/ComputeNumSignBits failures. NFC. (#154111 ) Avoid unnecessary (costly) computeKnownBits/ComputeNumSignBits calls - use MaskedValueIsZero instead of computeKnownBits directly to simplify code.	2025-08-18 14:55:09 +01:00
Simon Pilgrim	681ecae913	[DAG] visitTRUNCATE - test abd legality early to avoid unnecessary computeKnownBits/ComputeNumSignBits calls. NFC. (#154085 ) isOperationLegal is much cheaper than value tracking	2025-08-18 11:06:29 +01:00
Simon Pilgrim	bcb4984a0b	[X86] select-smin-smax.ll - add i128 tests Helps check quality of legality codegen (all we had was x86 i64 handling)	2025-08-15 13:48:13 +01:00
Min-Yih Hsu	abe92a5000	[DAGCombine] Fix an incorrect folding of extract_subvector (#153709 ) Reported from https://github.com/llvm/llvm-project/pull/153393#issuecomment-3189898813 During DAGCombine, an intermediate extract_subvector sequence was generated: ``` t8: v9i16 = extract_subvector t3, Constant:i64<9> t24: v8i16 = extract_subvector t8, Constant:i64<0> ``` And one of the DAGCombine rule which turns `(extract_subvector (extract_subvector X, C), 0)` into `(extract_subvector X, C)` kicked in and turn that into `v8i16 = extract_subvector t3, Constant:i64<9>`. But it forgot to check if the extracted index is a multiple of the minimum vector length of the result type, hence the crash. This patch fixes this by adding an additional check.	2025-08-14 23:37:22 +00:00
woruyu	95b16d1264	[DAG] Fold trunc(abdu(x,y)) and trunc(abds(x,y)) if they have sufficient leading zero/sign bits (#151471 ) This PR resolves https://github.com/llvm/llvm-project/issues/147683 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-08 10:43:14 +01:00
Benjamin Maxwell	94c48a21bb	[AArch64][SVE] Fix hang in VECTOR_HISTOGRAM DAG combine (#152539 ) The histogram DAG combine went into an infinite loop of creating the same histogram node due to an incorrect use of the `refineUniformBase` and `refineIndexType` APIs. These APIs take SDValues by reference (SDValue&) and return `true` if they were "refined" (i.e., set to new values). Previously, this DAG combine would create the `Ops` array (used to create the new histogram node) before calling the `refine*` APIs, which copies the SDValues into the array, meaning the updated values were not used to create the new histogram node. Reproducer: https://godbolt.org/z/hsGWhTaqY (it will timeout)	2025-08-08 09:59:24 +01:00
Craig Topper	57045a137f	[DAGCombiner] Avoid repeated calls to WideVT.getScalarSizeInBits() in DAGCombiner::mergeTruncStores. NFC (#152231 ) We already have a variable, WideNumBits, that contains the same information. Use it and delay the creation of WideVT until we really need it.	2025-08-06 09:10:02 -07:00
Simon Pilgrim	9f50224b25	[DAG] Remove Depth=1 hack from isGuaranteedNotToBeUndefOrPoison checks (#152127 ) Now that #146490 removed the assertion in visitFreeze to assert that the node was still isGuaranteedNotToBeUndefOrPoison we no longer need this reduced depth hack (which had to account for the difference in depth of freeze(op()) vs op(freeze()) Helps with some of the minor regressions in #150017	2025-08-05 13:35:04 +01:00
Simon Pilgrim	d561259a08	[DAG] visitFREEZE - replace multiple frozen/unfrozen uses of an SDValue with just the frozen node (#150017 ) Similar to InstCombinerImpl::freezeOtherUses, attempt to ensure that we merge multiple frozen/unfrozen uses of a SDValue. This fixes a number of hasOneUse() problems when trying to push FREEZE nodes through the DAG. Remove SimplifyMultipleUseDemandedBits handling of FREEZE nodes as we now want to keep the common node, and not bypass for some nodes just because of DemandedElts. Fixes #149799	2025-08-05 09:24:09 +01:00
woruyu	38bfe9ae56	[DAG] combineVSelectWithAllOnesOrZeros - missing freeze (#150388 ) This PR resolves https://github.com/llvm/llvm-project/issues/150069 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-04 15:55:12 +01:00
Simon Pilgrim	5c2054a4ea	[DAG] getMinMaxOpcodeForFP - split if-else chain. NFC. (#151938 ) (style) All cases return so split the chain	2025-08-04 15:32:08 +01:00
Abhishek Kaushik	1c0ac80d4a	[DAG] Combine `store + vselect` to `masked_store` (#145176 ) Add a new combine to replace ``` (store ch (vselect cond truevec (load ch ptr offset)) ptr offset) ``` to ``` (mstore ch truevec ptr offset cond) ``` This saves a blend operation on targets that support conditional stores.	2025-08-04 19:05:36 +05:30
AZero13	23022a4683	[SelectionDAG] Move sign pattern check from AArch64 and ARM to general SelectionDAG (#151736 ) This works on all cases much like the XOR case above it in SelectionDAG.	2025-08-01 14:46:51 -07:00
Paul Walker	ceb2b9c141	[LLVM][DAGCombiner] fold (shl (X * vscale(C0)), C1) -> (X * vscale(C0 << C1)). (#150651 )	2025-08-01 11:42:45 +01:00
黃國庭	f04ea2ef1c	Add m_SelectCCLike matcher to match SELECT_CC or SELECT with SETCC (#149646 ) Fix #147282 and Follow-up to #148834 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-01 10:12:05 +01:00
David Sherwood	05b16aff0f	[DAGCombiner] Add combine for vector interleave of splats (#151110 ) This patch adds two DAG combines: 1. vector_interleave(splat, splat, ...) -> {splat,splat,...} 2. concat_vectors(splat, splat, ...) -> wide_splat where all the input splats are identical. Both of these together enable us to fold concat_vectors(vector_interleave(splat, splat, ...)) into a wide splat. Post-legalisation we must only do the concat_vector combine if the wider type and splat operation is legal. For fixed-width vectors the DAG combine only occurs for interleave factors of 3 or more, however it's not currently safe to test this for AArch64 since there isn't any lowering support for fixed-width interleaves. I've only added fixed-width tests for RISCV.	2025-08-01 09:58:05 +01:00
Pierre van Houtryve	c4b1557097	[DAG] Fold (setcc ((x \| x >> c0 \| ...) & mask)) sequences (#146054 ) Fold sequences where we extract a bunch of contiguous bits from a value, merge them into the low bit and then check if the low bits are zero or not. Usually the and would be on the outside (the leaves) of the expression, but the DAG canonicalizes it to a single `and` at the root of the expression. The reason I put this in DAGCombiner instead of the target combiner is because this is a generic, valid transform that's also fairly niche, so there isn't much risk of a combine loop I think. See #136727	2025-07-30 10:27:19 +02:00
Pierre van Houtryve	250f2a6367	[DAG] Remove AssertZext if the input is masked (#146052 ) Remove AssertZext if the input ensures the assert cannot fail.	2025-07-29 13:05:30 +02:00
Nikita Popov	ab1f6ce482	[IR][SDAG] Remove lifetime size handling from SDAG (#150944 ) Split out from https://github.com/llvm/llvm-project/pull/150248: Specify that the argument of lifetime.start/lifetime.end is ignored and will be removed in the future. Remove lifetime size handling from SDAG. The size was previously discarded during isel, so was always ignored for stack coloring anyway. Where necessary, obtain the size of the full frame index.	2025-07-29 09:53:59 +02:00
Craig Topper	8d549cf036	[SelectionDAG] Pass SDNodeFlags through getNode instead of setFlags. (#149852 ) getNode updates flags correctly for CSE. Calling setFlags after getNode may set the flags where they don't apply. I've added a Flags argument to getSelectCC and the signature of getNode that takes an ArrayRef of EVTs.	2025-07-22 08:06:30 -07:00
Simon Pilgrim	c37942df00	[DAG] visitFREEZE - limit freezing of multiple operands (#149797 ) This is a partial revert of #145939 (I've kept the BUILD_VECTOR(FREEZE(UNDEF), FREEZE(UNDEF), elt2, ...) canonicalization) as we're getting reports of infinite loops (#148084). The issue appears to be due to deep chains of nodes and how visitFREEZE replaces all instances of an operand with a common frozen version - other users of the original frozen node then get added back to the worklist but might no longer be able to confirm a node isn't poison due to recursion depth limits on isGuaranteedNotToBeUndefOrPoison. The issue still exists with the old implementation but by only allowing a single frozen operand it helps prevent cases of interdependent frozen nodes. I'm still working on supporting multiple operands as its critical for topological DAG handling but need to get a fix in for trunk and 21.x. Fixes #148084	2025-07-22 15:40:55 +01:00
Nikita Popov	a7a1df8f72	[CodeGen] Remove handling for lifetime.start/end on non-alloca (#149838 ) After https://github.com/llvm/llvm-project/pull/149310 we are guaranteed that the argument is an alloca, so we don't need to look at underlying objects (which was not a correct thing to do anyway). This also drops the offset argument for lifetime nodes in SDAG. The offset is fixed to zero now. (Peculiarly, while SDAG pretended to have an offset, it just gets silently dropped during selection.)	2025-07-22 09:44:59 +02:00
Simon Pilgrim	92e2d4e9e1	[DAG] visitFREEZE - remove unused HadMaybePoisonOperands check. NFC. (#149517 ) Redundant since #145939	2025-07-18 17:38:11 +01:00
Alex MacLean	f73e163278	[DAGCombiner] Fold [us]itofp of truncate (#149391 )	2025-07-18 08:10:20 -07:00
Paul Walker	44cd5027f8	[LLVM][CodeGen][SVE] List MVTs that are desirable for extending loads. (#149153 ) Extend AArch64TargetLowering::isVectorLoadExtDesirable to specify the set of MVT for which load extension is desirable. Fixes https://github.com/llvm/llvm-project/issues/148939	2025-07-18 15:34:48 +01:00
Piotr Fusik	9fa3971fac	[DAGCombiner] Fold vector subtraction if above threshold to `umin` (#148834 ) This extends #134235 and #135194 to vectors.	2025-07-17 16:37:59 +02:00
Craig Topper	36e4174989	[DAGCombiner][AArch64] Prevent SimplifyVCastOp from creating illegal scalar types after type legalization. (#148970 ) Fixes #148949	2025-07-15 18:22:25 -07:00
Paul Walker	bd4e7f5f5d	[LLVM][DAGCombiner] Fix size calculations in calculateByteProvider. (#148425 ) calculateByteProvider only cares about scalars or a single element within a vector. For the later there is the VectorIndex parameter to identify the element. All other properties, and specificially Index, are related to the underyling scalar type and thus when taking the size of a type it's the scalar size that matters. Fixes https://github.com/llvm/llvm-project/issues/148387	2025-07-15 11:05:38 +01:00
Craig Topper	eea5c291bb	[DAGCombiner] Pass SDNodeFlags to getNode instead of modifying nodes. (#148744 ) getNode has logic to intersect flags correctly if the new node happens to CSE with an existing node. Setting node flags after getNode bypasses this logic and may change the node for other uses where the flags don't hold.	2025-07-14 20:53:14 -07:00
Craig Topper	f07107337f	[DAGCombiner] Pass SDNodeFlags to getSelect instead of modifying the node returned. (#148733 )	2025-07-14 16:50:10 -07:00
jjasmine	44481f5067	[DAGCombine] Change isBuildVectorAll* -> isConstantSplatVectorAll* for Vselect (#147305 ) Change isBuildVectorAll* -> isConstantSplatVectorAll* in VSelect in case the fold happens after BuildVector has been canonically transformed to Splat or if the Splat is initially in vselect already - Fixes #73454 - Update related test cases, add extra tests in wasm --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-07-11 10:13:05 +01:00
David Green	0736f330b0	[DAG] Handle truncated splat in isBoolConstant (#145473 ) This allows truncated splat / buildvector in isBoolConstant, to allow certain not instructions to be recognized post-legalization, and allow vselect to optimize. An override for x86 avx512 predicated vectors is required to avoid an infinite recursion from the code that detects zero vectors. From: ``` // Check if the first operand is all zeros and Cond type is vXi1. // If this an avx512 target we can improve the use of zero masking by // swapping the operands and inverting the condition. ```	2025-07-10 20:59:34 +01:00
Philip Reames	f00a7a49bd	[DAG] Fold insert_subvector (splat X), (splat X), N2 - > splat X (#147380 ) If we're inserting a splat into a splat of the same value, then regardless of the index, the result is simply a splat of that value.	2025-07-08 08:50:01 -07:00
woruyu	b0790e04a3	[DAG] combineVSelectWithAllOnesOrZeros - fold select Cond, 0, x -> and not(Cond), x (#147472 ) ### Summary This patch extends the work from [#145298](https://github.com/llvm/llvm-project/pull/145298) by removing the now-unnecessary X86-specific combineVSelectWithLastZeros logic. That combine is now correctly and more generally handled in the target-independent combineVSelectWithAllOnesOrZeros. This simplifies the X86 DAG combine logic and avoids duplication. Fixes: [#144513](https://github.com/llvm/llvm-project/issues/144513) Related for reference: [#146831](https://github.com/llvm/llvm-project/pull/146831)	2025-07-08 14:45:40 +01:00
woruyu	c80fa2364b	[DAG] SDPatternMatch m_Zero/m_One/m_AllOnes have inconsistent undef h… (#147044 ) ### Summary This PR resolves https://github.com/llvm/llvm-project/issues/146871 This PR resolves https://github.com/llvm/llvm-project/issues/140745 Refactor m_Zero/m_One/m_AllOnes all use struct template function to match and AllowUndefs=false as default.	2025-07-07 15:04:54 +01:00
Simon Pilgrim	52383956f8	[DAG] Replace DAGCombiner::ConstantFoldBITCASTofBUILD_VECTOR with SelectionDAG::FoldConstantBuildVector (#147037 ) DAGCombiner can already constant fold build vectors of constants/undefs to a new vector type, but it has to be incredibly careful after legalization to not affect a target's canonicalized constants. This patch proposes we move the implementation inside SelectionDAG to make it easier for targets to manually use the constant folding whenever it deems it safe to do so. I've also altered the method to take the BuildVectorSDNode input directly and consistently use the same SDLoc.	2025-07-07 10:44:03 +01:00
Simon Pilgrim	ba7d78ac45	[DAG] foldABSToABD - fallback to value tracking if the (ABS (SUB LHS, RHS)) operands aren't extended (#147053 ) ISD::ABDS can be used if the signed subtraction will not overwrap (this is an extension to handle cases where the NSW flag has been lost) ISD::ABDU can be used if both operands have at least 1 zero sign bit. Fixes #147049	2025-07-06 08:36:46 +01:00
Simon Pilgrim	740da004af	[DAG] Fix static analyzer warning about mismatched argument comments in isConstOrConstSplat. NFC.	2025-07-04 15:00:08 +01:00
Simon Pilgrim	c79fcfee41	[DAG] combineVSelectWithAllOnesOrZeros - reusing existing VT. NFC.	2025-07-03 10:57:55 +01:00
woruyu	bbcebec3af	[DAG] Refactor X86 combineVSelectWithAllOnesOrZeros fold into a generic DAG Combine (#145298 ) This PR resolves https://github.com/llvm/llvm-project/issues/144513 The modification include five pattern : 1.vselect Cond, 0, 0 → 0 2.vselect Cond, -1, 0 → bitcast Cond 3.vselect Cond, -1, x → or Cond, x 4.vselect Cond, x, 0 → and Cond, x 5.vselect Cond, 000..., X -> andn Cond, X 1-4 have been migrated to DAGCombine. 5 still in x86 code. The reason is that you cannot use the andn instruction directly in DAGCombine, you can only use and+xor, which will introduce optimization order issues. For example, in the x86 backend, select Cond, 0, x → (~Cond) & x, the backend will first check whether the cond node of (~Cond) is a setcc node. If so, it will modify the comparison operator of the condition.So the x86 backend cannot complete the optimization of andn.In short, I think it is a better choice to keep the pattern of vselect Cond, 000..., X instead of and+xor in combineDAG. For commit, the first is code changes and x86 test(note 1), the second is tests in other backend(node 2). --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-07-02 15:07:48 +01:00
Simon Pilgrim	38200e94f1	[DAG] visitFREEZE - always allow freezing multiple operands (#145939 ) Always try to fold freeze(op(....)) -> op(freeze(),freeze(),freeze(),...). This patch proposes we drop the opt-in limit for opcodes that are allowed to push a freeze through the op to freeze all its operands, through the tree towards the roots. I'm struggling to find a strong reason for this limit apart from the DAG freeze handling being immature for so long - as we've improved coverage in canCreateUndefOrPoison/isGuaranteedNotToBeUndefOrPoison it looks like the regressions are not as severe. Hopefully this will help some of the regression issues in #143102 etc.	2025-07-02 11:28:37 +01:00
James Y Knight	ae2104897c	[SelectionDAG] Fix NaN regression in fma dag-combine. (#146592 ) After 901e1390c9778a191256335d37802bc631c2d183 (#127770), the DAG combine would transform `fma(x, 0.0, 1.0)` into `1.0` if `-fp-contract=fast` was enabled, in addition to when 'x' is marked nnan/ninf. It's only valid in the latter case, not the former, so delete the extra condition.	2025-07-01 18:10:30 -04:00
Simon Pilgrim	4e30f8101e	[DAG] visitFREEZE - remove isGuaranteedNotToBeUndefOrPoison assertion (#146490 ) Although nice to have to prove the freeze can be moved, this can fail immediately after freeze(op(...)) -> op(freeze(),freeze(),...) creation if any of the new freeze nodes now prevents value tracking from seeing through to the source values (e.g. shift amounts/element indices are in bounds etc.). This will allow us to remove the isGuaranteedNotToBeUndefOrPoison checks inside canCreateUndefOrPoison that were discussed on #146361	2025-07-01 11:17:41 +01:00
paperchalice	613222ec33	[DAGCombiner] Remove `UnsafeFPMath` usage in `visitFSUBForFMACombine` etc. (#145637 ) Remove `UnsafeFPMath` in `visitFMULForFMADistributiveCombine`, `visitFSUBForFMACombine` and `visitFDIV`. All affected tests are fixed by add fast math flags manually. Propagate fast math flags when lowering fdiv in NVPTX backend, so it can produce optimized dag when `unsafe-fp-math` is absent.	2025-06-30 08:41:23 +08:00

1 2 3 4 5 ...

4102 Commits