llvm-project

Author	SHA1	Message	Date
Fabian Ritter	d5607694e1	[AMDGPU][SDAG] DAGCombine PTRADD -> disjoint OR (#146075 ) If we can't fold a PTRADD's offset into its users, lowering them to disjoint ORs is preferable: Often, a 32-bit OR instruction suffices where we'd otherwise use a pair of 32-bit additions with carry. This needs to be a DAGCombine (and not a selection rule) because its main purpose is to enable subsequent DAGCombines for bitwise operations. We don't want to just turn PTRADDs into disjoint ORs whenever that's sound because this transform loses the information that the operation implements pointer arithmetic, which AMDGPU for instance needs when folding constant offsets. For SWDEV-516125.	2025-09-19 11:58:41 +02:00
Fabian Ritter	771c94c8db	[SDAG][AMDGPU] Allow opting in to OOB-generating PTRADD transforms (#146074 ) This PR adds a TargetLowering hook, canTransformPtrArithOutOfBounds, that targets can use to allow transformations to introduce out-of-bounds pointer arithmetic. It also moves two such transformations from the AMDGPU-specific DAG combines to the generic DAGCombiner. This is motivated by target features like AArch64's checked pointer arithmetic, CPA, which does not tolerate the introduction of out-of-bounds pointer arithmetic.	2025-09-19 11:07:59 +02:00
Björn Pettersson	1c4c7bd808	[SelectionDAG] Deal with POISON for INSERT_VECTOR_ELT/INSERT_SUBVECTOR (#143102 ) As reported in https://github.com/llvm/llvm-project/issues/141034 SelectionDAG::getNode had some unexpected behaviors when trying to create vectors with UNDEF elements. Since we treat both UNDEF and POISON as undefined (when using isUndef()) we can't just fold away INSERT_VECTOR_ELT/INSERT_SUBVECTOR based on isUndef(), as that could make the resulting vector more poisonous. Same kind of bug existed in DAGCombiner::visitINSERT_SUBVECTOR. Here are some examples: This fold was done even if vec[idx] was POISON: INSERT_VECTOR_ELT vec, UNDEF, idx -> vec This fold was done even if any of vec[idx..idx+size] was POISON: INSERT_SUBVECTOR vec, UNDEF, idx -> vec This fold was done even if the elements not extracted from vec could be POISON: sub = EXTRACT_SUBVECTOR vec, idx INSERT_SUBVECTOR UNDEF, sub, idx -> vec With this patch we avoid such folds unless we can prove that the result isn't more poisonous when eliminating the insert. Fixes https://github.com/llvm/llvm-project/issues/141034	2025-09-17 21:04:00 +00:00
guan jian	6aab826e23	[DAGCombiner] add fold (xor (smin(x, C), C)) and fold (xor (smax(x, C), C)) (#155141 ) Hi, I compared the following LLVM IR with GCC and Clang, and there is a small difference between the two. The LLVM IR is: ``` define i64 @test_smin_neg_one(i64 %a) { %1 = tail call i64 @llvm.smin.i64(i64 %a, i64 -1) %retval.0 = xor i64 %1, -1 ret i64 %retval.0 } ``` GCC generates: ``` cmp x0, 0 csinv x0, xzr, x0, ge ret ``` Clang generates: ``` cmn x0, #1 csinv x8, x0, xzr, lt mvn x0, x8 ret ``` Clang keeps flipping x0 through x8 unnecessarily. So I added the following folds to DAGCombiner: fold (xor (smax(x, C), C)) -> select (x > C), xor(x, C), 0 fold (xor (smin(x, C), C)) -> select (x < C), xor(x, C), 0 alive2: https://alive2.llvm.org/ce/z/gffoir --------- Co-authored-by: Yui5427 <785369607@qq.com> Co-authored-by: Matt Arsenault <arsenm2@gmail.com> Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-09-16 15:30:57 +00:00
Arthur Eubanks	984251acad	Revert "[DAGCombiner] Relax condition for extract_vector_elt combine" (#157953 ) Reverts llvm/llvm-project#157658 Causes hangs, see https://github.com/llvm/llvm-project/pull/157658#issuecomment-3276441812	2025-09-10 21:33:44 +00:00
ZhaoQi	4621e17dee	[DAGCombiner] Relax condition for extract_vector_elt combine (#157658 ) Checking `isOperationLegalOrCustom` instead of `isOperationLegal` allows more optimization opportunities. In particular, if a target wants to mark `extract_vector_elt` as `Custom` rather than `Legal` in order to optimize some certain cases, this combiner would otherwise miss some improvements. Previously, using `isOperationLegalOrCustom` was avoided due to the risk of getting stuck in infinite loops (as noted in `61ec738b60`). After testing, the issue no longer reproduces, but the coverage is limited to the regression/unit tests and the test-suite.	2025-09-10 15:51:52 +08:00
guan jian	83af24dd85	[DAG] Generalize fold (not (neg x)) -> (add X, -1) (#154348 ) Generalize `fold (not (neg x)) -> (add X, -1)` to `fold (not (sub Y, X)) -> (add X, ~Y)` --------- Co-authored-by: Yui5427 <785369607@qq.com> Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-09-08 15:12:59 +00:00
paperchalice	667f919214	[SelectionDAG][ARM] Propagate fast math flags in visitBRCOND (#156647 ) Factor out from #151275.	2025-09-06 20:44:25 +08:00
Yingwei Zheng	0d29279465	[DAGCombine] Propagate nuw when evaluating sub with narrower types (#156710 ) Proof: https://alive2.llvm.org/ce/z/cdbzSL Closes https://github.com/llvm/llvm-project/issues/156559.	2025-09-04 10:17:45 +08:00
Yingwei Zheng	5d111a20c5	[DAGCombiner] Avoid double deletion when replacing multiple frozen/unfrozen uses (#155427 ) Closes https://github.com/llvm/llvm-project/issues/155345. In the original case, we have one frozen use and two unfrozen uses: ``` t73: i8 = select t81, Constant:i8<0>, t18 t75: i8 = select t10, t18, t73 t59: i8 = freeze t18 (combining) t80: i8 = freeze t59 (another user of t59) ``` In `DAGCombiner::visitFREEZE`, we replace all uses of `t18` with `t59`. After updating the uses, `t59: i8 = freeze t18` will be updated to `t59: i8 = freeze t59` (`AddModifiedNodeToCSEMaps`) and CSEed into `t80: i8 = freeze t59` (`ReplaceAllUsesWith`). As the previous call to `AddModifiedNodeToCSEMaps` already removed `t59` from the CSE map, `ReplaceAllUsesWith` cannot remove `t59` again. For clarity, see the following call graph: ``` ReplaceAllUsesOfValueWith(t18, t59) ReplaceAllUsesWith(t18, t59) RemoveNodeFromCSEMaps(t73) update t73 AddModifiedNodeToCSEMaps(t73) RemoveNodeFromCSEMaps(t75) update t75 AddModifiedNodeToCSEMaps(t75) RemoveNodeFromCSEMaps(t59) <- first delection update t59 AddModifiedNodeToCSEMaps(t59) ReplaceAllUsesWith(t59, t80) RemoveNodeFromCSEMaps(t59) <- second delection Boom! ``` This patch unfreezes all the uses first to avoid triggering CSE when introducing cycles.	2025-08-27 11:21:22 +08:00
Craig Topper	56289647be	[DAGCombiner] Preserve nuw when converting mul to shl. Use nuw in srl+shl combine. (#155043 ) If the srl+shl have the same shift amount and the shl has the nuw flag, we can remove both. In the affected test, the InterleavedAccess pass will emit a udiv after the `mul nuw`. We expect them to combine away. The remaining shifts on the RV64 tests are because we didn't add the zeroext attribute to the incoming evl operand.	2025-08-25 20:44:06 -07:00
Alex MacLean	8ab917a241	Reland "[NVPTX] Legalize aext-load to zext-load to expose more DAG combines" (#155063 ) The original version of this change inadvertently dropped b6e19b35cd87f3167a0f04a61a12016b935ab1ea. This version retains that fix as well as adding tests for it and an explanation for why it is needed.	2025-08-25 09:15:44 -07:00
XChy	fd330dedcb	[DAG] Constant fold ISD::FSHL/FSHR nodes (#154480 ) Fixes #153612. This patch handles trinary scalar integers for FSHL/R in `FoldConstantArithmetic`. Pending until #153790 is merged.	2025-08-23 10:08:21 +09:00
Joseph Huber	d439c9ea4a	Revert "[NVPTX] Legalize aext-load to zext-load to expose more DAG combines (#154251 )" Causes failures in the LLVM libc test suite https://lab.llvm.org/buildbot/#/builders/69/builds/26327/steps/12/logs/stdio. This reverts commit a3ed96b899baddd4865f1ef09f01a83da011db5c.	2025-08-22 16:13:58 -05:00
paperchalice	2014890c09	[SelectionDAG] Remove `UnsafeFPMath` in `visitFP_ROUND` (#154768 ) Remove `UnsafeFPMath` in `visitFP_ROUND` part, it blocks some bugfixes related to clang and the ultimate goal is to remove `resetTargetOptions` method in `TargetMachine`, see FIXME in `resetTargetOptions`. See also https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast https://discourse.llvm.org/t/allowfpopfusion-vs-sdnodeflags-hasallowcontract Now all UnsafeFPMath uses are eliminated in LLVMCodeGen	2025-08-22 19:46:33 +08:00
paperchalice	945a186089	[DAGCombiner] Remove most `UnsafeFPMath` references (#146295 ) This pull request removes all references to `UnsafeFPMath` in dag combiner except FP_ROUND. - Set fast math flags in some tests.	2025-08-22 15:27:25 +08:00
Alex MacLean	a3ed96b899	[NVPTX] Legalize aext-load to zext-load to expose more DAG combines (#154251 )	2025-08-21 15:33:23 -07:00
Jim Lin	fd28257195	[DAGCombiner] Fold umax/umin operations with vscale operands (#154461 ) If umax/umin operations with vscale operands, that can be constant folded.	2025-08-21 09:15:40 +08:00
Matt Arsenault	276c1d8114	DAG: Add assert to getNode for EXTRACT_SUBVECTOR indexes (#154099 ) Verify it's a multiple of the result vector element count instead of asserting this in random combines. The testcase in #153808 fails in the wrong point. Add an assert to getNode so the invalid extract asserts at construction instead of use.	2025-08-20 09:55:43 +09:00
Simon Pilgrim	fcb36ca8cc	[DAG] visitTRUNCATE - merge the trunc(abd) and trunc(avg) handling which are almost identical (#154301 ) CC @houngkoungting	2025-08-19 12:59:39 +01:00
黃國庭	0773854575	[DAG] Fold trunc(avg(x,y)) for avgceil/floor u/s nodes if they have sufficient leading zero/sign bits (#152273 ) avgceil version : https://alive2.llvm.org/ce/z/2CKrRh Fixes #147773 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-18 16:36:26 +01:00
Simon Pilgrim	858d1dfa2c	[DAG] visitTRUNCATE - early out from computeKnownBits/ComputeNumSignBits failures. NFC. (#154111 ) Avoid unnecessary (costly) computeKnownBits/ComputeNumSignBits calls - use MaskedValueIsZero instead of computeKnownBits directly to simplify code.	2025-08-18 14:55:09 +01:00
Simon Pilgrim	681ecae913	[DAG] visitTRUNCATE - test abd legality early to avoid unnecessary computeKnownBits/ComputeNumSignBits calls. NFC. (#154085 ) isOperationLegal is much cheaper than value tracking	2025-08-18 11:06:29 +01:00
Simon Pilgrim	bcb4984a0b	[X86] select-smin-smax.ll - add i128 tests Helps check quality of legality codegen (all we had was x86 i64 handling)	2025-08-15 13:48:13 +01:00
Min-Yih Hsu	abe92a5000	[DAGCombine] Fix an incorrect folding of extract_subvector (#153709 ) Reported from https://github.com/llvm/llvm-project/pull/153393#issuecomment-3189898813 During DAGCombine, an intermediate extract_subvector sequence was generated: ``` t8: v9i16 = extract_subvector t3, Constant:i64<9> t24: v8i16 = extract_subvector t8, Constant:i64<0> ``` And one of the DAGCombine rule which turns `(extract_subvector (extract_subvector X, C), 0)` into `(extract_subvector X, C)` kicked in and turn that into `v8i16 = extract_subvector t3, Constant:i64<9>`. But it forgot to check if the extracted index is a multiple of the minimum vector length of the result type, hence the crash. This patch fixes this by adding an additional check.	2025-08-14 23:37:22 +00:00
woruyu	95b16d1264	[DAG] Fold trunc(abdu(x,y)) and trunc(abds(x,y)) if they have sufficient leading zero/sign bits (#151471 ) This PR resolves https://github.com/llvm/llvm-project/issues/147683 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-08 10:43:14 +01:00
Benjamin Maxwell	94c48a21bb	[AArch64][SVE] Fix hang in VECTOR_HISTOGRAM DAG combine (#152539 ) The histogram DAG combine went into an infinite loop of creating the same histogram node due to an incorrect use of the `refineUniformBase` and `refineIndexType` APIs. These APIs take SDValues by reference (SDValue&) and return `true` if they were "refined" (i.e., set to new values). Previously, this DAG combine would create the `Ops` array (used to create the new histogram node) before calling the `refine*` APIs, which copies the SDValues into the array, meaning the updated values were not used to create the new histogram node. Reproducer: https://godbolt.org/z/hsGWhTaqY (it will timeout)	2025-08-08 09:59:24 +01:00
Craig Topper	57045a137f	[DAGCombiner] Avoid repeated calls to WideVT.getScalarSizeInBits() in DAGCombiner::mergeTruncStores. NFC (#152231 ) We already have a variable, WideNumBits, that contains the same information. Use it and delay the creation of WideVT until we really need it.	2025-08-06 09:10:02 -07:00
Simon Pilgrim	9f50224b25	[DAG] Remove Depth=1 hack from isGuaranteedNotToBeUndefOrPoison checks (#152127 ) Now that #146490 removed the assertion in visitFreeze to assert that the node was still isGuaranteedNotToBeUndefOrPoison we no longer need this reduced depth hack (which had to account for the difference in depth of freeze(op()) vs op(freeze()) Helps with some of the minor regressions in #150017	2025-08-05 13:35:04 +01:00
Simon Pilgrim	d561259a08	[DAG] visitFREEZE - replace multiple frozen/unfrozen uses of an SDValue with just the frozen node (#150017 ) Similar to InstCombinerImpl::freezeOtherUses, attempt to ensure that we merge multiple frozen/unfrozen uses of a SDValue. This fixes a number of hasOneUse() problems when trying to push FREEZE nodes through the DAG. Remove SimplifyMultipleUseDemandedBits handling of FREEZE nodes as we now want to keep the common node, and not bypass for some nodes just because of DemandedElts. Fixes #149799	2025-08-05 09:24:09 +01:00
woruyu	38bfe9ae56	[DAG] combineVSelectWithAllOnesOrZeros - missing freeze (#150388 ) This PR resolves https://github.com/llvm/llvm-project/issues/150069 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-04 15:55:12 +01:00
Simon Pilgrim	5c2054a4ea	[DAG] getMinMaxOpcodeForFP - split if-else chain. NFC. (#151938 ) (style) All cases return so split the chain	2025-08-04 15:32:08 +01:00
Abhishek Kaushik	1c0ac80d4a	[DAG] Combine `store + vselect` to `masked_store` (#145176 ) Add a new combine to replace ``` (store ch (vselect cond truevec (load ch ptr offset)) ptr offset) ``` to ``` (mstore ch truevec ptr offset cond) ``` This saves a blend operation on targets that support conditional stores.	2025-08-04 19:05:36 +05:30
AZero13	23022a4683	[SelectionDAG] Move sign pattern check from AArch64 and ARM to general SelectionDAG (#151736 ) This works on all cases much like the XOR case above it in SelectionDAG.	2025-08-01 14:46:51 -07:00
Paul Walker	ceb2b9c141	[LLVM][DAGCombiner] fold (shl (X * vscale(C0)), C1) -> (X * vscale(C0 << C1)). (#150651 )	2025-08-01 11:42:45 +01:00
黃國庭	f04ea2ef1c	Add m_SelectCCLike matcher to match SELECT_CC or SELECT with SETCC (#149646 ) Fix #147282 and Follow-up to #148834 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-01 10:12:05 +01:00
David Sherwood	05b16aff0f	[DAGCombiner] Add combine for vector interleave of splats (#151110 ) This patch adds two DAG combines: 1. vector_interleave(splat, splat, ...) -> {splat,splat,...} 2. concat_vectors(splat, splat, ...) -> wide_splat where all the input splats are identical. Both of these together enable us to fold concat_vectors(vector_interleave(splat, splat, ...)) into a wide splat. Post-legalisation we must only do the concat_vector combine if the wider type and splat operation is legal. For fixed-width vectors the DAG combine only occurs for interleave factors of 3 or more, however it's not currently safe to test this for AArch64 since there isn't any lowering support for fixed-width interleaves. I've only added fixed-width tests for RISCV.	2025-08-01 09:58:05 +01:00
Pierre van Houtryve	c4b1557097	[DAG] Fold (setcc ((x \| x >> c0 \| ...) & mask)) sequences (#146054 ) Fold sequences where we extract a bunch of contiguous bits from a value, merge them into the low bit and then check if the low bits are zero or not. Usually the and would be on the outside (the leaves) of the expression, but the DAG canonicalizes it to a single `and` at the root of the expression. The reason I put this in DAGCombiner instead of the target combiner is because this is a generic, valid transform that's also fairly niche, so there isn't much risk of a combine loop I think. See #136727	2025-07-30 10:27:19 +02:00
Pierre van Houtryve	250f2a6367	[DAG] Remove AssertZext if the input is masked (#146052 ) Remove AssertZext if the input ensures the assert cannot fail.	2025-07-29 13:05:30 +02:00
Nikita Popov	ab1f6ce482	[IR][SDAG] Remove lifetime size handling from SDAG (#150944 ) Split out from https://github.com/llvm/llvm-project/pull/150248: Specify that the argument of lifetime.start/lifetime.end is ignored and will be removed in the future. Remove lifetime size handling from SDAG. The size was previously discarded during isel, so was always ignored for stack coloring anyway. Where necessary, obtain the size of the full frame index.	2025-07-29 09:53:59 +02:00
Craig Topper	8d549cf036	[SelectionDAG] Pass SDNodeFlags through getNode instead of setFlags. (#149852 ) getNode updates flags correctly for CSE. Calling setFlags after getNode may set the flags where they don't apply. I've added a Flags argument to getSelectCC and the signature of getNode that takes an ArrayRef of EVTs.	2025-07-22 08:06:30 -07:00
Simon Pilgrim	c37942df00	[DAG] visitFREEZE - limit freezing of multiple operands (#149797 ) This is a partial revert of #145939 (I've kept the BUILD_VECTOR(FREEZE(UNDEF), FREEZE(UNDEF), elt2, ...) canonicalization) as we're getting reports of infinite loops (#148084). The issue appears to be due to deep chains of nodes and how visitFREEZE replaces all instances of an operand with a common frozen version - other users of the original frozen node then get added back to the worklist but might no longer be able to confirm a node isn't poison due to recursion depth limits on isGuaranteedNotToBeUndefOrPoison. The issue still exists with the old implementation but by only allowing a single frozen operand it helps prevent cases of interdependent frozen nodes. I'm still working on supporting multiple operands as its critical for topological DAG handling but need to get a fix in for trunk and 21.x. Fixes #148084	2025-07-22 15:40:55 +01:00
Nikita Popov	a7a1df8f72	[CodeGen] Remove handling for lifetime.start/end on non-alloca (#149838 ) After https://github.com/llvm/llvm-project/pull/149310 we are guaranteed that the argument is an alloca, so we don't need to look at underlying objects (which was not a correct thing to do anyway). This also drops the offset argument for lifetime nodes in SDAG. The offset is fixed to zero now. (Peculiarly, while SDAG pretended to have an offset, it just gets silently dropped during selection.)	2025-07-22 09:44:59 +02:00
Simon Pilgrim	92e2d4e9e1	[DAG] visitFREEZE - remove unused HadMaybePoisonOperands check. NFC. (#149517 ) Redundant since #145939	2025-07-18 17:38:11 +01:00
Alex MacLean	f73e163278	[DAGCombiner] Fold [us]itofp of truncate (#149391 )	2025-07-18 08:10:20 -07:00
Paul Walker	44cd5027f8	[LLVM][CodeGen][SVE] List MVTs that are desirable for extending loads. (#149153 ) Extend AArch64TargetLowering::isVectorLoadExtDesirable to specify the set of MVT for which load extension is desirable. Fixes https://github.com/llvm/llvm-project/issues/148939	2025-07-18 15:34:48 +01:00
Piotr Fusik	9fa3971fac	[DAGCombiner] Fold vector subtraction if above threshold to `umin` (#148834 ) This extends #134235 and #135194 to vectors.	2025-07-17 16:37:59 +02:00
Craig Topper	36e4174989	[DAGCombiner][AArch64] Prevent SimplifyVCastOp from creating illegal scalar types after type legalization. (#148970 ) Fixes #148949	2025-07-15 18:22:25 -07:00
Paul Walker	bd4e7f5f5d	[LLVM][DAGCombiner] Fix size calculations in calculateByteProvider. (#148425 ) calculateByteProvider only cares about scalars or a single element within a vector. For the later there is the VectorIndex parameter to identify the element. All other properties, and specificially Index, are related to the underyling scalar type and thus when taking the size of a type it's the scalar size that matters. Fixes https://github.com/llvm/llvm-project/issues/148387	2025-07-15 11:05:38 +01:00
Craig Topper	eea5c291bb	[DAGCombiner] Pass SDNodeFlags to getNode instead of modifying nodes. (#148744 ) getNode has logic to intersect flags correctly if the new node happens to CSE with an existing node. Setting node flags after getNode bypasses this logic and may change the node for other uses where the flags don't hold.	2025-07-14 20:53:14 -07:00

1 2 3 4 5 ...

4117 Commits