llvm-project

Author	SHA1	Message	Date
Kazu Hirata	31b8ba5670	[Analysis, CodeGen] Use ArrayRef instead of const ArrayRef (NFC) (#166026 ) This patch improves readability by using "ArrayRef<T>" instead of "const ArrayRef<T>" and "const ArrayRef<T> &" in function parameter types.	2025-11-01 23:20:19 -07:00
Fabian Ritter	8ea447b4c4	[SDAG] Set InBounds when when computing offsets into memory objects (#165425 ) When a load or store accesses N bytes starting from a pointer P, and we want to compute an offset pointer within these N bytes after P, we know that the arithmetic to add the offset must be inbounds. This is for example relevant when legalizing too-wide memory accesses, when lowering memcpy&Co., or when optimizing "vector-load -> extractelement" into an offset load. For SWDEV-516125.	2025-10-31 11:27:55 +01:00
Fabian Ritter	a85e84b854	[SDAG] Preserve InBounds in DAGCombines (#165424 ) This PR preserves the InBounds flag (#162477) where possible in PTRADD-related DAGCombines. We can't preserve them in all the cases that we could in the analogous GISel change (#152495) because SDAG usually represents pointers as integers, which means that pointer provenance is not preserved between PTRADD operations (see the discussion at PR #162477 for more details). This PR marks the places in the DAGCombiner where this is relevant explicitly. For SWDEV-516125.	2025-10-31 10:25:39 +01:00
Princeton Ferro	68e74f8f84	[DAGCombiner] Lower dynamic insertelt chain more efficiently (#162368 ) For an insertelt with a dynamic index, the default handling in DAGTypeLegalizer and LegalizeDAG will reserve a stack slot for the vector, lower the insertelt to a store, then load the modified vector back into temporaries. The vector store and load may be legalized into a sequence of smaller operations depending on the target. Let V = the vector size and L = the length of a chain of insertelts with dynamic indices. In the worse case, this chain will lower to O(VL) operations, which can increase code size dramatically. Instead, identify such chains, reserve one stack slot for the vector, and lower all of the insertelts to stores at once. This requires only O(V + L) operations. This change only affects the default lowering behavior.	2025-10-29 09:46:01 -07:00
Lauren	e964acf85f	[DAG] Fold mismatched widened avg idioms to narrow form (#147946 ) (#163366 ) [DAG] Fold mismatched widened avg idioms to narrow form (fixes half of [llvm#147946](https://github.com/llvm/llvm-project/issues/147946)) 1. `trunc(avgceilu(sext(x), sext(y))) -> avgceils(x, y)` 2. `trunc(avgceils(zext(x), zext(y))) -> avgceilu(x, y)` When inputs are sign-extended, unsigned and signed averaging operations produce identical results after truncation, allowing us to use the semantically correct narrow operation. alive2: https://alive2.llvm.org/ce/z/ZRbfHT	2025-10-27 12:24:41 +00:00
David Green	332f786a35	[DAG][AArch64] Ensure that ResNo is correct for uses of Ptr when considering postinc. (#164810 ) We might be looking at a different use, for example in the uses of a i32,i64,ch preindex load. Fixes #164775	2025-10-24 11:33:08 +01:00
paperchalice	15d11ebc84	[NFC] "unsafe-fp-math" post cleanup (code comments part) (#164582 )	2025-10-22 11:07:23 +00:00
kper	e83eee335c	[DAG] Create SDPatternMatch method `m_SelectLike` to match `ISD::Select` and `ISD::VSelect` (#164069 ) Fixes #150019	2025-10-22 09:49:34 +00:00
Simon Pilgrim	f8edcba62d	[DAG] visitTRUNCATE - more aggressively fold trunc(add(x,x)) -> add(trunc(x),trunc(x)) (#164227 ) We're very careful not to truncate binary arithmetic ops if it will affect legality, or cause additional truncation instructions, hence we currently limit this to cases where one operand is constant. But if both ops are the same (i.e. for some add/mul cases) then we wouldn't increase the number of truncations, so can be slightly more aggressive at folding the truncation.	2025-10-21 10:17:57 +00:00
Simon Pilgrim	a51e498ea6	[DAG] combineTruncationShuffle - ensure the _EXTEND_VECTOR_INREG node didn't come from a smaller type (#164160 ) The _EXTEND_VECTOR_INREG source vector must be the same size as the destination We already have a similar TODO to handle more types. Fixes #164107	2025-10-19 14:15:33 +00:00
paperchalice	bfee9db785	[DAGCombiner] Remove NoNaNsFPMath uses (#163504 ) Users should use `nnan` flag instead.	2025-10-15 21:22:13 +08:00
paperchalice	dd44e63c8e	[DAGCombiner] Use `FlagInserter` in `visitFSQRT` (#163301 ) Propagate fast-math flags for TLI.getSqrtEstimate etc.	2025-10-15 09:03:15 +08:00
Paul Walker	d7fc770340	[LLVM][DAGCombiner] Improve simplifyDivRem's effectiveness after type legalisation. (#162706 ) simplifyDivRem does not work as well after type legalisation because splatted constants can have a size mismatch between the scalar to splat and the element type of the splatted result. simplifyDivRem does not seem to care about this mismatch so I've updated the "is one" check for the divisor to allow truncation.	2025-10-14 11:23:53 +01:00
Sam Parker	1820102167	Wasm fmuladd relaxed (#163177 ) Reland #161355, after fixing up the cross-projects-tests for the wasm simd intrinsics. Original commit message: Lower v4f32 and v2f64 fmuladd calls to relaxed_madd instructions. If we have FP16, then lower v8f16 fmuladds to FMA. I've introduced an ISD node for fmuladd to maintain the rounding ambiguity through legalization / combine / isel.	2025-10-13 16:50:53 +01:00
Sam Parker	30d3441cf0	Revert "[WebAssembly] Lower fmuladd to madd and nmadd" (#163171 ) Reverts llvm/llvm-project#161355 Looks like I've broken some intrinsic code generation.	2025-10-13 11:53:40 +01:00
Sam Parker	a4eb7ea225	[WebAssembly] Lower fmuladd to madd and nmadd (#161355 ) Lower v4f32 and v2f64 fmuladd calls to relaxed_madd instructions. If we have FP16, then lower v8f16 fmuladds to FMA. I've introduced an ISD node for fmuladd to maintain the rounding ambiguity through legalization / combine / isel.	2025-10-13 10:36:08 +01:00
paperchalice	a61107472b	[SelectionDAG] Remove NoInfsFPMath uses (#162788 ) Users should use fast-math flags instead.	2025-10-12 09:34:24 +08:00
AZero13	d95f8ffee4	[ARM][TargetLowering] Combine Level should not be a factor in shouldFoldConstantShiftPairToMask (NFC) (#156949 ) This should be based on the type and instructions, and only thumb uses combine level anyway.	2025-10-11 10:58:48 +09:00
Yi-Chi Lee	a9c8e94b43	[DAGCombiner] Extend FP-to-Int cast without requiring nsz (#161093 ) This patch updates the FP-to-Int conversion handling: - For signed integers: use `ftrunc` followed by clamping to the target integer range. - For unsigned integers: apply `fabs` + `ftrunc`, then clamp. This removes the previous dependence on `nsz` and ensures correct lowering for both signed and unsigned cases. I've tested the code generation of -mtriple=amdgcn. It seems that the assembly code is expected, but I'm not sure how to write a general testcase for every target. Fixes #160623.	2025-10-11 00:34:33 +09:00
Lewis Crawford	4c2b1d495a	[DAGCombiner] Improve FMin/FMax DAGCombines (#161352 ) Add several improvements to DAGCombine patterns for fmin/fmax: * Fix incorrect results due to minimumnum not being marked as IsMin - e.g. nnan minimumnum(x, +inf) returned +inf instead of x * Fix incorrect results checking maximumnum for vecreduce patterns * Make maxnum/minnum return QNaN if one input is SNaN instead of X * Quiet SNaN inputs when propagating them e.g. maximum(x, SNaN) = QNaN * Update comments to mark when SNaN propagation is being ignored	2025-10-09 18:00:50 +01:00
paperchalice	4967bc17df	[DAGCombiner] Remove NoSignedZerosFPMath in visitFNEG (#162052 ) Remove the `NoSignedZerosFPMath` use in `visitFNEG`. Now the only use of `NoSignedZerosFPMath` is in `foldFPToIntToFP`, but adding fast-math flags support for `uitofp` may introduce breaking changes.	2025-10-08 17:01:47 +08:00
Yatao Wang	178e2a704b	[LLVM][CodeGen] Check Non Saturate Case in isSaturatingMinMax (#160637 ) Fix Issue #160611	2025-10-03 20:39:45 +01:00
Florian Hahn	e86b3386fd	[DAGCombine] Support (shl %x, constant) in foldPartialReduceMLAMulOp. (#160663 ) Support shifts in foldPartialReduceMLAMulOp by treating (shl %x, %c) as (mul %x, (shl 1, %c)). PR: https://github.com/llvm/llvm-project/pull/160663	2025-10-01 09:06:01 +00:00
paperchalice	c6d3b517ee	[DAGCombiner] Remove most `NoSignedZerosFPMath` uses (#161180 ) Remained two uses are related to fneg and foldFPToIntToFP, some AMDGPU tests are duplicated and regenerated.	2025-09-30 11:44:34 +08:00
paperchalice	84e4c0686e	[DAGCombiner] Remove NoSignedZerosFPMath uses in visitFSUB (#160974 ) Remove NoSignedZerosFPMath in visitFSUB part, we should always use instruction level fast math flags.	2025-09-29 19:19:18 +08:00
paperchalice	1e01c02996	[DAGCombiner] Remove `NoSignedZerosFPMath` uses in `visitFADD` (#160635 ) Remove these global flags and use node level flags instead.	2025-09-26 11:24:02 +08:00
kper	0b1318f2a8	[DAG] Fold rem(rem(A, BCst), Op1Cst) -> rem(A, Op1Cst) (#159517 ) Fixes [157370](https://github.com/llvm/llvm-project/issues/157370) UREM General proof: https://alive2.llvm.org/ce/z/b_GQJX SREM General proof: https://alive2.llvm.org/ce/z/Whkaxh I have added it as rv32i and rv64i tests because they are the only architectures where I could verify that it works.	2025-09-22 09:30:10 +00:00
Abhishek Kaushik	f65d5a7a56	[DAG] Skip `mstore` combine for `<1 x ty>` vectors (#159915 ) Fixes #159912	2025-09-21 11:06:49 -07:00
Fabian Ritter	d5607694e1	[AMDGPU][SDAG] DAGCombine PTRADD -> disjoint OR (#146075 ) If we can't fold a PTRADD's offset into its users, lowering them to disjoint ORs is preferable: Often, a 32-bit OR instruction suffices where we'd otherwise use a pair of 32-bit additions with carry. This needs to be a DAGCombine (and not a selection rule) because its main purpose is to enable subsequent DAGCombines for bitwise operations. We don't want to just turn PTRADDs into disjoint ORs whenever that's sound because this transform loses the information that the operation implements pointer arithmetic, which AMDGPU for instance needs when folding constant offsets. For SWDEV-516125.	2025-09-19 11:58:41 +02:00
Fabian Ritter	771c94c8db	[SDAG][AMDGPU] Allow opting in to OOB-generating PTRADD transforms (#146074 ) This PR adds a TargetLowering hook, canTransformPtrArithOutOfBounds, that targets can use to allow transformations to introduce out-of-bounds pointer arithmetic. It also moves two such transformations from the AMDGPU-specific DAG combines to the generic DAGCombiner. This is motivated by target features like AArch64's checked pointer arithmetic, CPA, which does not tolerate the introduction of out-of-bounds pointer arithmetic.	2025-09-19 11:07:59 +02:00
Björn Pettersson	1c4c7bd808	[SelectionDAG] Deal with POISON for INSERT_VECTOR_ELT/INSERT_SUBVECTOR (#143102 ) As reported in https://github.com/llvm/llvm-project/issues/141034 SelectionDAG::getNode had some unexpected behaviors when trying to create vectors with UNDEF elements. Since we treat both UNDEF and POISON as undefined (when using isUndef()) we can't just fold away INSERT_VECTOR_ELT/INSERT_SUBVECTOR based on isUndef(), as that could make the resulting vector more poisonous. Same kind of bug existed in DAGCombiner::visitINSERT_SUBVECTOR. Here are some examples: This fold was done even if vec[idx] was POISON: INSERT_VECTOR_ELT vec, UNDEF, idx -> vec This fold was done even if any of vec[idx..idx+size] was POISON: INSERT_SUBVECTOR vec, UNDEF, idx -> vec This fold was done even if the elements not extracted from vec could be POISON: sub = EXTRACT_SUBVECTOR vec, idx INSERT_SUBVECTOR UNDEF, sub, idx -> vec With this patch we avoid such folds unless we can prove that the result isn't more poisonous when eliminating the insert. Fixes https://github.com/llvm/llvm-project/issues/141034	2025-09-17 21:04:00 +00:00
guan jian	6aab826e23	[DAGCombiner] add fold (xor (smin(x, C), C)) and fold (xor (smax(x, C), C)) (#155141 ) Hi, I compared the following LLVM IR with GCC and Clang, and there is a small difference between the two. The LLVM IR is: ``` define i64 @test_smin_neg_one(i64 %a) { %1 = tail call i64 @llvm.smin.i64(i64 %a, i64 -1) %retval.0 = xor i64 %1, -1 ret i64 %retval.0 } ``` GCC generates: ``` cmp x0, 0 csinv x0, xzr, x0, ge ret ``` Clang generates: ``` cmn x0, #1 csinv x8, x0, xzr, lt mvn x0, x8 ret ``` Clang keeps flipping x0 through x8 unnecessarily. So I added the following folds to DAGCombiner: fold (xor (smax(x, C), C)) -> select (x > C), xor(x, C), 0 fold (xor (smin(x, C), C)) -> select (x < C), xor(x, C), 0 alive2: https://alive2.llvm.org/ce/z/gffoir --------- Co-authored-by: Yui5427 <785369607@qq.com> Co-authored-by: Matt Arsenault <arsenm2@gmail.com> Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-09-16 15:30:57 +00:00
Arthur Eubanks	984251acad	Revert "[DAGCombiner] Relax condition for extract_vector_elt combine" (#157953 ) Reverts llvm/llvm-project#157658 Causes hangs, see https://github.com/llvm/llvm-project/pull/157658#issuecomment-3276441812	2025-09-10 21:33:44 +00:00
ZhaoQi	4621e17dee	[DAGCombiner] Relax condition for extract_vector_elt combine (#157658 ) Checking `isOperationLegalOrCustom` instead of `isOperationLegal` allows more optimization opportunities. In particular, if a target wants to mark `extract_vector_elt` as `Custom` rather than `Legal` in order to optimize some certain cases, this combiner would otherwise miss some improvements. Previously, using `isOperationLegalOrCustom` was avoided due to the risk of getting stuck in infinite loops (as noted in `61ec738b60`). After testing, the issue no longer reproduces, but the coverage is limited to the regression/unit tests and the test-suite.	2025-09-10 15:51:52 +08:00
guan jian	83af24dd85	[DAG] Generalize fold (not (neg x)) -> (add X, -1) (#154348 ) Generalize `fold (not (neg x)) -> (add X, -1)` to `fold (not (sub Y, X)) -> (add X, ~Y)` --------- Co-authored-by: Yui5427 <785369607@qq.com> Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-09-08 15:12:59 +00:00
paperchalice	667f919214	[SelectionDAG][ARM] Propagate fast math flags in visitBRCOND (#156647 ) Factor out from #151275.	2025-09-06 20:44:25 +08:00
Yingwei Zheng	0d29279465	[DAGCombine] Propagate nuw when evaluating sub with narrower types (#156710 ) Proof: https://alive2.llvm.org/ce/z/cdbzSL Closes https://github.com/llvm/llvm-project/issues/156559.	2025-09-04 10:17:45 +08:00
Yingwei Zheng	5d111a20c5	[DAGCombiner] Avoid double deletion when replacing multiple frozen/unfrozen uses (#155427 ) Closes https://github.com/llvm/llvm-project/issues/155345. In the original case, we have one frozen use and two unfrozen uses: ``` t73: i8 = select t81, Constant:i8<0>, t18 t75: i8 = select t10, t18, t73 t59: i8 = freeze t18 (combining) t80: i8 = freeze t59 (another user of t59) ``` In `DAGCombiner::visitFREEZE`, we replace all uses of `t18` with `t59`. After updating the uses, `t59: i8 = freeze t18` will be updated to `t59: i8 = freeze t59` (`AddModifiedNodeToCSEMaps`) and CSEed into `t80: i8 = freeze t59` (`ReplaceAllUsesWith`). As the previous call to `AddModifiedNodeToCSEMaps` already removed `t59` from the CSE map, `ReplaceAllUsesWith` cannot remove `t59` again. For clarity, see the following call graph: ``` ReplaceAllUsesOfValueWith(t18, t59) ReplaceAllUsesWith(t18, t59) RemoveNodeFromCSEMaps(t73) update t73 AddModifiedNodeToCSEMaps(t73) RemoveNodeFromCSEMaps(t75) update t75 AddModifiedNodeToCSEMaps(t75) RemoveNodeFromCSEMaps(t59) <- first delection update t59 AddModifiedNodeToCSEMaps(t59) ReplaceAllUsesWith(t59, t80) RemoveNodeFromCSEMaps(t59) <- second delection Boom! ``` This patch unfreezes all the uses first to avoid triggering CSE when introducing cycles.	2025-08-27 11:21:22 +08:00
Craig Topper	56289647be	[DAGCombiner] Preserve nuw when converting mul to shl. Use nuw in srl+shl combine. (#155043 ) If the srl+shl have the same shift amount and the shl has the nuw flag, we can remove both. In the affected test, the InterleavedAccess pass will emit a udiv after the `mul nuw`. We expect them to combine away. The remaining shifts on the RV64 tests are because we didn't add the zeroext attribute to the incoming evl operand.	2025-08-25 20:44:06 -07:00
Alex MacLean	8ab917a241	Reland "[NVPTX] Legalize aext-load to zext-load to expose more DAG combines" (#155063 ) The original version of this change inadvertently dropped b6e19b35cd87f3167a0f04a61a12016b935ab1ea. This version retains that fix as well as adding tests for it and an explanation for why it is needed.	2025-08-25 09:15:44 -07:00
XChy	fd330dedcb	[DAG] Constant fold ISD::FSHL/FSHR nodes (#154480 ) Fixes #153612. This patch handles trinary scalar integers for FSHL/R in `FoldConstantArithmetic`. Pending until #153790 is merged.	2025-08-23 10:08:21 +09:00
Joseph Huber	d439c9ea4a	Revert "[NVPTX] Legalize aext-load to zext-load to expose more DAG combines (#154251 )" Causes failures in the LLVM libc test suite https://lab.llvm.org/buildbot/#/builders/69/builds/26327/steps/12/logs/stdio. This reverts commit a3ed96b899baddd4865f1ef09f01a83da011db5c.	2025-08-22 16:13:58 -05:00
paperchalice	2014890c09	[SelectionDAG] Remove `UnsafeFPMath` in `visitFP_ROUND` (#154768 ) Remove `UnsafeFPMath` in `visitFP_ROUND` part, it blocks some bugfixes related to clang and the ultimate goal is to remove `resetTargetOptions` method in `TargetMachine`, see FIXME in `resetTargetOptions`. See also https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast https://discourse.llvm.org/t/allowfpopfusion-vs-sdnodeflags-hasallowcontract Now all UnsafeFPMath uses are eliminated in LLVMCodeGen	2025-08-22 19:46:33 +08:00
paperchalice	945a186089	[DAGCombiner] Remove most `UnsafeFPMath` references (#146295 ) This pull request removes all references to `UnsafeFPMath` in dag combiner except FP_ROUND. - Set fast math flags in some tests.	2025-08-22 15:27:25 +08:00
Alex MacLean	a3ed96b899	[NVPTX] Legalize aext-load to zext-load to expose more DAG combines (#154251 )	2025-08-21 15:33:23 -07:00
Jim Lin	fd28257195	[DAGCombiner] Fold umax/umin operations with vscale operands (#154461 ) If umax/umin operations with vscale operands, that can be constant folded.	2025-08-21 09:15:40 +08:00
Matt Arsenault	276c1d8114	DAG: Add assert to getNode for EXTRACT_SUBVECTOR indexes (#154099 ) Verify it's a multiple of the result vector element count instead of asserting this in random combines. The testcase in #153808 fails in the wrong point. Add an assert to getNode so the invalid extract asserts at construction instead of use.	2025-08-20 09:55:43 +09:00
Simon Pilgrim	fcb36ca8cc	[DAG] visitTRUNCATE - merge the trunc(abd) and trunc(avg) handling which are almost identical (#154301 ) CC @houngkoungting	2025-08-19 12:59:39 +01:00
黃國庭	0773854575	[DAG] Fold trunc(avg(x,y)) for avgceil/floor u/s nodes if they have sufficient leading zero/sign bits (#152273 ) avgceil version : https://alive2.llvm.org/ce/z/2CKrRh Fixes #147773 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-18 16:36:26 +01:00
Simon Pilgrim	858d1dfa2c	[DAG] visitTRUNCATE - early out from computeKnownBits/ComputeNumSignBits failures. NFC. (#154111 ) Avoid unnecessary (costly) computeKnownBits/ComputeNumSignBits calls - use MaskedValueIsZero instead of computeKnownBits directly to simplify code.	2025-08-18 14:55:09 +01:00

1 2 3 4 5 ...

4145 Commits