llvm-project

Author	SHA1	Message	Date
paperchalice	c53acf0443	[SelectionDAGBuilder] Remove NoNaNsFPMath uses (#169904 ) Replaced by checking fast-math flags or value tracking results.	2026-02-09 09:48:07 +08:00
David Sherwood	e958bcdd17	[DAGCombiner] Look through freeze for ext(freeze(extload(x))) (#178669 ) This patch fixes a regression introduced by PR #175022, where a freeze was introduced with the following transformation: ext(freeze(load(x))) -> freeze(extload(x)) If a new extend is introduced afterwards we then have ext(freeze(extload(x))) which doesn't get picked up by existing DAG combines due to the freeze getting in the way.	2026-02-06 15:50:17 +00:00
Steffen Larsen	5654ecd5dd	[DAGCombiner] Fix exact power-of-two signed division for large integers (#177340 ) Previously, the DAG combiner did not optimize exact signed division by a power-of-two constant divisor for integer types exceeding the size of division supported by the target architecture (e.g., i128 on x86-64). However, such an optimization was expected by the division expansion logic, leading to unsupported division operations making it to instruction selection. This commit addresses this issue by making an exception to the existing exclusion of signed division with the exact flag for the aforementioned operations. That is, the DAG combiner will now optimize exact signed division if the divisor is a power-of-two constant and the integer type exceeds the size of division supported by the target architecture. --------- Signed-off-by: Steffen Holst Larsen <HolstLarsen.Steffen@amd.com>	2026-02-06 09:40:32 +01:00
Nicolai Hähnle	af836ff60c	[CodeGen] Add getTgtMemIntrinsic overload for multiple memory operands (NFC) (#175843 ) There are target intrinsics that logically require two MMOs, such as llvm.amdgcn.global.load.lds, which is a copy from global memory to LDS, so there's both a load and a store to different addresses. Add an overload of getTgtMemIntrinsic that produces intrinsic info in a vector, and implement it in terms of the existing (now protected) overload. GlobalISel and SelectionDAG paths are updated to support multiple MMOs. The main part of this change is supporting multiple MMOs in MemIntrinsicNodes. Converting the backends to using the new overload is a fairly mechanical step that is done in a separate change in the hope that that allows reducing merging pains during review and for downstreams. A later change will then enable using multiple MMOs in AMDGPU.	2026-02-02 21:58:42 +00:00
DaKnig	fbda30607c	[SDAG] (abs (add nsw a, -b)) -> (abds a, b) (#175801 ) This is beneficial for bv of constants. alive2: https://alive2.llvm.org/ce/z/e3GsWZ	2026-02-02 15:11:16 +00:00
Simon Pilgrim	a372152cb5	[DAG] visitVECTOR_SHUFFLE - ensure correct resno when folding shuffle(bop(shuffle(x,y),shuffle(z,w)) (#179124 ) TLI.isBinOp recognises some opcodes that have multiple results, including UADDO etc. In most cases we currently just bail if a binop has multiple results, but shuffle combining was missing the check and its pretty trivial to add handling in this case. I've added add/sub-overflow opcodes to verifyNode to help catch these cases in the future - IIRC there was a plan to autogen these, but there isn't anything at the moment. Fixes #179112	2026-02-02 09:22:48 +00:00
Benjamin Maxwell	1818b23a99	[SDAG] Check for `nsz` in DAG.canIgnoreSignBitOfZero() (#178905 ) Follow up to #174423	2026-02-01 15:58:38 +00:00
陈子昂	a994198906	[DAG] Reland: Enable bitcast STLF for Constant/Undef (#178890 ) This is a reland of #172523. The original patch caused an assertion failure on RISC-V because it attempted to create a bitcast from an illegal type (i32 on RV64) during the post-type-legalization DAGCombine stage. Added a `TLI.isTypeLegal(Val.getValueType())` check to ensure we only proceed with the bitcast STLF optimization when the source value's type is legal for the target.	2026-01-30 18:21:32 +01:00
Alex Bradbury	41f453efe2	Revert "[DAG] Enable bitcast STLF for Constant/Undef" (#178872 ) Reverts llvm/llvm-project#172523 As explained in https://github.com/llvm/llvm-project/pull/172523#issuecomment-3823234270 (along with reproducer), this causes compiler crashes building llvm-test-suite for RVV targets.	2026-01-30 12:18:38 +00:00
陈子昂	d3c64633c3	[DAG] Enable bitcast STLF for Constant/Undef (#172523 ) This patch introduces support for Store-to-Load Forwarding (STLF) in `DAGCombiner::ForwardStoreValueToDirectLoad` when the store and load have different types but equal memory size (e.g., storing an `i32` then loading a `float` from the same location). ### What this patch does: Enables Optimization: It allows for the safe forwarding of the stored value as a Bitcast when the value is: * A Constant (`ConstantSDNode`, `ConstantFPSDNode`, `ConstantPoolSDNode`). * Undef. * And the memory sizes (`LdMemSize` == `StMemSize`) match. ### Scope and Next Steps: This patch only implements forwarding for constant and undef values that has the same memory size so far. I am submitting this initial patch to get early review feedback on the core logic and fix the immediate crashes before tackling the more complex scenarios. For the simple case: ```llvm ; Case Handled by this PR so far (e.g., zeroinitializer is a constant) define float @test_stlf_integer(ptr %p, float %v) { store i32 0, ptr %p, align 4 %f = load float, ptr %p, align 4 ; ... } ``` Fixes: #151683	2026-01-30 10:11:59 +01:00
Craig Topper	80cbd1d696	[RISCV] Support ISD::CLMUL/CLMULH for i64 scalable vectors with Zvbc. (#178340 ) We also get some i32->i64 promotion for CLMULH. The DAGCombiner change is to prevent an infinite loop from that. Test file was rewritten to cover all types and split between clmul and clmulh. I added a couple masked tests to show that VectorPeephole works. The test outputs were already large so I didn't want to add more than a couple.	2026-01-29 13:17:03 -08:00
David Sherwood	73c7c562dd	[LLVM][DAGCombiner] Look through freeze when combining extensions of loads (#175022 ) Following on from https://github.com/llvm/llvm-project/pull/172484 I have added support to tryToFoldExtOfLoad for looking through freezes, in order to catch more cases of extending loads. This type of code is sometimes seen being generated by the loop vectoriser. For now I've limited this to cases where the load is only used by the freeze, since otherwise it leads to worse code in some X86 tests.	2026-01-29 12:01:43 +00:00
Simon Pilgrim	9aec188b77	[DAG] SDPatternMatch - allow m_BinOp / m_c_BinOp to take an optional SDNodeFlags required for matching (#178435 ) BinaryOpc_match is already wired up for this - but allow us to use m_BinOp/m_c_BinOp with the required flags directly Updated the foldShiftToAvg folds to make use of this	2026-01-28 18:50:42 +00:00
Matt Arsenault	544c300f43	DAG: Use poison instead of undef in some vector combines (#177612 ) Use poison for the unused or out of bounds vector components.	2026-01-25 18:52:51 +01:00
Stefan Weigl-Bosker	2370bf206d	[DAG] Extend MinMax matchers to detect flippable sign (#177504 ) Fixes #174328	2026-01-25 15:35:57 +00:00
Matt Arsenault	c928d7903f	DAG: Use correct shift type for big endian store forwarding case (#177752 ) Theoretically the shift amount type could differ, it just happens none of the big endian targets do this.	2026-01-24 11:21:35 +00:00
Vishruth Thimmaiah	7a10fc8d54	[DAG] Add basic folds for CLMUL nodes (#176961 ) Closes #176783 Adds support for folding `ISD::CMUL`/`CMULH`/`CMULR` nodes.	2026-01-23 13:11:07 +00:00
DaKnig	4016592bf7	[SDAG] (abd? (?ext x), c) -> (zext (abd? x, c)) (#176366 ) just the existing pattern, with constants	2026-01-22 16:28:12 +00:00
Simon Pilgrim	39028cc55a	[DAG] foldAddToAvg - add patterns to form avgceil(A, B) from ((A >> 1) + (B >> 1)) + ((A \| B) & 1) (#174719 ) Alive2 proof: https://alive2.llvm.org/ce/z/mcatXZ I've raised #174718 as supposedly PPC has AVGCEIL instructions, but the patterns in PPCInstrAltivec.td are either incorrect or the instructions don't account for overflow. Fixes #128377	2026-01-20 12:12:51 +00:00
Luke Lau	45a0f9cc1b	[DAGCombiner] Fold min/max vscale, C -> C (#174708 ) This fixes a regression in #174693 caused by using ISD::UMIN to clamp offset into a vector address. For (umin x, y) if we know the minimum value of x is >= the maximum value of y, then y will always be the smaller operand and we can fold to y. We can do similar folds for umax, smin and smax too. In practice the only time we get a useful ConstantRange is with VScale and a constant RHS, so this patch limits it to this case. I tried generalizing it with computeKnownBits but it didn't have any effect on existing tests.	2026-01-20 07:57:58 +00:00
fbrv	dd29183f33	[DAG] Allow MIN/MAX signedness flip when operands are known-negative (#174469 ) Extend the existing DAGCombine logic in visitIMINMAX so that signed and unsigned MIN/MAX can be flipped not only when both operands are known non-negative but also when both operands are known negative. This replaces the old SignBitIsZero checks with computeKnownBits and explicit tests for non-negative or negative operands while keeping all existing legality and saturation gating in place. Add regression tests to cover both the known-negative case and the known-non-negative case. Fixes #174325	2026-01-16 18:48:54 +00:00
Matt Arsenault	01e6245af4	DAG: Avoid querying libcall info from TargetLowering (#176268 ) Libcall lowering decisions should come from the LibcallLoweringInfo analysis. Query this through the DAG, so eventually the source can be the analysis. For the moment this is just a wrapper around the TargetLowering information.	2026-01-16 09:02:49 +00:00
Ramkumar Ramachandra	d69335bac9	[LLVM] Clean up code using [not_]equal_to (NFC) (#175824 ) Use llvm::[not_]equal_to landed in d2a521750 ([ADT] Introduce bind_{front,back}, [not_]equal_to, #175056) across LLVM for cleaner code.	2026-01-13 21:19:39 +00:00
actink	ad3e3d809e	[SDAG] fix miss opt: shl nuw + zext adds unnecessary masking (#172046 ) close: #171750	2026-01-13 22:03:47 +08:00
DaKnig	e51f25a3ca	[SDAG] Combine select into ABD?, for const (#173581 ) (select (setcc ...) (sub a, b) (sub b, a)) When b is const, the `sub a, b` becomes `add a, -b` which we take care of in this patch with the m_SpecificNeg() matcher.	2026-01-12 10:12:10 +00:00
Ramkumar Ramachandra	9e5e267a03	[ISel] Introduce llvm.clmul intrinsic (#168731 ) In line with a std proposal to introduce the llvm.clmul family of intrinsics corresponding to carry-less multiply operations. This work builds upon 727ee7e ([APInt] Introduce carry-less multiply primitives), and follow-up patches will introduce custom-lowering on supported targets, replacing target-specific clmul intrinsics. Testing is done on the RISC-V target, which should be sufficient to prove that the intrinsics work, since no RISC-V specific lowering has been added. Ref: https://isocpp.org/files/papers/P3642R3.html Co-authored-by: Craig Topper <craig.topper@sifive.com>	2026-01-05 20:24:06 +00:00
Craig Topper	1b43f5cec6	[RISCV][SelectionDAG] Add a ISD::CTLS node for count leading redundant sign bits. Use it to select CLS(W). (#173417 ) The RISC-V P extension adds an instruction equivalent to __builtin_clrsb. AArch64 has a similar instruction that we currently fail to select when using the builtin. This patch adds a combine based on the canonical version of the pattern emitted by clang for the builtin, (add (ctlz (xor x, (sra x, bw-1)))), -1). I'm starting the combine at the ctlz because the outer add can easily be combined into other nodes obscuring the full pattern. So we generate (add (ctls x), 1) and hope the add will be combined away. I've also added a combine for the pattern AArch64 recognizes (ctlz_zero_undef (or (shl (xor x, (sra x, bw-1)), 1), 1)). I've only enabled the combines when the target has a Legal or Custom action for the operation, taking into account type promotion. We can relax this in the future by adding a default expansion to LegalizeDAG and adding more type legalization rules.	2026-01-04 18:00:00 -08:00
Islam Imad	7ceecfad40	[CodeGen] Fix EVT::changeVectorElementType assertion on simple-to-extended fallback (#173413 ) Fixes #171608	2025-12-28 18:51:18 +00:00
Guy David	ec1a65ff61	[DAGCombiner] Relax nsz constraint with fp->int->fp optimizations (#164503 ) `NoSignedZerosFPMath` isn't a hard requirements and in some contexts we can still apply the truncation without worrying. For example, in cases where the users of this sequence are overwriting the sign-bit (fabs) or simply ignoring it (fcmp). I think the same logic can be applied elsewhere for other DAG optimizations.	2025-12-23 23:11:03 +02:00
Guy David	1cb99036b4	[DAGCombiner] Extend fp->int->fp optimizations to include clamping (#164502 ) Extends the original pattern to allow min/max operations between the conversions.	2025-12-23 20:26:40 +02:00
Paul Walker	c4088b27ea	[LLVM][DAGCombiner] Look through freeze when combining extensions of extending-masked-loads. (#172484 ) Extensions in this context mean post legalisation extensions (i.e. and, sext-in-reg) because that's the point the freeze blocks the existing combine.	2025-12-23 11:42:49 +00:00
natanelh-mobileye	fa78d6a5f1	[SDAG] Shrink (abd? (?ext x) (?ext y)) (#171865 ) Alive2 test: https://alive2.llvm.org/ce/z/maryYU Lit test before change: https://godbolt.org/z/nEKWdPbMv Fixes #171640	2025-12-17 16:30:52 +00:00
Simon Pilgrim	a68fde5780	[DAG] foldAddToAvg - optimize nested m_Reassociatable matchers (#171681 ) The use of nested m_Reassociatable matchers by #169644 can result in high compile times as the inner m_Reassociatable call is being repeated a lot while the outer call is trying to match. Place the inner m_ReassociatableAnd at the beginning of the pattern so it is not repeatedly matched in recursion.	2025-12-15 13:41:02 +00:00
Shubham Sandeep Rastogi	16e6055273	Revert "[SelectionDAG] Salvage debuginfo when combining load and sext… (#171745 ) … instrs. (#169779)" This reverts commit 2b958b9ee24b8ea36dcc777b2d1bcfb66c4972b6. I might have broken the sanitizer-x86_64-linux bot /home/b/sanitizer-x86_64-linux/build/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_procmaps_linux.cpp clang++: /home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:248: const T &llvm::ArrayRef<llvm::DbgValueLocEntry>::operator[](size_t) const [T = llvm::DbgValueLocEntry]: Assertion `Index < Length && "Invalid index!"' failed.	2025-12-10 16:49:59 -08:00
Shubham Sandeep Rastogi	2b958b9ee2	[SelectionDAG] Salvage debuginfo when combining load and sext instrs. (#169779 ) SelectionDAG uses the DAGCombiner to fold a load followed by a sext to a load and sext instruction. For example, in x86 we will see that ``` %1 = load i32, ptr @GlobArr #dbg_value(i32 %1, !43, !DIExpression(), !52) %2 = sext i32 %1 to i64, !dbg !53 ``` is converted to: ``` %0:gr64_nosp = MOVSX64rm32 $rip, 1, $noreg, @GlobArr, $noreg, debug-instr-number 1, debug-location !51 DBG_VALUE $noreg, $noreg, !"Idx", !DIExpression(), debug-location !52 ``` The `DBG_VALUE` needs to be transferred correctly to the new combined instruction, and it needs to be appended with a `DIExpression` which contains a `DW_OP_LLVM_fragment`, describing that the lower bits of the virtual register contain the value. This patch fixes the above described problem.	2025-12-10 14:43:38 -08:00
Simon Pilgrim	804e768bda	[DAG] Recognise AVGFLOOR (((A >> 1) + (B >> 1)) + (A & B & 1)) patterns (#169644 ) Recognise 'LSB' style AVGFLOOR patterns. Alive2: [https://alive2.llvm.org/ce/z/nfSSk_](https://alive2.llvm.org/ce/z/nfSSk_) Fixes #53648	2025-12-10 08:44:11 +00:00
Guy David	29611f4cbe	[DAGCombiner] Relax nsz constraint for FP optimizations (#165011 ) Some floating-point optimization don't trigger because they can produce incorrect results around signed zeros, and rely on the existence of the nsz flag which commonly appears when fast-math is enabled. However, this flag is not a hard requirement when all of the users of the combined value are either guaranteed to overwrite the sign-bit or simply ignore it (comparisons, etc.). The optimizations affected: - fadd x, +0.0 -> x - fsub x, -0.0 -> x - fsub +0.0, x -> fneg x - fdiv(x, sqrt(x)) -> sqrt(x) - frem lowering with power-of-2 divisors	2025-12-09 12:07:46 +02:00
David Green	0959bb3001	[DAG] Generate UMULH/SMULH with wider vector types (#170283 ) The existing code for generating umulh/smulh was checking that that the getTypeToTransformTo was a LegalOrCustom operation. This only takes a single legalization step though, so if v4i32 was legal, a v8i32 would be transformed but a v16i32 would not. This patch introduces a getLegalTypeToTransformTo that performs getTypeToTransformTo until a legal type is reached. The umulh/smulh code can then use it to check if the final resultant type will be legal.	2025-12-08 19:35:32 +00:00
Hongyu Chen	11866c499b	[DAGCombiner] Don't peek through bitcast when checking isMulAddWithConstProfitable (#171056 ) Fixes https://github.com/llvm/llvm-project/issues/171035 Peeking through bitcast may cause type mismatch between `AddNode` and `ConstNode` in `isMulAddWithConstProfitable`.	2025-12-08 22:09:12 +08:00
Valeriy Savchenko	5c6918f24d	[DAGCombiner] Allow promoted constants in MULHU by power-of-2 -> SRL transform (#170562 ) Type legalization can promote constant operands. The MULHU optimization `mulhu x, (1 << c) -> x >> (bitwidth - c)` was failing when constants were promoted because: 1. `isConstantOrConstantVector` check rejected promoted constants 2. `BuildLogBase2` -> `takeInexpensiveLog2` -> `matchUnaryPredicate` rejected promoted constants This fixes both by adding `AllowTruncation=true`, following the pattern from the recent UDIV fix (#169491).	2025-12-04 13:32:19 +00:00
Valeriy Savchenko	8e53a88de3	[DAGCombiner] Handle type-promoted constants in SDIV lowering (#169924 ) Builds up on the solution proposed for #169491 and applies it for SDIV as well.	2025-12-04 11:33:19 +00:00
Valeriy Savchenko	c5fa1f8c4b	[DAGCombiner] Handle type-promoted constants in UDIV lowering (#169491 )	2025-12-03 19:34:21 +00:00
Matt Arsenault	cdb501064f	DAG: Avoid more uses of getLibcallName (#170402 )	2025-12-03 13:01:04 -05:00
Lewis Crawford	ea3fdc5972	Avoid maxnum(sNaN, x) optimizations / folds (#170181 ) The behaviour of constant-folding `maxnum(sNaN, x)` and `minnum(sNaN, x)` has become controversial, and there are ongoing discussions about which behaviour we want to specify in the LLVM IR LangRef. See: - https://github.com/llvm/llvm-project/issues/170082 - https://github.com/llvm/llvm-project/pull/168838 - https://github.com/llvm/llvm-project/pull/138451 - https://github.com/llvm/llvm-project/pull/170067 - https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006 This patch removes optimizations and constant-folding support for `maxnum(sNaN, x)` but keeps it folded/optimized for `qNaN`. This should allow for some more flexibility so the implementation can conform to either the old or new version of the semantics specified without any changes. As far as I am aware, optimizations involving constant `sNaN` should generally be edge-cases that rarely occur, so here should hopefully be very little real-world performance impact from disabling these optimizations.	2025-12-02 12:43:03 +00:00
Simon Pilgrim	38678a91f3	[DAG] getCarry - always succeed if we encounter a i1 type during trunc/ext peeling (#169777 ) If we are force reconstructing a carry from a raw MVT::i1 type, make sure we don't miss any cases while peeling through trunc/ext chains - check for i1 types at the start of the while loop Fixes #169691	2025-11-30 18:26:24 +00:00
Hongyu Chen	3fec26e329	[DAGCombiner] Don't optimize insert_vector_elt into shuffle if implicit truncation exists (#169022 ) Fixes #169017	2025-11-22 03:33:53 +08:00
Craig Topper	01e5e4fd00	[DAGCombiner] Remove unneeded m_BitReverse from visitBITREVERSE. NFC (#168918 ) We already know we're looking at BITREVERSE, we can match on the source operand.	2025-11-20 18:20:47 +00:00
Matt Arsenault	0e1cb2de90	Reapply "DAG: Allow select ptr combine for non-0 address spaces" (#168292 ) (#168786 ) This reverts commit 6d5f87fc4284c4c22512778afaf7f2ba9326ba7b. Previously this failed due to treating the unknown MachineMemOperand value as known uniform.	2025-11-20 12:13:46 -05:00
Sander de Smalen	f369a53d82	[DAGCombiner] Fold select into partial.reduce.add operands. (#167857 ) This generates more optimal codegen when using partial reductions with predication. ``` partial_reduce_mla(acc, sel(p, mul(ext(a), ext(b)), splat(0)), splat(1)) -> partial_reduce_mla(acc, sel(p, a, splat(0)), b) partial.reduce.mla(acc, sel(p, ext(op), splat(0)), splat(1)) -> partial.reduce.*mla(acc, sel(p, op, splat(0)), splat(trunc(1))) ```	2025-11-18 09:49:42 +00:00
ronlieb	6d5f87fc42	Revert "DAG: Allow select ptr combine for non-0 address spaces" (#168292 ) Reverts llvm/llvm-project#167909	2025-11-16 18:35:51 -05:00

1 2 3 4 5 ...

4205 Commits