llvm-project

Author	SHA1	Message	Date
Simon Pilgrim	6832709dc0	[DAG] SDPatternMatch - rename m_Opc -> m_SpecificOpc (#190215 ) Match naming convention for other m_Specific* matchers, and frees up the m_Opc() matcher for future use in #84940 to allow us to capture the opcode of a unknown binop Moving to m_SpecificOpc does mess up the formatting in a few places, I've tried to refactor to use the m_Value(SDValue, ....) matcher where I can to retrieve some whitespace	2026-04-03 18:03:00 +00:00
Simon Pilgrim	5674755cb6	[DAG] visitMUL - cleanup pattern matchers to use m_Shl and (commutative) m_Mul directly (#190339 ) Based on feedback on #190215	2026-04-03 13:21:51 +00:00
DaKnig	d6b8163f3f	Retry "[SDAG] (abs (add nsw a, -b)) -> (abds a, b) (#175801 )" (#186659 ) A better version of #175801 . see that for more info. Fixes #185467 The original patch was checking the correctness of the transformation based on the original Op1 , which was then negated (in the case of IsAdd). This patch fixes that issue by inverting the sign bit in that case. Also pushed a slight nfc there to simplify the code and remove some duplication. alive2 proofs: abds: https://alive2.llvm.org/ce/z/oJQPss abdu: https://alive2.llvm.org/ce/z/HfPF5q Note that the regression test is not (wrongly) affected anymore by the patch (as it did before)	2026-04-01 13:37:29 +00:00
Simon Pilgrim	9a33125e42	[DAG] Add basic ISD::IS_FPCLASS constant/identity folds (#189944 ) Attempts to match middle-end implementation in InstructionSimplify/foldIntrinsicIsFPClass Fixes #189919	2026-04-01 13:06:27 +00:00
natanelh-mobileye	46dd9d6f52	[SDAG][abd] Combine abd of small types (#181538 ) It is beneficial to combine abd of illegal, small types (types that get promoted to wider scalar size).	2026-03-31 13:40:51 +00:00
Demetrius Kanios	96bd7b6e15	[CodeGen] Add additional params to `TargetLoweringBase::getTruncStoreAction` (#187422 ) The truncating store analogue of #181104. Adds `Alignment` and `AddrSpace` parameters to `TargetLoweringBase::getTruncStoreAction` and dependents, and introduces a `getCustomTruncStoreAction` hook for targets to customize legalization behavior using this new information. This change is fully backwards compatible from the target's point of view, with `setTruncStoreAction` having identical functionality. The change is purely additive.	2026-03-30 16:52:45 -07:00
Alexis Engelke	bbef10d9f1	[CodeGen][NFC] Compute MaximumLegalStoreInBits just once (#189355 ) Instead of iterating over all value types per basic block, pre-compute the TLI-specific value once when constructing the TLI.	2026-03-30 18:44:18 +02:00
Jim Lin	2b41985405	[DAG] Fix incorrect ForSigned handling in computeConstantRange calls (#188889 ) Fix two places where ForSigned was incorrectly passed to computeConstantRange, causing wrong signed/unsigned range computation. In computeConstantRangeIncludingKnownBits (DemandedElts overload), the call omitted ForSigned, so Depth (unsigned) was implicitly converted to bool for the ForSigned parameter. Introduced in a6a66a4e6915. In visitIMINMAX, the call always passed ForSigned=false, even when folding SMAX/SMIN which query signed bounds from the resulting range.	2026-03-30 10:30:19 +00:00
Simon Pilgrim	207598a827	[DAG] Add command line option and TLI hook to enable DAG topological sorting (#188636 ) The very first step towards #83422 - which will move DAG combines to be processed in topological order. There is a lot of churn on existing tests that need to be addressed before this can be switched on globally, this patch gives the ability to enable it both on a per-target basis, and via a command line option to assist with testing and triage. At the moment I'm focusing on addressing the x86 regressions (example in the patch's basic test coverage) as that's the target I'm most familiar with and will help with many other targets as well, but there might be other/simpler targets that would benefit from earlier handling.	2026-03-27 07:40:53 +00:00
Craig Topper	0ebef5e5e2	[DAGCombine] Enable div by constant optimization for odd sized vectors before type legalization. (#188313 ) If we we are going to legalize to a vector with the same element type and mulh or mul_lohi are supported, allow the optimization before type legalization. RISC-V will widen vectors using vp.udiv/sdiv that doesn't support division by constant optimization. In addition, type legalization will create a build_vector with undef elements making it hard to match after type legalization. Other targets may need to widen by a combination of vector and scalar divisions to avoid traps if we widen a vector with garbage. I had to enable the MULHU->SRL DAG combine before type legalization to prevent regressions. After type legalization, the multiply constant build_vector will have undef elements and the combine won't trigger.	2026-03-26 09:16:46 -07:00
Neil Phan	a6a66a4e69	[DAG] Define computeConstantRange for VSCALE folding (#176027 ) Resolves #175150 Defines computeConstantRange and computeConstantRangeIncludingKnownBits in the SelectionDAG. Currently only handles `ISD::VSCALE` operation related to #174708. Test cases were constructed to test varying VSCALE ranges on AArch64. Further testing can be implemented as needed by review.	2026-03-25 20:32:09 +00:00
Nikita Popov	f064a9979f	[DAGCombine] Optimize away cond ? 1 : 0 post-legalization (#186771 ) Selects of the form `cond ? 1 : 0` are created during unrolling of setcc+vselect. Currently these are not optimized away post-legalization even if fully redundant. Having these extra selects sitting between things can prevent other folds from applying. Enabling this requires some mitigations in the ARM backend, in particular in the interaction with MVE support. There's two changes here: * Form CSINV/CSNEG/CSINC from CMOV, rather than only creating it during SELECT_CC lowering. (After this change, the lowering in SELECT_CC can be dropped without test changes, let me know if I should do that.) * Support pushing negations through CMOV in more cases, in particular if the operands are constant or the negation can be handled by flipping lshr/ashr. Additionally, in the X86 backend, try to simplify CMOV to SETCC if only the low bit is demanded.	2026-03-20 16:23:18 +01:00
Paul Walker	7663802125	[LLVM][DAGCombiner] Limit extract_subvec(extract_subvec()) combine to vectors of the same type. (#187334 ) The index operand of ISD::EXTRACT_SUBVECTOR is implicitly scaled by vscale, which is effectively always one for fixed-length vectors. When combining nested extracts we must ensure all use the same implicit scaling otherwise the transform is not equivalent. Fixes https://github.com/llvm/llvm-project/issues/186563	2026-03-19 11:14:30 +00:00
Craig Topper	9dd2e3792a	[DAGCombiner] Move the XORHandle in rebuildSetCC inside the while loop. (#187189 ) If N was changed on the previous loop iteration, we need the handle to point at the new N. Fixes #186969.	2026-03-18 09:30:05 -07:00
Demetrius Kanios	351501799a	[CodeGen] Improve `getLoadExtAction` and friends (#181104 ) Alternative approach to the same goals as #162407 This takes `TargetLoweringBase::getLoadExtAction`, renames it to `TargetLoweringBase::getLoadAction`, merges `getAtomicLoadExtAction` into it, and adds more inputs for relavent information (alignment, address space). The `isLoadExtLegal[OrCustom]` helpers are also modified in a matching manner. This is fully backwards compatible, with the existing `setLoadExtAction` working as before. But this allows targets to override a new hook to allow the query to make more use of the information. The hook `getCustomLoadAction` is called with all the parameters whenever the table lookup yields `LegalizeAction::Custom`, and can return any other action it wants.	2026-03-17 23:40:19 -07:00
Iasonaskrpr	b44434474e	Improved ISD::SRL handling in isKnownToBeAPowerOfTwo (#182562 ) Fixes #181651 Added DemandedElts argument to isConstOrConstSplat and to isKnowTobePowerOfTwo calls and OrZero \|\| isKnownNeverZero(Val, Depth) is checked before isKnowTobePowerOfTwo. Also added unit tests.	2026-03-14 18:49:08 +00:00
Gergo Stomfai	0fb8f7f9c3	[DAG] Fold away identity FSHL and FSHR patterns (#185667 ) Fold away identity FSHL and FSHR patterns Came up in #185175, this seems to be the cleanest way to get rid of this pattern Alive2 proofs: `fshl(lshr(x, amnt), shl(c, BW - amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/AEzthY `fshl(lshr(x, amnt), fshl(x, _, BW - amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/oDpaqF `fshl(lshr(x, amnt), fshr(x, _, amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/aCxQch `fshl(fshr(_, x, amnt), shl(c, BW - amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/89NQME `fshl(fshr(_, x, amnt), fshl(x, _, BW - amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/KdR3Mp `fshl(fshr(_, x, amnt), fshr(x, _, amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/2Gkc7m `fshl(fshl(_, x, BW - amnt), shl(c, BW - amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/LNjr_R `fshl(fshl(_, x, BW - amnt), fshl(x, _, BW - amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/cwGjhL `fshl(fshl(_, x, BW - amnt), fshr(x, _, amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/UChZW4 `fshr(lshr(x, BW - amnt), shl(c, amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/uiSBEQ `fshr(lshr(x, BW - amnt), fshl(x, _, amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/11pXpJ `fshr(lshr(x, BW - amnt), fshr(x, _, BW - amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/7mvxH7 `fshr(fshr(_, x, BW - amnt), shl(c, amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/ybswip `fshr(fshr(_, x, BW - amnt), fshl(x, _, amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/fNUQQv `fshr(fshr(_, x, BW - amnt), fshr(x, _, BW - amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/9bFnec `fshr(fshl(_, x, amnt), shl(c, amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/vuYAYn `fshr(fshl(_, x, amnt), fshl(x, _, amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/kP94MG `fshr(fshl(_, x, amnt), fshr(x, _, BW - amnt), amnt) -> x`: ✓ https://alive2.llvm.org/ce/z/X8u__v	2026-03-12 11:21:29 +00:00
Simon Pilgrim	9a8147b553	Revert "[SDAG] (abs (add nsw a, -b)) -> (abds a, b)" (#17580 ) (#186068 ) Reverts llvm/llvm-project#175801 while #185467 miscompilation is being investigated	2026-03-12 10:35:24 +00:00
Craig Topper	53a2fd99aa	[DAGCombiner] Combine (fshl A, B, S) \| (fshr C, D, BW-S) --> (fshl (A\|C), (B\|D), S) (#180889 ) This is similar to the FSHL/FSHR handling in hoistLogicOpWithSameOpcodeHands. Here the opcodes aren't exactly the same, but the operations are equivalent. Fixes regressions from #180888	2026-03-10 18:53:40 -07:00
YunQiang Su	757a0f85c8	SelectionDAG: Use ISD::AssertNoFPClass for Load with nofpclass metadata (#184952 ) 1. Use ISD::AssertNoFPClass if LoadInst has !nofpclass metadata. 2. Strip ISD::AssertNoFPClass when try to combine load with bitcast in DAGCombiner::visitBITCAST.	2026-03-11 08:14:27 +08:00
Nikita Popov	f90b783c3f	[WebAssembly] Do not form minnum/maxnum (#184796 ) For wasm, forming minnum/maxnum style ISD nodes is non-profitable, because (in cases where any float min/max support exists at all), it has pmin/pmax instructions that correspond to the fcmp+select semantics, or relaxed_fmin/relaxed_fmax (for the nnan+nsz case) with even loser semantics. As such, return false from isProfitableToCombineMinNumMaxNum(), and also respect that hook in the SDAGBuilder.	2026-03-06 09:05:51 +01:00
Chaitanya Koparkar	a631af32c0	Add EVT::changeVectorElementCount and MVT:changeVectorElementCount (#182266 ) Fixes #174584.	2026-03-05 07:19:19 -05:00
Lewis Crawford	fa6eef8378	Revert "Avoid maxnum(sNaN, x) optimizations / folds (#170181 )" (#184125 ) This reverts commit ea3fdc5972db7f2d459e543307af05c357f2be26. Re-enable const-folding for maxnum/minnum in the middle-end, GlobalISel, and SelectionDAG. Re-enable optimizations that depend on maxnum/minnum sNaN semantics in InstCombine and DAGCombiner. Now that maxnum(x, sNaN) is specified to non-deterministically produce either NaN or x, these constant-foldings and optimizations are now valid again according to the newly clarified semantics in #172012 .	2026-03-03 12:45:26 +00:00
David Sherwood	0b36d4265e	[AArch64] Add vector expansion support for ISD::FCBRT when using ArmPL (#183750 ) This patch teaches the backend how to lower the FCBRT DAG node to the vector math library function when using ArmPL. This is similar to what we already do for llvm.pow/FPOW, however the only way to expose this is via a DAG combine that converts FPOW(<2 x double> %x, <2 x double> <double 1.0/3.0, double 1.0/3.0>) into FCBRT(<2 x double> %x) when the appropriate fast math flags are present on the node. I've updated the DAG combine to handle vector types and only perform the transformation if there exists a vector library variant of cbrt.	2026-03-03 10:39:21 +00:00
fbrv	482a7718a8	[DAG] visitCLMUL - fold (clmul x, c_pow2) -> (shl x, log2(c_pow2)) (#184049 ) Implements the missing basic folds for `ISD::CLMUL` in `visitCLMUL`: - `(clmul x, 1)` → `x` - `(clmul x, c_pow2)` → `(shl x, log2(c_pow2))` These were previously only folded during scalar expansion (`expandCLMUL`), so targets with native CLMUL support (e.g. X86 pclmul, RISCV Zbc) never had the opportunity to simplify these cases. Fixes #181831	2026-03-02 10:52:51 +00:00
Simon Pilgrim	90b3fd7101	[DAG] Move (X +/- Y) & Y --> ~X & Y fold from visitAnd to SimplifyDemandedBits (#183270 ) Add DemandedElts handling to allow better vector support To prevent RISCV falling back to a mul call in known-never-zero.ll I've had to tweak the (mul step_vector(C0), C1) to (step_vector(C0 * C1)) fold to only occur if C0 is already non-power-of-2, C0 * C1 is a power-of-2 or the target has good mul support.	2026-02-26 11:26:00 +00:00
Aadarsh Keshri	6d7ec4b7c3	[DAG] Improved ISD::SHL handling in isKnownToBeAPowerOfTwo (#181882 ) Fixes #181650	2026-02-26 10:10:56 +00:00
Simon Pilgrim	efcf64e898	[DAG] visitOR - attempt to fold (or buildvector(), buildvector()) -> buildvector() (#183032 ) See if we can fold all elements of an OR of buildvectors: OR(-1,X) -> -1, OR(0,X) -> X, etc.	2026-02-25 10:03:15 +00:00
zGoldthorpe	61d40e22c4	[SDPatternMatch] Add `m_ConstInt` overloads with `uint64_t`/`int64_t` operands (#182615 ) Adds overloads ```cpp auto m_ConstInt(uint64_t &); auto m_ConstInt(int64_t &); ``` which behave analogously to `m_ConstInt(APInt &)`, but only match if the captured integer fits within 64 bits.	2026-02-24 10:07:06 -07:00
AbdallahRashed	188346d433	[DAGCombiner] Add legality check for CLMULR fold to prevent infinite loop (#182376 ) The bitreverse(clmul(bitreverse, bitreverse)) -> clmulr fold was missing a legality check, causing an infinite loop when CLMULR isn't supported on the target. Added the check to match other folds in visitBITREVERSE. Fixes #182270	2026-02-22 16:36:02 +00:00
Craig Topper	15430ba094	[DAGCombiner] Use APInt::isPower2() instead of popcount() == 1. NFC (#182600 )	2026-02-21 09:00:15 -08:00
Simon Pilgrim	f8f799c640	[DAG] Fold (X +/- Y) & Y --> ~X & Y when Y is a power of 2 (or zero). (#181677 ) Same as InstCombinerImpl::visitAnd To prevent RISCV falling back to a mul call in known-never-zero.ll I've had to tweak the (sub X, (vscale * C)) to (add X, (vscale * -C)) fold to not occur if C is power-of-2 and the target has poor mul support. Alive2: https://alive2.llvm.org/ce/z/Khvs5H	2026-02-18 12:19:21 +00:00
Craig Topper	03ad6549fc	[DAGCombiner] Combine (fshl A, X, Y) \| (shl X, Y) --> fshl (A\|X), X, Y (#180887 ) Similar for (fshr X, B, Y) \| (srl X, Y) --> fshr X, (X\|B), Y This is similar to the FSHL/FSHR handling in hoistLogicOpWithSameOpcodeHands but here we treat a shl/shr like a fshl/fshr with 0. The pattern doesn't require X to be the same in both sides, but that's what occurred in the case I was looking at so that's what is implemented. Alive2: https://alive2.llvm.org/ce/z/eUou-u	2026-02-17 09:00:45 -08:00
DaKnig	75aa83c0c0	[SDAG] foldSelectToABD - canonicalize compare of abd (#180952 )	2026-02-17 14:02:53 +00:00
Liao Chunyu	cfe1b46b46	[DAGCombiner] Fold trunc(build_vector(ext(x), ext(x)) -> build_vector(x,x) (#179857 ) The original implementation performed the transformation when isTruncateFree was true: truncate(build_vector(x, x)) -> build_vector(truncate(x), truncate(x)). In some cases, x comes from an ext, try to pre-truncate build_vectors source operands when the source operands of build_vectors comes from an ext. Testcase from: https://gcc.godbolt.org/z/bbxbYK7dh	2026-02-15 10:01:47 +08:00
陈子昂	6e23353c39	[DAGCombiner] Fix crash caused by illegal InterVT in ForwardStoreValueToDirectLoad (#181175 ) This patch fixes an assertion failure in ForwardStoreValueToDirectLoad during DAGCombine. The crash occurs when `STLF (Store-to-Load Forwarding)` creates an illegal intermediate bitcast type (e.g., `v128i1` when bridging a 128-bit store to a `<32 x i1>` load on X86). Since `v128i1` is not a legal mask type for the backend, it violates the expectations of the LegalizeDAG pass. The fix adds a `TLI.isTypeLegal(InterVT)` check to ensure that the intermediate type used for the transformation is supported by the target. Fixes #181130	2026-02-14 21:05:31 +08:00
Björn Pettersson	6420099bcc	[SelectionDAG] Make sure demanded lanes for AND/MUL-by-zero are frozen (#180727 ) DAGCombiner can fold a chain of INSERT_VECTOR_ELT into a vector AND/OR operation. This patch adds protection to avoid that we end up making the vector more poisonous by freezing the source vector when the elements that should be set to 0/-1 may be poison in the source vector. The patch also fixes a bug in SimplifyDemandedVectorElts for MUL/MULHU/MULHS/AND that could result in making the vector more poisonous. Problem was that we skipped demanding elements from Op0 that were known to be zero in Op1. But that could result in elements being simplified into poison when simplifying Op0, and then the result would be poison and not zero after the MUL/MULHU/MULHS/AND. The solution is to defensively make sure that we demand all the elements originally demanded also when simplifying Op0. This bugs were found when analysing the miscompiles in https://github.com/llvm/llvm-project/issues/179448 Main culprit in #179448 seems to have been the bug in DAGCombiner. The bug in SimplifyDemandedVectorElts surfaced when fixing the DAGCombiner, as that fix typically introduce the (AND (FREEZE x), y) pattern that wasn't handled correctly in SimplifyDemandedVectorElts. Also fixes #180409. Also fixes #176682.	2026-02-12 10:58:29 +01:00
陈子昂	6117bdd903	[DAGCombiner] Fix subvector extraction index for big-endian STLF (#180795 ) This PR fixes a big-endian regression in `ForwardStoreValueToDirectLoad` where the wrong subvector was being extracted. In big-endian, memory offset 0 corresponds to the high bits, so the extraction index needs to be adjusted. As suggested by @KennethHilmersson, calculate the extraction index as the difference between the number of elements in the intermediate vector and the load vector when in big-endian mode. Special thanks to Kenneth Hilmersson for providing the fix logic and the ARM regression test. https://github.com/llvm/llvm-project/pull/172523#issuecomment-3878065191 https://github.com/llvm/llvm-project/pull/172523#issuecomment-3879575092	2026-02-12 17:42:13 +08:00
Alexander Weinrauch	1e086d06e9	[DAGCombiner] Fix crash in reassociationCanBreakAddressingModePattern for multi-memop nodes (#180268 ) Two code paths in `reassociationCanBreakAddressingModePattern` were missing a `hasUniqueMemOperand()` guard before calling `getAddressSpace()`. Note that on `L1214` we already have the same guard in place. `getAddressSpace()` chains through `getPointerInfo()` to `getMemOperand()`, which asserts that the node has exactly one memory operand.	2026-02-10 13:53:36 -08:00
paperchalice	c53acf0443	[SelectionDAGBuilder] Remove NoNaNsFPMath uses (#169904 ) Replaced by checking fast-math flags or value tracking results.	2026-02-09 09:48:07 +08:00
David Sherwood	e958bcdd17	[DAGCombiner] Look through freeze for ext(freeze(extload(x))) (#178669 ) This patch fixes a regression introduced by PR #175022, where a freeze was introduced with the following transformation: ext(freeze(load(x))) -> freeze(extload(x)) If a new extend is introduced afterwards we then have ext(freeze(extload(x))) which doesn't get picked up by existing DAG combines due to the freeze getting in the way.	2026-02-06 15:50:17 +00:00
Steffen Larsen	5654ecd5dd	[DAGCombiner] Fix exact power-of-two signed division for large integers (#177340 ) Previously, the DAG combiner did not optimize exact signed division by a power-of-two constant divisor for integer types exceeding the size of division supported by the target architecture (e.g., i128 on x86-64). However, such an optimization was expected by the division expansion logic, leading to unsupported division operations making it to instruction selection. This commit addresses this issue by making an exception to the existing exclusion of signed division with the exact flag for the aforementioned operations. That is, the DAG combiner will now optimize exact signed division if the divisor is a power-of-two constant and the integer type exceeds the size of division supported by the target architecture. --------- Signed-off-by: Steffen Holst Larsen <HolstLarsen.Steffen@amd.com>	2026-02-06 09:40:32 +01:00
Nicolai Hähnle	af836ff60c	[CodeGen] Add getTgtMemIntrinsic overload for multiple memory operands (NFC) (#175843 ) There are target intrinsics that logically require two MMOs, such as llvm.amdgcn.global.load.lds, which is a copy from global memory to LDS, so there's both a load and a store to different addresses. Add an overload of getTgtMemIntrinsic that produces intrinsic info in a vector, and implement it in terms of the existing (now protected) overload. GlobalISel and SelectionDAG paths are updated to support multiple MMOs. The main part of this change is supporting multiple MMOs in MemIntrinsicNodes. Converting the backends to using the new overload is a fairly mechanical step that is done in a separate change in the hope that that allows reducing merging pains during review and for downstreams. A later change will then enable using multiple MMOs in AMDGPU.	2026-02-02 21:58:42 +00:00
DaKnig	fbda30607c	[SDAG] (abs (add nsw a, -b)) -> (abds a, b) (#175801 ) This is beneficial for bv of constants. alive2: https://alive2.llvm.org/ce/z/e3GsWZ	2026-02-02 15:11:16 +00:00
Simon Pilgrim	a372152cb5	[DAG] visitVECTOR_SHUFFLE - ensure correct resno when folding shuffle(bop(shuffle(x,y),shuffle(z,w)) (#179124 ) TLI.isBinOp recognises some opcodes that have multiple results, including UADDO etc. In most cases we currently just bail if a binop has multiple results, but shuffle combining was missing the check and its pretty trivial to add handling in this case. I've added add/sub-overflow opcodes to verifyNode to help catch these cases in the future - IIRC there was a plan to autogen these, but there isn't anything at the moment. Fixes #179112	2026-02-02 09:22:48 +00:00
Benjamin Maxwell	1818b23a99	[SDAG] Check for `nsz` in DAG.canIgnoreSignBitOfZero() (#178905 ) Follow up to #174423	2026-02-01 15:58:38 +00:00
陈子昂	a994198906	[DAG] Reland: Enable bitcast STLF for Constant/Undef (#178890 ) This is a reland of #172523. The original patch caused an assertion failure on RISC-V because it attempted to create a bitcast from an illegal type (i32 on RV64) during the post-type-legalization DAGCombine stage. Added a `TLI.isTypeLegal(Val.getValueType())` check to ensure we only proceed with the bitcast STLF optimization when the source value's type is legal for the target.	2026-01-30 18:21:32 +01:00
Alex Bradbury	41f453efe2	Revert "[DAG] Enable bitcast STLF for Constant/Undef" (#178872 ) Reverts llvm/llvm-project#172523 As explained in https://github.com/llvm/llvm-project/pull/172523#issuecomment-3823234270 (along with reproducer), this causes compiler crashes building llvm-test-suite for RVV targets.	2026-01-30 12:18:38 +00:00
陈子昂	d3c64633c3	[DAG] Enable bitcast STLF for Constant/Undef (#172523 ) This patch introduces support for Store-to-Load Forwarding (STLF) in `DAGCombiner::ForwardStoreValueToDirectLoad` when the store and load have different types but equal memory size (e.g., storing an `i32` then loading a `float` from the same location). ### What this patch does: Enables Optimization: It allows for the safe forwarding of the stored value as a Bitcast when the value is: * A Constant (`ConstantSDNode`, `ConstantFPSDNode`, `ConstantPoolSDNode`). * Undef. * And the memory sizes (`LdMemSize` == `StMemSize`) match. ### Scope and Next Steps: This patch only implements forwarding for constant and undef values that has the same memory size so far. I am submitting this initial patch to get early review feedback on the core logic and fix the immediate crashes before tackling the more complex scenarios. For the simple case: ```llvm ; Case Handled by this PR so far (e.g., zeroinitializer is a constant) define float @test_stlf_integer(ptr %p, float %v) { store i32 0, ptr %p, align 4 %f = load float, ptr %p, align 4 ; ... } ``` Fixes: #151683	2026-01-30 10:11:59 +01:00
Craig Topper	80cbd1d696	[RISCV] Support ISD::CLMUL/CLMULH for i64 scalable vectors with Zvbc. (#178340 ) We also get some i32->i64 promotion for CLMULH. The DAGCombiner change is to prevent an infinite loop from that. Test file was rewritten to cover all types and split between clmul and clmulh. I added a couple masked tests to show that VectorPeephole works. The test outputs were already large so I didn't want to add more than a couple.	2026-01-29 13:17:03 -08:00

1 2 3 4 5 ...

4244 Commits