4244 Commits

Author SHA1 Message Date
Simon Pilgrim
6832709dc0
[DAG] SDPatternMatch - rename m_Opc -> m_SpecificOpc (#190215)
Match naming convention for other m_Specific* matchers, and frees up the
m_Opc() matcher for future use in #84940 to allow us to capture the
opcode of a unknown binop

Moving to m_SpecificOpc does mess up the formatting in a few places,
I've tried to refactor to use the m_Value(SDValue, ....) matcher where I
can to retrieve some whitespace
2026-04-03 18:03:00 +00:00
Simon Pilgrim
5674755cb6
[DAG] visitMUL - cleanup pattern matchers to use m_Shl and (commutative) m_Mul directly (#190339)
Based on feedback on #190215
2026-04-03 13:21:51 +00:00
DaKnig
d6b8163f3f
Retry "[SDAG] (abs (add nsw a, -b)) -> (abds a, b) (#175801)" (#186659)
A better version of #175801 . see that for more info.

Fixes #185467

The original patch was checking the correctness of the transformation
based on the original Op1 , which was then negated (in the case of
IsAdd). This patch fixes that issue by inverting the sign bit in that
case.

Also pushed a slight nfc there to simplify the code and remove some
duplication.

alive2 proofs:

abds: https://alive2.llvm.org/ce/z/oJQPss

abdu: https://alive2.llvm.org/ce/z/HfPF5q

Note that the regression test is not (wrongly) affected anymore by the
patch (as it did before)
2026-04-01 13:37:29 +00:00
Simon Pilgrim
9a33125e42
[DAG] Add basic ISD::IS_FPCLASS constant/identity folds (#189944)
Attempts to match middle-end implementation in InstructionSimplify/foldIntrinsicIsFPClass

Fixes #189919
2026-04-01 13:06:27 +00:00
natanelh-mobileye
46dd9d6f52
[SDAG][abd] Combine abd of small types (#181538)
It is beneficial to combine abd of illegal, small types (types that get promoted to wider scalar size).
2026-03-31 13:40:51 +00:00
Demetrius Kanios
96bd7b6e15
[CodeGen] Add additional params to TargetLoweringBase::getTruncStoreAction (#187422)
The truncating store analogue of #181104.

Adds `Alignment` and `AddrSpace` parameters to
`TargetLoweringBase::getTruncStoreAction` and dependents, and introduces
a `getCustomTruncStoreAction` hook for targets to customize legalization
behavior using this new information.

This change is fully backwards compatible from the target's point of
view, with `setTruncStoreAction` having identical functionality. The
change is purely additive.
2026-03-30 16:52:45 -07:00
Alexis Engelke
bbef10d9f1
[CodeGen][NFC] Compute MaximumLegalStoreInBits just once (#189355)
Instead of iterating over all value types per basic block, pre-compute
the TLI-specific value once when constructing the TLI.
2026-03-30 18:44:18 +02:00
Jim Lin
2b41985405
[DAG] Fix incorrect ForSigned handling in computeConstantRange calls (#188889)
Fix two places where ForSigned was incorrectly passed to
computeConstantRange, causing wrong signed/unsigned range computation.

In computeConstantRangeIncludingKnownBits (DemandedElts overload),
the call omitted ForSigned, so Depth (unsigned) was implicitly
converted to bool for the ForSigned parameter. Introduced in
a6a66a4e6915.

In visitIMINMAX, the call always passed ForSigned=false, even when
folding SMAX/SMIN which query signed bounds from the resulting range.
2026-03-30 10:30:19 +00:00
Simon Pilgrim
207598a827
[DAG] Add command line option and TLI hook to enable DAG topological sorting (#188636)
The very first step towards #83422 - which will move DAG combines to be
processed in topological order.

There is a lot of churn on existing tests that need to be addressed
before this can be switched on globally, this patch gives the ability to
enable it both on a per-target basis, and via a command line option to
assist with testing and triage.

At the moment I'm focusing on addressing the x86 regressions (example in
the patch's basic test coverage) as that's the target I'm most familiar
with and will help with many other targets as well, but there might be
other/simpler targets that would benefit from earlier handling.
2026-03-27 07:40:53 +00:00
Craig Topper
0ebef5e5e2
[DAGCombine] Enable div by constant optimization for odd sized vectors before type legalization. (#188313)
If we we are going to legalize to a vector with the same element type
and mulh or mul_lohi are supported, allow the optimization before type
legalization.

RISC-V will widen vectors using vp.udiv/sdiv that doesn't support
division by constant optimization. In addition, type legalization will create
a build_vector with undef elements making it hard to match after type
legalization.

Other targets may need to widen by a combination of vector and scalar
divisions to avoid traps if we widen a vector with garbage.

I had to enable the MULHU->SRL DAG combine before type legalization to
prevent regressions. After type legalization, the multiply constant
build_vector will have undef elements and the combine won't trigger.
2026-03-26 09:16:46 -07:00
Neil Phan
a6a66a4e69
[DAG] Define computeConstantRange for VSCALE folding (#176027)
Resolves #175150 

Defines computeConstantRange and computeConstantRangeIncludingKnownBits
in the SelectionDAG. Currently only handles `ISD::VSCALE` operation
related to #174708.

Test cases were constructed to test varying VSCALE ranges on AArch64.
Further testing can be implemented as needed by review.
2026-03-25 20:32:09 +00:00
Nikita Popov
f064a9979f
[DAGCombine] Optimize away cond ? 1 : 0 post-legalization (#186771)
Selects of the form `cond ? 1 : 0` are created during unrolling of
setcc+vselect. Currently these are not optimized away post-legalization
even if fully redundant. Having these extra selects sitting between
things can prevent other folds from applying.

Enabling this requires some mitigations in the ARM backend, in
particular in the interaction with MVE support. There's two changes
here:

* Form CSINV/CSNEG/CSINC from CMOV, rather than only creating it during
SELECT_CC lowering. (After this change, the lowering in SELECT_CC can be
dropped without test changes, let me know if I should do that.)
* Support pushing negations through CMOV in more cases, in particular if
the operands are constant or the negation can be handled by flipping
lshr/ashr.

Additionally, in the X86 backend, try to simplify CMOV to SETCC if only the
low bit is demanded.
2026-03-20 16:23:18 +01:00
Paul Walker
7663802125
[LLVM][DAGCombiner] Limit extract_subvec(extract_subvec()) combine to vectors of the same type. (#187334)
The index operand of ISD::EXTRACT_SUBVECTOR is implicitly scaled by
vscale, which is effectively always one for fixed-length vectors. When
combining nested extracts we must ensure all use the same implicit
scaling otherwise the transform is not equivalent.

Fixes https://github.com/llvm/llvm-project/issues/186563
2026-03-19 11:14:30 +00:00
Craig Topper
9dd2e3792a
[DAGCombiner] Move the XORHandle in rebuildSetCC inside the while loop. (#187189)
If N was changed on the previous loop iteration, we need the handle to
point at the new N.

Fixes #186969.
2026-03-18 09:30:05 -07:00
Demetrius Kanios
351501799a
[CodeGen] Improve getLoadExtAction and friends (#181104)
Alternative approach to the same goals as #162407

This takes `TargetLoweringBase::getLoadExtAction`, renames it to
`TargetLoweringBase::getLoadAction`, merges `getAtomicLoadExtAction`
into it, and adds more inputs for relavent information (alignment,
address space).

The `isLoadExtLegal[OrCustom]` helpers are also modified in a matching
manner.

This is fully backwards compatible, with the existing `setLoadExtAction`
working as before. But this allows targets to override a new hook to
allow the query to make more use of the information. The hook
`getCustomLoadAction` is called with all the parameters whenever the
table lookup yields `LegalizeAction::Custom`, and can return any other
action it wants.
2026-03-17 23:40:19 -07:00
Iasonaskrpr
b44434474e
Improved ISD::SRL handling in isKnownToBeAPowerOfTwo (#182562)
Fixes #181651

Added DemandedElts argument to isConstOrConstSplat and to
isKnowTobePowerOfTwo calls and OrZero || isKnownNeverZero(Val, Depth) is
checked before isKnowTobePowerOfTwo. Also added unit tests.
2026-03-14 18:49:08 +00:00
Gergo Stomfai
0fb8f7f9c3
[DAG] Fold away identity FSHL and FSHR patterns (#185667)
Fold away identity FSHL and FSHR patterns

Came up in #185175, this seems to be the cleanest way to get rid of this
pattern

Alive2 proofs:
`fshl(lshr(x, amnt), shl(c, BW - amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/AEzthY
`fshl(lshr(x, amnt), fshl(x, _, BW - amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/oDpaqF
`fshl(lshr(x, amnt), fshr(x, _, amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/aCxQch
`fshl(fshr(_, x, amnt), shl(c, BW - amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/89NQME
`fshl(fshr(_, x, amnt), fshl(x, _, BW - amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/KdR3Mp
`fshl(fshr(_, x, amnt), fshr(x, _, amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/2Gkc7m
`fshl(fshl(_, x, BW - amnt), shl(c, BW - amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/LNjr_R
`fshl(fshl(_, x, BW - amnt), fshl(x, _, BW - amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/cwGjhL
`fshl(fshl(_, x, BW - amnt), fshr(x, _, amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/UChZW4
`fshr(lshr(x, BW - amnt), shl(c, amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/uiSBEQ
`fshr(lshr(x, BW - amnt), fshl(x, _, amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/11pXpJ
`fshr(lshr(x, BW - amnt), fshr(x, _, BW - amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/7mvxH7
`fshr(fshr(_, x, BW - amnt), shl(c, amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/ybswip
`fshr(fshr(_, x, BW - amnt), fshl(x, _, amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/fNUQQv
`fshr(fshr(_, x, BW - amnt), fshr(x, _, BW - amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/9bFnec
`fshr(fshl(_, x, amnt), shl(c, amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/vuYAYn
`fshr(fshl(_, x, amnt), fshl(x, _, amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/kP94MG
`fshr(fshl(_, x, amnt), fshr(x, _, BW - amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/X8u__v
2026-03-12 11:21:29 +00:00
Simon Pilgrim
9a8147b553
Revert "[SDAG] (abs (add nsw a, -b)) -> (abds a, b)" (#17580) (#186068)
Reverts llvm/llvm-project#175801 while #185467 miscompilation is being investigated
2026-03-12 10:35:24 +00:00
Craig Topper
53a2fd99aa
[DAGCombiner] Combine (fshl A, B, S) | (fshr C, D, BW-S) --> (fshl (A|C), (B|D), S) (#180889)
This is similar to the FSHL/FSHR handling in
hoistLogicOpWithSameOpcodeHands.
Here the opcodes aren't exactly the same, but the operations are
equivalent.

Fixes regressions from #180888
2026-03-10 18:53:40 -07:00
YunQiang Su
757a0f85c8
SelectionDAG: Use ISD::AssertNoFPClass for Load with nofpclass metadata (#184952)
1. Use ISD::AssertNoFPClass if LoadInst has !nofpclass metadata.
2. Strip ISD::AssertNoFPClass when try to combine load with bitcast
    in DAGCombiner::visitBITCAST.
2026-03-11 08:14:27 +08:00
Nikita Popov
f90b783c3f
[WebAssembly] Do not form minnum/maxnum (#184796)
For wasm, forming minnum/maxnum style ISD nodes is non-profitable,
because (in cases where any float min/max support exists at all), it has
pmin/pmax instructions that correspond to the fcmp+select semantics, or
relaxed_fmin/relaxed_fmax (for the nnan+nsz case) with even loser
semantics.

As such, return false from isProfitableToCombineMinNumMaxNum(), and also
respect that hook in the SDAGBuilder.
2026-03-06 09:05:51 +01:00
Chaitanya Koparkar
a631af32c0
Add EVT::changeVectorElementCount and MVT:changeVectorElementCount (#182266)
Fixes #174584.
2026-03-05 07:19:19 -05:00
Lewis Crawford
fa6eef8378
Revert "Avoid maxnum(sNaN, x) optimizations / folds (#170181)" (#184125)
This reverts commit ea3fdc5972db7f2d459e543307af05c357f2be26.

Re-enable const-folding for maxnum/minnum in the middle-end, GlobalISel,
and SelectionDAG.

Re-enable optimizations that depend on maxnum/minnum sNaN semantics in
InstCombine and DAGCombiner.

Now that maxnum(x, sNaN) is specified to non-deterministically produce
either NaN or x, these constant-foldings and optimizations are now valid
again according to the newly clarified semantics in #172012 .
2026-03-03 12:45:26 +00:00
David Sherwood
0b36d4265e
[AArch64] Add vector expansion support for ISD::FCBRT when using ArmPL (#183750)
This patch teaches the backend how to lower the FCBRT DAG node to the
vector math library function when using ArmPL. This is similar to what
we already do for llvm.pow/FPOW, however the only way to expose this is
via a DAG combine that converts

  FPOW(<2 x double> %x, <2 x double> <double 1.0/3.0, double 1.0/3.0>)

into

  FCBRT(<2 x double> %x)

when the appropriate fast math flags are present on the node. I've
updated the DAG combine to handle vector types and only perform the
transformation if there exists a vector library variant of cbrt.
2026-03-03 10:39:21 +00:00
fbrv
482a7718a8
[DAG] visitCLMUL - fold (clmul x, c_pow2) -> (shl x, log2(c_pow2)) (#184049)
Implements the missing basic folds for `ISD::CLMUL` in `visitCLMUL`:
 - `(clmul x, 1)` → `x` 
 - `(clmul x, c_pow2)` → `(shl x, log2(c_pow2))`

These were previously only folded during scalar expansion
(`expandCLMUL`), so targets with native CLMUL support (e.g. X86 pclmul,
RISCV Zbc) never had the opportunity to simplify these cases.

Fixes #181831
2026-03-02 10:52:51 +00:00
Simon Pilgrim
90b3fd7101
[DAG] Move (X +/- Y) & Y --> ~X & Y fold from visitAnd to SimplifyDemandedBits (#183270)
Add DemandedElts handling to allow better vector support

To prevent RISCV falling back to a mul call in known-never-zero.ll I've
had to tweak the (mul step_vector(C0), C1) to (step_vector(C0 * C1))
fold to only occur if C0 is already non-power-of-2, C0 * C1 is a
power-of-2 or the target has good mul support.
2026-02-26 11:26:00 +00:00
Aadarsh Keshri
6d7ec4b7c3
[DAG] Improved ISD::SHL handling in isKnownToBeAPowerOfTwo (#181882)
Fixes  #181650
2026-02-26 10:10:56 +00:00
Simon Pilgrim
efcf64e898
[DAG] visitOR - attempt to fold (or buildvector(), buildvector()) -> buildvector() (#183032)
See if we can fold all elements of an OR of buildvectors: OR(-1,X) ->
-1, OR(0,X) -> X, etc.
2026-02-25 10:03:15 +00:00
zGoldthorpe
61d40e22c4
[SDPatternMatch] Add m_ConstInt overloads with uint64_t/int64_t operands (#182615)
Adds overloads
```cpp
auto m_ConstInt(uint64_t &);
auto m_ConstInt(int64_t &);
```
which behave analogously to `m_ConstInt(APInt &)`, but only match if the
captured integer fits within 64 bits.
2026-02-24 10:07:06 -07:00
AbdallahRashed
188346d433
[DAGCombiner] Add legality check for CLMULR fold to prevent infinite loop (#182376)
The bitreverse(clmul(bitreverse, bitreverse)) -> clmulr fold was missing
a legality check, causing an infinite loop when CLMULR isn't supported
on the target. Added the check to match other folds in visitBITREVERSE.

Fixes #182270
2026-02-22 16:36:02 +00:00
Craig Topper
15430ba094
[DAGCombiner] Use APInt::isPower2() instead of popcount() == 1. NFC (#182600) 2026-02-21 09:00:15 -08:00
Simon Pilgrim
f8f799c640
[DAG] Fold (X +/- Y) & Y --> ~X & Y when Y is a power of 2 (or zero). (#181677)
Same as InstCombinerImpl::visitAnd

To prevent RISCV falling back to a mul call in known-never-zero.ll I've
had to tweak the (sub X, (vscale * C)) to (add X, (vscale * -C)) fold to
not occur if C is power-of-2 and the target has poor mul support.

Alive2: https://alive2.llvm.org/ce/z/Khvs5H
2026-02-18 12:19:21 +00:00
Craig Topper
03ad6549fc
[DAGCombiner] Combine (fshl A, X, Y) | (shl X, Y) --> fshl (A|X), X, Y (#180887)
Similar for (fshr X, B, Y) | (srl X, Y) --> fshr X, (X|B), Y

This is similar to the FSHL/FSHR handling in
hoistLogicOpWithSameOpcodeHands but here we treat a shl/shr like a
fshl/fshr with 0.

The pattern doesn't require X to be the same in both sides, but that's
what occurred in the case I was looking at so that's what is
implemented.

Alive2: https://alive2.llvm.org/ce/z/eUou-u
2026-02-17 09:00:45 -08:00
DaKnig
75aa83c0c0
[SDAG] foldSelectToABD - canonicalize compare of abd (#180952) 2026-02-17 14:02:53 +00:00
Liao Chunyu
cfe1b46b46
[DAGCombiner] Fold trunc(build_vector(ext(x), ext(x)) -> build_vector(x,x) (#179857)
The original implementation performed the transformation when
isTruncateFree was true:
 truncate(build_vector(x, x)) -> build_vector(truncate(x), truncate(x)).
    
In some cases, x comes from an ext, try to pre-truncate build_vectors
source operands
 when the source operands of build_vectors comes from an ext.
    
 Testcase from: https://gcc.godbolt.org/z/bbxbYK7dh
2026-02-15 10:01:47 +08:00
陈子昂
6e23353c39
[DAGCombiner] Fix crash caused by illegal InterVT in ForwardStoreValueToDirectLoad (#181175)
This patch fixes an assertion failure in ForwardStoreValueToDirectLoad
during DAGCombine.

The crash occurs when `STLF (Store-to-Load Forwarding)` creates an
illegal intermediate bitcast type (e.g., `v128i1` when bridging a
128-bit store to a `<32 x i1>` load on X86). Since `v128i1` is not a
legal mask type for the backend, it violates the expectations of the
LegalizeDAG pass.

The fix adds a `TLI.isTypeLegal(InterVT)` check to ensure that the
intermediate type used for the transformation is supported by the
target.

Fixes #181130
2026-02-14 21:05:31 +08:00
Björn Pettersson
6420099bcc
[SelectionDAG] Make sure demanded lanes for AND/MUL-by-zero are frozen (#180727)
DAGCombiner can fold a chain of INSERT_VECTOR_ELT into a vector AND/OR
operation. This patch adds protection to avoid that we end up making the
vector more poisonous by freezing the source vector when the elements
that should be set to 0/-1 may be poison in the source vector.

The patch also fixes a bug in SimplifyDemandedVectorElts for
MUL/MULHU/MULHS/AND that could result in making the vector more
poisonous. Problem was that we skipped demanding elements from Op0 that
were known to be zero in Op1. But that could result in elements being
simplified into poison when simplifying Op0, and then the result would
be poison and not zero after the MUL/MULHU/MULHS/AND. The solution is to
defensively make sure that we demand all the elements originally
demanded also when simplifying Op0.

This bugs were found when analysing the miscompiles in
https://github.com/llvm/llvm-project/issues/179448

Main culprit in #179448 seems to have been the bug in DAGCombiner. The
bug in SimplifyDemandedVectorElts surfaced when fixing the DAGCombiner,
as that fix typically introduce the (AND (FREEZE x), y) pattern that
wasn't handled correctly in SimplifyDemandedVectorElts.

Also fixes #180409.
Also fixes #176682.
2026-02-12 10:58:29 +01:00
陈子昂
6117bdd903
[DAGCombiner] Fix subvector extraction index for big-endian STLF (#180795)
This PR fixes a big-endian regression in `ForwardStoreValueToDirectLoad`
where the wrong subvector was being extracted. In big-endian, memory
offset 0 corresponds to the high bits, so the extraction index needs to
be adjusted.

As suggested by @KennethHilmersson, calculate the extraction index as
the difference between the number of elements in the intermediate vector
and the load vector when in big-endian mode.

Special thanks to Kenneth Hilmersson for providing the fix logic and the
ARM regression test.
https://github.com/llvm/llvm-project/pull/172523#issuecomment-3878065191
https://github.com/llvm/llvm-project/pull/172523#issuecomment-3879575092
2026-02-12 17:42:13 +08:00
Alexander Weinrauch
1e086d06e9
[DAGCombiner] Fix crash in reassociationCanBreakAddressingModePattern for multi-memop nodes (#180268)
Two code paths in `reassociationCanBreakAddressingModePattern` were
missing a `hasUniqueMemOperand()` guard before calling
`getAddressSpace()`. Note that on `L1214` we already have the same guard
in place.

`getAddressSpace()` chains through `getPointerInfo()` to
`getMemOperand()`, which asserts that the node has exactly one memory
operand.
2026-02-10 13:53:36 -08:00
paperchalice
c53acf0443
[SelectionDAGBuilder] Remove NoNaNsFPMath uses (#169904)
Replaced by checking fast-math flags or value tracking results.
2026-02-09 09:48:07 +08:00
David Sherwood
e958bcdd17
[DAGCombiner] Look through freeze for ext(freeze(extload(x))) (#178669)
This patch fixes a regression introduced by PR #175022, where
a freeze was introduced with the following transformation:

  ext(freeze(load(x))) -> freeze(extload(x))

If a new extend is introduced afterwards we then have

  ext(freeze(extload(x)))

which doesn't get picked up by existing DAG combines due to
the freeze getting in the way.
2026-02-06 15:50:17 +00:00
Steffen Larsen
5654ecd5dd
[DAGCombiner] Fix exact power-of-two signed division for large integers (#177340)
Previously, the DAG combiner did not optimize exact signed division by a
power-of-two constant divisor for integer types exceeding the size of
division supported by the target architecture (e.g., i128 on x86-64).
However, such an optimization was expected by the division expansion
logic, leading to unsupported division operations making it to
instruction selection.
This commit addresses this issue by making an exception to the existing
exclusion of signed division with the exact flag for the aforementioned
operations. That is, the DAG combiner will now optimize exact signed
division if the divisor is a power-of-two constant and the integer type
exceeds the size of division supported by the target architecture.

---------

Signed-off-by: Steffen Holst Larsen <HolstLarsen.Steffen@amd.com>
2026-02-06 09:40:32 +01:00
Nicolai Hähnle
af836ff60c
[CodeGen] Add getTgtMemIntrinsic overload for multiple memory operands (NFC) (#175843)
There are target intrinsics that logically require two MMOs, such as
llvm.amdgcn.global.load.lds, which is a copy from global memory to LDS,
so there's both a load and a store to different addresses.

Add an overload of getTgtMemIntrinsic that produces intrinsic info in a
vector, and implement it in terms of the existing (now protected)
overload.

GlobalISel and SelectionDAG paths are updated to support multiple MMOs.
The main part of this change is supporting multiple MMOs in
MemIntrinsicNodes.

Converting the backends to using the new overload is a fairly mechanical step
that is done in a separate change in the hope that that allows reducing merging
pains during review and for downstreams. A later change will then enable
using multiple MMOs in AMDGPU.
2026-02-02 21:58:42 +00:00
DaKnig
fbda30607c
[SDAG] (abs (add nsw a, -b)) -> (abds a, b) (#175801)
This is beneficial for bv of constants.

alive2: https://alive2.llvm.org/ce/z/e3GsWZ
2026-02-02 15:11:16 +00:00
Simon Pilgrim
a372152cb5
[DAG] visitVECTOR_SHUFFLE - ensure correct resno when folding shuffle(bop(shuffle(x,y),shuffle(z,w)) (#179124)
TLI.isBinOp recognises some opcodes that have multiple results,
including UADDO etc.

In most cases we currently just bail if a binop has multiple results,
but shuffle combining was missing the check and its pretty trivial to
add handling in this case.

I've added add/sub-overflow opcodes to verifyNode to help catch these
cases in the future - IIRC there was a plan to autogen these, but there
isn't anything at the moment.

Fixes #179112
2026-02-02 09:22:48 +00:00
Benjamin Maxwell
1818b23a99
[SDAG] Check for nsz in DAG.canIgnoreSignBitOfZero() (#178905)
Follow up to #174423
2026-02-01 15:58:38 +00:00
陈子昂
a994198906
[DAG] Reland: Enable bitcast STLF for Constant/Undef (#178890)
This is a reland of #172523.

The original patch caused an assertion failure on RISC-V because it
attempted to create a bitcast from an illegal type (i32 on RV64) during
the post-type-legalization DAGCombine stage.

Added a `TLI.isTypeLegal(Val.getValueType())` check to ensure we only
proceed with the bitcast STLF optimization when the source value's type
is legal for the target.
2026-01-30 18:21:32 +01:00
Alex Bradbury
41f453efe2
Revert "[DAG] Enable bitcast STLF for Constant/Undef" (#178872)
Reverts llvm/llvm-project#172523

As explained in
https://github.com/llvm/llvm-project/pull/172523#issuecomment-3823234270
(along with reproducer), this causes compiler crashes building
llvm-test-suite for RVV targets.
2026-01-30 12:18:38 +00:00
陈子昂
d3c64633c3
[DAG] Enable bitcast STLF for Constant/Undef (#172523)
This patch introduces support for Store-to-Load Forwarding (STLF) in
`DAGCombiner::ForwardStoreValueToDirectLoad` when the store and load
have **different types but equal memory size** (e.g., storing an `i32`
then loading a `float` from the same location).

### What this patch does:
**Enables Optimization:** It allows for the safe forwarding of the
stored value as a Bitcast when the value is:
* A **Constant** (`ConstantSDNode`, `ConstantFPSDNode`,
`ConstantPoolSDNode`).
    * **Undef**.
    * And the memory sizes (`LdMemSize` == `StMemSize`) match.

### Scope and Next Steps:

This patch **only implements forwarding for constant and undef values
that has the same memory size** so far.

**I am submitting this initial patch to get early review feedback on the
core logic and fix the immediate crashes before tackling the more
complex scenarios.**

For the simple case:
```llvm
; Case Handled by this PR so far (e.g., zeroinitializer is a constant)
define float @test_stlf_integer(ptr %p, float %v) {
  store i32 0, ptr %p, align 4 
  %f = load float, ptr %p, align 4 
  ; ...
}
```
Fixes: #151683
2026-01-30 10:11:59 +01:00
Craig Topper
80cbd1d696
[RISCV] Support ISD::CLMUL/CLMULH for i64 scalable vectors with Zvbc. (#178340)
We also get some i32->i64 promotion for CLMULH. The DAGCombiner
change is to prevent an infinite loop from that.

Test file was rewritten to cover all types and split between clmul
and clmulh.

I added a couple masked tests to show that VectorPeephole works.
The test outputs were already large so I didn't want to add more than a couple.
2026-01-29 13:17:03 -08:00