4102 Commits

Author SHA1 Message Date
paperchalice
945a186089
[DAGCombiner] Remove most UnsafeFPMath references (#146295)
This pull request removes all references to `UnsafeFPMath` in dag
combiner except FP_ROUND.
- Set fast math flags in some tests.
2025-08-22 15:27:25 +08:00
Alex MacLean
a3ed96b899
[NVPTX] Legalize aext-load to zext-load to expose more DAG combines (#154251) 2025-08-21 15:33:23 -07:00
Jim Lin
fd28257195
[DAGCombiner] Fold umax/umin operations with vscale operands (#154461)
If umax/umin operations with vscale operands, that can be constant
folded.
2025-08-21 09:15:40 +08:00
Matt Arsenault
276c1d8114
DAG: Add assert to getNode for EXTRACT_SUBVECTOR indexes (#154099)
Verify it's a multiple of the result vector element count
instead of asserting this in random combines.

The testcase in #153808 fails in the wrong point. Add an
assert to getNode so the invalid extract asserts at construction
instead of use.
2025-08-20 09:55:43 +09:00
Simon Pilgrim
fcb36ca8cc
[DAG] visitTRUNCATE - merge the trunc(abd) and trunc(avg) handling which are almost identical (#154301)
CC @houngkoungting
2025-08-19 12:59:39 +01:00
黃國庭
0773854575
[DAG] Fold trunc(avg(x,y)) for avgceil/floor u/s nodes if they have sufficient leading zero/sign bits (#152273)
avgceil version :  https://alive2.llvm.org/ce/z/2CKrRh  

Fixes #147773 

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-08-18 16:36:26 +01:00
Simon Pilgrim
858d1dfa2c
[DAG] visitTRUNCATE - early out from computeKnownBits/ComputeNumSignBits failures. NFC. (#154111)
Avoid unnecessary (costly) computeKnownBits/ComputeNumSignBits calls - use MaskedValueIsZero instead of computeKnownBits directly to simplify code.
2025-08-18 14:55:09 +01:00
Simon Pilgrim
681ecae913
[DAG] visitTRUNCATE - test abd legality early to avoid unnecessary computeKnownBits/ComputeNumSignBits calls. NFC. (#154085)
isOperationLegal is much cheaper than value tracking
2025-08-18 11:06:29 +01:00
Simon Pilgrim
bcb4984a0b [X86] select-smin-smax.ll - add i128 tests
Helps check quality of legality codegen (all we had was x86 i64 handling)
2025-08-15 13:48:13 +01:00
Min-Yih Hsu
abe92a5000
[DAGCombine] Fix an incorrect folding of extract_subvector (#153709)
Reported from
https://github.com/llvm/llvm-project/pull/153393#issuecomment-3189898813

During DAGCombine, an intermediate extract_subvector sequence was
generated:
```
  t8: v9i16 = extract_subvector t3, Constant:i64<9>
t24: v8i16 = extract_subvector t8, Constant:i64<0>
```
And one of the DAGCombine rule which turns `(extract_subvector
(extract_subvector X, C), 0)` into `(extract_subvector X, C)` kicked in
and turn that into `v8i16 = extract_subvector t3, Constant:i64<9>`. But
it forgot to check if the extracted index is a multiple of the minimum
vector length of the result type, hence the crash.

This patch fixes this by adding an additional check.
2025-08-14 23:37:22 +00:00
woruyu
95b16d1264
[DAG] Fold trunc(abdu(x,y)) and trunc(abds(x,y)) if they have sufficient leading zero/sign bits (#151471)
This PR resolves https://github.com/llvm/llvm-project/issues/147683

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-08-08 10:43:14 +01:00
Benjamin Maxwell
94c48a21bb
[AArch64][SVE] Fix hang in VECTOR_HISTOGRAM DAG combine (#152539)
The histogram DAG combine went into an infinite loop of creating the
same histogram node due to an incorrect use of the `refineUniformBase`
and `refineIndexType` APIs.

These APIs take SDValues by reference (SDValue&) and return `true` if
they were "refined" (i.e., set to new values).

Previously, this DAG combine would create the `Ops` array (used to
create the new histogram node) before calling the `refine*` APIs, which
copies the SDValues into the array, meaning the updated values were not
used to create the new histogram node.

Reproducer: https://godbolt.org/z/hsGWhTaqY (it will timeout)
2025-08-08 09:59:24 +01:00
Craig Topper
57045a137f
[DAGCombiner] Avoid repeated calls to WideVT.getScalarSizeInBits() in DAGCombiner::mergeTruncStores. NFC (#152231)
We already have a variable, WideNumBits, that contains the same
information. Use it and delay the creation of WideVT until we really
need it.
2025-08-06 09:10:02 -07:00
Simon Pilgrim
9f50224b25
[DAG] Remove Depth=1 hack from isGuaranteedNotToBeUndefOrPoison checks (#152127)
Now that #146490 removed the assertion in visitFreeze to assert that the
node was still isGuaranteedNotToBeUndefOrPoison we no longer need this
reduced depth hack (which had to account for the difference in depth of
freeze(op()) vs op(freeze())

Helps with some of the minor regressions in #150017
2025-08-05 13:35:04 +01:00
Simon Pilgrim
d561259a08
[DAG] visitFREEZE - replace multiple frozen/unfrozen uses of an SDValue with just the frozen node (#150017)
Similar to InstCombinerImpl::freezeOtherUses, attempt to ensure that we
merge multiple frozen/unfrozen uses of a SDValue. This fixes a number of
hasOneUse() problems when trying to push FREEZE nodes through the DAG.

Remove SimplifyMultipleUseDemandedBits handling of FREEZE nodes as we
now want to keep the common node, and not bypass for some nodes just
because of DemandedElts.

Fixes #149799
2025-08-05 09:24:09 +01:00
woruyu
38bfe9ae56
[DAG] combineVSelectWithAllOnesOrZeros - missing freeze (#150388)
This PR resolves https://github.com/llvm/llvm-project/issues/150069

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-08-04 15:55:12 +01:00
Simon Pilgrim
5c2054a4ea
[DAG] getMinMaxOpcodeForFP - split if-else chain. NFC. (#151938)
(style) All cases return so split the chain
2025-08-04 15:32:08 +01:00
Abhishek Kaushik
1c0ac80d4a
[DAG] Combine store + vselect to masked_store (#145176)
Add a new combine to replace
```
(store ch (vselect cond truevec (load ch ptr offset)) ptr offset)
```
to
```
(mstore ch truevec ptr offset cond)
```

This saves a blend operation on targets that support conditional stores.
2025-08-04 19:05:36 +05:30
AZero13
23022a4683
[SelectionDAG] Move sign pattern check from AArch64 and ARM to general SelectionDAG (#151736)
This works on all cases much like the XOR case above it in SelectionDAG.
2025-08-01 14:46:51 -07:00
Paul Walker
ceb2b9c141
[LLVM][DAGCombiner] fold (shl (X * vscale(C0)), C1) -> (X * vscale(C0 << C1)). (#150651) 2025-08-01 11:42:45 +01:00
黃國庭
f04ea2ef1c
Add m_SelectCCLike matcher to match SELECT_CC or SELECT with SETCC (#149646)
Fix #147282 and  Follow-up to #148834

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-08-01 10:12:05 +01:00
David Sherwood
05b16aff0f
[DAGCombiner] Add combine for vector interleave of splats (#151110)
This patch adds two DAG combines:

1. vector_interleave(splat, splat, ...) -> {splat,splat,...}
2. concat_vectors(splat, splat, ...) -> wide_splat

where all the input splats are identical. Both of these
together enable us to fold
  concat_vectors(vector_interleave(splat, splat, ...))
into a wide splat. Post-legalisation we must only do the
concat_vector combine if the wider type and splat operation
is legal.

For fixed-width vectors the DAG combine only occurs for
interleave factors of 3 or more, however it's not currently
safe to test this for AArch64 since there isn't any lowering
support for fixed-width interleaves. I've only added
fixed-width tests for RISCV.
2025-08-01 09:58:05 +01:00
Pierre van Houtryve
c4b1557097
[DAG] Fold (setcc ((x | x >> c0 | ...) & mask)) sequences (#146054)
Fold sequences where we extract a bunch of contiguous bits from a value,
merge them into the low bit and then check if the low bits are zero or
not.

Usually the and would be on the outside (the leaves) of the expression,
but the DAG canonicalizes it to a single `and` at the root of the
expression.

The reason I put this in DAGCombiner instead of the target combiner is
because this is a generic, valid transform that's also fairly niche, so
there isn't much risk of a combine loop I think.

See #136727
2025-07-30 10:27:19 +02:00
Pierre van Houtryve
250f2a6367
[DAG] Remove AssertZext if the input is masked (#146052)
Remove AssertZext if the input ensures the assert cannot fail.
2025-07-29 13:05:30 +02:00
Nikita Popov
ab1f6ce482
[IR][SDAG] Remove lifetime size handling from SDAG (#150944)
Split out from https://github.com/llvm/llvm-project/pull/150248:

Specify that the argument of lifetime.start/lifetime.end is ignored and
will be removed in the future.

Remove lifetime size handling from SDAG. The size was previously
discarded during isel, so was always ignored for stack coloring anyway.
Where necessary, obtain the size of the full frame index.
2025-07-29 09:53:59 +02:00
Craig Topper
8d549cf036
[SelectionDAG] Pass SDNodeFlags through getNode instead of setFlags. (#149852)
getNode updates flags correctly for CSE. Calling setFlags after getNode
may set the flags where they don't apply.

I've added a Flags argument to getSelectCC and the signature of getNode that takes
an ArrayRef of EVTs.
2025-07-22 08:06:30 -07:00
Simon Pilgrim
c37942df00
[DAG] visitFREEZE - limit freezing of multiple operands (#149797)
This is a partial revert of #145939 (I've kept the BUILD_VECTOR(FREEZE(UNDEF), FREEZE(UNDEF), elt2, ...) canonicalization) as we're getting reports of infinite loops (#148084).

The issue appears to be due to deep chains of nodes and how visitFREEZE replaces all instances of an operand with a common frozen version - other users of the original frozen node then get added back to the worklist but might no longer be able to confirm a node isn't poison due to recursion depth limits on isGuaranteedNotToBeUndefOrPoison.

The issue still exists with the old implementation but by only allowing a single frozen operand it helps prevent cases of interdependent frozen nodes.

I'm still working on supporting multiple operands as its critical for topological DAG handling but need to get a fix in for trunk and 21.x.

Fixes #148084
2025-07-22 15:40:55 +01:00
Nikita Popov
a7a1df8f72
[CodeGen] Remove handling for lifetime.start/end on non-alloca (#149838)
After https://github.com/llvm/llvm-project/pull/149310 we are guaranteed
that the argument is an alloca, so we don't need to look at underlying
objects (which was not a correct thing to do anyway).

This also drops the offset argument for lifetime nodes in SDAG. The
offset is fixed to zero now. (Peculiarly, while SDAG pretended to have
an offset, it just gets silently dropped during selection.)
2025-07-22 09:44:59 +02:00
Simon Pilgrim
92e2d4e9e1
[DAG] visitFREEZE - remove unused HadMaybePoisonOperands check. NFC. (#149517)
Redundant since #145939
2025-07-18 17:38:11 +01:00
Alex MacLean
f73e163278
[DAGCombiner] Fold [us]itofp of truncate (#149391) 2025-07-18 08:10:20 -07:00
Paul Walker
44cd5027f8
[LLVM][CodeGen][SVE] List MVTs that are desirable for extending loads. (#149153)
Extend AArch64TargetLowering::isVectorLoadExtDesirable to specify the
set of MVT for which load extension is desirable.

Fixes https://github.com/llvm/llvm-project/issues/148939
2025-07-18 15:34:48 +01:00
Piotr Fusik
9fa3971fac
[DAGCombiner] Fold vector subtraction if above threshold to umin (#148834)
This extends #134235 and #135194 to vectors.
2025-07-17 16:37:59 +02:00
Craig Topper
36e4174989
[DAGCombiner][AArch64] Prevent SimplifyVCastOp from creating illegal scalar types after type legalization. (#148970)
Fixes #148949
2025-07-15 18:22:25 -07:00
Paul Walker
bd4e7f5f5d
[LLVM][DAGCombiner] Fix size calculations in calculateByteProvider. (#148425)
calculateByteProvider only cares about scalars or a single element
within a vector. For the later there is the VectorIndex parameter to
identify the element. All other properties, and specificially Index, are
related to the underyling scalar type and thus when taking the size of a
type it's the scalar size that matters.

Fixes https://github.com/llvm/llvm-project/issues/148387
2025-07-15 11:05:38 +01:00
Craig Topper
eea5c291bb
[DAGCombiner] Pass SDNodeFlags to getNode instead of modifying nodes. (#148744)
getNode has logic to intersect flags correctly if the new node happens
to CSE with an existing node. Setting node flags after getNode bypasses
this logic and may change the node for other uses where the flags don't
hold.
2025-07-14 20:53:14 -07:00
Craig Topper
f07107337f
[DAGCombiner] Pass SDNodeFlags to getSelect instead of modifying the node returned. (#148733) 2025-07-14 16:50:10 -07:00
jjasmine
44481f5067
[DAGCombine] Change isBuildVectorAll* -> isConstantSplatVectorAll* for Vselect (#147305)
Change isBuildVectorAll* -> isConstantSplatVectorAll* in VSelect in case
the fold happens after BuildVector has been canonically transformed to
Splat or if the Splat is initially in vselect already

- Fixes #73454
- Update related test cases, add extra tests in wasm

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-07-11 10:13:05 +01:00
David Green
0736f330b0
[DAG] Handle truncated splat in isBoolConstant (#145473)
This allows truncated splat / buildvector in isBoolConstant, to allow
certain not instructions to be recognized post-legalization, and allow
vselect to optimize.

An override for x86 avx512 predicated vectors is required to avoid an
infinite recursion from the code that detects zero vectors. From:
```
  // Check if the first operand is all zeros and Cond type is vXi1.
  // If this an avx512 target we can improve the use of zero masking by
  // swapping the operands and inverting the condition.
```
2025-07-10 20:59:34 +01:00
Philip Reames
f00a7a49bd
[DAG] Fold insert_subvector (splat X), (splat X), N2 - > splat X (#147380)
If we're inserting a splat into a splat of the same value, then
regardless of the index, the result is simply a splat of that value.
2025-07-08 08:50:01 -07:00
woruyu
b0790e04a3
[DAG] combineVSelectWithAllOnesOrZeros - fold select Cond, 0, x -> and not(Cond), x (#147472)
### Summary
This patch extends the work from
[#145298](https://github.com/llvm/llvm-project/pull/145298) by removing
the now-unnecessary X86-specific combineVSelectWithLastZeros logic. That
combine is now correctly and more generally handled in the
target-independent combineVSelectWithAllOnesOrZeros.

This simplifies the X86 DAG combine logic and avoids duplication.

Fixes: [#144513](https://github.com/llvm/llvm-project/issues/144513)
Related for reference:
[#146831](https://github.com/llvm/llvm-project/pull/146831)
2025-07-08 14:45:40 +01:00
woruyu
c80fa2364b
[DAG] SDPatternMatch m_Zero/m_One/m_AllOnes have inconsistent undef h… (#147044)
### Summary
This PR resolves https://github.com/llvm/llvm-project/issues/146871 
This PR resolves https://github.com/llvm/llvm-project/issues/140745

Refactor m_Zero/m_One/m_AllOnes all use struct template function to
match and AllowUndefs=false as default.
2025-07-07 15:04:54 +01:00
Simon Pilgrim
52383956f8
[DAG] Replace DAGCombiner::ConstantFoldBITCASTofBUILD_VECTOR with SelectionDAG::FoldConstantBuildVector (#147037)
DAGCombiner can already constant fold build vectors of constants/undefs
to a new vector type, but it has to be incredibly careful after
legalization to not affect a target's canonicalized constants.

This patch proposes we move the implementation inside SelectionDAG to
make it easier for targets to manually use the constant folding whenever
it deems it safe to do so.

I've also altered the method to take the BuildVectorSDNode input
directly and consistently use the same SDLoc.
2025-07-07 10:44:03 +01:00
Simon Pilgrim
ba7d78ac45
[DAG] foldABSToABD - fallback to value tracking if the (ABS (SUB LHS, RHS)) operands aren't extended (#147053)
ISD::ABDS can be used if the signed subtraction will not overwrap (this
is an extension to handle cases where the NSW flag has been lost)

ISD::ABDU can be used if both operands have at least 1 zero sign bit.

Fixes #147049
2025-07-06 08:36:46 +01:00
Simon Pilgrim
740da004af [DAG] Fix static analyzer warning about mismatched argument comments in isConstOrConstSplat. NFC. 2025-07-04 15:00:08 +01:00
Simon Pilgrim
c79fcfee41 [DAG] combineVSelectWithAllOnesOrZeros - reusing existing VT. NFC. 2025-07-03 10:57:55 +01:00
woruyu
bbcebec3af
[DAG] Refactor X86 combineVSelectWithAllOnesOrZeros fold into a generic DAG Combine (#145298)
This PR resolves https://github.com/llvm/llvm-project/issues/144513

The modification include five pattern :
1.vselect Cond, 0, 0 → 0
2.vselect Cond, -1, 0 → bitcast Cond
3.vselect Cond, -1, x → or Cond, x
4.vselect Cond, x, 0 → and Cond, x
5.vselect Cond, 000..., X -> andn Cond, X

1-4 have been migrated to DAGCombine. 5 still in x86 code.

The reason is that you cannot use the andn instruction directly in
DAGCombine, you can only use and+xor, which will introduce optimization
order issues. For example, in the x86 backend, select Cond, 0, x →
(~Cond) & x, the backend will first check whether the cond node of
(~Cond) is a setcc node. If so, it will modify the comparison operator
of the condition.So the x86 backend cannot complete the optimization of
andn.In short, I think it is a better choice to keep the pattern of
vselect Cond, 000..., X instead of and+xor in combineDAG.

For commit, the first is code changes and x86 test(note 1), the second
is tests in other backend(node 2).

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-07-02 15:07:48 +01:00
Simon Pilgrim
38200e94f1
[DAG] visitFREEZE - always allow freezing multiple operands (#145939)
Always try to fold freeze(op(....)) -> op(freeze(),freeze(),freeze(),...).

This patch proposes we drop the opt-in limit for opcodes that are allowed to push a freeze through the op to freeze all its operands, through the tree towards the roots.

I'm struggling to find a strong reason for this limit apart from the DAG freeze handling being immature for so long - as we've improved coverage in canCreateUndefOrPoison/isGuaranteedNotToBeUndefOrPoison it looks like the regressions are not as severe.

Hopefully this will help some of the regression issues in #143102 etc.
2025-07-02 11:28:37 +01:00
James Y Knight
ae2104897c
[SelectionDAG] Fix NaN regression in fma dag-combine. (#146592)
After 901e1390c9778a191256335d37802bc631c2d183 (#127770), the DAG
combine would transform `fma(x, 0.0, 1.0)` into `1.0` if
`-fp-contract=fast` was enabled, in addition to when 'x' is marked
nnan/ninf.

It's only valid in the latter case, not the former, so delete the extra
condition.
2025-07-01 18:10:30 -04:00
Simon Pilgrim
4e30f8101e
[DAG] visitFREEZE - remove isGuaranteedNotToBeUndefOrPoison assertion (#146490)
Although nice to have to prove the freeze can be moved, this can fail
immediately after freeze(op(...)) -> op(freeze(),freeze(),...) creation
if any of the new freeze nodes now prevents value tracking from seeing
through to the source values (e.g. shift amounts/element indices are in
bounds etc.).

This will allow us to remove the isGuaranteedNotToBeUndefOrPoison checks
inside canCreateUndefOrPoison that were discussed on #146361
2025-07-01 11:17:41 +01:00
paperchalice
613222ec33
[DAGCombiner] Remove UnsafeFPMath usage in visitFSUBForFMACombine etc. (#145637)
Remove `UnsafeFPMath` in `visitFMULForFMADistributiveCombine`,
`visitFSUBForFMACombine` and `visitFDIV`.
All affected tests are fixed by add fast math flags manually.
Propagate fast math flags when lowering fdiv in NVPTX backend, so it can
produce optimized dag when `unsafe-fp-math` is absent.
2025-06-30 08:41:23 +08:00