4078 Commits

Author SHA1 Message Date
Nikita Popov
ab1f6ce482
[IR][SDAG] Remove lifetime size handling from SDAG (#150944)
Split out from https://github.com/llvm/llvm-project/pull/150248:

Specify that the argument of lifetime.start/lifetime.end is ignored and
will be removed in the future.

Remove lifetime size handling from SDAG. The size was previously
discarded during isel, so was always ignored for stack coloring anyway.
Where necessary, obtain the size of the full frame index.
2025-07-29 09:53:59 +02:00
Craig Topper
8d549cf036
[SelectionDAG] Pass SDNodeFlags through getNode instead of setFlags. (#149852)
getNode updates flags correctly for CSE. Calling setFlags after getNode
may set the flags where they don't apply.

I've added a Flags argument to getSelectCC and the signature of getNode that takes
an ArrayRef of EVTs.
2025-07-22 08:06:30 -07:00
Simon Pilgrim
c37942df00
[DAG] visitFREEZE - limit freezing of multiple operands (#149797)
This is a partial revert of #145939 (I've kept the BUILD_VECTOR(FREEZE(UNDEF), FREEZE(UNDEF), elt2, ...) canonicalization) as we're getting reports of infinite loops (#148084).

The issue appears to be due to deep chains of nodes and how visitFREEZE replaces all instances of an operand with a common frozen version - other users of the original frozen node then get added back to the worklist but might no longer be able to confirm a node isn't poison due to recursion depth limits on isGuaranteedNotToBeUndefOrPoison.

The issue still exists with the old implementation but by only allowing a single frozen operand it helps prevent cases of interdependent frozen nodes.

I'm still working on supporting multiple operands as its critical for topological DAG handling but need to get a fix in for trunk and 21.x.

Fixes #148084
2025-07-22 15:40:55 +01:00
Nikita Popov
a7a1df8f72
[CodeGen] Remove handling for lifetime.start/end on non-alloca (#149838)
After https://github.com/llvm/llvm-project/pull/149310 we are guaranteed
that the argument is an alloca, so we don't need to look at underlying
objects (which was not a correct thing to do anyway).

This also drops the offset argument for lifetime nodes in SDAG. The
offset is fixed to zero now. (Peculiarly, while SDAG pretended to have
an offset, it just gets silently dropped during selection.)
2025-07-22 09:44:59 +02:00
Simon Pilgrim
92e2d4e9e1
[DAG] visitFREEZE - remove unused HadMaybePoisonOperands check. NFC. (#149517)
Redundant since #145939
2025-07-18 17:38:11 +01:00
Alex MacLean
f73e163278
[DAGCombiner] Fold [us]itofp of truncate (#149391) 2025-07-18 08:10:20 -07:00
Paul Walker
44cd5027f8
[LLVM][CodeGen][SVE] List MVTs that are desirable for extending loads. (#149153)
Extend AArch64TargetLowering::isVectorLoadExtDesirable to specify the
set of MVT for which load extension is desirable.

Fixes https://github.com/llvm/llvm-project/issues/148939
2025-07-18 15:34:48 +01:00
Piotr Fusik
9fa3971fac
[DAGCombiner] Fold vector subtraction if above threshold to umin (#148834)
This extends #134235 and #135194 to vectors.
2025-07-17 16:37:59 +02:00
Craig Topper
36e4174989
[DAGCombiner][AArch64] Prevent SimplifyVCastOp from creating illegal scalar types after type legalization. (#148970)
Fixes #148949
2025-07-15 18:22:25 -07:00
Paul Walker
bd4e7f5f5d
[LLVM][DAGCombiner] Fix size calculations in calculateByteProvider. (#148425)
calculateByteProvider only cares about scalars or a single element
within a vector. For the later there is the VectorIndex parameter to
identify the element. All other properties, and specificially Index, are
related to the underyling scalar type and thus when taking the size of a
type it's the scalar size that matters.

Fixes https://github.com/llvm/llvm-project/issues/148387
2025-07-15 11:05:38 +01:00
Craig Topper
eea5c291bb
[DAGCombiner] Pass SDNodeFlags to getNode instead of modifying nodes. (#148744)
getNode has logic to intersect flags correctly if the new node happens
to CSE with an existing node. Setting node flags after getNode bypasses
this logic and may change the node for other uses where the flags don't
hold.
2025-07-14 20:53:14 -07:00
Craig Topper
f07107337f
[DAGCombiner] Pass SDNodeFlags to getSelect instead of modifying the node returned. (#148733) 2025-07-14 16:50:10 -07:00
jjasmine
44481f5067
[DAGCombine] Change isBuildVectorAll* -> isConstantSplatVectorAll* for Vselect (#147305)
Change isBuildVectorAll* -> isConstantSplatVectorAll* in VSelect in case
the fold happens after BuildVector has been canonically transformed to
Splat or if the Splat is initially in vselect already

- Fixes #73454
- Update related test cases, add extra tests in wasm

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-07-11 10:13:05 +01:00
David Green
0736f330b0
[DAG] Handle truncated splat in isBoolConstant (#145473)
This allows truncated splat / buildvector in isBoolConstant, to allow
certain not instructions to be recognized post-legalization, and allow
vselect to optimize.

An override for x86 avx512 predicated vectors is required to avoid an
infinite recursion from the code that detects zero vectors. From:
```
  // Check if the first operand is all zeros and Cond type is vXi1.
  // If this an avx512 target we can improve the use of zero masking by
  // swapping the operands and inverting the condition.
```
2025-07-10 20:59:34 +01:00
Philip Reames
f00a7a49bd
[DAG] Fold insert_subvector (splat X), (splat X), N2 - > splat X (#147380)
If we're inserting a splat into a splat of the same value, then
regardless of the index, the result is simply a splat of that value.
2025-07-08 08:50:01 -07:00
woruyu
b0790e04a3
[DAG] combineVSelectWithAllOnesOrZeros - fold select Cond, 0, x -> and not(Cond), x (#147472)
### Summary
This patch extends the work from
[#145298](https://github.com/llvm/llvm-project/pull/145298) by removing
the now-unnecessary X86-specific combineVSelectWithLastZeros logic. That
combine is now correctly and more generally handled in the
target-independent combineVSelectWithAllOnesOrZeros.

This simplifies the X86 DAG combine logic and avoids duplication.

Fixes: [#144513](https://github.com/llvm/llvm-project/issues/144513)
Related for reference:
[#146831](https://github.com/llvm/llvm-project/pull/146831)
2025-07-08 14:45:40 +01:00
woruyu
c80fa2364b
[DAG] SDPatternMatch m_Zero/m_One/m_AllOnes have inconsistent undef h… (#147044)
### Summary
This PR resolves https://github.com/llvm/llvm-project/issues/146871 
This PR resolves https://github.com/llvm/llvm-project/issues/140745

Refactor m_Zero/m_One/m_AllOnes all use struct template function to
match and AllowUndefs=false as default.
2025-07-07 15:04:54 +01:00
Simon Pilgrim
52383956f8
[DAG] Replace DAGCombiner::ConstantFoldBITCASTofBUILD_VECTOR with SelectionDAG::FoldConstantBuildVector (#147037)
DAGCombiner can already constant fold build vectors of constants/undefs
to a new vector type, but it has to be incredibly careful after
legalization to not affect a target's canonicalized constants.

This patch proposes we move the implementation inside SelectionDAG to
make it easier for targets to manually use the constant folding whenever
it deems it safe to do so.

I've also altered the method to take the BuildVectorSDNode input
directly and consistently use the same SDLoc.
2025-07-07 10:44:03 +01:00
Simon Pilgrim
ba7d78ac45
[DAG] foldABSToABD - fallback to value tracking if the (ABS (SUB LHS, RHS)) operands aren't extended (#147053)
ISD::ABDS can be used if the signed subtraction will not overwrap (this
is an extension to handle cases where the NSW flag has been lost)

ISD::ABDU can be used if both operands have at least 1 zero sign bit.

Fixes #147049
2025-07-06 08:36:46 +01:00
Simon Pilgrim
740da004af [DAG] Fix static analyzer warning about mismatched argument comments in isConstOrConstSplat. NFC. 2025-07-04 15:00:08 +01:00
Simon Pilgrim
c79fcfee41 [DAG] combineVSelectWithAllOnesOrZeros - reusing existing VT. NFC. 2025-07-03 10:57:55 +01:00
woruyu
bbcebec3af
[DAG] Refactor X86 combineVSelectWithAllOnesOrZeros fold into a generic DAG Combine (#145298)
This PR resolves https://github.com/llvm/llvm-project/issues/144513

The modification include five pattern :
1.vselect Cond, 0, 0 → 0
2.vselect Cond, -1, 0 → bitcast Cond
3.vselect Cond, -1, x → or Cond, x
4.vselect Cond, x, 0 → and Cond, x
5.vselect Cond, 000..., X -> andn Cond, X

1-4 have been migrated to DAGCombine. 5 still in x86 code.

The reason is that you cannot use the andn instruction directly in
DAGCombine, you can only use and+xor, which will introduce optimization
order issues. For example, in the x86 backend, select Cond, 0, x →
(~Cond) & x, the backend will first check whether the cond node of
(~Cond) is a setcc node. If so, it will modify the comparison operator
of the condition.So the x86 backend cannot complete the optimization of
andn.In short, I think it is a better choice to keep the pattern of
vselect Cond, 000..., X instead of and+xor in combineDAG.

For commit, the first is code changes and x86 test(note 1), the second
is tests in other backend(node 2).

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-07-02 15:07:48 +01:00
Simon Pilgrim
38200e94f1
[DAG] visitFREEZE - always allow freezing multiple operands (#145939)
Always try to fold freeze(op(....)) -> op(freeze(),freeze(),freeze(),...).

This patch proposes we drop the opt-in limit for opcodes that are allowed to push a freeze through the op to freeze all its operands, through the tree towards the roots.

I'm struggling to find a strong reason for this limit apart from the DAG freeze handling being immature for so long - as we've improved coverage in canCreateUndefOrPoison/isGuaranteedNotToBeUndefOrPoison it looks like the regressions are not as severe.

Hopefully this will help some of the regression issues in #143102 etc.
2025-07-02 11:28:37 +01:00
James Y Knight
ae2104897c
[SelectionDAG] Fix NaN regression in fma dag-combine. (#146592)
After 901e1390c9778a191256335d37802bc631c2d183 (#127770), the DAG
combine would transform `fma(x, 0.0, 1.0)` into `1.0` if
`-fp-contract=fast` was enabled, in addition to when 'x' is marked
nnan/ninf.

It's only valid in the latter case, not the former, so delete the extra
condition.
2025-07-01 18:10:30 -04:00
Simon Pilgrim
4e30f8101e
[DAG] visitFREEZE - remove isGuaranteedNotToBeUndefOrPoison assertion (#146490)
Although nice to have to prove the freeze can be moved, this can fail
immediately after freeze(op(...)) -> op(freeze(),freeze(),...) creation
if any of the new freeze nodes now prevents value tracking from seeing
through to the source values (e.g. shift amounts/element indices are in
bounds etc.).

This will allow us to remove the isGuaranteedNotToBeUndefOrPoison checks
inside canCreateUndefOrPoison that were discussed on #146361
2025-07-01 11:17:41 +01:00
paperchalice
613222ec33
[DAGCombiner] Remove UnsafeFPMath usage in visitFSUBForFMACombine etc. (#145637)
Remove `UnsafeFPMath` in `visitFMULForFMADistributiveCombine`,
`visitFSUBForFMACombine` and `visitFDIV`.
All affected tests are fixed by add fast math flags manually.
Propagate fast math flags when lowering fdiv in NVPTX backend, so it can
produce optimized dag when `unsafe-fp-math` is absent.
2025-06-30 08:41:23 +08:00
Fabian Ritter
215e61c088
[AMDGPU][SDAG] Add ISD::PTRADD DAG combines (#142739)
This patch focuses on generic DAG combines, plus an AMDGPU-target-specific one
that is closely connected.

The generic DAG combine is based on a part of PR #105669 by rgwott, which was
adapted from work by jrtc27, arichardson, davidchisnall in the CHERI/Morello
LLVM tree. I added some parts and removed several disjuncts from the
reassociation condition:
- `isNullConstant(X)`, since there are address spaces where 0 is a perfectly
  normal value that shouldn't be treated specially,
- `(YIsConstant && ZOneUse)` and `(N0OneUse && ZOneUse && !ZIsConstant)`, since
  they cause regressions in AMDGPU.

For SWDEV-516125.
2025-06-26 09:40:04 +02:00
paperchalice
901e1390c9
[SelectionDAG] Remove UnsafeFPMath check in visitFADDForFMACombine (#127770)
As requested in #127488, remove reference to `Options.UnsafeFPMath`,
which should be obsolete and `AllowFPOpFusion` also handles it.
2025-06-25 12:31:23 +08:00
David Green
825ad86aea
[DAG] Fold nested add(add(reduce(a), b), add(reduce(c), d)) (#115150)
This patch reassociates `add(add(vecreduce(a), b), add(vecreduce(c),
d))` into `add(vecreduce(add(a, c)), add(b, d))`, to combine the
reductions into a single node. This comes up after unrolling vectorized
loops.

There is another small change to move reassociateReduction inside fadd
outside of a AllowNewConst block, as new constants will not be created
and it should be OK to perform the combine later after legalization.
2025-06-24 13:08:59 +01:00
Craig Topper
97ad0f4b3d
[DAGCombiner][RISCV] Don't propagate the exact flag from udiv/sdiv to urem/srem. (#145387)
If we simplify a udiv/sdiv using the exact flag we shouldn't
propagate that simplifaction to any urem/srem that happens to
use the same operands. If the exact flag is wrong, the udiv/sdiv
will produce poison, but that doesn't mean we can make the urem/srem
simplify to 0.
    
Fixes #145360.
2025-06-23 14:29:17 -07:00
Kazu Hirata
76ae9aa4d2
[CodeGen] Use range-based for loops (NFC) (#145251) 2025-06-22 19:09:38 -07:00
Iris Shi
f2eb5d416e
[SelectionDAG] Handle fneg/fabs/fcopysign in SimplifyDemandedBits (#139239) 2025-06-22 22:48:59 +08:00
Simon Pilgrim
c734377544 [DAG] foldMaskedMerge - fix Wparentheses operator precedence warning. NFC. 2025-06-20 16:20:28 +01:00
Craig Topper
5eb24fde11
[SelectionDAG][RISCV] Preserve nneg flag when folding (trunc (zext X))->(zext X). (#144807)
If X is known non-negative, that's still true if we fold the truncate
to create a smaller zext.
    
In the i128 tests, SelectionDAGBuilder aggressively truncates the
`zext nneg` to i64 to match `getShiftAmountTy`. If we don't preserve
the `nneg` we can't see that the shift amount argument being `signext`
means we don't need to do any extension
2025-06-19 08:06:07 -07:00
woruyu
0018921148
[DAG] add (~a | x) & (a | y) -> (a & (x ^ y)) ^y for foldMaskedMerge (#144342)
### Summary
This PR resolves https://github.com/llvm/llvm-project/issues/143864

Add (~a | x) & (a | y) -> (a & (x ^ y)) ^y for foldMaskedMerge func
using SDPatternMatch

aftering adding this pattern, run ```ninja check-llvm-codegen```, all
other cases remain unchanged, so I add a
testcase(fold-masked-merge-demorgan.ll) for it

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-06-18 17:22:53 +01:00
Rajveer Singh Bharadwaj
e07b1b26c3
[DAG] Implement SDPatternMatch m_Abs() matcher (#144512) 2025-06-18 12:59:27 +05:30
Simon Pilgrim
71f72f4d5d [DAG] Move foldMaskedMerge before visitAND. NFC.
Reduces diff in #144342
2025-06-17 11:21:56 +01:00
Xu Zhang
7c25db3fbf
[DAG] Fold (and X, (add (not Y), Z)) -> (and X, (not (sub Y, Z))). (#141476)
Fixes #140639

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-06-16 15:55:26 +01:00
woruyu
41c8df147b
[DAG] Convert foldMaskedMerge to SDPatternMatch to match (m & x) | (~m & y) (#143855)
This PR resolves https://github.com/llvm/llvm-project/issues/143363

Remove foldMaskedMergeImpl entirely to use SDPatternMatch
2025-06-12 13:46:07 +01:00
Iris Shi
24d730b380
Reland "[SelectionDAG] Make (a & x) | (~a & y) -> (a & (x ^ y)) ^ y available for all targets" (#143651) 2025-06-11 15:56:37 +08:00
Iris Shi
8c890eaa3f
Revert "[SelectionDAG] Make (a & x) | (~a & y) -> (a & (x ^ y)) ^ y available for all targets" (#143648) 2025-06-11 10:19:12 +08:00
Matt Arsenault
505c550e4c
DAG: Assert fcmp uno runtime calls are boolean values (#142898)
This saves 2 instructions in the ARM soft float case for fcmp ueq.

This code is written in an confusingly overly general way. The point
of getCmpLibcallCC is to express that the compiler-rt implementations
of the FP compares are different aliases around functions which may
return -1 in some cases. This does not apply to the call for unordered,
which returns a normal boolean.

Also stop overriding the default value for the unordered compare for ARM.
This was setting it to the same value as the default, which is now assumed.
2025-06-10 10:46:29 +09:00
Philip Reames
939666380f
[SDAG] Add partial_reduce_sumla node (#141267)
We have recently added the partial_reduce_smla and partial_reduce_umla
nodes to represent Acc += ext(b) * ext(b) where the two extends have to
have the same source type, and have the same extend kind.

For riscv64 w/zvqdotq, we have the vqdot and vqdotu instructions which
correspond to the existing nodes, but we also have vqdotsu which
represents the case where the two extends are sign and zero respective
(i.e. not the same type of extend).

This patch adds a partial_reduce_sumla node which has sign extension for
A, and zero extension for B. The addition is somewhat mechanical.
2025-06-09 07:17:45 -07:00
Iris Shi
bfb48363b0
[SelectionDAG] Make (a & x) | (~a & y) -> (a & (x ^ y)) ^ y available for all targets (#137641) 2025-06-09 17:57:15 +08:00
Harrison Hao
102dfa8a48
[DAGCombiner] Allow freeze to sink through fmul by adding it to AllowMultipleMaybePoisonOperands (#142250)
Allow freeze to sink through fmul by treating it as a
non-poison-generating op
when operands are not poison.

Adding `ISD::FMUL` to `AllowMultipleMaybePoisonOperands` lets DAG
combine
push freeze through fmul. This helps expose patterns like `fmul+fadd`
for `FMA` fusion.

When rebuilding the node, we drop flags like nnan/ninf/nsz that imply
poison,
but keep contract, reassoc, afn, and arcp.


Closes: https://github.com/llvm/llvm-project/issues/141622
2025-06-08 19:38:57 +08:00
AZero13
7730853fa1
[SelectionDAG] Use DAG.getSelect (NFC) (#143276) 2025-06-08 10:27:10 +09:00
Craig Topper
b4b3be7faa
[DAGCombiner] Teach SearchForAndLoads to handle an AND with 2 constant operands. (#142062)
If opaque constants are involved we can have an AND with 2 constant
operands that hasn't been simplified. If this is the case, we need
to modify at least one of the constants if it is out of range.
    
Fixes #142004
2025-05-30 16:00:43 -07:00
Craig Topper
c5a17e6bea
[DAGCombiner] Use APInt::isSubsetOf. NFC (#142029) 2025-05-30 09:01:36 -07:00
Philip Reames
1651aa2943
[SDAG] Split the partial reduce legalize table by opcode [nfc] (#141970)
On it's own, this change should be non-functional. This is a preparatory
change for https://github.com/llvm/llvm-project/pull/141267 which adds a
new form of PARTIAL_REDUCE_*MLA. As noted in the discussion on that
review, AArch64 needs a different set of legal and custom types for the
PARTIAL_REDUCE_SUMLA variant than the currently existing
PARTIAL_REDUCE_UMLA/SMLA.
2025-05-29 14:05:31 -07:00
Nicholas Guy
a5d97ebe8b
[AArch64][SelectionDAG] Add type legalization for partial reduce wide adds (#141075)
Based on work initially done by @JamesChesterman.
2025-05-29 14:42:23 +01:00