3993 Commits

Author SHA1 Message Date
Simon Pilgrim
9b32f3d096
[DAG] visitEXTRACT_SUBVECTOR - don't return early on failure of EXTRACT_SUBVECTOR(INSERT_SUBVECTOR()) -> BITCAST fold (#133695)
Always allow later folds to try to match as well.
2025-03-31 14:32:43 +01:00
Simon Pilgrim
666faa7fd9
[DAG] visitEXTRACT_SUBVECTOR - accumulate SimplifyDemandedVectorElts demanded elts across all EXTRACT_SUBVECTOR uses (REAPPLIED) (#133401)
Similar to what is done for visitEXTRACT_VECTOR_ELT - if all uses of a vector are EXTRACT_SUBVECTOR, then determine the accumulated demanded elts across all users and call SimplifyDemandedVectorElts in "AssumeSingleUse" use.

Second try after #133130 was reverted by #133331 due to it affecting reverted test files
2025-03-29 17:55:38 +00:00
Walter Lee
5b7fd708fe
Revert "[DAG] visitEXTRACT_SUBVECTOR - accumulate SimplifyDemandedVectorElts demanded elts across all EXTRACT_SUBVECTOR uses" (#133331)
Reverts llvm/llvm-project#133130

This touches a common file as #133083, which is causing failures
2025-03-27 18:36:38 -04:00
Philip Reames
c90a536bcf [CodeGen] Simplify code using TypeSize overloads of getMachineMemOperand [nfc]
These were added in d584cea.  This change runs through existing uses and
simplifies where obvious.
2025-03-27 11:47:51 -07:00
Simon Pilgrim
a8575b3ea8
[DAG] visitEXTRACT_SUBVECTOR - accumulate SimplifyDemandedVectorElts demanded elts across all EXTRACT_SUBVECTOR uses (#133130)
Similar to what is done for visitEXTRACT_VECTOR_ELT - if all uses of a
vector are EXTRACT_SUBVECTOR, then determine the accumulated demanded
elts across all users and call SimplifyDemandedVectorElts in
"AssumeSingleUse" use.
2025-03-27 15:31:06 +00:00
LU-JOHN
2df25a4733
Invalidate range metadata when folding bitcast into load (#133095) 2025-03-27 14:10:55 +07:00
Pierre van Houtryve
6e3c24fc0a
[DAG] Combine (sext (sext_in_reg x)) to (sext_in_reg (any_extend x)) (#132386) 2025-03-24 09:31:02 +01:00
Mikhail R. Gadelha
f138e36d52
[SelectionDAG][RISCV] Avoid store merging across function calls (#130430)
This patch improves DAGCombiner's handling of potential store merges by
detecting function calls between loads and stores. When a function call
exists in the chain between a load and its corresponding store, we avoid
merging these stores if the spilling is unprofitable.

We had to implement a hook on TLI, since TTI is unavailable in
DAGCombine. Currently, it's only enabled for riscv.

This is the DAG equivalent of PR #129258
2025-03-22 10:35:25 -03:00
Matthias Braun
e6382f2111
SelectionDAG: neg (and x, 1) --> SIGN_EXTEND_INREG x, i1 (#131239)
The pattern
```LLVM
%shl = shl i32 %x, 31
%ashr = ashr i32 %shl, 31
```
would be combined to `SIGN_EXTEND_INREG %x, ValueType:ch:i1` by
SelectionDAG.
However InstCombine normalizes this pattern to:
```LLVM
%and = and i32 %x, 1
%neg = sub i32 0, %and
```
This adds matching code to DAGCombiner to catch this variant as well.
2025-03-14 10:47:56 -07:00
LU-JOHN
95e186cadf
Reland "DAG: Preserve range metadata when load is narrowed" (#128144) (#130609)
Changes: Add guard to ensure truncation is strictly smaller than
original size.

---------

Signed-off-by: John Lu <John.Lu@amd.com>
2025-03-13 12:47:03 +07:00
sommersun
848ba3854c
[DAG] fold AVGFLOORS to AVGFLOORU for non-negative operand (#84746) (#129678)
Fold ISD::AVGFLOORS to ISD::AVGFLOORU for non-negative operand. Cover test is modified for uhadd with zero extension.

Fixes #84746
2025-03-10 13:01:08 +00:00
Paul Walker
a537724069
[LLVM][DAGCombine] Remove combiner-vector-fcopysign-extend-round. (#129878)
This option was added to improve test coverage for SVE lowering code
that is impossible to reach otherwise. Given it is not possible to
trigger a bug without it and the generated code is universally worse
with it, I figure the option has no value and should be removed.
2025-03-05 15:31:34 +00:00
James Chesterman
e3c8e17b07 Reland "[DAGCombiner] Add generic DAG combine for ISD::PARTIAL_REDUCE_MLA (#127083)"
This relands commit 7a06681398a33d53ba6d661777be8b4c1d19acb7.
2025-03-04 11:09:33 +00:00
Kazu Hirata
7a06681398 Revert "[DAGCombiner] Add generic DAG combine for ISD::PARTIAL_REDUCE_MLA (#127083)"
This reverts commit 2bef21f24ba932a757a644470358c340f4bcd113.

Multiple builtbot failures have been reported:
https://github.com/llvm/llvm-project/pull/127083
2025-03-04 01:44:09 -08:00
James Chesterman
2bef21f24b
[DAGCombiner] Add generic DAG combine for ISD::PARTIAL_REDUCE_MLA (#127083)
Add generic DAG combine for ISD::PARTIAL_REDUCE_U/SMLA nodes. Transforms
the DAG from:
PARTIAL_REDUCE_MLA(Acc, MUL(EXT(MulOpLHS), EXT(MulOpRHS)), Splat(1)) to
PARTIAL_REDUCE_MLA(Acc, MulOpLHS, MulOpRHS).
2025-03-04 09:09:15 +00:00
Huibin Wang
59138a603f
[DAGCombiner] Cleanup MatchFunnelPosNeg by using SDPatternMatch matchers (#129482)
Fixes issue: https://github.com/llvm/llvm-project/issues/129034
2025-03-03 14:35:38 +07:00
Yingwei Zheng
2709366f75
[DAGCombiner] Don't ignore N2's undef elements in foldVSelectOfConstants (#129272)
Since N2 will be reused in the fold, we cannot skip N2's undef elements
if the corresponding element in N1 is well-defined.
For example:
```
t2: v4i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
t24: v4i32 = BUILD_VECTOR undef:i32, undef:i32, Constant:i32<1>, undef:i32
t11: v4i32 = vselect t8, t2, t10
```
Before this patch, we fold t11 into:
```
t26: v4i32 = sign_extend t8
t27: v4i32 = add t26, t24
```
The last element of t27 is incorrect.

Closes https://github.com/llvm/llvm-project/issues/129181.
2025-03-01 20:21:28 +08:00
Simon Pilgrim
bae41127e2
[DAG] replaceShuffleOfInsert - add support for shuffle_vector(scalar_to_vector(x),y) -> insert_vector_elt(y,x,c) (#127210)
Begin extending replaceShuffleOfInsert to handle other forms of scalar insertion into a vector.

I've limited this to targets that have Custom/Legal ISD::INSERT_VECTOR_ELT handling for now - although we can probably always fold this before LegalOperations.
2025-02-27 08:41:58 +00:00
Daniel Thornburgh
02128342d2
Revert "DAG: Preserve range metadata when load is narrowed" (#128948)
Reverts llvm/llvm-project#128144

Breaks clang prod x64 build (seen in Fuchsia toolchain)
2025-02-26 14:14:55 -08:00
LU-JOHN
d8bcb53780
DAG: Preserve range metadata when load is narrowed (#128144)
In DAGCombiner.cpp preserve range metadata when load is narrowed to load
LSBs if original range metadata bounds can fit in the narrower type.

Utilize preserved range metadata to reduce 64-bit shl to 32-bit shl.

---------

Signed-off-by: John Lu <John.Lu@amd.com>
2025-02-26 15:40:49 +07:00
Vikash Gupta
352c48f278
[SelectionDAG] Utilizing target hook convertSelectOfConstantsToMath for SelectwithConstant (#127599)
The Target hook convertSelectOfConstantsToMath() needs to be used within
SimplifySelectCC helper combine function in SelectionDAG Isel, where
generic select folding with constants is happening into simple maths op
using the condition as it is.

It necessarily fixes #121145.
2025-02-25 20:32:24 +05:30
Yingwei Zheng
44d1dbd24c
[X86][DAGCombiner] Skip x87 fp80 values in combineFMulOrFDivWithIntPow2 (#128618)
f80 is not a valid IEEE floating-point type.
Closes https://github.com/llvm/llvm-project/issues/128528.
2025-02-25 22:03:17 +08:00
Yingwei Zheng
646e4f2eed
[DAGCombiner] visitFREEZE: Early exit when N is deleted (#128161)
`N` may get merged with existing nodes inside the loop. Early exit when
it is deleted to avoid the crash.
Alternative solution: use `DAGNodeDeletedListener` to refresh the value
of N.

Closes https://github.com/llvm/llvm-project/issues/128143.
2025-02-22 12:06:34 +08:00
Piotr Fusik
8b58cb853a
[SelectionDAG][NFC] Refactor duplicate code into SDNode::bitcastToAPInt() (#127503) 2025-02-20 13:23:00 +07:00
zhijian lin
1ac0db44fd
[NFC] using isUndef() instead of getOpcode() == ISD::UNDEF (#127713)
[NFC] using isUndef() instead of getOpcode() == ISD::UNDEF
2025-02-19 08:42:38 -05:00
Craig Topper
ef9f0b3c41
[DAGCombiner] Don't peek through truncates of shift amounts in takeInexpensiveLog2. (#126957)
Shift amounts in SelectionDAG don't have to match the result type
of the shift. SelectionDAGBuilder will aggressively truncate shift
amounts to the target's preferred type. This may result in a zero extend
that existed in IR being removed.
    
If we look through a truncate here, we can't guarantee the upper bits
of the truncate input are zero. There may have been a zext that was
removed. Unfortunately, this regresses tests where no truncate was
involved. The only way I can think to fix this is to add an assertzext
when SelectionDAGBuilder truncates a shift amount or remove the
early truncation of shift amounts from SelectionDAGBuilder all together.
    
Fixes #126889.
2025-02-17 20:26:05 -08:00
Jim Lin
31bfae35d2
[DAGCombiner] Add hasOneUse checks for folding (not (add X, -1)) to (neg X) (#126667)
To get more better codegen for AArch with bic, x86 with andn and riscv
with andn.
2025-02-12 12:24:29 +08:00
Matt Arsenault
c268a3f093 DAG: Fix extract of load combine with mismatched vector element type
Fix the case where the vector element type of the loaded extractelement
input does not match the result type of the extract.

This fixes a regression reported after
c55a7659b38946350315ac4a18d9805deb1f0a54
2025-02-06 22:56:56 +07:00
Matt Arsenault
c55a7659b3
DAG: Move scalarizeExtractedVectorLoad to TargetLowering (#122670)
SimplifyDemandedVectorElts should be able to use this on loads
2025-02-04 17:37:12 +07:00
Craig Topper
788bbd2ef6 [DAGCombiner] Improve chain handling in fold (fshl ld1, ld0, c) -> (ld0[ofs]) combine. (#124871)
Happened to notice some odd things related to chains in this code.

The code calls hasOneUse on LoadSDNode* which will check users
of the data and the chain. I think this was trying to check that
the data had one use so one of the loads would definitely be
removed by the transform. Load chains don't always have users so
our testing may not have noticed that the chains being used would
block the transform.

The code makes all users of ld1's chain use the new load's chain, but
we don't know that ld1 becomes dead. This can cause incorrect dependencies if
ld1's chain is used and it isn't deleted. I think the better thing to do
is use makeEquivalentMemoryOrdering to make all users of ld0 and ld1
depend on the new load and the original loads. If the olds loads become
dead, their chain will be cleaned up later.

I'm having trouble getting a test for any ordering issue with the current code.
areNonVolatileConsecutiveLoads requires the two loads to have the same
input chain. Given that, I don't know how to use one of the load chain
results without also using the other. If they are both used we don't
do the transform because SDNode::hasOneUse will return false for both.
2025-02-03 11:48:41 -08:00
Matt Arsenault
97a1f494a6
DAG: Avoid breaking legal vector_shuffle with multiple uses (#123712)
Previously this combine would undo AMDGPU's new custom legalization of
wide vector shuffles into 2 element pieces. The comment also
states that this combine is only done before legalization,
but the case with a build_vector source was unconditional.

We probably don't want to do this if the multiple uses are full
scalarization of the vector, but this seems to work well enough.
Scalarizing extracts should have folded out pre-legalize.
2025-01-30 10:55:21 +07:00
Benjamin Maxwell
778138114e
[SDAG] Use BatchAAResults for querying alias analysis (AA) results (#123934)
Once we get to SelectionDAG the IR should not be changing anymore, so we
can use BatchAAResults rather than AAResults to cache AA queries.

This should be a NFC change for targets that enable AA during codegen
(such as AArch64), but also give a nice compile-time improvement in some
cases. See:
https://github.com/llvm/llvm-project/pull/123787#issuecomment-2606797041

Note: This follows Nikita's suggestion on #123787.
2025-01-23 09:16:09 +00:00
Matt Arsenault
5e79ae60a6
DAG: Fix vector_shuffle -> splat fold defining undef lanes (#123596)
For shuffle vector splats with undef lanes in the mask,
this was introducing real values. Filter out build_vector
results based on the undef elements in the mask.

This avoids AMDGPU test regressions in a future change.

test/CodeGen/X86/urem-seteq-illegal-types.ll looks worse
but I didn't investigate.
2025-01-21 23:55:50 +07:00
David Sherwood
50bfa85d79
[DAGCombiner] Fix scalarizeExtractedBinOp for some SETCC cases (#123071)
PR https://github.com/llvm/llvm-project/pull/118823 added a
DAG combine for extracting elements of a vector returned from
SETCC, however it doesn't correctly deal with the case where
the vector element type is not i1. In this case we have to
take account of the boolean contents, which are represented
differently between vectors and scalars. The code now
explicitly performs an inreg sign extend in order to get the
same result.

Fixes https://github.com/llvm/llvm-project/issues/121372
2025-01-21 10:31:56 +00:00
Simon Pilgrim
bacfdcd7e0
[DAG] Add SDPatternMatch::m_BitCast matcher (#123327)
Simplifies a future patch
2025-01-17 12:22:07 +00:00
Matt Arsenault
7475f0a345
DAG: Avoid forming shufflevector from a single extract_vector_elt (#122672)
This avoids regressions in a future AMDGPU commit. Previously we
would have a build_vector (extract_vector_elt x), undef with free
access to the elements bloated into a shuffle of one element + undef,
which has much worse combine support than the extract.

Alternatively could check aggressivelyPreferBuildVectorSources, but
I'm not sure it's really different than isExtractVecEltCheap.
2025-01-17 08:44:43 +07:00
Matt Arsenault
4431106630
DAG: Fix vector bin op scalarize defining a partially undef vector (#122459)
This avoids some of the pending regressions after AMDGPU implements
isExtractVecEltCheap.

In a case like shl <value, undef>, splat k, because the second operand
was fully defined, we would fall through and use the splat value for the
first operand, losing the undef high bits. This would result in an additional
instruction to handle the high bits. Add some reduced testcases for different
opcodes for one of the regressions.
2025-01-17 08:34:03 +07:00
Simon Pilgrim
bd768246da [DAG] replaceShuffleOfInsert - convert INSERT_VECTOR_ELT matching to use SDPatternMatch helpers. NFC. 2025-01-15 10:09:21 +00:00
Matt Arsenault
f4598194b5
DAG: Fold bitcast of scalar_to_vector to anyext (#122660)
scalar_to_vector is difficult to make appear and test,
but I found one case where this makes an observable difference.
It fires more often than this in the test suite, but most of them
have no net result in the final code. This helps reduce regressions
in a future commit.
2025-01-13 19:38:58 +07:00
Ryan Mansfield
67efbd0bf1
[LLVM] Fix various cl::desc typos and whitespace issues (NFC) (#121955) 2025-01-08 11:07:23 +01:00
Simon Pilgrim
1332db36ee [DAG] TransformFPLoadStorePair - early out if we're not loading a simple type
Its never going to transform into a legal integer type, so just bail - noticed while triaging the assertion reported in #121784
2025-01-07 13:37:23 +00:00
Matt Arsenault
d34f7ead88
DAG: Fix assuming f16 is the only 16-bit fp type in concat vector combine (#121637)
This would see if there are mixed integer and FP types and pick an
equivalently sized FP type to use as the vector element type, and only
cast if there were mixed integers. We need to insert a cast if the types
are mixed, which may include different FP types.

Fixes #121601
2025-01-06 10:38:54 +07:00
Min-Yih Hsu
2291d0aba9
[DAGCombiner] Turn (neg (max x, (neg x))) into (min x, (neg x)) (#120666)
This pattern was originally spotted in 429.mcf by @topperc.

We already have a DAGCombiner pattern to turn `(neg (abs x))` into `(min
x, (neg x))`. But in some cases `(neg (max x, (neg x)))` is formed by an
expanded `abs` followed by a `neg` that is generated only after the
`abs` expansion. This patch adds a separate pattern to match cases like
this, as well as its inverse pattern: `(neg (min X, (neg X))) --> (max
X, (neg X))`.

This pattern is applicable to both signed and unsigned min/max.
2025-01-02 16:28:55 -08:00
Simon Pilgrim
b3a7ab6f1f [DAG] Don't allow implicit truncation in extract_element(bitcast(scalar_to_vector(X))) -> trunc(srl(X,C)) fold
Limits #117900 to only fold when scalar_to_vector doesn't perform implicit truncation, as the scaled shift calculation doesn't currently account for this - this can be addressed in a future update.

Fixes #121306
2024-12-30 16:08:35 +00:00
Craig Topper
f139bde8d8
[SelectionDAG] Move SDNode::use_iterator::getOperandNo to SDUse. (#120536)
This allows us to write more range based for loops because we no
longer need the iterator. It also matches IR's Use class.
2024-12-19 09:07:42 -08:00
Craig Topper
e6b2495545
[SelectionDAG] Split SDNode::use_iterator into user_iterator and use_iterator. (#120531)
SDNode::use_iterator now returns an SDUse& when dereferenced.
SDNode::user_iterator returns SDNode*. SDNode::use_begin/use_end/uses
work on use_iterator. SDNode::user_begin/user_end/users work on
user_iterator.

We can now write range based for loops using SDUse& and SDNode::uses().
I've converted many of these in this patch. I didn't update loops that
have additional variables updated in their for statement.

Some loops use SDNode::use_iterator::getOperandNo() which also prevents
using range based for loops. I plan to move this into SDUse in a follow
up patch.
2024-12-19 08:35:32 -08:00
Craig Topper
bd261ecc5a
[SelectionDAG] Add SDNode::user_begin() and use it in some places (#120509)
Most of these are just places that want the first user and aren't
iterating over the whole list.

While there I changed some use_size() == 1 to hasOneUse() which
is more efficient.

This is part of an effort to rename use_iterator to user_iterator
and provide a use_iterator that dereferences to SDUse&. This patch
helps reduce the diff on later patches.
2024-12-18 22:13:04 -08:00
Craig Topper
104ad9258a
[SelectionDAG] Rename SDNode::uses() to users(). (#120499)
This function is most often used in range based loops or algorithms
where the iterator is implicitly dereferenced. The dereference returns
an SDNode * of the user rather than SDUse * so users() is a better name.

I've long beeen annoyed that we can't write a range based loop over
SDUse when we need getOperandNo. I plan to rename use_iterator to
user_iterator and add a use_iterator that returns SDUse& on dereference.
This will make it more like IR.
2024-12-18 20:09:33 -08:00
Benjamin Maxwell
a7dafea384
[SDAG] Allow folding stack slots into sincos/frexp in more cases (#118117)
This adds a new helper `canFoldStoreIntoLibCallOutputPointers()` to
check that it is safe to fold a store into a node that will expand to a
library call that takes output pointers. This requires checking for two
(independent) properties:

1. The store is not within a CALLSEQ_START..CALLSEQ_END pair
* If it is, the expansion would lead to nested call sequences (which is
invalid)
2. The node does not appear as a predecessor to the store
* If it does, attempting to merge the store into the call would result
in a cycle in the DAG

These two properties are checked as part of the same traversal in
`canFoldStoreIntoLibCallOutputPointers()`
2024-12-17 10:54:17 +00:00
Simon Pilgrim
0954c67d7a [DAG] visitFREEZE - only fold integer types to an all ones constant
ISD::isBuildVectorAllOnes can peek through bitcasts, so this can match against FP NAN (ish) data (e.g. double (bitcast i64 -1)) under certain circumstances - bail if the type isn't an integer and let bitcast folding handle it first.

Fixes #120093
2024-12-16 16:46:38 +00:00