3807 Commits

Author SHA1 Message Date
Patrick O'Neill
4ab2ac22d0
[DAGCombiner] Mark vectors as not AllAddOne/AllSubOne on type mismatch (#92195)
Fixes #92193.
2024-05-15 12:39:28 -07:00
Simon Pilgrim
7f3e3785d0 Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFC. 2024-05-10 22:40:23 +01:00
Simon Pilgrim
7e6879b245 [X86] scalarizeExtractedBinop - reuse existing SDLoc. NFC. 2024-05-10 22:40:23 +01:00
David Green
8fc9e3d577
[DAG] Lower frem of power-2 using div/trunc/mul+sub (#91148)
If we are lowering a frem and the divisor is known to be an integer power-2, we
can use the formula 'frem = x - trunc(x / d) * d'. This avoids the more
expensive call to fmod. The results are identical as fmod so long as d is a
power-2 (so the mul does not round incorrectly), and the sign of the return is
either always positive or not important for zeroes (nsz).

Unfortunately Alive2 does not handle this well at the moment. I was using
exhaustive checking to test this:
(https://gist.github.com/davemgreen/6078015f30d3bacd1e9572f8db5d4b64).

I found this in cpythons implementation of float_pow. I currently added it as a
DAG combine for frem with power-2 fp constants.
2024-05-10 14:58:48 +01:00
David Green
23b673e5b4
[DAG][AArch64] Handle vscale addressing modes in reassociationCanBreakAddressingModePattern (#89908)
reassociationCanBreakAddressingModePattern tries to prevent bad add
reassociations that would break adrressing mode patterns. This adds
support for vscale offset addressing modes, making sure we don't break
patterns that already exist. It does not optimize _to_ the correct
addressing modes yet, but prevents us from optimizating _away_ from
them.
2024-05-10 09:27:02 +01:00
David Green
fcf945f4ed
[DAG] Fold add(mul(add(A, CA), CM), CB) -> add(mul(A, CM), CM*CA+CB) (#90860)
This is useful when the inner add has multiple uses, and so cannot be
canonicalized by pushing the constants down through the mul. This patch
adds patterns for both `add(mul(add(A, CA), CM), CB)` and with an extra add
`add(add(mul(add(A, CA), CM), B) CB)` as the second can come up when
lowering geps.
2024-05-08 22:11:18 +01:00
Craig Topper
ef84452571
[DAGCombiner] Be more careful about looking through extends and truncates in mergeTruncStores. (#91375)
Previously we recursively looked through extends and truncates on both
SourceValue and WideVal.

SourceValue is the largest source found for each of the stores we are
combining. WideVal is the source for the current store.

Previously we could incorrectly look through a (zext (trunc X)) pair and
incorrectly believe X to be a good source.

I think we could also look through a zext on one store and a sext on
another store and arbitrarily pick one of the extends as the final
source.

With this patch we only look through one level of extend or truncate.
And we don't look through extends/truncs on both SourceValue and WideVal
at the same time.

This may lose some optimization cases, but keeps everything we had tests
for.

Fixes #90936.
2024-05-07 21:17:50 -07:00
Simon Pilgrim
522b4bfe5b
[DAG] Fold bitreverse(shl/srl(bitreverse(x),y)) -> srl/shl(x,y) (#89897)
Noticed while investigating GFNI per-element vector shifts (we can form SHL but not SRL/SRA)

Alive2: https://alive2.llvm.org/ce/z/fSH-rf
2024-05-06 11:13:05 +01:00
Simon Pilgrim
caacf8685a
[DAG] Fold freeze(shuffle(x,y,m)) -> shuffle(freeze(x),freeze(y),m) (#90952)
If the shuffle mask contains no undef elements, then we can move the freeze through a shuffle node.

This requires special case handling to create a new ShuffleVectorSDNode.

Includes VECTOR_SHUFFLE support for isGuaranteedNotToBeUndefOrPoison  / canCreateUndefOrPoison.
2024-05-04 12:03:10 +01:00
Craig Topper
3563af6c06
[DAGCombiner] In mergeTruncStore, make sure we aren't storing shifted in bits. (#90939)
When looking through a right shift, we need to make sure that all of
the bits we are using from the shift come from the shift input and
not the sign or zero bits that are shifted in.
    
Fixes #90936.
2024-05-03 09:59:33 -07:00
Simon Pilgrim
91c52b966a [DAG] Pull out repeated SDLoc() from SHL/SRL/SRA combines. NFC.
We were always calling SDLoc(N) at the top of each visitSHL/SRL/SRA for the FoldConstantArithmetic call, so just reuse this as much as possible.
2024-04-30 17:30:43 +01:00
Luke Lau
5e03c0af47
[DAGCombiner] Fix mayAlias not accounting for scalable MMOs with offsets (#90573)
In #70452 DAGCombiner::mayAlias was taught to handle scalable sizes, but
when it checks via AA->isNoAlias it didn't take into account the case
where the size is scalable but there was an offset too.

For the fixed length case the offset was just accounted for by adding to
the LocationSize, but for the scalable case there doesn't seem to be a
way to represent both a scalable and fixed part in it. So this patch
works around it by bailing if there is an offset.

Fixes #90559
2024-04-30 20:20:40 +08:00
Bjorn Pettersson
55c6bda01e Revert "Revert "[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison (#84921)" and more..."
This reverts commit 16bd10a38730fed27a3bf111076b8ef7a7e7b3ee.

Re-applies:
    b3c55b707110084a9f50a16aade34c3be6fa18da - "[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison (#84921)"
    8e2f6495c0bac1dd6ee32b6a0d24152c9c343624 - "[DAGCombiner] Do not always fold FREEZE over BUILD_VECTOR (#85932)"
    73472c5996716cda0dbb3ddb788304e0e7e6a323 - "[SelectionDAG] Treat CopyFromReg as freezing the value (#85932)"

with a fix in DAGCombiner::visitFREEZE.
2024-04-29 13:08:52 +02:00
David Spickett
16bd10a387 Revert "[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison (#84921)" and more...
This reverts:
b3c55b707110084a9f50a16aade34c3be6fa18da - "[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison (#84921)"
(because it updates a test case that I don't know how to resolve the conflict for)
8e2f6495c0bac1dd6ee32b6a0d24152c9c343624 - "[DAGCombiner] Do not always fold FREEZE over BUILD_VECTOR (#85932)"
73472c5996716cda0dbb3ddb788304e0e7e6a323 - "[SelectionDAG] Treat CopyFromReg as freezing the value (#85932)"

Due to a test suite failure on AArch64 when compiling for SVE.
https://lab.llvm.org/buildbot/#/builders/197/builds/13955

clang: ../llvm/llvm/include/llvm/CodeGen/ValueTypes.h:307: MVT llvm::EVT::getSimpleVT() const: Assertion `isSimple() && "Expected a SimpleValueType!"' failed.
2024-04-29 09:47:41 +01:00
Björn Pettersson
b3c55b7071
[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison (#84921)
[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison

Handle SELECT_CC similarly as SETCC.

Handle these operations that only propagate poison/undef based on the
input operands:
  SADDSAT, UADDSAT, SSUBSAT, USUBSAT, MULHU, MULHS,
  SMIN, SMAX, UMIN, UMAX

These operations may create poison based on shift amount and exact
flag being violated:
  SRL, SRA

One goal here is to allow pushing freeze through these operations
when allowed, as well as letting analyses such as
isGuaranteedNotToBeUndefOrPoison to not break on such operations.

Since some problems have been observed with pushing freeze through
SRA/SRL we block that explicitly in DAGCombiner::visitFreeze now.
That way we can still model SRA/SRL properly in
SelectionDAG::canCreateUndefOrPoison, e.g. when used by
isGuaranteedNotToBeUndefOrPoison, even if we do not want to push
freeze through those instructions.
2024-04-29 07:56:49 +02:00
Matt Arsenault
405c018c71
DAG: Simplify demanded bits for truncating atomic_store (#90113)
It's really unfortunate that STORE and ATOMIC_STORE are separate
opcodes. This duplicates a basic simplify demanded for the truncating
case. This avoids some AMDGPU lit regressions in a future patch.

I'm not sure how to craft a test that exposes this without first
introducing the regressions by promoting half to i16.
2024-04-26 15:21:44 +02:00
Simon Pilgrim
55d85c84ac
[DAG] visitORCommutative - fold build_pair(not(x),not(y)) -> not(build_pair(x,y)) style patterns (#90050)
(Sorry, not an actual build_pair node just a similar pattern).

For cases where we're concatenating 2 integers into a double width integer, see if both integer sources are NOT patterns.

We could take this further and handle all logic ops with a constant operands, but I just wanted to handle the case reported on #89533 initially.

Fixes #89533
2024-04-26 14:11:03 +01:00
Bjorn Pettersson
8e2f6495c0 [DAGCombiner] Do not always fold FREEZE over BUILD_VECTOR (#85932)
Avoid turning a BUILD_VECTOR that can be recognized as "all zeros",
"all ones" or "constant" into something that depends on
freeze(undef), as that would destroy those properties.

Instead we replace undef by 0/-1 in such vectors, making it possible
to fold away the freeze. We typically use -1 if the BUILD_VECTOR
would identify as "all ones", and otherwise we use the value 0.
2024-04-26 13:41:21 +02:00
Simon Pilgrim
d51a17f684 [DAG] visitORCommutative - pull out repeated SDLoc(). NFC. 2024-04-25 14:23:36 +01:00
Björn Pettersson
f9b419b7a0
[DAGCombiner] Fix miscompile bug in combineShiftOfShiftedLogic (#89616)
Ensure that the sum of the shift amounts does not overflow the
shift amount type when combining shifts in combineShiftOfShiftedLogic.

Solves a miscompile bug found when testing the C23 BitInt feature.

Targets like X86 that only use an i8 for shift amounts after
legalization seems to be extra susceptible for bugs like this as it
isn't legal to shift more than 255 steps.
2024-04-23 14:11:34 +02:00
Simon Pilgrim
ca9a44ef47 [DAG] visitORCommutative - use sd_match to reduce the need for commutative operand matching. NFCI.
Use sd_match to match commutative inner AND/OR/XOR node arguments instead of some messy manual matching of each commutation.
2024-04-22 10:41:57 +01:00
Simon Pilgrim
c88b84d467 [DAG] visitOR/visitORLike - merge repeated SDLoc calls. 2024-04-22 10:28:02 +01:00
Craig Topper
ce48f43f05
[SelectionDAG] Require UADDO_CARRY carryin and carryout to have the same type. (#89255)
This requires type legalization to keep them the same. This means we no
longer need to legalize the operand since it will be legalized when we
legalize the second result.
2024-04-19 12:38:53 -07:00
Simon Pilgrim
2e68ba99de [DAG] visitADDLike - update "(x - y) + -1 -> add (xor y, -1), x" fold to accept UNDEF in a splat vector of -1
Make sure we use getNOT instead of reusing the allones (with undefs) vector
2024-04-19 13:47:29 +01:00
Craig Topper
ba1158813d
[DAGCombiner][AArch64] Make combineCarryDiamond avoid creating UADDO_CARRY with carry in larger than setcc result type. (#89121)
In the attach test case we were creating a UADDO_CARRY with i1 carry out
and i41 carry in. i41 exceeds is larger than the setcc result type for
AArch64 which is i32. i41 needs to be promoted to i64 since it is larger
than i32. The type legalizer tried to use promoteTargetBoolean, but that
can only promote from a type smaller than setcc result type.

The easiest fix here is to force the carryin type to match the carryout
type at the type of creation. This should ensure the node won't exceeed
setcc result type as long as the output type doesn't.

I think we should explore requiring the types to match for this node.

Fixes #88966
2024-04-18 08:34:51 -07:00
Simon Pilgrim
73b255c9f8 [DAG] Ensure extract_subvector(insert_subvector(x,y,c1),c2) --> extract_subvector(y,c2-c1) is working on fixed vector types
#87925 failed to ensure we weren't removing the extracted subvector from a scalable vector type

Thanks to @antmox for the headsup.
2024-04-18 13:21:52 +01:00
Simon Pilgrim
c18a3b6bd3 [DAG] Fold extract_subvector(insert_subvector(x,y,c1),c2) --> extract_subvector(y,c2-c1) (#87925) (REAPPLIED)
If the extract_subvector is cheap, attempt to extract directly from an inserted subvector

Reapplied with a check to ensure we only attempt this for fixed vectors
2024-04-16 12:30:27 +01:00
Alina Sbirlea
40bbdb609f Revert "[DAG] Fold extract_subvector(insert_subvector(x,y,c1),c2) --> extract_subvector(y,c2-c1) (#87925)"
This reverts commit 8c0f52e9d5a99bf96bb64ac23b5893482c292527.

Reverting to green, reproducer attached in the PR/revision comments.
2024-04-15 17:38:52 -07:00
Yingwei Zheng
4d28d3f93b
[SDAG] Turn umin into smin if the saturation pattern is broken (#88505)
As we canonicalizes smin with non-negative operands into umin in the
middle-end, the saturation pattern will be broken.
This patch reverts the transform in DAGCombine to fix the regression on
ARM.

Fixes https://github.com/llvm/llvm-project/issues/85706.
2024-04-16 01:28:28 +08:00
fengfeng
36230f90ee
[SelectionDAG] Propagate Disjoint flag. (#88370)
Signed-off-by: feng.feng <feng.feng@iluvatar.com>
2024-04-15 11:01:15 +02:00
Simon Pilgrim
8c0f52e9d5
[DAG] Fold extract_subvector(insert_subvector(x,y,c1),c2) --> extract_subvector(y,c2-c1) (#87925)
If the extract_subvector is cheap, attempt to extract directly from an inserted subvector
2024-04-12 11:23:51 +01:00
Feng Zou
6b6f272f35
[DAGCombiner] Require same type of splat & element for build_vector (#88284)
Only allow to change build_vector to concat_vector when the splat type
and vector element type is same. It's to fix assertion of failing to
bitcast types of different sizes.
2024-04-12 10:13:56 +08:00
Simon Pilgrim
759422c6df [DAG] visitEXTRACT_SUBVECTOR - pull out repeated SDLoc. NFC. 2024-04-11 10:45:01 +01:00
David Green
9fd2e2c2fd
[DAG][AArch64] Support masked loads/stores with nontemporal flags (#87608)
SVE has some non-temporal masked loads and stores. The metadata coming
from the nodes is not copied to the MMO at the moment though, meaning it
will generate a normal instruction. This patch ensures that the right
flags are set if the instruction has non-temporal metadata.
2024-04-08 08:53:27 +01:00
Piotr Sobczak
5b59ae423a
[DAG] Preserve NUW when reassociating (#87621)
Similarly to the generic case below, preserve the NUW flag when
reassociating adds with constants.
2024-04-04 16:47:25 +02:00
Simon Pilgrim
2d0087424f
[DAG] Remove extract_vector_elt(freeze(x)), idx -> freeze(extract_vector_elt(x), idx) fold (#87480)
Reverse the fold with handling inside canCreateUndefOrPoison for cases where we know that the extract index is in bounds.

This exposed a number or regressions, and required some initial freeze handling of SCALAR_TO_VECTOR, which will require us to properly improve demandedelts support to handle its undef upper elements.

There is still one outstanding regression to be addressed in the future - how do we want to handle folds involving frozen loads?

Fixes #86968
2024-04-04 11:10:55 +01:00
Simon Pilgrim
39eedfded4 [DAG] visitADDLikeCommutative - convert (add x, shl(0 - y, n)) fold to SDPatternMatch. NFC. 2024-04-03 15:37:38 +01:00
AinsleySnow
52b18430ae
[VP][DAGCombine] Use simplifySelect when combining vp.select. (#87342)
Hi all,

This patch is a follow-up of #79101. It migrates logic from
`visitVSELECT` to `visitVP_SELECT` to simplify `vp.select`. With this
patch we can do the following combinations:

```
vp.select undef, T, F --> T (if T is a constant), F otherwise
vp.select <condition>, undef, F --> F
vp.select <condition>, T, undef --> T
vp.select false, T, F --> F
vp.select <condition>, T, T --> T
```

I'm a total newbie to llvm and I'm sure there's room for improvements in
this patch. Please let me know if you have any advice. Thank you in
advance!
2024-04-03 07:45:50 -04:00
Jonas Paulsson
94b5c118b3
[ISel] Move handling of atomic loads from SystemZ to DAGCombiner (NFC). (#86484)
The folding of sign/zero extensions into an atomic load by specifying an
extension type is not target specific, and therefore belongs in the
DAGCombiner rather than in the SystemZ backend.

- Handle atomic loads similarly to regular loads by adding
AtomicLoadExtActions with set/get methods.
- Move SystemZ extendAtomicLoad() to DagCombiner.cpp.
2024-03-28 16:14:35 +01:00
Luke Lau
856e815ca1
[DAGCombiner] Set disjoint flag in add->or and xor->or combines (#86925)
We check DAG.haveNoCommonBitsSet so the operands will be known to be
disjoint.

I couldn't think of a codegen test case since most targets aren't
checking hasDisjoint yet, apart from RISCV in the or_is_add pattern, but
it also falls back to computeKnownBits.
2024-03-28 18:08:59 +08:00
Simon Pilgrim
9247f3185c [DAG] foldAddSubOfSignBit - reuse existing SDLoc instead of regenerating it. NFC. 2024-03-27 12:22:31 +00:00
Simon Pilgrim
51388fbab1 [DAG] visitSub - reuse existing SDLoc instead of regenerating it. NFC. 2024-03-27 12:22:30 +00:00
Simon Pilgrim
1c9d5c25ae [DAG] foldAddSubBoolOfMaskedVal - reuse existing SDLoc instead of regenerating it. NFC. 2024-03-26 18:33:30 +00:00
Simon Pilgrim
5fc619b5ee [DAG] Update ISD::AVG folds to use hasOperation to allow Custom matching prior to legalization
Fixes issue where AVX1 targets weren't matching 256-bit AVGCEILU cases.
2024-03-26 10:41:07 +00:00
Simon Pilgrim
c7198e0af3
[DAG] Fold insert_subvector(N0, extract_subvector(N0, N2), N2) --> N0 (#86487)
Handle the case where we've ended up inserting back into the source vector we extracted the subvector from.
2024-03-26 10:03:42 +00:00
houndlord
9632e1515c
Match fixed width ISD::AVGFLOORS + ISD::AVGCEILS patterns (#86222) 2024-03-24 15:33:16 +00:00
Harvin Iriawan
57146daeaa
[CodeGen] Update for scalable MemoryType in MMO (#70452)
Remove getSizeOrUnknown call when MachineMemOperand is created.  For Scalable
TypeSize, the MemoryType created becomes a scalable_vector.

2 MMOs that have scalable memory access can then use the updated BasicAA that
understands scalable LocationSize.

Original Patch by Harvin Iriawan
Co-authored-by: David Green <david.green@arm.com>
2024-03-23 12:56:25 +00:00
Simon Pilgrim
ceabaa7e7a [DAG] Fix some missing formatting when I rewrote the SUB(MAX,MIN) -> ABD patterns. NFC. 2024-03-22 11:48:03 +00:00
XChy
cb4453dc69
[SelectionDAG] Prevent combination on inconsistent type in combineCarryDiamond (#84888)
Fixes #84831
When matching carry pattern with `getAsCarry`, it may produce different
type of carryout. This patch checks such case and does early exit.

I'm new to DAG, any suggestion is appreciated.
2024-03-22 16:05:20 +05:30
Simon Pilgrim
6942927609 [DAG] combineConcatVectorOfScalars - stop always creating UNDEF nodes. NFC.
Noticed in debug logs - most calls to visitVECTOR_SHUFFLE resulted into wasteful UNDEF node creations, despite almost never being used.
2024-03-21 16:37:48 +00:00