3319 Commits

Author SHA1 Message Date
jacquesguan
e60eb7053d recommit "[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat."
With fix for AArch64 and Hexgon test cases.
2022-07-21 17:34:34 +08:00
David Truby
4c82f56d8f [llvm][SVE] Remove redundant and when comparing against extending load
When determining if an `and` should be merged into an extending load
the constant argument to the `and` is currently not checked if the
argument requires truncation. This prevents the combine happening when
the vector width is half the normal available vector width for SVE VLA
vectors.

Reviewed By: c-rhodes

Differential Revision: https://reviews.llvm.org/D129281
2022-07-19 17:08:32 +01:00
Simon Pilgrim
71c502cbca [DAG] Call SimplifyDemandedBits from ISD::MUL nodes
Noticed while triaging D129765.
2022-07-19 14:11:04 +01:00
Max Kazantsev
69b284aaf6 Revert "[DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat."
This reverts commit 58dfaaaace4ea75ab3588a6e738f2cf58ebf77c2.

Massive AARCH test failures in buildbot.
2022-07-19 13:41:52 +07:00
jacquesguan
58dfaaaace [DAGCombiner] Teach scalarizeBinOpOfSplats handle scalable splat.
This revision supports to scalarize a binary operation of two scalable splat vectors.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D122791
2022-07-19 11:20:51 +08:00
Itay Bookstein
2570f226d1 [SDAG] Remove single-result restriction on commutative CSE
The DAG Combiner unnecessarily restricts commutative CSE
to nodes with a single result value. This commit removes
that restriction.

Signed-off-by: Itay Bookstein <ibookstein@gmail.com>

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D129666
2022-07-18 19:19:13 +03:00
Simon Pilgrim
53b90dd372 [DAG] Fold (or (and X, C1), (and (or X, Y), C2)) -> (or (and X, C1|C2), (and Y, C2))
Pulled out of D77804

Alive2: https://alive2.llvm.org/ce/z/g61VRe
2022-07-17 18:51:41 +01:00
Kazu Hirata
9e6d1f4b5d [CodeGen] Qualify auto variables in for loops (NFC) 2022-07-17 01:33:28 -07:00
Sanjay Patel
7ca3e23f25 [SDAG] narrow truncated sign_extend_inreg
trunc (sign_ext_inreg X, iM) to iN --> sign_ext_inreg (trunc X to iN), iM

There are improvements on existing tests from this, and there are a pair
of large regressions in D127115 for Thumb2 caused by not folding this
pattern.

Differential Revision: https://reviews.llvm.org/D129890
2022-07-16 16:29:15 -04:00
Simon Pilgrim
a44bdf9bc1 [DAG] visitINSERT_VECTOR_ELT - refactor BUILD_VECTOR creation from INSERT_VECTOR_ELT chain.
D127595 added the ability to recurse up a (one-use) INSERT_VECTOR_ELT chain to create a BUILD_VECTOR before other combines manage to break the chain, something that is particularly bad in D127115.

The patch generalises this so it doesn't have to build the chain starting from the last element insertion, instead it can now start from any insertion and will recurse up the chain until it finds all elements or finds a UNDEF/BUILD_VECTOR/SCALAR_TO_VECTOR which represents that start of the chain.

Fixes several regressions in D127115
2022-07-16 16:37:31 +01:00
Simon Pilgrim
52b6168c16 [DAG] visitINSERT_VECTOR_ELT - remove duplicate VT.getVectorNumElements() call. NFC. 2022-07-16 16:20:49 +01:00
Simon Pilgrim
2bb6b03d71 Fix signed/unsigned mismatch 2022-07-16 11:48:41 +01:00
Simon Pilgrim
a5d0122f75 [DAG] Canonicalize non-inlane shuffle -> AND if all non-inlane referenced elements are known zero
As mentioned on D127115, this patch that attempts to recognise shuffle masks that could be simplified to a AND mask - we already have a similar transform that will fold AND -> 'clear mask' shuffle, but this patch handles cases where the referenced elements are not from the same lane indices but are known to be zero.

Differential Revision: https://reviews.llvm.org/D129150
2022-07-16 11:38:24 +01:00
Simon Pilgrim
1cb7416ee3 [DAG] combineShiftAnd1ToBitTest - match "and (srl (not X), C)), 1 --> (and X, 1<<C) == 0" patterns
combineShiftAnd1ToBitTest already matches "and (not (srl X, C)), 1 --> (and X, 1<<C) == 0" patterns, but we can end up with situations where the not is before the shift.

Part of some yak shaving for D127115 to generalise the "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold.
2022-07-16 11:00:07 +01:00
Simon Pilgrim
3c8bf29696 [DAG] Move "xor (X logical_shift ShiftC), XorC --> (not X) logical_shift ShiftC" fold into SimplifyDemandedBits
SimplifyDemandedBits is called slightly later which allows the not(sext(x)) -> sext(not(x)) fold to occur via foldLogicOfShifts

As mentioned on D127115, we should be able to further generalise this based off the demanded bits.
2022-07-15 13:10:15 +01:00
Simon Pilgrim
d172842b51 [DAG] SimplifyDemandedVectorElts - adjust demanded elements for selection mask for known zero results
If an element is known zero from both selections then it shouldn't matter what the selection mask element is.
2022-07-13 17:36:05 +01:00
Philip Reames
fd67992f9c [DAGCombine] fold (urem x, (lshr pow2, y)) -> (and x, (add (lshr pow2, y), -1))
We have the same fold in InstCombine - though implemented via OrZero flag on isKnownToBePowerOfTwo. The reasoning here is that either a) the result of the lshr is a power-of-two, or b) we have a div-by-zero triggering UB which we can ignore.

Differential Revision: https://reviews.llvm.org/D129606
2022-07-13 08:34:38 -07:00
Sanjay Patel
d0eec5f7e7 [SDAG] enhance sub->xor fold to ignore signbit
As suggested in the post-commit feedback for D128123,
we can ease the mask constraint to ignore the MSB
(and make the code easier to read by adjusting the check).

https://alive2.llvm.org/ce/z/bbvqWv
2022-07-11 12:37:50 -04:00
Kazu Hirata
1fd6611fc8 [SelectionDAG] Restore calls to has_value (NFC)
This patch restores calls to has_value to make it clear that we are
checking the presence of an optional value, not the underlying value.

This patch partially reverts d08f34b592ff06ccb1f36da88ec09aa926427a4d.

Differential Revision: https://reviews.llvm.org/D129454
2022-07-10 14:37:23 -07:00
Craig Topper
40866b74bd [DAGCombiner][X86] Fold sra (sub AddC, (shl X, N1C)), N1C --> sext (sub AddC1',(trunc X to (width - N1C)))
We already handled this case for add with a constant RHS. A
similar pattern can occur for sub with a constant left hand side.

Test cases use add and a mul representing (neg (shl X, C)) because
that's what I saw in the wild. The mul will be decomposed and then
the new transform can kick in.

Tests have not been committed, but this patch shows the changes.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D128769
2022-07-09 11:53:44 -07:00
Sanjay Patel
8b75671314 [SDAG] try to replace subtract-from-constant with xor
This is almost the same as the abandoned D48529, but it
allows splat vector constants too.

This replaces the x86-specific code that was added with
the alternate patch D48557 with the original generic
combine.

This transform is a less restricted form of an existing
InstCombine and the proposed SDAG equivalent for that
in D128080:
https://alive2.llvm.org/ce/z/OUm6N_

Differential Revision: https://reviews.llvm.org/D128123
2022-07-08 08:14:24 -04:00
Simon Pilgrim
7068c843d2 [DAG] visitREM - use isAllOnesOrAllOnesSplat instead of isConstOrConstSplat
We were only using the N1C scalar/splat value once, so for clarity use isAllOnesOrAllOnesSplat instead if we actually need it.
2022-07-05 16:44:31 +01:00
Simon Pilgrim
e7a0fa4df0 [DAG] foldAddSubOfSignBit - don't bother creating the new shift node unless constant folding succeeds
Noticed by inspection - the new shift is only ever used if the constant fold occurs
2022-07-05 16:44:31 +01:00
Simon Pilgrim
cce64e7a9c [DAG] visitTRUNCATE - move GetDemandedBits AFTER SimplifyDemandedBits.
Another cleanup step before removing GetDemandedBits entirely.
2022-07-04 11:25:40 +01:00
Kazu Hirata
94460f5136 Don't use Optional::hasValue (NFC)
This patch replaces x.hasValue() with x where x is contextually
convertible to bool.
2022-06-26 19:54:41 -07:00
Kazu Hirata
d08f34b592 [llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-26 18:31:51 -07:00
Kazu Hirata
3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
2022-06-25 11:56:50 -07:00
Kazu Hirata
aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
chenglin.bi
8c74205642 [SelectionDAG][DAGCombiner] Reuse exist node by reassociate
When already have (op N0, N2), reassociate (op (op N0, N1), N2) to (op (op N0, N2), N1) to reuse the exist (op N0, N2)

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D122539
2022-06-24 23:15:06 +08:00
chenglin.bi
9c2bf534f5 Revert "[SelectionDAG][DAGCombiner] Reuse exist node by reassociate"
This reverts commit 6c951c5ee6d0b848877cb8ac7a9cb2a9ef9ebbb5.
2022-06-23 13:21:51 +08:00
Simon Pilgrim
1c2b756cd6 [DAG] visitTRUNCATE - move TRUNCATE(ADDE/ADDCARRY) folds to switch statement handling the other binops. NFC. 2022-06-21 22:07:41 +01:00
Kazu Hirata
7a47ee51a1 [llvm] Don't use Optional::getValue (NFC) 2022-06-20 22:45:45 -07:00
chenglin.bi
6c951c5ee6 [SelectionDAG][DAGCombiner] Reuse exist node by reassociate
When already have (op N0, N2), reassociate (op (op N0, N1), N2) to (op (op N0, N2), N1) to reuse the exist (op N0, N2)

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D122539
2022-06-21 09:45:19 +08:00
Kazu Hirata
e0e687a615 [llvm] Don't use Optional::hasValue (NFC) 2022-06-20 10:38:12 -07:00
Simon Pilgrim
e4a124dda5 [DAG] Fold (srl (shl x, c1), c2) -> and(shl/srl(x, c3), m)
Similar to the existing (shl (srl x, c1), c2) fold

Part of the work to fix the regressions in D77804

Differential Revision: https://reviews.llvm.org/D125836
2022-06-20 08:37:38 +01:00
Craig Topper
314dbde12c [DAGCombiner][ARM][RISCV] Teach ShrinkLoadReplaceStoreWithStore to use truncstore.
The VT we want to shrink to may not be legal especially after type
legalization.

Fixes PR56110.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D128135
2022-06-19 15:50:15 -07:00
Benjamin Kramer
8c4a07c61f [DAGCombiner] Fold fold (fp_to_bf16 (bf16_to_fp op)) -> op 2022-06-15 19:54:39 +02:00
Simon Pilgrim
f096d5926d [DAG] Fix SDLoc mismatch in (shl (srl x, c1), c2) -> and(shift(x,c3)) fold
Noticed by @craig.topper on D125836 which uses a tweaked copy of the same code.

Differential Revision: https://reviews.llvm.org/D127772
2022-06-15 11:07:59 +01:00
Simon Pilgrim
7d8fd4f5db [DAG] visitINSERT_VECTOR_ELT - attempt to reconstruct BUILD_VECTOR before other fold interfere
Another issue unearthed by D127115

We take a long time to canonicalize an insert_vector_elt chain before being able to convert it into a build_vector - even if they are already in ascending insertion order, we fold the nodes one at a time into the build_vector 'seed', leaving plenty of time for other folds to alter it (in particular recognising when they come from extract_vector_elt resulting in a shuffle_vector that is much harder to fold with).

D127115 makes this particularly difficult as we're almost guaranteed to have the lost the sequence before all possible insertions have been folded.

This patch proposes to begin at the last insertion and attempt to collect all the (oneuse) insertions right away and create the build_vector before its too late.

Differential Revision: https://reviews.llvm.org/D127595
2022-06-13 11:48:18 +01:00
Simon Pilgrim
54ae4ca755 [DAG] visitSRL - pull out ShiftVT. NFC. 2022-06-12 14:02:23 +01:00
Simon Pilgrim
cf5c63d187 [DAG] visitVECTOR_SHUFFLE - fold splat(insert_vector_elt()) and splat(scalar_to_vector()) to build_vector splats
Addresses a number of regressions identified in D127115
2022-06-11 21:06:42 +01:00
Simon Pilgrim
44a0cd25df [DAG] visitINSERT_VECTOR_ELT - add <1 x ???> insert_vector_elt(v0,extract_vector_elt(v1,0),0) special case handling
Check if we're just replacing one v1x?? vector with another
2022-06-11 19:30:00 +01:00
Simon Pilgrim
a71ad6a3c8 [DAG] visitINSERT_VECTOR_ELT - fold insert_vector_elt(scalar_to_vector(x),v,i) -> build_vector()
Allow scalar_to_vector nodes to be used for the start of a build_vector creation
2022-06-11 15:29:22 +01:00
Simon Pilgrim
693f4db1ec [DAG] visitINSERT_VECTOR_ELT - refactor BUILD_VECTOR insertion to remove early-out. NFCI.
Remove the early-out cases so we can more easily add additional folds in the future.
2022-06-11 12:01:13 +01:00
Simon Pilgrim
7dbfcfa735 [DAG] combineInsertEltToShuffle - if EXTRACT_VECTOR_ELT fails to match an existing shuffle op, try to replace an undef op if there is one.
This should fix a number of shuffle regressions in D127115 where the re-ordered combines mean we fail to fold a EXTRACT_VECTOR_ELT/INSERT_VECTOR_ELT sequence into a BUILD_VECTOR if we extract from more than one vector source.
2022-06-09 14:56:14 +01:00
Simon Pilgrim
b84c10d4bc [DAG] visitVSELECT - don't wait for truncation of sub before attempting to match with getTruncatedUSUBSAT
Fixes some X86 PSUBUS regressions encountered in D127115 where the truncate was being replaced with a PACKSS/PACKUS before the fold got called again
2022-06-08 16:16:35 +01:00
Simon Pilgrim
a083f3caa1 [DAG] combineShuffleOfSplatVal - fold shuffle(splat,undef) -> splat, iff the splat contains no UNDEF elements
As noticed on D127115 - we were missing this fold, instead just having the shuffle(shuffle(x,undef,splatmask),undef) fold. We should be able to merge these into one using SelectionDAG::isSplatValue, but we'll need to match the shuffle's undef handling first.

This also exposed an issue in SelectionDAG::isSplatValue which was incorrectly propagating the undef mask across a bitcast (it was trying to just bail with a APInt::isSubsetOf if it found any undefs but that was actually the wrong way around so didn't fire for partial undef cases).
2022-06-07 16:42:24 +01:00
Guillaume Chatelet
0788186182 [Alignment][NFC] Remove usage of MemSDNode::getAlignment
I can't remove the function just yet as it is used in the generated .inc files.
I would also like to provide a way to compare alignment with TypeSize since it came up a few times.

Differential Revision: https://reviews.llvm.org/D126910
2022-06-07 13:52:20 +00:00
Nikita Popov
5a64bc207e [DAGCombiner] Remove overzealous assertion when folding assert+trunc+assert (PR55846)
These assert that there are no "useless" assertzext/assertsext nodes
(that assert a wider width than a following trunc), but I don't think
there is anything preventing such nodes from reaching this code.
I don't think the assertion is relevant for correctness of this
transform either -- if such an assert is present, then the other
one will always be to a smaller width, and we'll pick that one.
The assertion dates back to D37017.

Fixes https://github.com/llvm/llvm-project/issues/55846.

Differential Revision: https://reviews.llvm.org/D126952
2022-06-07 09:50:26 +02:00
Benjamin Kramer
e8e4b741dd [DAGCombiner] Add bf16 to the matrix of types that we don't promote to integer stores
Remove a few stray semicolons while there.
2022-06-03 13:28:34 +02:00