3605 Commits

Author SHA1 Message Date
Simon Pilgrim
e9caa37e9c [DAG] Move lshr narrowing from visitANDLike to SimplifyDemandedBits
Inspired by some of the cases from D145468

Let SimplifyDemandedBits handle the narrowing of lshr to half-width if we don't require the upper bits, the narrowed shift is profitable and the zext/trunc are free.

A future patch will propose the equivalent shl narrowing combine.

Differential Revision: https://reviews.llvm.org/D146121
2023-07-17 15:50:09 +01:00
Noah Goldstein
74f0ec5e24 [DAGCombiner] Make it so that udiv can be folded with (select c, NonZero, 1)
This is done by allowing speculation of `udiv` if we can prove the
denominator is non-zero.

https://alive2.llvm.org/ce/z/VNCt_q

Differential Revision: https://reviews.llvm.org/D149198
2023-07-12 17:17:53 -05:00
Ivan Kosarev
15e7749e19 [Codegen] Generate fast fp64-to-fp16 conversions in unsafe mode.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D154528
2023-07-12 11:55:19 +01:00
Amaury Séchet
ee2d10cd16 [NFC] Reorder functions in DAGCombiner so all UADDO_CARRY related functions are next to each others. 2023-07-04 14:55:11 +00:00
Simon Pilgrim
4742715eb7 [DAG] Fold (*ext (*_extend_vector_inreg x)) -> (*_extend_vector_inreg x) 2023-06-30 14:42:49 +01:00
David Green
14f54a594e [DAG][AArch64] Fold shuffle_vector<4,5,6,7> to extract_subvector
During legalization, we can end up with shuffles that are identity masks, so
act like extract_subvector, but do not simplify to extract_subvector. This
adjusts the profitability heuristic in foldExtractSubvectorFromShuffleVector to
allow identity vectors that do not start at element 0. Undef masks elements are
excluded as it can be more useful to keep the undef elements.

Differential Revision: https://reviews.llvm.org/D153504
2023-06-30 11:13:39 +01:00
Luke Lau
742fb8b5c7 [DAGCombine] Fold (store (insert_elt (load p)) x p) -> (store x)
If we have a store of a load with no other uses in between it, it's
considered dead and is removed. So sometimes when legalizing a fixed
length vector store of an insert, we end up producing better code
through scalarization than without.
An example is the follow below:

  %a = load <4 x i64>, ptr %x
  %b = insertelement <4 x i64> %a, i64 %y, i32 2
  store <4 x i64> %b, ptr %x

If this is scalarized, then DAGCombine successfully removes 3 of the 4
stores which are considered dead, and on RISC-V we get:

  sd a1, 16(a0)

However if we make the vector type legal (-mattr=+v), then we lose the
optimisation because we don't scalarize it.

This patch attempts to recover the optimisation for vectors by
identifying patterns where we store a load with a single insert
inbetween, replacing it with a scalar store of the inserted element.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D152276
2023-06-28 22:45:04 +01:00
FLZ101
32e4013dd4 [AArch64][SelectionDAG] fix infinite loop caused by legalizing & combining CONCAT_VECTORS
Legalizing in `AArch64TargetLowering::LowerCONCAT_VECTORS()` and combining in `DAGCombiner::visitCONCAT_VECTORS()` could cause an infinite loop.
This commit fixes that issue by conditionally skipping the combining.

Fix https://github.com/llvm/llvm-project/issues/63322

Reviewed By: RKSimon, MaskRay

Differential Revision: https://reviews.llvm.org/D153316
2023-06-27 13:57:41 -07:00
Simon Pilgrim
1f006f5fb6 [DAG] mergeTruncStores - early out if we collect more than the maximum number of stores
If we have an excessive number of stores in a single chain then the candidate WideVT may exceed the maximum width of an EVT integer type (and will assert) - but since mergeTruncStores doesn't support anything wider than a i64 store we should just early-out if we've collected more than stores than that.

Fixes #63306
2023-06-23 16:22:11 +01:00
David Green
589c940eb3 [DAG] Fix and expand fmin/fmax reassociation fold.
This call to reassociateReduction is used by both fminnum/fmaxnum and
fminimum/fmaximum. In adding support for fminimum/fmaximum we appear to be
fixing the use of an incorrect reduction type, which should have only applied
to minnum/maxnum.

I also believe that it doesn't need nsz and reassoc to perform the
reassociation. For float min/max it should always be valid.

Differential Revision: https://reviews.llvm.org/D153247
2023-06-23 14:45:14 +01:00
Amaury Séchet
34d8c5b9ce [DAG] Peek through trunc when combining select into shifts.
This fixes a regression in D127115

Depends on D127115

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D151916
2023-06-23 00:35:39 +00:00
Simon Pilgrim
43ad2e9c8b [DAG] Add getExtOrTrunc helper. NFC.
Wrap the getSExtOrTrunc/getZExtOrTrunc calls behind an IsSigned argument.
2023-06-20 16:03:18 +01:00
Simon Pilgrim
ff23856c1c [DAG] Fold (abds x, y) -> (abdu x, y) iff both args are known positive
This is a generic DAG combine version of D151055 which recognizes when a signed ABDS can be safely replaced with a unsigned ABDU instruction if it is legal.

Alive2: https://alive2.llvm.org/ce/z/pb5BjG

Differential Revision: https://reviews.llvm.org/D153328
2023-06-20 15:31:22 +01:00
Jeffrey Byrnes
7972a6e126 [DAGCombiner][NFC] Factor out ByteProvider
Differential Revision: https://reviews.llvm.org/D143018

Change-Id: I3dc03787a3382c0c3fe6b869f869c2946f450874
2023-06-19 08:54:34 -07:00
Craig Topper
7163539466 [DAGCombiner] When combining (sext_inreg (zext X), VT) -> (sext X) don't pass along the sext_inreg VT.
ISD::SIGN_EXTEND is only supposed to have one operand, but we
were creating it with 2 operands.

Since we basically never check for extra operands this went
unnoticed.
2023-06-15 11:47:42 -07:00
Amara Emerson
f79b0333fc [DAGCombiner] Fix crash when trying to replace an indexed store with a narrow store.
rdar://108818859

Differential Revision: https://reviews.llvm.org/D152978
2023-06-15 01:54:38 -07:00
Anna Thomas
26bfbec5d2 [Intrinsic] Introduce reduction intrinsics for minimum/maximum
This patch introduces the reduction intrinsic for floating point minimum
and maximum which has the same semantics (for NaN and signed zero) as
llvm.minimum and llvm.maximum.

Reviewed-By: nikic

Differential Revision: https://reviews.llvm.org/D152370
2023-06-13 12:29:58 -04:00
David Green
14914fb157 [DAG][NFC] Update comment on min/max reduction fold.
As pointed out in D141870, this one was incorrectly referencing and.
2023-06-13 17:09:22 +01:00
Amaury Séchet
a70d5e25f3 [DAGCombine] Make sure combined nodes are added back to the worklist in topological order.
Currently, a node and its users are added back to the worklist in reverse topological order after it is combined. This diff changes that order to be topological. This is part of a larger migration to get the DAGCombiner to process nodes in topological order.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D127115
2023-06-13 09:14:37 +00:00
Nikita Popov
5c6ff3a602 [DAGCombine] Move setcc of freeze fold to brcond
This fold goes against the usual approach of pushing freeze into
operands. The idea behind the fold is that if the setcc feeds into
a brcond, the freeze can be dropped entirely.

Move the fold to brcond, where we can remove the freeze directly.
This ensures that there can be no infinite combine loops due to
conflicting transforms.

Differential Revision: https://reviews.llvm.org/D152544
2023-06-12 12:01:29 +02:00
Yeting Kuo
2fe2a6d4b8 [DAGCombiner] Use generalized pattern matcher in visitFMA to support vp.fma.
Note: Some patterns in visitFMA are needed refined to support splat of constant.

Reviewed By: luke

Differential Revision: https://reviews.llvm.org/D152260
2023-06-08 09:40:21 +08:00
Serge Pavlov
10e7899818 [FPEnv] Get rid of extra moves in fpenv calls
If intrinsic `get_fpenv` or `set_fpenv` is lowered to the form where FP
environment is represented as a region in memory, extra moves can
appear. For example the code:

  define void @func_01(ptr %ptr) {
    %env = call i256 @llvm.get.fpenv.i256()
    store i256 %env, ptr %ptr
    ret void
  }

produces DAG:

  ch = get_fpenv_mem ch, memory_region
  val: i256, ch = load ch, memory_region
  ch = store ch, ptr, val

In this case the extra moves can be avoided if `get_fpenv_mem` got
pointer to the memory where the FP environment should be finally placed.

This change implement such optimization for this use case.

Differential Revision: https://reviews.llvm.org/D150437
2023-06-06 14:54:52 +07:00
Matt Arsenault
a1422bf906 DAG: Reorder conditions 2023-06-05 18:44:17 -04:00
Amaury Séchet
7988725f65 [NFC][DAG] Move isTruncateOf so that it can be used in foldBinOpIntoSelect. 2023-06-05 15:33:59 +00:00
JP Lehr
c9998ec145 Revert "[DAGCombine] Make sure combined nodes are added back to the worklist in topological order."
This reverts commit e69fa03ddd85812be3143d79a0359c3e8d43bd45.

This patch lead to build time outs on the AMDGPU OpenMP runtime
buildbot.
2023-06-05 10:55:58 -04:00
Amaury Séchet
e69fa03ddd [DAGCombine] Make sure combined nodes are added back to the worklist in topological order.
Currently, a node and its users are added back to the worklist in reverse topological order after it is combined. This diff changes that order to be topological. This is part of a larger migration to get the DAGCombiner to process nodes in topological order.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D127115
2023-06-05 11:09:18 +00:00
Jay Foad
b7052fa329 [DAGCombiner] Do not fold fadd (fmul x, y), (fmul x, y) -> fma x, y, (fmul x, y)
Differential Revision: https://reviews.llvm.org/D151890
2023-06-01 16:32:24 +01:00
David Green
7740216f2e [DAG] Combine insert(shuffle(load), load, 0) into a single load
Given an insert of a scalar load into a vector shuffle with mask
u,0,1,2,3,4,5,6 or 1,2,3,4,5,6,7,u (depending on the insert index),
it can be more profitable to convert to a single load and avoid the
shuffles. This adds a DAG combine for it, providing the new load is
still fast.

Differential Revision: https://reviews.llvm.org/D151029
2023-05-31 19:48:57 +01:00
Dhruv Chawla
51572c2cd7
[NFC][DAGCombiner]: Only consider nodes with no uses for pruning when forming initial worklist
When the worklist is initially being formed, there is no need to
consider all nodes for pruning. This is because the first time calling
getNextWorklistEntry will only clear those nodes which have no uses,
with their operands being added to the worklist. However, when the worklist is
created for the first time all nodes are added anyways, so this operation
actually ends up adding no nodes.

This patch adds a parameter IsCandidateForPruning to AddToWorklist with a
default value of true to avoid having to update every call site.

Differential Revision: https://reviews.llvm.org/D151416
2023-05-25 19:48:30 +05:30
Amaury Séchet
87bf2bff05 [NFC][DAG] Simplify a giant expression in visitMul. 2023-05-18 18:58:07 +00:00
Philip Reames
0dc0c27989 [TLI] Add IsZero parameter to storeOfVectorConstantIsCheap [nfc]
Make the decision to consider zero constant stores cheap target specific.  Will be used in an upcoming change for RISCV.
2023-05-17 09:19:01 -07:00
Austin Chang
d069ac035a [DAGCombiner] Add bswap(logic_op(bswap(x), y)) optimization
This is the implementation of D149782

The patch implements a helper function that matches and fold the following cases in the DAGCombiner:

1. `bswap(logic_op(x, bswap(y))) -> logic_op(bswap(x), y)`
2. `bswap(logic_op(bswap(x), y)) -> logic_op(x, bswap(y))`
3. `bswap(logic_op(bswap(x), bswap(y))) -> logic_op(x, y)` in multiuse case, which still reduces the number of instructions.

The helper function accepts SDValue with BSWAP and BITREVERSE opcode. This patch folds the BSWAP cases and remain the BITREVERSE optimization in the future

Reviewed By: RKSimon, goldstein.w.n

Differential Revision: https://reviews.llvm.org/D149783
2023-05-16 18:58:07 -05:00
Simon Pilgrim
8f82d8ee76 [DAG] visitSUBSAT - fold subsat(x,y) -> sub(x,y) if it never overflows 2023-05-06 15:55:04 +01:00
Simon Pilgrim
08c1150d4c [DAG] Add computeOverflowForSignedSub/computeOverflowForUnsignedSub/computeOverflowForSub
Match the addition variants (although computeOverflowForUnsignedSub is really just a placeholder), and use this in DAGCombiner::visitSUBO
2023-05-06 15:55:04 +01:00
Simon Pilgrim
3fb067f7ba [DAG] visitADDSAT - fold saddsat(x,y) -> add(x,y) if it never overflows
Extend existing uaddsat(x,y) fold
2023-05-06 14:18:23 +01:00
Simon Pilgrim
7395f6ae78 [DAG] Add computeOverflowForSignedAdd and computeOverflowForAdd wrapper
Add basic computeOverflowForSignedAdd helper to recognise that sadd overflow can't occur if both operands have more that one sign bit.

Add computeOverflowForAdd wrapper that calls computeOverflowForSignedAdd/computeOverflowForUnsignedAdd depending on the IsSigned argument, and use this in DAGCombiner::visitADDO
2023-05-06 13:33:14 +01:00
Simon Pilgrim
c7fce3f98b [DAG] Rename computeOverflowKind -> computeOverflowForUnsignedAdd. NFC.
Matches the naming convention for the equivalent ValueTracking helpers - further SelectionDAG computeOverflowFor*() helpers will be added soon.
2023-05-05 19:38:54 +01:00
Luo, Yuanke
ae1ca47bb4 [Coverity] Big parameter passed by value. 2023-05-05 09:50:38 +08:00
Craig Topper
fe9f557578 [DAGCombiner][RISCV] Enable reassociation for VP_FMA in visitFADDForFMACombine.
Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D149911
2023-05-04 17:20:58 -07:00
Yeting Kuo
287aa6c453 [DAGCombiner] Use generalized pattern match for visitFSUBForFMACombine.
The patch makes visitFSUBForFMACombine serve vp.fsub too. It helps DAGCombiner
to fuse vp.fsub and vp.fmul patterns to vp.fma.

Reviewed By: luke

Differential Revision: https://reviews.llvm.org/D149821
2023-05-04 22:02:32 +08:00
Luo, Yuanke
d9b92c4d55 [Coverity] Improper use of negtive value.
The `Iteration` value may be -1 which would cause incorrect loop count
when pass the value to buildSqrtNROneConst or buildSqrtNRTwoConst.
2023-05-04 21:11:49 +08:00
NAKAMURA Takumi
c1221251fb Restore CodeGen/MachineValueType.h from Support
This is rework of;

  - rG13e77db2df94 (r328395; MVT)

Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h`
can be restored as well.

Depends on D148767

Differential Revision: https://reviews.llvm.org/D149024
2023-05-03 00:13:20 +09:00
Sergei Barannikov
e744e51b12 [SelectionDAG] Rename ADDCARRY/SUBCARRY to UADDO_CARRY/USUBO_CARRY (NFC)
This will make them consistent with other overflow-aware nodes.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D148196
2023-04-29 21:59:58 +03:00
Piyou Chen
8a3950510f [RISCV] Support scalar/fix-length vector NTLH intrinsic with different domain
This commit implements the two NTLH intrinsic functions.

```
type __riscv_ntl_load (type *ptr, int domain);
void __riscv_ntl_store (type *ptr, type val, int domain);

```

```
enum {
  __RISCV_NTLH_INNERMOST_PRIVATE = 2,
  __RISCV_NTLH_ALL_PRIVATE,
  __RISCV_NTLH_INNERMOST_SHARED,
  __RISCV_NTLH_ALL
};
```

We encode the non-temporal domain into MachineMemOperand flags.

1. Create the RISC-V built-in function with custom semantic checking.
2. Assume the domain argument is a compile time constant,
and make it as LLVM IR metadata (nontemp_node).
3. Encode domain value as two bits MachineMemOperand TargetMMOflag.
4. According to MachineMemOperand TargetMMOflag, select corrsponding ntlh instruction.

Currently, it supports scalar type and fixed-length vector type.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D143364
2023-04-24 20:15:14 -07:00
Simon Pilgrim
b0832fca3f [DAG] Add ISD::isExtVecInRegOpcode helper.
Match ISD::ANY_EXTEND_VECTOR_INREG\ZERO_EXTEND_VECTOR_INREG\SIGN_EXTEND_VECTOR_INREG opcodes
2023-04-24 14:47:23 +01:00
Wang, Xin10
76cc949212 Clean come dead code
These codes deleted are dead code, we never go into it.
1. In AggressiveAntiDepBreaker.cpp, have assert AntiDepReg != 0.
2. IfConversion.cpp, Kind can only be one unique value, so isFalse && isRev
    can never be true.
3. DAGCombiner.cpp, at line 3675, we have considered the condition like
```
  // fold (sub x, c) -> (add x, -c)
  if (N1C) {
    return DAG.getNode(ISD::ADD, DL, VT, N0,
                       DAG.getConstant(-N1C->getAPIntValue(), DL, VT));
  }
```
4. ScheduleDAGSDNodes.cpp, we have Latency > 1 at line 663
5. MasmParser.cpp, code exists in a switch-case block which decided by
    the value FirstTokenKind, at line 1621, FirstTokenKind could only be
    one of AsmToken::Dollar, AsmToken::At and AsmToken::Identifier.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D148610
2023-04-23 20:46:34 -04:00
David Green
33fe899cef [DAG][AArch64] Limit preferIncOfAddToSubOfNot until after legalization if the node has wrap flags
If the add node has wrap flags then they will be destroyed by converting to
sub/not. The flags can be useful in converting to rhadd, for example, but that
may be required late if the node types need to be legalized. This limits the
preferIncOfAddToSubOfNot fold until after legalize DAG if the node have flags
to allow more folding.

Differential Revision: https://reviews.llvm.org/D148809
2023-04-21 18:35:58 +01:00
David Green
bbc983d33a [DAG] Retain nuw flags when reassociating adds
Given two adds that are both nuw, they will still be nuw after being
reassociated. (They only increase in value and at no point wrap).
https://alive2.llvm.org/ce/z/JrYM6H

Differential Revision: https://reviews.llvm.org/D148804
2023-04-20 19:05:45 +01:00
Simon Pilgrim
41053053e3 [DAG] SimplifyVCastOp - ensure we select the correct value type from an SDValue operand
As reported on Issue #62234 - we weren't correctly using the SDValue operand to get its value type, resulting in a failure when it came from a SDNode with multiple results

We haven't been able to create a suitable upstream regression test, but its been confirmed by inspection by both myself and @topperc

Fixes #62234
2023-04-20 10:38:10 +01:00
Dinar Temirbulatov
e6096871fd [DAGCombine][AArch64] Allow transformable to legal vectors to be used for MULH lowering.
It looks like it is still profitable to accept a transformable to a legal vector
type, not just a legal vector, as long as vector elements are the same between
two of those types.

Differential Revision: https://reviews.llvm.org/D148229
2023-04-19 13:24:58 +00:00