2383 Commits

Author SHA1 Message Date
Amaury Séchet
9041e1fa29 [DAG] Peek through zext/trunc in haveNoCommonBitsSet.
This limitation was discovered thanks to some regression in D127115 .

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D147821
2023-04-11 11:44:15 +00:00
Craig Topper
65f3794111 [SelectionDAG] Use MemVT for FoldingSetNodeID in SelectionDAG::getLoadVP.
Return types and operands are put in the ID by AddNodeIDNode. I'm
pretty sure this was supposed to be the memory VT.
2023-04-03 15:15:48 -07:00
Simon Pilgrim
2434c8fcf9 [DAG] canCreateUndefOrPoison - add ISD::INSERT_VECTOR_ELT handling
If the inserted element index is guaranteed to be inbounds then a ISD::INSERT_VECTOR_ELT will not create poison/undef.
2023-04-02 16:28:26 +01:00
Simon Pilgrim
8153b92d9b [DAG] Add SelectionDAG::SplitScalar helper
Similar to the existing SelectionDAG::SplitVector helper, this helper creates the EXTRACT_ELEMENT nodes for the LO/HI halves of the scalar source.

Differential Revision: https://reviews.llvm.org/D147264
2023-03-31 18:35:40 +01:00
Yeting Kuo
84c8c2b4b4 [DAG][RISCV] Allow scalable vector ISD::STRICT_FP_ROUND and support vector ISD::STRICT_FP_ROUND for RISC-V.
The patch customized lower vector type ISD::STRICT_FP_ROUND to RISCVISD::STRICT_FP_ROUND.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147113
2023-03-30 08:20:02 +08:00
Kazu Hirata
9bb96fd874 [llvm] Use llvm::isNullConstant (NFC) 2023-03-22 00:31:48 -07:00
Simon Pilgrim
c1f81e7604 [DAG] mergeStore - peek through truncates when finding dead store(trunc(load())) patterns
Extend the existing store(load()) removal code to account for intermediate truncates that some targets won't remove with canCombineTruncStore - we only care about the load/store MemoryVT.

Fixes regression from D146121
2023-03-15 11:54:13 +00:00
pvanhout
1f1fea6c38 Reland: [DAG/AMDGPU] Use UniformityAnalysis in DAGISel
Switch DAGISel over to UniformityAnalysis, which was one of the last remaining users of the DivergenceAnalysis.
No explosions seen during internal testing so this looks like a smooth transition.

Reviewed By: sameerds

Differential Revision: https://reviews.llvm.org/D145918
2023-03-14 14:38:45 +01:00
pvanhout
0e79106fc9 Revert "[DAG/AMDGPU] Use UniformityAnalysis in DAGISel"
This reverts commit 0022b5803fd4f5a4e9fcf233267c0ffa1b88f763.
2023-03-14 11:48:58 +01:00
pvanhout
0022b5803f [DAG/AMDGPU] Use UniformityAnalysis in DAGISel
Switch DAGISel over to UniformityAnalysis, which was one of the last remaining users of the DivergenceAnalysis.
No explosions seen during internal testing so this looks like a smooth transition.

Reviewed By: sameerds

Differential Revision: https://reviews.llvm.org/D145918
2023-03-14 11:18:28 +01:00
Yeting Kuo
b2c48559c8 [IR][DAG][RISCV] Allow scalable vector ISD::STRICT_FP_EXTEND and RISC-V supports for vector ISD::STRICT_FP_EXTEND.
The patch mainly does two things. The first is allowing scalable vector
ISD::STRICT_FP_EXTEND. The second is making RISC-V customized lower
strict_fpextend to riscv_strict_fpextend_vl, the strict version of
riscv_fpextend_vl.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145548
2023-03-09 17:37:59 +08:00
Marco Elver
bdb4353ae0 [SelectionDAG] Optimize copyExtraInfo deep copy
It turns out that there are relatively trivial, albeit rare, cases that
require a MaxDepth of more than 16 (see added test). However, we want to
avoid having to rely on a large fixed MaxDepth.

Since these cases are relatively rare, apply the following strategy:

  1. Start with a low MaxDepth of 16 - if the entry node was not
     reached, we can return (the common case).

  2. If the entry node was reached, exponentially increase MaxDepth up
     to some large limit that should cover all cases and guard against
     stack exhaustion.

This retains the better performance with a low MaxDepth in the common
case, and in complex cases backs off and retries. On a whole, this is
preferable vs. starting with a large MaxDepth which would unnecessarily
penalize the common case where a low MaxDepth is sufficient.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D145386
2023-03-06 17:29:53 +01:00
Marco Elver
7ecd2a23f5 [SelectionDAG] Fix missing lambda capture
Move MaxDepth into the lambda, since it is not needed outside. This
fixes some compilers that complain about missing capture:

  error C3493: 'MaxDepth' cannot be implicitly captured because no
  default capture mode has been specified

Fixes: f693932fbea7 ("[SelectionDAG] Transitively copy NodeExtraInfo on RAUW")
2023-03-02 23:47:36 +01:00
Marco Elver
f693932fbe [SelectionDAG] Transitively copy NodeExtraInfo on RAUW
During legalization of the SelectionDAG, some nodes are replaced with
arch-specific nodes. These may be complex nodes, where the root node no
longer corresponds to the node that should carry the extra info.

Fix the issue by copying extra info to the new node and all its new
transitive operands during RAUW. See code comments for more details.

This fixes the remaining pcsections-atomics.ll tests on X86.

v2: Optimize copyExtraInfo() deep copy. For now we assume that only
NodeExtraInfo that have PCSections set require deep copy. Furthermore,
limit the depth of graph search while pre-populating the visited set,
assuming the to-be-replaced subgraph 'From' has limited complexity. An
assertion catches if the maximum depth needs to be increased.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D144677
2023-03-02 23:07:19 +01:00
Craig Topper
06c6b787b2 [SelectionDAG][AArch64] Constant fold in SelectionDAG::getVScale if VScaleMin==VScaleMax.
Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D145113
2023-03-02 12:02:38 -08:00
Marco Elver
e0bc779000 Revert "[SelectionDAG] Transitively copy NodeExtraInfo on RAUW"
This reverts commit 7f635b90e7bdf1378fd9a65fc62b99e8e07d4aaf.

The current implementation causes pathological slowdowns in certain
cases: https://github.com/llvm/llvm-project/issues/61108
2023-03-02 09:39:44 +01:00
Marco Elver
7f635b90e7 [SelectionDAG] Transitively copy NodeExtraInfo on RAUW
During legalization of the SelectionDAG, some nodes are replaced with
arch-specific nodes. These may be complex nodes, where the root node no
longer corresponds to the node that should carry the extra info.

Fix the issue by copying extra info to the new node and all its new
transitive operands during RAUW. See code comments for more details.

This fixes the remaining pcsections-atomics.ll tests on X86.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D144677
2023-02-27 12:16:14 +01:00
Fangrui Song
e4f4f34e7a [SelectionDAG] Migrate away from soft-deprecated functions. NFC 2023-02-21 11:01:34 -08:00
Kazu Hirata
4a05edd410 [llvm] Use APInt::getZero instead of APInt::getNullValue (NFC)
Note that APInt::getNullValue has been soft-deprecated in favor of
APInt::getZero.
2023-02-19 22:42:01 -08:00
Kazu Hirata
f8f3db2756 Use APInt::count{l,r}_{zero,one} (NFC) 2023-02-19 22:04:47 -08:00
Kazu Hirata
cbde2124f1 Use APInt::popcount instead of APInt::countPopulation (NFC)
This is for consistency with the C++20-style bit manipulation
functions in <bit>.
2023-02-19 11:29:12 -08:00
Simon Pilgrim
ce63cd3bf1 [DAG] Fold freeze(concat_vectors(x,y,...)) -> concat_vectors(freeze(x),freeze(y),...)
Another of the cleanups necessary for D136529
2023-02-08 20:26:43 +00:00
Simon Pilgrim
b7deb71ef5 [DAG] Fold freeze(build_pair(x,y)) -> build_pair(freeze(x),freeze(y))
One of the cleanups necessary for D136529 - another being how we're going to handle moving freeze through multiple result nodes (like uaddo and subcarry)
2023-02-08 17:54:03 +00:00
Yeting Kuo
7bc2cd614e [VP][DAGCombiner] Introduce generalized pattern match for vp sdnodes.
The patch tries to solve duplicated combine work for vp sdnodes. The idea is to
introduce MatchConext that verifies specific patterns and generate specific node
infromation. There is two MatchConext in DAGCombiner. EmptyMatcher is for
normal nodes and VPMatcher is for vp nodes.

The idea of this patch is come form Simon Moll's proposal [0]. I only fixed some
minor issues and added few new features in this patch.

[0]: c38a14484a

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D141891
2023-02-08 13:45:35 +08:00
Archibald Elliott
62c7f035b4 [NFC][TargetParser] Remove llvm/ADT/Triple.h
I also ran `git clang-format` to get the headers in the right order for
the new location, which has changed the order of other headers in two
files.
2023-02-07 12:39:46 +00:00
Samuel Parker
7bff37783f [SDAG] Check fminnum/fmaxnum for non-zero operand.
Currently, in TargetLowering, if the target does not support fminnum, we lower
to fminimum if neither operand could be a NaN. But this isn't quite correct
because fminnum and fminimum treat +/-0 differently; so, we need to prove that
one of the operands isn't a zero, or we don't have signed zeros.

Differential Revision: https://reviews.llvm.org/D143256
2023-02-07 10:54:23 +00:00
David Green
120ce83660 [DAG] Add visitABD optimizations
This adds basic a visitABD to optimize ABDS and ABDU nodes, similar to the
existing visitAVG method.

The fold I was initially interested in was folding shuffles though the binop.
This also:
- Marks ABDS and ABDU as commutative binops (https://alive2.llvm.org/ce/z/oCDogb
  and https://alive2.llvm.org/ce/z/7zrs86).
- Add reassociative folds.
- Add constant folding using max(x,y)-min(x,y)
- Canonicalizes constants to the RHS
- Folds abds x, 0 -> abs(x) (https://alive2.llvm.org/ce/z/4ZEibv)
- Folds abdu x, 0 -> x (https://alive2.llvm.org/ce/z/J_rKqx)
- Folds abd x, undef -> 0 (https://alive2.llvm.org/ce/z/NV6Nsv and
  https://alive2.llvm.org/ce/z/vs92hu).

Differential Revision: https://reviews.llvm.org/D143193
2023-02-05 10:28:54 +00:00
Marco Elver
98f0e4f611 Revert "[SelectionDAG] Add pcsections recursively on SDNode values"
Revert "[SelectionDAG] Add missing setValue calls in visitIntrinsicCall"

This reverts commit 0c64e1b68f36640ffe82fc90e6279c50617ad1cc.
This reverts commit 1142e6c7c795de7f80774325a07ed49bc95a48c9.

It spuriously added !pcsections where they shouldn't be. See added test
case in test/CodeGen/X86/pcsections.ll as an example. The reason is that
the SelectionDAG chains operations in a basic block as "operands"
pointing to preceding instructions. This resulted in setting the
metadata on _all_ instructions preceding the one that should have the
metadata.

Reverting for now because the semantics of !pcsections was completely
buggy now.
2023-02-03 18:57:34 +01:00
Martin Fink
0c64e1b68f [SelectionDAG] Add pcsections recursively on SDNode values
When adding pcsections to SDNodes, recursively add them to all values of
the node as well.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D141048
2023-01-26 16:13:46 +01:00
Matt Arsenault
778cf5431c IR: Add atomicrmw uinc_wrap and udec_wrap
These are essentially add/sub 1 with a clamping value.

AMDGPU has instructions for these. CUDA/HIP expose these as
atomicInc/atomicDec. Currently we use target intrinsics for these,
but those do no carry the ordering and syncscope. Add these to
atomicrmw so we can carry these and benefit from the regular
legalization processes.
2023-01-24 17:55:11 -04:00
Paul Walker
3f94a38388 [SVE] Fix invalid INSERT_SUBVECTOR creation when lowering fixed length fp-int conversions.
The original logic resulted in inserting an integer vector into
a floating point one and vice versa. Patch also adds the missing
assert that would have caught the issue.

Differential Revision: https://reviews.llvm.org/D142303
2023-01-24 12:29:25 +00:00
Kazu Hirata
5638156a1c [llvm] Use llvm::bit_width (NFC) 2023-01-21 13:56:47 -08:00
Simon Pilgrim
835cb9ff4d [DAG] getNode - add type assertion checks for ISD::ABDS/ABDU 2023-01-21 11:31:55 +00:00
Roman Lebedev
edf004e691
[NFC][TargetLowering] isSplatValueForTargetNode(): add DAG operand
Without it we can't recurse further.
2023-01-16 00:02:20 +03:00
Guillaume Chatelet
8fd5558b29 [NFC] Use TypeSize::geFixedValue() instead of TypeSize::getFixedSize()
This change is one of a series to implement the discussion from
https://reviews.llvm.org/D141134.
2023-01-11 16:49:38 +00:00
Guillaume Chatelet
48f5d77eee [NFC] Use TypeSize::getKnownMinValue() instead of TypeSize::getKnownMinSize()
This change is one of a series to implement the discussion from
https://reviews.llvm.org/D141134.
2023-01-11 16:36:39 +00:00
Luke Lau
275658d1af [SelectionDAG] Implicitly truncate known bits in SPLAT_VECTOR
Now that D139525 fixes the Hexagon infinite loop, the stopgap can be
removed to provide more information about known bits in SPLAT_VECTOR
whose operands are smaller than the bit width (which is most of the
time)

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D141075
2023-01-06 15:43:47 +00:00
Roman Lebedev
62fc5f1640
[DAGCombiner] Add a most basic combineShuffleToZeroExtendVectorInReg()
Sometimes we end up with a shuffles in DAG that would be
better represented as a `ISD::ZERO_EXTEND_VECTOR_INREG`,
and a failure to do so causes suboptimal codegen in a number of cases,
especially when we will then cast vector to scalar.

I acknowledge, the test changes here are rather underwhelming,
but as with all of codegen, it's always a yak shawing,
and this is the most stripped down version of the patch
that shows *some* effect without having insurmountable amount
of fallout to deal with. The next change resolves this regression.

The transformation will be extended in follow-ups.
2022-12-26 22:54:03 +03:00
Roman Lebedev
1234754bbc
[DAGCombine] BUILD_VECTOR can not create undef or poison 2022-12-23 02:26:36 +03:00
Nemanja Ivanovic
cb3f415cd2 [PowerPC] Fix up memory ordering after combining BV to a load
The combiner for BUILD_VECTOR that merges consecutive
loads into a wide load had two issues:

- It didn't check that the input loads all have the
  same input chain
- It didn't update nodes that are chained to the original
  loads to be chained to the new load

This caused issues with bootstrap when
3c4d2a03968ccf5889bacffe02d6fa2443b0260f was committed.
This patch fixes the issue so it can unblock this commit.

Differential revision: https://reviews.llvm.org/D140046
2022-12-16 08:57:36 -06:00
Fangrui Song
67819a72c6 [CodeGen] llvm::Optional => std::optional 2022-12-13 09:06:36 +00:00
Kazu Hirata
f7dffc28b3 Don't include None.h (NFC)
I've converted all known uses of None to std::nullopt, so we no longer
need to include None.h.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-10 11:24:26 -08:00
OCHyams
1d1de7467c [Assignment Tracking][Analysis] Add analysis pass
The Assignment Tracking debug-info feature is outlined in this RFC:

https://discourse.llvm.org/t/
rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir

Add initial revision of assignment tracking analysis pass
---------------------------------------------------------
This patch squashes five individually reviewed patches into one:

    #1 https://reviews.llvm.org/D136320
    #2 https://reviews.llvm.org/D136321
    #3 https://reviews.llvm.org/D136325
    #4 https://reviews.llvm.org/D136331
    #5 https://reviews.llvm.org/D136335

Patch #1 introduces 2 new files: AssignmentTrackingAnalysis.h and .cpp. The
two subsequent patches modify those files only. Patch #4 plumbs the analysis
into SelectionDAG, and patch #5 is a collection of tests for the analysis as
a whole.

The analysis was broken up into smaller chunks for review purposes but for the
most part the tests were written using the whole analysis. It would be possible
to break up the tests for patches #1 through #3 for the purpose of landing the
patches seperately. However, most them would require an update for each
patch. In addition, patch #4 - which connects the analysis to SelectionDAG - is
required by all of the tests.

If there is build-bot trouble, we might try a different landing sequence.

Analysis problem and goal
-------------------------

Variables values can be stored in memory, or available as SSA values, or both.
Using the Assignment Tracking metadata, it's not possible to determine a
variable location just by looking at a debug intrinsic in
isolation. Instructions without any metadata can change the location of a
variable. The meaning of dbg.assign intrinsics changes depending on whether
there are linked instructions, and where they are relative to those
instructions. So we need to analyse the IR and convert the embedded information
into a form that SelectionDAG can consume to produce debug variable locations
in MIR.

The solution is a dataflow analysis which, aiming to maximise the memory
location coverage for variables, outputs a mapping of instruction positions to
variable location definitions.

API usage
---------

The analysis is named `AssignmentTrackingAnalysis`. It is added as a required
pass for SelectionDAGISel when assignment tracking is enabled.

The results of the analysis are exposed via `getResults` using the returned
`const FunctionVarLocs *`'s const methods:

    const VarLocInfo *single_locs_begin() const;
    const VarLocInfo *single_locs_end() const;
    const VarLocInfo *locs_begin(const Instruction *Before) const;
    const VarLocInfo *locs_end(const Instruction *Before) const;
    void print(raw_ostream &OS, const Function &Fn) const;

Debug intrinsics can be ignored after running the analysis. Instead, variable
location definitions that occur between an instruction `Inst` and its
predecessor (or block start) can be found by looping over the range:

    locs_begin(Inst), locs_end(Inst)

Similarly, variables with a memory location that is valid for their lifetime
can be iterated over using the range:

    single_locs_begin(), single_locs_end()

Further detail
--------------

For an explanation of the dataflow implementation and the integration with
SelectionDAG, please see the reviews linked at the top of this commit message.

Reviewed By: jmorse
2022-12-09 16:17:37 +00:00
David Green
f4cc734a33 [DAG] Teach isConstOrConstSplat about SPLAT_VECTORs
This teaches the DemandedElts version of isConstOrConstSplat about
SPLAT_VECTORS, in the same way as the non-DemandedElts version by
calling the demanded-bits version from the non-demanded-bits version.

Differential Revision: https://reviews.llvm.org/D128919
2022-12-08 11:53:25 +00:00
Matt Arsenault
47c68904a5 DAG: ComputeNumSignBits from load range metadata
The cases where the result type doesn't match the range type
are inadequately tested, but I'm not sure how to write such a
test. During the pre-legalize combine, any obviously optimizable
code gets handled so it's harder to test legalized extloads.
2022-12-05 11:57:13 -05:00
Philip Reames
7969ab85e0 [SDAG] Allow scalable vectors in ComputeKnownBits (try 2)
This was previously reverted due to a hang on a Hexagon bot.  This turned out to be a bug in the Hexagon backend around how splat_vectors are legalized (which they're using for fixed length vectors!).  I adjusted this patch to remove the implicit truncate support.  This hides the hexagon bug for now, and unblocks the rest of the change.

Original commit message:

This is the SelectionDAG equivalent of D136470, and is thus an alternate patch to D128159.

The basic idea here is that we track a single lane for scalable vectors which corresponds to an unknown number of lanes at runtime. This is enough for us to perform lane wise reasoning on many arithmetic operations.

This patch also includes an implementation for SPLAT_VECTOR as without it, the lane wise reasoning has no base case. The original patch which inspired this (D128159), also included STEP_VECTOR. I plan to do that as a separate patch.

Differential Revision: https://reviews.llvm.org/D137140
2022-12-05 08:52:37 -08:00
Kazu Hirata
9f252e5567 [llvm] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 17:31:17 -08:00
Kazu Hirata
998960ee1f [CodeGen] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 20:36:08 -08:00
Philip Reames
fc0efb7e78 [SDAG] Allow scalable vectors in ComputeNumSignBits (try 2)
I had reverted this before the holiday week because a problem was reported with a related change (D137140 - scalable vector known bits in DAG).  I had initially confused the two patches, and then decided to leave this reverted out an abundance of caution.  Now that we're through the holiday week, reapplying.

I also roled in fixes for several post commit review comments that hadn't landed with the original change.

Original commit message

This is a continuation of the series of patches adding lane wise support for scalable vectors in various knownbit-esq routines.

The basic idea here is that we track a single lane for scalable vectors which corresponds to an unknown number of lanes at runtime. This is enough for us to perform lane wise reasoning on many arithmetic operations.

Differential Revision: https://reviews.llvm.org/D137141
2022-11-29 08:25:05 -08:00
Simon Pilgrim
629f17c516 [DAG] isGuaranteedNotToBeUndefOrPoison - handle FrameIndex/TargetFrameIndex
Fixes #58904
2022-11-22 18:16:15 +00:00