This patch adds several (AMDGPU-)target-specific DAG combines for
ISD::PTRADD nodes that reproduce existing similar transforms for
ISD::ADD nodes. There is no functional change intended for the existing
target-specific PTRADD combine.
For SWDEV-516125.
Add special case handling where a new replacement node has the entry
node as an operand i.e. does not depend on any other nodes.
This can be observed with the existing X86/pcsections-atomics.ll test
case when targeting Haswell, where certain 128-bit atomics are
transformed into arch-specific instructions, with some operands having
no other dependencies.
calculateByteProvider only cares about scalars or a single element
within a vector. For the later there is the VectorIndex parameter to
identify the element. All other properties, and specificially Index, are
related to the underyling scalar type and thus when taking the size of a
type it's the scalar size that matters.
Fixes https://github.com/llvm/llvm-project/issues/148387
getNode has logic to intersect flags correctly if the new node happens
to CSE with an existing node. Setting node flags after getNode bypasses
this logic and may change the node for other uses where the flags don't
hold.
Hexagon currently has an untested global flag to control fast
math variants of libcalls. Add fast variants as explicit libcall
options so this can be a flag based lowering decision, and implement
it. I have no idea what fast math flags the hexagon case requires,
so I picked the maximally potentially relevant set of flags although
this probably is refinable per call. Looking in compiler-rt, I'm not
sure if the fast variants are anything more than aliases.
Change isBuildVectorAll* -> isConstantSplatVectorAll* in VSelect in case
the fold happens after BuildVector has been canonically transformed to
Splat or if the Splat is initially in vselect already
- Fixes#73454
- Update related test cases, add extra tests in wasm
---------
Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
This allows truncated splat / buildvector in isBoolConstant, to allow
certain not instructions to be recognized post-legalization, and allow
vselect to optimize.
An override for x86 avx512 predicated vectors is required to avoid an
infinite recursion from the code that detects zero vectors. From:
```
// Check if the first operand is all zeros and Cond type is vXi1.
// If this an avx512 target we can improve the use of zero masking by
// swapping the operands and inverting the condition.
```
Add LLVM Context to getOptimalMemOpType and findOptimalMemOpLowering. So
that we can use EVT::getVectorVT to generate EVT type in
getOptimalMemOpType.
Related to [#146673](https://github.com/llvm/llvm-project/pull/146673).
Previously we had a table of entries for every Libcall for
the comparison to use against an integer 0 if it was a soft
float compare function. This was only relevant to a handful of
opcodes, so it was wasteful. Now that we can distinguish the
abstract libcall for the compare with the concrete implementation,
we can just directly hardcode the comparison against the libcall
impl without this configuration system.
llvm/llvm-project#147560 changed when the legacy SelectionDAG pass needs
TargetTransformInfoWrapperPass to always require it (rather than only
when assertions are enabled). `SelectionDAGISelLegacy::getAnalysisUsage`
was not updated in that PR, which was causing crashes on
assertions-disabled builds, which are hard to track down.
This makes the required update, which should avoid crashes being seen on
some buildbots and by some users.
This PR takes the work previously done by @pawan-nirpal-031 on X86 in
#106370, and makes it available in common code. This should enable all
targets to use `__builtin_canonicalize` for all `f(16|32|64|128)` data
types.
Canonicalization is implemented here as multiplication by `1.0`, as
suggested in [the
docs](https://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic).
DAGCombiner can already constant fold build vectors of constants/undefs
to a new vector type, but it has to be incredibly careful after
legalization to not affect a target's canonicalized constants.
This patch proposes we move the implementation inside SelectionDAG to
make it easier for targets to manually use the constant folding whenever
it deems it safe to do so.
I've also altered the method to take the BuildVectorSDNode input
directly and consistently use the same SDLoc.
ISD::ABDS can be used if the signed subtraction will not overwrap (this
is an extension to handle cases where the NSW flag has been lost)
ISD::ABDU can be used if both operands have at least 1 zero sign bit.
Fixes#147049
This reverts commit 8ac7210b7f0ad49ae7809bf6a9faf2f7433384b0.
This breaks the building the AArch64 backend, e.g. see
https://github.com/llvm/llvm-project/pull/144947
Revert to unbreak the build.
Also reverts follow-up commits 1e76f012db3ccfaa05e238812e572b5b6d12c17e.
If a kernel is known to be executing only a single lane, IR
UniformityAnalysis will take note of that (via
GCNTTIImpl::hasBranchDivergence) and report that all values are uniform.
SelectionDAG's built-in divergence tracking should do the same.
When generating SDAG for a getelementptr with a vector result, we were
previously generating splats for each scalar operand. This essentially
has the effect of aggressively vectorizing the sequence, and leaving it
later combines to scalarize if profitable.
Instead, we can keep the accumulating address as a scalar for as long as
the prefix of operands allows before lazily converting to vector on the
first vector operand. This both better fits hardware which frequently
has a scalar base on the scatter/gather instructions, and reduces the
addressing cost even when not as otherwise we end up with a scalar to
vector domain crossing for each scalar operand.
Note that constant splat offsets are treated as scalar for the above,
and only variable offsets can force a conversion to vector.
---------
Co-authored-by: Craig Topper <craig.topper@sifive.com>
This PR resolves https://github.com/llvm/llvm-project/issues/144513
The modification include five pattern :
1.vselect Cond, 0, 0 → 0
2.vselect Cond, -1, 0 → bitcast Cond
3.vselect Cond, -1, x → or Cond, x
4.vselect Cond, x, 0 → and Cond, x
5.vselect Cond, 000..., X -> andn Cond, X
1-4 have been migrated to DAGCombine. 5 still in x86 code.
The reason is that you cannot use the andn instruction directly in
DAGCombine, you can only use and+xor, which will introduce optimization
order issues. For example, in the x86 backend, select Cond, 0, x →
(~Cond) & x, the backend will first check whether the cond node of
(~Cond) is a setcc node. If so, it will modify the comparison operator
of the condition.So the x86 backend cannot complete the optimization of
andn.In short, I think it is a better choice to keep the pattern of
vselect Cond, 000..., X instead of and+xor in combineDAG.
For commit, the first is code changes and x86 test(note 1), the second
is tests in other backend(node 2).
---------
Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>