Reverse the fold with handling inside canCreateUndefOrPoison for cases where we know that the extract index is in bounds.
This exposed a number or regressions, and required some initial freeze handling of SCALAR_TO_VECTOR, which will require us to properly improve demandedelts support to handle its undef upper elements.
There is still one outstanding regression to be addressed in the future - how do we want to handle folds involving frozen loads?
Fixes#86968
On RV64, we legalize zexts of i1s to (vselect m, (splat_vector i64 1),
(splat_vector i64 0)), where the splat_vectors are implicitly
truncating.
When the vselect is used by a binop we want to pull the vselect out via
foldSelectWithIdentityConstant. But because vectors with an element size
< i64 will truncate, isNeutralConstant will return false.
This patch handles truncating splats by getting the APInt value and
truncating it. We almost don't need to do this since most of the neutral
elements are either one/zero/all ones, but it will make a difference for
smax and smin.
I wasn't able to figure out a way to write the tests in terms of select,
since we need the i1 zext legalization to create a truncating
splat_vector.
This supercedes #87236. Fixed vectors are unfortunately not handled by
this patch (since they get legalized to _VL nodes), but they don't seem
to appear in the wild.
Remove getSizeOrUnknown call when MachineMemOperand is created. For Scalable
TypeSize, the MemoryType created becomes a scalable_vector.
2 MMOs that have scalable memory access can then use the updated BasicAA that
understands scalable LocationSize.
Original Patch by Harvin Iriawan
Co-authored-by: David Green <david.green@arm.com>
Allow targets to rely on TargetLowering::isGuaranteedNotToBeUndefOrPoisonForTargetNode to test nodes for canCreateUndefOrPoisonForTargetNode + all arguments are isGuaranteedNotToBeUndefOrPoison.
Targets can still perform this themselves for specific special case nodes (e.g. target shuffles).
Matches the fallback in SelectionDAG::isGuaranteedNotToBeUndefOrPoison
- Instead of lowering float/double ISD::ATOMIC_LOAD / ISD::ATOMIC_STORE
nodes to regular LOAD/STORE nodes, make them legal and select those nodes
properly instead. This avoids exposing them to the DAGCombiner.
- AtomicExpand pass no longer casts float/double atomic load/stores to integer
(FP128 is still casted).
This is part of #70452 that changes the type used for the external
interface of MMO to LocationSize as opposed to uint64_t. This means the
constructors take LocationSize, and convert ~UINT64_C(0) to
LocationSize::beforeOrAfter(). The getSize methods return a
LocationSize.
This allows us to be more precise with unknown sizes, not accidentally
treating them as unsigned values, and in the future should allow us to
add proper scalable vector support but none of that is included in this
patch. It should mostly be an NFC.
Global ISel is still expected to use the underlying LLT as it needs, and
are not expected to see unknown sizes for generic operations. Most of
the changes are hopefully fairly mechanical, adding a lot of getValue()
calls and protecting them with hasValue() where needed.
When I created APIntOps::absdiff, I totally missed that we already have ISD::ABDS/ABDU nodes, and we use this term in other places/targets as well.
I've added the APIntOps::abds implementation and renamed APIntOps::absdiff to APIntOps::abdu.
Given that APIntOps::absdiff is so young I don't think we need to create a deprecation wrapper, but I can if anyone thinks it important.
I'll do a KnownBits rename patch after this.
No nans/infs in SelectionDAG is complicated. Hopefully I've captured
all of the cases. I've only applied to ConsiderFlags to the SDNodeFlags
since those are the only ones that will be droped by hoisting. The
condition code and TargetOptions would still be in effect.
Recovers some regression from #84232.
Teach canCreateUndefOrPoison that ISD::SETCC with integer operands can
never create undef/poison. FP SETCC is more complicated and will be
handled in a future patch.
Teach isGuaranteedNotToBeUndefOrPoison that ISD::CONDCODE is not
poison/undef. Its a special constant only used by setcc/select_cc like
nodes. This is needed since the hoisting will only hoist if exactly one
operand might be poison. setcc has 3 operand including the condition
code.
Recovers some regression from #84232.
This extends the known bits for extending loads which have range
metadata, handling the range metadata on the original memory type,
extending that to the correct BitWidth.
The base case of these call InferPtrInfo. This is dangerous due to
#82657, but it turns out none of these are used.
It seemed best to reduce the surface area until these are needed.
This handles two cases where we can work out some known-zero bits for
ISD::STEP_VECTOR.
The first case handles when we know the low bits are zero because the
step
amount is a power of two. This is taken from
https://reviews.llvm.org/D128159,
and even though the original patch didn't end up landing this case due
to it
not having any test difference, I've included it here for completeness's
sake.
The second case handles the case when we have an upper bound on
vscale_range.
We can use this to work out the upper bound on the number of elements,
and thus
what the maximum step will be. From the maximum step we then know which
hi bits
are zero.
On its own, computing the known hi bits results in some small
improvements for
RVV with -mrvv-vector-bits=zvl across the llvm-test-suite. However I'm
hoping
to be able to use this later to reduce the LMUL in index calculations
for
vrgather/indexed accesses.
---------
Co-authored-by: Philip Reames <preames@rivosinc.com>
Relying on ComputeKnownBits to find a splat is causing miscompilations where a shift of zero is being assumed to give zero, but further simplification leads to a shift of zero by undef, resulting in an unexpected undef value.
Fixes#78109
Return true iff all of vector elements are constant AND not zero
Fixes#77805
Previously, it'd return `true` (as in - the value is known to be never
zero) for any build_vector/splat_vector with non-constant elements.
This is the logical equivalent for #76710 for APInt and uses the same
naming scheme.
Converted existing users through:
`git grep -l "cast<ConstantSDNode>\(.*\).*getAPIntValueValue" | xargs
sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getAPIntValue/\1->getAsAPIntVal/'`
This follows on from #76708, allowing
`cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just
`N->getAsZextVal();`
Introduced via `git grep -l "cast<ConstantSDNode>\(.*\).*getZExtValue" |
xargs sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getZExtValue/\1->getAsZExtVal/'` and
then using `git clang-format` on the result.
Keep the haveNoCommonBitsSet check because we haven't started inferring
the flag yet.
I've added tests for two transforms, but these are not the only
transforms that use isADDLike.
This copies the flag from IR to the SDNode in SelectionDAGBuilder, clears
the flag in SimplifyDemandedBits, and adds it to canCreateUndefOrPoison.
Uses of the flag will come in later patches.
The helper function allows examples like
`cast<ConstantSDNode>(Op.getOperand(0))->getAPIntValue();` to be changed
to `Op.getConstantOperandAPInt(0);`.
See #76708 for further context. Although there are far fewer
opportunities for replacement, I used a similar git grep and sed combo
as before, given I already had it to hand:
`git grep -l "cast<ConstantSDNode>\(.*->getOperand\(.*\)\)->getAPIntValue\(\)" | xargs sed -E -i 's/cast<ConstantSDNode>\((.*)->getOperand\((.*)\)\)->getAPIntValue\(\)/\1->getConstantOperandAPInt(\2)/'`
and
`git grep -l
"cast<ConstantSDNode>\(.*\.getOperand\(.*\)\)->getAPIntValue\(\)" |
xargs sed -E -i
's/cast<ConstantSDNode>\((.*)\.getOperand\((.*)\)\)->getAPIntValue\(\)/\1.getConstantOperandAPInt(\2)/'`
This helper function shortens examples like
`cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();` to
`Node->getConstantOperandVal(1);`.
Implemented with:
`git grep -l
"cast<ConstantSDNode>\(.*->getOperand\(.*\)\)->getZExtValue\(\)" | xargs
sed -E -i
's/cast<ConstantSDNode>\((.*)->getOperand\((.*)\)\)->getZExtValue\(\)/\1->getConstantOperandVal(\2)/`
and `git grep -l
"cast<ConstantSDNode>\(.*\.getOperand\(.*\)\)->getZExtValue\(\)" | xargs
sed -E -i
's/cast<ConstantSDNode>\((.*)\.getOperand\((.*)\)\)->getZExtValue\(\)/\1.getConstantOperandVal(\2)/'`.
With a couple of simple manual fixes needed. Result then processed by
`git clang-format`.
we need first judge N1.getNumOperands() > 0.
If Lowering Generated SDNode like.
```
v2i32 t20: TargetOpNode.
i32 t21: extract_vector_elt t20 0
i32 t22: extract_vector_elt t20 1
```
will cause a error.
It seems TypeSize is currently broken in the sense that:
TypeSize::Fixed(4) + TypeSize::Scalable(4) => TypeSize::Fixed(8)
without failing its assert that explicitly tests for this case:
assert(LHS.Scalable == RHS.Scalable && ...);
The reason this fails is that `Scalable` is a static method of class
TypeSize,
and LHS and RHS are both objects of class TypeSize. So this is
evaluating
if the pointer to the function Scalable == the pointer to the function
Scalable,
which is always true because LHS and RHS have the same class.
This patch fixes the issue by renaming `TypeSize::Scalable` ->
`TypeSize::getScalable`, as well as `TypeSize::Fixed` to
`TypeSize::getFixed`,
so that it no longer clashes with the variable in
FixedOrScalableQuantity.
The new methods now also better match the coding standard, which
specifies that:
* Variable names should be nouns (as they represent state)
* Function names should be verb phrases (as they represent actions)
This is a follow-up to #68981, and fix for #72630, #72447.
We may end up in SelectionDAG::salvageDebugInfo() with indirect debug
values, and attempting to salvage ADD nodes with non-constant RHS would
lead us to try to turn those indirect debug values variadic, which is
not allowed.
This triggered the following assert in the SDDbgValue constructor:
Assertion `!(IsVariadic && IsIndirect)' failed.
This also adds a lit test for salvaging when having an indirect debug
value and constant RHS, as there seems like there was no such lit test.
However, I am not sure if the use of the stack_value operation is
correct in that case (which is existing behavior before #68981), but
that at least documents the current behavior.
This adds the nneg flag to SDNodeFlags and the node printing code.
SelectionDAGBuilder will add this flag to the node if the target doesn't
prefer sign extend.
A future RISC-V patch can remove the sign extend preference from
SelectionDAGBuilder.
I've also added the flag to the DAG combine that converts
ISD::SIGN_EXTEND to ISD::ZERO_EXTEND.
Teach SelectionDAG::salvageDebugInfo() to salvage debug information for
ADD nodes where the RHS is non-constant.
In the first try, the test case used the MIR output that was produced by
using -stop-before=sparc-isel. Running -start-before=sparc-isel on that
output resulted in the following verifier error with EXPENSIVE_CHECKS:
"Function has NoVRegs property but there are VReg operands".
In this re-attempt the test case has been rewritten to a .ll test by
extracting the IR from the original MIR file. The test still starts
before sparc-isel.
Co-authored-by: Mikael Holmen <mikael.holmen@ericsson.com>
Teach SelectionDAG::salvageDebugInfo() to salvage debug information for
ADD nodes where the RHS is non-constant.
Co-authored-by: Mikael Holmen <mikael.holmen@ericsson.com>
- [DebugInfo] Precommit testcase for pointer addition with unknown
offset
- [SelectionDAG] Salvage debug info for non-constant ADDs
---------
Co-authored-by: Mikael Holmen <mikael.holmen@ericsson.com>
The issue #55208 noticed that std::rint is vectorized by the
SLPVectorizer, but a very similar function, std::lrint, is not.
std::lrint corresponds to ISD::LRINT in the SelectionDAG, and
std::llrint is a familiar cousin corresponding to ISD::LLRINT. Now,
neither ISD::LRINT nor ISD::LLRINT have a corresponding vector variant,
and the LangRef makes this clear in the documentation of llvm.lrint.*
and llvm.llrint.*.
This patch extends the LangRef to include vector variants of
llvm.lrint.* and llvm.llrint.*, and lays the necessary ground-work of
scalarizing it for all targets. However, this patch would be devoid of
motivation unless we show the utility of these new vector variants.
Hence, the RISCV target has been chosen to implement a custom lowering
to the vfcvt.x.f.v instruction. The patch also includes a CostModel for
RISCV, and a trivial follow-up can potentially enable the SLPVectorizer
to vectorize std::lrint and std::llrint, fixing #55208.
The patch includes tests, obviously for the RISCV target, but also for
the X86, AArch64, and PowerPC targets to justify the addition of the
vector variants to the LangRef.