This is part of #70452 that changes the type used for the external
interface of MMO to LocationSize as opposed to uint64_t. This means the
constructors take LocationSize, and convert ~UINT64_C(0) to
LocationSize::beforeOrAfter(). The getSize methods return a
LocationSize.
This allows us to be more precise with unknown sizes, not accidentally
treating them as unsigned values, and in the future should allow us to
add proper scalable vector support but none of that is included in this
patch. It should mostly be an NFC.
Global ISel is still expected to use the underlying LLT as it needs, and
are not expected to see unknown sizes for generic operations. Most of
the changes are hopefully fairly mechanical, adding a lot of getValue()
calls and protecting them with hasValue() where needed.
Resolves#84745.
Based on SDPatternMatch introduced by #78654, this patch replaces some
of basic patterns in `visitADDLike` with corresponding patterns in
SDPatternMatch.
This patch only replaces original folds, instead of introducing new ones.
An EXTRACT_VECTOR_ELT can extend the element to the width of its result
type, leaving the high bits undefined. Previously if we attempted to
query the bytes in these high bits we would recurse and hit an
assertion. This fixes it by bailing if the index is outside of the
vector element size.
I think the assertion Index < ByteWidth may still be incorrect, since
ByteWidth is calculated from Op.getValueSizeInBits(). I believe this
should be Op.getScalarValueSizeInBits() whenever VectorIndex is set
since we're querying the element now, not the vector. But I couldn't
think of a test case to trigger it. It can be addressed in a follow-up
patch.
Fixes#83920
This is another smaller step of #70452, changing the signature of
computeAliasing() from optional<uint64_t> to LocationSize, and follow-up
changes in DAGCombiner::mayAlias(). There are some test change due to
the previous AA->isNoAlias call incorrectly using an unknown size
(~UINT64_T(0)). This should then be improved again in #70452 when the
types are known to be scalable.
This patch also pick the MatchContext framework from DAGCombiner to an
indiviual header file to make the framework be used from other files in
llvm/lib/CodeGen/SelectionDAG/.
If we have a sext and a zext nneg with the same types and operand
we should combine them into the sext. We can't go the other way
because the nneg flag may only be valid in the context of the uses
of the zext nneg.
This treats the zext nneg as sext if X is known to have sufficient sign
bits to allow the zext or truncate or both to removed. This code is
taken from the same optimization for sext.
We already have the PtrOff factored into MachinePointerInfo. Any calls
to getAlign on the new load with do commonAlignment with the
MachinePointerInfo offset and the base alignment.
The getAlign function for a load returns the commonAlignment of the
"base align" and the offset stored in the MachinePointerInfo.
We're splitting a load here, so we should take the base alignment from
the original load without any offset that may already exist in the
original load. The new load can then maintain its own alignment using
just the base alignment and its own offset.
Noticed by inspection.
The load combine replaces a number of original loads with one new loads
and also replaces the output chains of the original loads with the
output chain of the new load. This is incorrect if the original load is
retained (due to multi-use), as it may get incorrectly reordered.
Fix this by using makeEquivalentMemoryOrdering() instead, which will
create a TokenFactor with both chains.
Fixes https://github.com/llvm/llvm-project/issues/80911.
Don't call TLI.SimplifyDemandedVectorElts directly from every SimplifyDemandedBits call, use the more expressive wrappers instead first.
This reduces the number of places we call TLI.SimplifyDemandedVectorElts and CommitTargetLoweringOpt to make it easier to track.
Part of the work to process DAG nodes in topological order.
Fixes https://github.com/llvm/llvm-project/issues/80744. This transform
doesn't handled vectors at all, The fixed length ones pass the first
check, but would fail the constant operand checks which immediate follow.
This patch takes the simplest approach, and just guards the transform
for scalar integers.
If we have a shifted mask, we may be able to reduce the load width
to the width of the non-zero part of the mask and use an offset
to the base address to remove the srl. The offset is given by
C+trailingzeros(ShiftedMask).
Then we add a final shl to restore the trailing zero bits.
I've use the ARM test because that's where the existing (and (srl
(load))) tests were.
The X86 test was modified to keep the H register.
The use of SmallDenseMap saves 0.48% of heap allocations during the
compilation of a large preprocessed file, namely X86ISelLowering.cpp,
for the X86 target. During the experiment, the maximum size of
WorklistMap was 24 or less 74% of the time. (Note that DenseMap has
the maximum occupancy rate of 3/4.)
Fixes#78897 - although the test case still has a number of poor codegen issues (in particular for i686 triples) that will need addressing (combining the nodes in topological order should help).
BSWAP/BITREVERSE/CTPOP/CTLZ/CTLZ_ZERO_UNDEF/CTTZ/CTTZ_ZERO_UNDEF are all handled by FoldConstantArithmetic - so use directly instead of testing for isConstantIntBuildVectorOrConstantInt and relying on DAG.getNode() to perform the constant fold.
This is the logical equivalent for #76710 for APInt and uses the same
naming scheme.
Converted existing users through:
`git grep -l "cast<ConstantSDNode>\(.*\).*getAPIntValueValue" | xargs
sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getAPIntValue/\1->getAsAPIntVal/'`
This follows on from #76708, allowing
`cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just
`N->getAsZextVal();`
Introduced via `git grep -l "cast<ConstantSDNode>\(.*\).*getZExtValue" |
xargs sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getZExtValue/\1->getAsZExtVal/'` and
then using `git clang-format` on the result.
Keep the haveNoCommonBitsSet check because we haven't started inferring
the flag yet.
I've added tests for two transforms, but these are not the only
transforms that use isADDLike.
Similar to the existing isZExtFree(SDValue, EVT) wrapper, this will allow targets to override for specific cases (e.g. free truncation of an ext/extload node). But for now its just used to wrap the existing isTruncateFree(EVT, EVT) call.
Pass the original MMO instead of different individual values.
getAlign() was used before where actually getOriginalAlign() would have been
better, and this patch has the same effect.