The current implementation considers both iPTR+iN and everything else
all in one go, which leads to more special casing when iPTR is present
in only one set than is described in the comment block. Moreover this
makes it very difficult to add any new iPTR-like wildcards due to the
exponential combinatorial explosion that occurs.
Logically, iPTR+iN handling is entirely independent from everything
else, so rewrite the code to do them separately. This removes special
cases, making the core of the implementation more succinct, whilst more
clearly implementing exactly what is described in the comment block, and
allows for any number of (non-overlapping) wildcards to be added to the
list, as needed by CHERI LLVM downstream (due to having a new capability
type which, much like a normal integer pointer in LLVM, varies in size
between targets and modes).
In testing, this change results in identical TableGen output for all
in-tree backends (including those in LLVM_ALL_EXPERIMENTAL_TARGETS), and
it is intended that this implementation is entirely equivalent to the
old one.
Almost all uses of `*TreePatternNode` expect it to be non-null. There
was the occasional check that it wasn't, which I have removed. Making
them references makes it clear that they exist.
This was attempted in 2018 (1b465767d6ca69f4b7201503f5f21e6125fe049a)
for `TreePatternNode::getChild()` but that was reverted.
This is the logical equivalent for #76710 for APInt and uses the same
naming scheme.
Converted existing users through:
`git grep -l "cast<ConstantSDNode>\(.*\).*getAPIntValueValue" | xargs
sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getAPIntValue/\1->getAsAPIntVal/'`
While working on DAGISelMatcherEmitter I've hit several runtime errors
caused by accessing TreePatternNode::Types out of bounds. These were
difficult to debug because the switch from std::vector to unique_ptr
removes bounds checking.
I don't think the slight reduction in class size is worth the extra
debugging and memory safety problems, so I suggest we revert this.
This reverts commit d34125a1a825208b592cfa8f5fc3566303d691a4.
Differential Revision: https://reviews.llvm.org/D154781
Recent changes to RISC-V cause the same predicate to appear in the
predicate list multiple times in some cases. This patch filters the
duplicates to reduce the number of predicate string variations.
Bugs were found by Svace static analysis tool. Null pointers are
dereferenced right after error checking that does not return from
function.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D147706
These vectors are resized in the constructor and never change size.
We can manually allocate two arrays instead.
This reduces the size of TreePatternNode by removing the
unneeded capacity end pointer fields from the std::vector.
We reserved 16 AddrSpaces in every TypeSetByHwMode. But we only ever
use the first one on targets that make use of the AddrSpace feature.
The vector was populated by pushing for each entry in the ArrayRef
passed to the TypeSetByHwMode constructor. Each entry is a
ValueTypeByHwMode that stores one VT for each HwMode.
The vector is accessed by a loop in TypeSetByHwMode::getValueTypeByHwMode.
That loop is over HwModes with in the TypeSetByHwMode. This is
unrelated to how the vector was created. The entries in the vector
don't represent HwModes.
The targets that use AddrSpace don't make use of HwModes so the
loop in getValueTypeByHwMode will only run 1 iteration. So we only
the first entry in the vector is meaningful used.
This patch simplifies things by storing only 1 AddrSpace in
TypeSetByMode. Reducing the memory used by TypeSetByHwMode.
More work will be needed to support HwModes with AddrSpace if we
need a different AddrSpace for each HwMode.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D148194
An intrusive reference counter uses less memory than the control
block of std::shared_ptr.
This should allow some additional code simplifications if we
don't need to pass around shared_ptr in order to create new
shared_ptrs.
We don't need to copy the ChildVariants vector into a new vector.
We're using std::move on every entry we copy so they clearly
aren't needed again. We can swap entries in the vector and reuse
it instead.
Previously we were passing 'this' and the std::shared_ptr version
of 'this'. This replaces all uses of 'this' with the shared_ptr and
makes the method static.
We add entries to the map in two places. One already used
std::piecewise_construct with map::emplace. The other was using
map::insert. Change both to map::try_emplace.
InfoByHwMode is the base class of TypeSetByHwMode. The two methods
did the same thing. No reason to have two ways to do it.
Also use the getSimple() access instead of Map.begin()->second.
Comparing types is quite expensive when hardware modes are being
used. Checking the operator first can let us detect mismatches
earlier without checking types.
Use the predicate condition instead of checkFeatures in *GenDAGISel.inc.
This makes the code similar to isel pattern predicates.
checkFeatures is still used by code created by SubtargetEmitter so
we can't remove the string. Backends need to be careful to keep
the string and predicates in sync, but I don't think that's a big issue.
I haven't measured it, but this should be a compile time improvement
for isel since we don't have to do any of the string processing that's
inside checkFeatures.
Reviewed By: kparzysz
Differential Revision: https://reviews.llvm.org/D146012
RFC to add a way to ignore COPY instructions when pattern-matching MIR in GISel.
- Add a new "GISelFlags" class to TableGen. Both `Pattern` and `PatFrags` defs can use it to alter matching behaviour.
- Flags start at zero and are scoped: the setter returns a `SaveAndRestore` object so that when the current scope ends, the flags are restored to their previous values. This allows child patterns to modify the flags without affecting the parent pattern.
- Child patterns always reuse the parent's pattern, but they can override its values. For more examples, see `GlobalISelEmitterFlags.td` tests.
- [AMDGPU] Use the IgnoreCopies flag in BFI patterns, which are known to be bothered by cross-regbank copies.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D136234
Simplifies the implementation of `TypeSize` while retaining its interface.
There is no need for abstract concepts like `LinearPolyBase`, `UnivariateLinearPolyBase` or `LinearPolySize`.
Differential Revision: https://reviews.llvm.org/D140263
Simplifies the implementation of `TypeSize` while retaining its interface.
There is no need for abstract concepts like `LinearPolyBase`, `UnivariateLinearPolyBase` or `LinearPolySize`.
Differential Revision: https://reviews.llvm.org/D140263
The TableGen implementation was using a homegrown implementation of
FunctionModRefInfo. This switches it to use MemoryEffects instead.
This makes the code simpler, and will allow exposing the full
representational power of MemoryEffects in the future. Among other
things, this will allow us to map IntrHasSideEffects to an
inaccessiblemem readwrite, rather than just ignoring it entirely
in most cases.
To avoid layering issues, this moves the ModRef.h header from IR
to Support, so that it can be included in the TableGen layer.
Differential Revision: https://reviews.llvm.org/D137641
SelectionDAG has a target hook, getExtendForAtomicOps, which it uses
in the computeKnownBits implementation for ATOMIC_LOAD. This is pretty
ugly (as is having a separate load opcode for atomics), so instead
allow making use of atomic zextload. Enable this for AArch64 since the
DAG path defaults in to the zext behavior.
The tablegen changes are pretty ugly, but partially helps migrate
SelectionDAG from using ISD::ATOMIC_LOAD to regular ISD::LOAD with
atomic memory operands. For now the DAG emitter will emit matchers for
patterns which the DAG will not produce.
I'm still a bit confused by the intent of the isLoad/isStore/isAtomic
bits. The DAG implementation rejects trying to use any of these in
combination. For now I've opted to make the isLoad checks also check
isAtomic, although I think having isLoad and isAtomic set on these
makes most sense.
This change introduces the HasNoUse builtin predicate in PatFrags that
checks for the absence of use of the first result operand.
GlobalISelEmitter will allow source PatFrags with this predicate to be
matched with destination instructions with empty outs. This predicate is
required for selecting the no-return variant of atomic instructions in
AMDGPU.
Differential Revision: https://reviews.llvm.org/D125212