- Instead of checking the default ops directly, this change queries DAG
default operands collected during patterns reading. It does not only
simplify the code but also handle few cases where integer values are
converted from convertible types, such as 'bits'.
- A test case is added GlobalISelEmitter.td as the regression test of
default 'bits' values.
Most users of AddImm and CheckConstantInt only use 1 byte immediates, so
I added an opcode variants for those. That way all those instructions
save 7 bytes.
Also added an opcode for AddTempRegister for the cases where there are
no register flags.
Space savings:
- AMDGPUGenGlobalISel: 470180 bytes to 422564 (-10%)
- AArch64GenGlobalISel.inc: 383893 bytes to 374046
If there is only one bit set in EmitNodeInfo, then we can encode it
implicitly to save one byte.
Overall this reduces the llc binary size with all in-tree targets by
about 168K.
The most common type is i32 or i64 so we add `OPC_CheckTypeI32` and
`OPC_CheckTypeI64` to save one byte.
Overall this reduces the llc binary size with all in-tree targets by
about 29K.
When importing instruction selection patterns into GlobalISel, the
operands matched in the "source" DAG are copied into corresponding
operands of the "destination" DAG according to their names (such as Rd).
If multiple operands in the source DAG share the same name, a
GIM_CheckIsSameOperand predicate makes instruction selector check the
corresponding operands for equality (at compiler run-time) as part of
matching the source pattern.
The Def operands of the root node of the destination DAG are handled
specially. The operands of the instruction corresponding to the root
node are taken and GIM_CheckRegBankForClass predicates are
tablegen-erated accordingly. If by coincidence the Def operand in
question has the same name as one of the named operands in the pattern,
a GIM_CheckIsSameOperand predicate is automatically added that is likely
to prevent matching the source of otherwise applicable selection pattern
at compiler run-time.
This patch mangles the Def operand names taken from the instruction
corresponding to the root of the destination DAG (for example, "Rd"
becomes "DstI[Rd]") preventing unexpected name clashes with pattern's
named operands.
The patch consists of three sets of changes:
* changes to the GlobalISelEmitter.cpp file are the actual fix
* a test case is added to GlobalISelEmitter.td file as a regression test
* everything else is the biggest and least interesting part - updates to
the existing test cases: renames of the form Rd -> DstI[Rd] inside the
inline comments in tablegen-erated code
KMOV is essential for copy between k-registers and GPRs.
R16-R31 was added into GPRs in #70958, so we extend KMOV for these new
registers first.
This patch
1. Promotes KMOV instructions from VEX space to EVEX space
2. Emits prefix {evex} for the EVEX variants
3. Prefers EVEX variant than VEX variant in ISEL and optimizations for
better RA
EVEX variants will be compressed to VEX variants by existing EVEX2VEX
pass if no EGPR is used.
RFC:
https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4
TAG: llvm-test-suite && CPU2017 can be built with feature egpr
successfully.
1. Rename the names of tables to simplify the print
2. Align the abbreviation in the same file Instr -> Inst
3. Clang-format
4. Capitalize the first char of the variable name
These two opcodes are used to be followed by a MVT operand, which is
always one of i8/i16/i32/i64.
We add instantiated `OPC_EmitInteger` and `OPC_EmitStringInteger` with
i8/i16/i32/i64 so that we can reduce one byte.
We reserve `OPC_EmitInteger` and `OPC_EmitStringInteger` in case that
we may need them someday, though I haven't found one usage after this
change.
Overall this reduces the llc binary size with all in-tree targets by
about 200K.
Also disables generation of MutateOpcode. It's almost never used in
combiners anyway.
If we really want to use it, it needs to be investigated & properly
fixed (see TODO)
Fixes#70780
The inference is trivial and leverages the MCOI OperandTypes encoded in
CodeGenInstructions to infer types across patterns in a CombineRule.
It's thus very limited and only supports CodeGenInstructions (but that's the
main use case so it's fine).
We only try to infer untyped operands in apply patterns when they're
temp reg defs, or immediates. Inference always outputs a `GITypeOf<$x>` where
$x is a named operand from a match pattern.
This allows us to drop the `GITypeOf` in most cases without any errors.
The Scratch buffer passed to getBinaryCodeForInst needs to be able to
hold any value returned by getMachineOpValue or other custom encoders.
It's better to let the caller of getBinaryCodeForInst set the size of
Scratch as it's impossible for VarLenCodeEmitterGen to know what the
smallest needed size is.
VarLenCodeEmitterGen now calculates its smallest needed Scratch bit
width based on the slice operations and zero extends Scratch if it's too
small. This only guarantees that Scratch has enough bits for the
generated code not for getMachineOpValue or custom encoders.
The smallest internal APInt representation uses one uint64_t word so
there is no point in using a smaller size.
VarLenCodeEmitterGen produced code that did not compile if using
alternative encoding in different HwModes. It's not possbile to assign
unsigned **Index = Index_<mode>[][2] = { ... };
As a fix, Index and InstBits where removed in favor of mode specific
getInstBits_<mode> functions since this is the only place the arrays are
accessed.
Handling of HwModes is now concentrated to the VarLenCodeEmitterGen::run
method reducing the overall amount of code and enabling other types of
alternative encodings not related to HwModes.
Added a test for VarLenCodeEmitterGen HwModes.
Make sure that HwModes are supported in the same way they are supported
for the standard CodeEmitter. It should be possible to define
instructions with universal encoding across modes, distinct encodings
for each mode or only define encodings for some modes.
Fixed indentation in generated code.
The keyword is intended for debugging purpose. It prints a message to
stderr.
This patch is based on code originally written by Adam Nemet, and on the
feedback received by the reviewers in
https://reviews.llvm.org/D157492.
Use `MachineIRBuilder::buildConstant` to emit typed immediates in
'apply' MIR patterns.
This adds flexibility, e.g. it allows us to seamlessly handle vector
cases, where a `G_BUILD_VECTOR` is needed to create a splat.
The MatchTableExecutor did not use the MachineIRBuilder but instead
created instructions ad-hoc.
Making it use a Builder has the benefit that any observer added by a
combine is now notified when instructions are created by MIR patterns.
Another benefit is that it allows me to improve how constants are
created in apply MIR patterns.
`MachineIRBuilder::buildConstant` automatically handles splats for us,
this means that we may change `addCImm` to use that and handle vector
cases automatically.
The !repr operator represents the content of a variable or of a record
as a string.
This patch is based on code originally written by Adam Nemet, and on the
feedback received by the reviewers in
https://reviews.llvm.org/D157492.
When `PredicateUsesOperands` is set to true, GlobalISelEmitter preserves
the original index of predicate operands and uses that information on
each predicate usage. However, previously it only looked up the original
index for "actual" operands (i.e. operands of a predicate usage) that
are leaf nodes, which is an incorrect assumption.
This patch fix it by generalizing the acceptable kinds of actual
operands for predicate as well as checking the existance of bound
predicate operands.
…the size of a pointer by HwMode.
This adds an equivalent of PtrValueType that can use a
ValueTypeByHwMode.
This allows the size of a pointer to vary based on HwMode. This is
needed for RISC-V to support XLen sized pointers.
X86 don't want to unfold RMW instrs to 1 load + 1 op + 1 store, because
RMW could save code size and benefit RA when reg pressure is high.
And from all the call position analysis, we could find we didn't unfold
RMW in current code.
We add a third argument `step` to `!range` bang operator to make it
with the same semantics as `range` in Python.
`step` can be negative. `step` is 1 by default and `step` can't be
0. If `start` < `end` and `step` is negative, or `start` > `end`
and `step` is positive, the result is an empty list.
We used to return `int` in `getAsInt`, while `IntInit::getValue`
returns `int64_t` and `utohexstr` needs `uint64_t`. The casting
causes the wrong hex value when printing bits value.
The call to `initOpcodeValuesMap` was missing, causing the MatchTable to
(unintentionally) not emit a `SwitchMatcher`. Also adds other code
imported from `GlobalISelEmitter.cpp` to ensure rules are sorted by
precedence as well.
Overall this improves GlobalISel compile-time performance by a
noticeable amount. See #66751
A field `FilterClassField` is added to `GenericTable` class, which
is an optional bit field of `FilterClass`. If specified, only those
records with this field being true will have corresponding entries
in the table.
We have already customized folding for VINSERTPS by 7e6606f4f1, which do
the folding when alignment >= 4 bytes.
We cannot arbitrarily fold it like others because we need to calculate
the source offset.
TableGen's lexer was unable to handle nested #ifndef when the outer
`#ifdef` / `#ifndef` scope is subject to skip. This was caused by returning
the canonicalized token when it should have returned the original one.
Fix#65100.
Differential Revision: https://reviews.llvm.org/D159236
Adds a new feature to MIR patterns: builtin instructions.
They offer some additional capabilities that currently cannot be expressed without falling back to C++ code.
There are two builtins added with this patch, but more can be added later as new needs arise:
- GIReplaceReg
- GIEraseRoot
Depends on D158714, D158713
Reviewed By: arsenm, aemerson
Differential Revision: https://reviews.llvm.org/D158975
Now that the old backend is gone, clean-up a few things that no longer make sense and tidy up the file a bit.
Depends on D158710
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D158714
Irritatingly, atomic_store had operands in the opposite order from
regular store. This made it difficult to share patterns between
regular and atomic stores.
There was a previous incomplete attempt to move atomic_store into the
regular StoreSDNode which would be better.
I think it was a mistake for all atomicrmw to swap the operand order,
so maybe it's better to take this one step further.
https://reviews.llvm.org/D123143
The MatchTable-based GlobalISel Combiner backend is the new default. There are no in-tree users left of the old backend.
- Removed implementation of old MatchDAG-based Combiner, including tests, the backend itself and all supporting code.
- Renamed MatchTable backend to `GlobalISelCombinerEmitter.cpp` + removed "-matchtable" from its CL option.
- no need to have a verbose name as it's the only backend left now.
Reviewed By: aemerson
Differential Revision: https://reviews.llvm.org/D158710
D150312 added a TODO:
TODO: consider renaming the field `StartAtCycle` and `Cycles` to
`AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the
fact that resource allocation is now represented as an interval,
relatively to the issue cycle of the instruction.
This patch implements that TODO. This naming clarifies how to use these
fields in the scheduler. In addition it was confusing that `StartAtCycle` was
singular but `Cycles` was plural. This renaming fixes this inconsistency.
This commit as previously reverted since it missed renaming that came
down after rebasing. This version of the commit fixes those problems.
Differential Revision: https://reviews.llvm.org/D158568
D150312 added a TODO:
TODO: consider renaming the field `StartAtCycle` and `Cycles` to
`AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the
fact that resource allocation is now represented as an interval,
relatively to the issue cycle of the instruction.
This patch implements that TODO. This naming clarifies how to use these
fields in the scheduler. In addition it was confusing that `StartAtCycle` was
singular but `Cycles` was plural. This renaming fixes this inconsistency.
This commit as previously reverted since it missed renaming that came
down after rebasing. This version of the commit fixes those problems.
Differential Revision: https://reviews.llvm.org/D158568
D150312 added a TODO:
TODO: consider renaming the field `StartAtCycle` and `Cycles` to
`AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the
fact that resource allocation is now represented as an interval,
relatively to the issue cycle of the instruction.
This patch implements that TODO. This naming clarifies how to use these
fields in the scheduler. In addition it was confusing that `StartAtCycle` was
singular but `Cycles` was plural. This renaming fixes this inconsistency.
Differential Revision: https://reviews.llvm.org/D158568
I had to tighten the restrictions on PatFrags a bit to make it consistent: instructions that
define the root of a PF can only have one def.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D157700
PredicateBitset currently uses std::bitset, but std::bitset doesn't
have a constexpr constructor or any constexpr methods until C++23.
Each target that supports GlobalIsel has as an array of PredicateBitset
objects that currently use a global constructor.
SubtargetFeature used by the MC layer for feature bits, has its own
implementation of std::bitset that has constexpr constructor and methods
that provides all the capabilities that PredicateBitset needs.
This patch copies the implementation from SubtargetFeature, makes
it a template class, and puts it in ADT. I'll migrate SubtargetFeature
in a separate patch. Adapting all existing users to it being a template
was distracting from the goal of this patch.
This reduces the binary size of llc built with gcc 8.5.0 on my local
build by ~15k.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D158576
Binary and decimal values were reconginzed by strtoll, which returns
error when the msb is 1, and the error was ignored, resulting to wrong
results.
This patch fixes the issue.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D157079