Since there are no opcodes for atomic loads and stores comparing to
SelectionDAG, we add `CheckMMOIsNonAtomic` predicate immediately after
the opcode predicate to make a logical combination of them. Otherwise
when `IPM_AtomicOrderingMMO` is inserted after `IPM_GenericPredicate`,
the patterns without predicates get a higher priority as
`IPM_AtomicOrderingMMO` has higher priority than `IPM_GenericPredicate`.
This is important to preserve an order of aligned/unaligned patterns on
X86 because aligned memory operations have an additional alignment
predicate and should be checked first according to their placement in td
file.
Closes#121446
Calling MCRegisterClass::contains with a Register does an implicit
conversion from Register to MCRegister. I think MCRegister is only
intended to be used for physical registers. We should protect this
implicit conversion by checking for physical registers first.
While I was here I removed some unnecessary parentheses from the output.
The signature of `CheckTemplateArgValues` implements error handling via
the `bool` return type, yet always returned false. The single possible
error case instead used `PrintFatalError,` which exits the program
afterward.
This behavior is undesirable: It prevents any further errors from being
printed and makes TableGen less usable as a library as it crashes the
entire process (e.g. `tblgen-lsp-server`).
This PR therefore fixes the issue by using `Error` instead and returning
true if an error occurred. All callers already perform proper error
handling.
As `llvm-tblgen` exits on error, a test was also added to the LSP to
ensure it exits normally despite the error.
When importing nested patterns, we create InsnMatcher for each pattern
and miss them if consider only the top level InsnMatcher. Iterate
PhysRegOperands instead.
Change the type of PhysRegOperands from DenseMap to SmallMapVector to
have stable generation. Also drop PhysRegInputs member from InsnMatcher
as there are no users of it.
Types used in the destination DAG of a pattern should not matter for
GlobalISel. All necessary checks are emitted in the form of matchers
when traversing the source DAG.
In particular, the check prevented importing patterns containing iPTR in
the middle of the destination DAG.
This reduces the number of skipped patterns on Mips and RISCV:
```
Mips 1270 -> 1212 (-58)
RISCV 42165 -> 42088 (-77)
```
Most of these patterns are for atomic operations.
Sub-instruction can have a def with the same name as a def in a
top-level instruction.
Previously this could result in both defs copied to the instruction
being built.
The existing test case is not representative. Even though TableGen
doesn't complain, the code generated from it is invalid and fails
verification with the message "Use not jointly dominated by defs.".
There is no way to magically transform `frameindex` to `tframeindex`
as it happens for some other leaf nodes. `frameindex` can only be
selected by custom C++ code or by using an `SDNodeXForm`.
This patch makes the test representative one and fixes the handling of
`G_FRAME_INDEX`, which shouldn't have set the operand's name.
It also fixes the type of the result of `G_FRAME_INDEX` in order to get
the correct type check (`GIM_CheckPointerToAny` instead of
`GIM_CheckType` with a scalar LLT argument).
The number of skipped patterns reduces for ARM from 4278 to 4257.
This is the only in-tree target that makes use of OptionalDefOperand.
Pull Request: https://github.com/llvm/llvm-project/pull/120470
A dead implicit def wasn't marked as dead if it is also an implicit use.
The new approach should also be more straightforward and simplifies
future changes for supporting optional defs and physical register defs.
Pull Request: https://github.com/llvm/llvm-project/pull/120426
There are cases (like in an upcoming patch to MLIR's `Property` class)
where the ? value is a useful null value. However, existing predicates
make ti difficult to test if the value in a record one is operating is ?
or not.
This commit adds the !initialized predicate, which is 1 on concrete,
non-? values and 0 on ?.
---------
Co-authored-by: Akshat Oke <Akshat.Oke@amd.com>
Add some comments that hopefully clarify a few things.
This was supposed to be NFC, but there is a difference in the inferred
register class for EXTRACT_SUBREG.
Pull Request: https://github.com/llvm/llvm-project/pull/120135
Some clients do not want to emit a terminator after each sub-sequence
(they have other means of determining the length of sub-sequences).
This moves `Term` argument from `emit` method to the constructor and
makes it optional. It couldn't be made optional while still on the
`emit` method because if the terminator wasn't specified, it has to be
taken into account in `layout` method as well.
The fact that `layout` method was called is now recorded in a dedicated
member variable, `IsLaidOut`. `Entries != 0` can no longer be used to
reliably check if `layout` method was called because it may be zero for
a different reason: the terminator wasn't specified and all added
sequences (if any) were empty.
This reduces the size of `*LaneMaskLists` and `*SubRegIdxLists` a bit
and resolves the removed TODO.
Fixes#118490
Point to the value name, otherwise it implies that the part after the
'=' is the problem.
Before:
```
/tmp/test.td:2:27: error: Value 'FlattenedFeatures' unknown!
let FlattenedFeatures = [];
^
```
After:
```
/tmp/test.td:2:7: error: Value 'FlattenedFeatures' unknown!
let FlattenedFeatures = [];
^
```
The DAG has the same instructions: the signed and unsigned absolute
difference of it's input. For AArch64, they map to uabd and sabd for
Neon and SVE. The Neon and SVE instructions will require custom
patterns.
They are pseudo opcodes and are not imported by the IRTranslator. We
need combines to create them.
PowerPC, ARM, and AArch64 have native instructions.
/// i.e trunc(abs(sext(Op0) - sext(Op1))) becomes abds(Op0, Op1)
/// or trunc(abs(zext(Op0) - zext(Op1))) becomes abdu(Op0, Op1)
For GlobalISel, we are going to write the combines in MIR patterns.
see:
llvm/test/CodeGen/AArch64/abd-combine.ll
- [ ] combine into abd
- [ ] legalize and add td patterns
Previously the check comments indicated that [pi][0-9]+ would match as a
type suffix, however the check itself was looking for [pi][0-9]* and
hence an 'i' suffix in isolation was being considered as a type suffix
despite it not having a bitwidth.
This change makes the check consistent with the comment and looks for
[pi][0-9]+
This reverts commit b36fcf4f493ad9d30455e178076d91be99f3a7d8.
This reverts commit c11b6b1b8af7454b35eef342162dc2cddf54b4de.
This reverts commit 775148f2367600f90d28684549865ee9ea2f11be.
multiple bot build breakages, e.g. https://lab.llvm.org/buildbot/#/builders/3/builds/8076
Use uppercase in the subvector description ("32x2" -> "32X4" etc.) - matches what we already do in VBROADCAST??X?, and we try to use uppercase for all x86 instruction mnemonics anyway (and lowercase just for the arg description suffix).
FPCLASS is a unary instruction with an immediate operand - update the naming to match similar instructions (e.g. VPSHUFD) by only using the source reg/mem and immediate in the instruction name
The idea is that by preemptively simplifying the result of `!and` and `!or`, we can fold
some of the conditional operators, like `!if` or `!cond`, as early as
possible.
TableGen builds up a map of "SubRegIdx -> Subclass" where Subclass is
the largest class where all registers have SubRegIdx as a sub-register.
When SubRegIdx (vis-a-vis the sub-register) is artificial it should
still include it in the map. This map is used in various places,
including in the calculation of the Lanemask of a register class, which
otherwise calculates an incorrect lanemask.
When CoveredBySubRegs is true and a sub-register consists of two
parts; a regular subreg and an artificial subreg, then TableGen
should consider only concatenating the non-artificial subregs.
For example, S0_S1 is a concatenated subreg from D0_D1,
but S0_S1_HI should not be considered.
This is to address the latest lit regressions
https://lab.llvm.org/buildbot/#/builders/64/builds/1285 caused by using
the internal lit shell. This change will limit using the internal lit
shell to TableGen on AIX so we do not hit these regressions.
`NumDefs` only counts the number of registers in `(outs)`, not any
implicit defs specified with `Defs = [...]`
This causes patterns with physical register defs to fail to import here
instead of later where implicit defs are rendered.
Add on `ImplicitDefs.size()` to count both and create `DstExpDefs` to
count only explicit defs, used later on.
Check name conflicts between intrinsics caused by mangling suffix.
If the base name of an overloaded intrinsic is a proper prefix of
another intrinsic, check if the other intrinsic name suffix after the
proper prefix can match a mangled type and issue an error if it can.
Some organizations have added operators downstream, and the test match-table.td tends to fail with off-by-n errors (with n being the number of `added operators`) periodically. This patch will increase the test robust and reduce the impact of merge process.
Print assert message after the "assertion failed" message instead of
printing it as a separate note. This makes the assert failure reporting
less verbose and also more useful to see the failure message inline with
the "assertion failed" message.
When record assertions fail, print an error message with the record's
location, so it's easier to see where the record that caused the assert
to fail was instantiated. This is useful when the assert condition in a
class depends on a template parameter, so we need to know the context of
the definition to determine why the assert failed.
Also enhanced the assert.td test to check for these context messages,
and also add checks for some assert failures that were missing in the
test.
Fix source location for anonymous records to be the one of the locations
where that record is instantiated as opposed to the location of the
class that was anonymously instantiated.
Currently, when a record is anonymously instantiated (via
`VarDefInit::instantiate`), we use the location of the class for the
record, which is not correct. Instead, pass in the `SMLoc` for the
location where the anonymous instantiation happens and use that location
when the record is instantiated. If there are multiple anonymous
instantiations with the same parameters, the location for the (single)
record created will be one of these instantiation locations as opposed
to the class location.
Decrease code size of `Intrinsic::getAttributes` function by uniquing
the function and argument attributes separately and using the
`IntrinsicsToAttributesMap` to store argument attribute ID in low 8 bits
and function attribute ID in upper 8 bits.
This reduces the number of cases to handle in the generated switch from
368 to 131, which is ~2.8x reduction in the number of switch cases.
Also eliminate the fixed size array `AS` and `NumAttrs` variable, and
instead call `AttributeList::get` directly from each case, with an
inline array of the <index, AttribueSet> pairs.
Currently, type casts can only be used to pattern match for intrinsics
with a single overloaded return value. For instance:
```
def int_foo : Intrinsic<[llvm_anyint_ty], []>;
def : Pat<(i32 (int_foo)), ...>;
```
This patch extends type casts to support matching intrinsics with
multiple overloaded return values. As an example, the following defines
a pattern that matches only if the overloaded intrinsic call returns an
`i16` for the first result and an `i32` for the second result:
```
def int_bar : Intrinsic<[llvm_anyint_ty, llvm_anyint_ty], []>;
def : Pat<([i16, i32] (int_bar)), ...>;
```
Validate that for target independent intrinsics the second dotted
component of their name (after the `llvm.`) does not match any existing
target names (for which atleast one intrinsic has been defined). Doing
so is invalid as LLVM will search for that intrinsic in that target's
intrinsic table and not find it, and conclude that its an unknown
intrinsic.