I found this occasionally.
For outer let statements, if we want to override some bits, we specify
the range list in the form of `<n-m>`. But for inner let statements,
we use `{n-m}`.
This is inconsistent, and I can't find the reason why it is designed
as this. So here we make inner/outer let statements consistent and
remove the duplicated parsing functions.
There is only one in-tree usage so I think the impact is small.
## Motivation
LLVM TableGen currently lacks a way to **accumulate** field values
across class hierarchies. When a derived class sets a field via `let`,
it completely replaces the parent's value. This forces users into
verbose workarounds like:
```tablegen
class Op { // This is generic MLIR Base
code extraClassDeclaration = ?;
}
// Some Generic shared base
class MyShared1OpClass : Op {
code shared1ExtraClassDeclaration = [{ some generic code 1 }];
}
class MyShared2OpClass : MyShared1OpClass {
code shared2ExtraClassDeclaration = [{ some generic code 2 }];
}
def MyOp : MyShared2OpClass {
// need to manually concatenate shared code
let extraClassDeclaration =
shared1ExtraClassDeclaration
# shared2ExtraClassDeclaration
# [{ additional specialized code }];
}
```
Instead I propose a more natural incremental solution without
unnecessery intermediate definitions:
```
class Op {
code extraClassDeclaration = ?;
}
class MyShared1OpClass : Op {
let append extraClassDeclaration = [{ some generic code 1 }];
}
class MyShared2OpClass : MyShared1OpClass {
let append extraClassDeclaration = [{ some generic code 2 }];
}
def MyOp : MyShared2OpClass {
let append extraClassDeclaration = [{ additional specialized code }];
}
```
This is especially painful in MLIR, where dialect authors want base
op/type/attribute classes to inject shared C++ declarations into all
derived definitions. I attempted to solve this in PR
https://github.com/llvm/llvm-project/pull/182265 with MLIR-specific
`inheritableExtraClassDeclaration`/`Definition` fields, but as
@joker-eph [pointed
out](https://github.com/llvm/llvm-project/pull/182265#discussion_r2098718600),
this is ad-hoc -- the same inheritance problem exists for `traits`,
`arguments`, `results`, and any other list/string/dag field. Rather than
adding `inheritable*` variants per field, we should solve this at the
language level.
## Design
This PR adds two new modifiers to the `let` statement: **`append`** and
**`prepend`**.
```tablegen
class Base {
list<int> items = [1, 2];
string text = "hello";
dag d = (op);
}
def Example : Base {
let append items = [3, 4]; // items = [1, 2, 3, 4]
let prepend items = [0]; // items = [0, 1, 2]
let append text = " world"; // text = "hello world"
let prepend text = "say "; // text = "say hello"
let append d = (op 3:$a); // d = (op 3:$a)
}
```
### Supported types
| Field type | Operation | Concat operator |
|---|---|---|
| `list<T>` | append/prepend | `!listconcat` |
| `string` / `code` | append/prepend | `!strconcat` |
| `dag` | append/prepend | `!con` |
| Other (`bit`, `int`, `bits`) | -- | Error |
### Semantics
- **`let append`** concatenates the new value **after** the current
value
- **`let prepend`** concatenates the new value **before** the current
value
- If the current value is **unset** (`?`), the new value is used
directly
- A plain **`let`** (without modifier) still replaces, allowing opt-out
from accumulated values
- Works in both **body-level** (`def Foo { let append ... }`) and
**top-level** (`let append ... in { }`) contexts
### Multi-level inheritance
Accumulation works naturally across inheritance chains:
```tablegen
class Base {
list<int> items = [1, 2];
}
class Middle : Base {
let append items = [3]; // items = [1, 2, 3]
}
def Leaf : Middle {
let append items = [4]; // items = [1, 2, 3, 4]
}
```
### Multiple inheritance
TableGen supports multiple inheritance (`def D : A, B { ... }`), where
parent classes are processed left to right and the **last parent class's
value wins** for any shared field. `let append`/`let prepend` operates
on whatever value the field has *after* inheritance resolution — it does
not accumulate across sibling parents:
```tablegen
class A { list<int> items = [1, 2]; }
class B { list<int> items = [3, 4]; }
def D : A, B {
let append items = [5]; // items = [3, 4, 5] (A's value is lost)
}
```
This also applies to diamond inheritance:
```tablegen
class Base { list<int> items = [1]; }
class Left : Base { let append items = [2]; } // [1, 2]
class Right : Base { let append items = [3]; } // [1, 3]
def D : Left, Right {
let append items = [4]; // items = [1, 3, 4] (Left's [2] is lost)
}
```
This is consistent with how plain `let` works with multiple inheritance
— it is the standard last-writer-wins rule. Users who need accumulation
from multiple parents should use a single-inheritance chain instead.
## Backward compatibility
This proposal is **fully backward compatible**. The keywords `append`
and `prepend` are implemented as **context-sensitive keywords** — they
are only recognized as modifiers when they appear immediately after
`let` (in both body-level and top-level contexts). In all other
positions, `append` and `prepend` remain valid identifiers and can be
used as field names, class names, def names, etc. This means:
- No existing `.td` files (in-tree or out-of-tree) will break
- Fields named `append` or `prepend` continue to work: `let append
append = [5];` is valid (the first `append` is the modifier, the second
is the field name)
- The parser checks for the identifier string value after `let`, not for
a reserved token
RFC:
https://discourse.llvm.org/t/rfc-tablegen-add-let-append-prepend-syntax-for-field-concatenation/89924/
Allow specification of the underlying C++ data type for `GenericEnum`.
I ran into this because I was trying to use a TableGen-genered enum in
`DenseSet` which requires the underlying type be specified.
Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
Previously, the handling of the `cleanup` attribute had some checks
based on the type, but we were deducing the type after handling the
attribute.
This PR fixes the way the are dealing with type checks for the `cleanup`
attribute by delaying these checks after we are deducing the type.
It is also fixed in a way that the solution can be adapted for other
attributes that does some type based checks.
This is the list of C/C++ attributes that are doing type based checks
and will need to be fixed in additional PRs:
- CUDAShared
- MutualExclusions
- PassObjectSize
- InitPriority
- Sentinel
- AcquireCapability
- RequiresCapability
- LocksExcluded
- AcquireHandle
NB: Some attributes could have been missed in my shallow search.
Fixes#129631
Reduces memory usage compiling backend sources, most notably for
AMDGPU by ~98 MB per source on average.
AMDGPUGenRegisterInfo.inc is tens of megabytes in size now, and
is even larger downstream. At the same time, it is included in
nearly all backend sources, typically just for a small portion of
its content, resulting in compilation being unnecessarily
memory-hungry, which in turn stresses buildbots and wastes their
resources.
Splitting .inc files also helps avoiding extra ccache misses
where changes in .td files don't cause changes in all parts of
what previously was a single .inc file.
It is thought that rather than building on top of the current
single-output-file design of TableGen, e.g., using `split-file`,
it would be more preferable to recognise the need for multi-file
outputs and give it a proper first-class support directly in
TableGen.
There are currently no ways to add names to dag
operators other than when defining them. Furthermore a !con operation as
well as some others, drops the operator names.
This patch propagates the name from the LHS dag
for !con and adds a way to get and set the operator name for a dag
(!getdagopname, !setdagopname).
---------
Co-authored-by: Nemanja Ivanovic <nemanja@synopsys.com>
In Record only store the direct superclasses instead of all
superclasses. getSuperClasses recurses to find all superclasses when
necessary.
This gives a small reduction in memory usage. On lib/Target/X86/X86.td I
measured about 2.0% reduction in total bytes allocated (measured by
valgrind) and 1.3% reduction in peak memory usage (measured by
/usr/bin/time -v).
---------
Co-authored-by: Min-Yih Hsu <min@myhsu.dev>
The format is: `!instances<T>([regex])`.
This operator produces a list of records whose type is `T`. If
`regex` is provided, only records whose name matches the regular
expression `regex` will be included. The format of `regex` is ERE
(Extended POSIX Regular Expressions).
The grammar is `!match(str, regex)` and this operator produces 1
if the `str` matches the regular expression `regex`.
The format of `regex` is ERE (Extended POSIX Regular Expressions).
Previously the Type production did not include "code", which was only
accepted in one place in the grammar:
BodyItem: (`Type` | "code") `TokIdentifier` ["=" `Value`] ";"
However the parser implementation accepts "code" as a Type with only one
place where it is *not* allowed, corresponding to this production:
SimpleValue9: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
This patch changes the production for Type to include "code", thereby
fixing most occurrences of Type in the grammar, and documents the
restriction for BangOperator Types in the text instead of codifying it
in the grammar.
There are cases (like in an upcoming patch to MLIR's `Property` class)
where the ? value is a useful null value. However, existing predicates
make ti difficult to test if the value in a record one is operating is ?
or not.
This commit adds the !initialized predicate, which is 1 on concrete,
non-? values and 0 on ?.
---------
Co-authored-by: Akshat Oke <Akshat.Oke@amd.com>
The idea is that by preemptively simplifying the result of `!and` and `!or`, we can fold
some of the conditional operators, like `!if` or `!cond`, as early as
possible.
Add a !listflatten operator that will transform an input list of type
`list<list<X>>` to `list<X>` by concatenating elements of the
constituent lists of the input argument.
In the RISC-V architecture, multiple vendor-specific Control and Status
Registers (CSRs) share the same encoding. However, the existing lookup
function, which currently returns only a single result, falls short.
During disassembly, it consistently returns the first CSR encountered,
which may not be the correct CSR for the subtarget.
To address this issue, we modify the function definition to return a
range of results. These results can then be iterated upon to identify
the CSR that best fits the subtarget’s feature requirements. The
behavior of this new definition is controlled by a variable named
`ReturnRange`, which defaults to `false`.
Specifically, this patch introduces support for emitting a new lookup
function for the primary key. This function returns a pair of iterators
pointing to the first and last values, providing a comprehensive range
of values that satisfy the query
Currently, for some tables involving a single primary key field which is
integral and densely numbered, a direct lookup is generated rather than
a binary search. This patch extends the direct lookup function
generation to instructions, where the integral value corresponds to the
instruction's enum value.
While this isn't as common as for other tables, it does occur in at
least one downstream backend and one in-tree backend.
Added a unit test and minimally updated the documentation.
We can use `deftype` (not using `typedef` here to be consistent
with `def`, `defm`, `defset`, `defvar`, etc) to define type aliases.
Currently, only primitive types and type aliases are supported to be
the source type and `deftype` statements can only appear at the top
level.
Reviewers: fpetrogalli, Artem-B, nhaehnle, jroelofs
Reviewed By: jroelofs, nhaehnle, Artem-B
Pull Request: https://github.com/llvm/llvm-project/pull/79570
This adds a link from the main docs page back to the README where
I have previously added a list of useful resources.
To that list, I've added a link to my recent llvm blog post.
The keyword is intended for debugging purpose. It prints a message to
stderr.
This patch is based on code originally written by Adam Nemet, and on the
feedback received by the reviewers in
https://reviews.llvm.org/D157492.
The !repr operator represents the content of a variable or of a record
as a string.
This patch is based on code originally written by Adam Nemet, and on the
feedback received by the reviewers in
https://reviews.llvm.org/D157492.
We add a third argument `step` to `!range` bang operator to make it
with the same semantics as `range` in Python.
`step` can be negative. `step` is 1 by default and `step` can't be
0. If `start` < `end` and `step` is negative, or `start` > `end`
and `step` is positive, the result is an empty list.
A field `FilterClassField` is added to `GenericTable` class, which
is an optional bit field of `FilterClass`. If specified, only those
records with this field being true will have corresponding entries
in the table.
We provide a way to specify arguments in the form of `name=value`
so that we don't have to specify all optional arguments before the
one we'd like to change. Required arguments can alse be specified
in this way.
Note that the argument can only be specified once regardless of
the way (named or positional) to specify and positional arguments
should be put before named arguments.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D152998
- This patch proposes to add `!setdagarg` and `!setdagname` bang
operators to produce a new DAG node after replacing the specified
argument value/name from the given input DAG node. E.g.,
`!setdagarg((foo 1, 2), 0, "x")` produces `(foo "x", 2)` and
`!setdagname((foo 1:$a, 2:$b), 1, "c")` produces `(foo 1:$a, 2:$c)`.
Reviewed By: simon_tatham
Differential Revision: https://reviews.llvm.org/D151842
- This patch proposes to add `!getdagarg` and `!getdagname` bang
operators as the inverse operation of `!dag`. They allow us to examine
arguments of a given dag.
Reviewed By: simon_tatham
Differential Revision: https://reviews.llvm.org/D151602
In D148197, we have made `defvar` statement able to refer to class
template arguments. However, the priority of class/multiclass
template argument is higher than variables defined by `defvar`, which
is a little counterintuitive.
In this patch, we unify the priority of variables. Each pair of
braces introduces a new scope, which may contain some additional
variables like template arguments, loop iterators, etc. We can
define local variables inside this scope via `defvar` and these
variables are of higher priority than additional variables. This
means that `defvar` will shadow additional variables with the same
name. The scope can be nested, and we use the innermost variable.
This make variables defined by `defvar` prior to class/multiclass
template arguments, loop iterators, etc. The shadow rules now are:
* `V` in a record body shadows a global `V`.
* `V` in a record body shadows template argument `V`.
* `V` in template arguments shadows a global `V`.
* `V` in a `foreach` statement list shadows any `V` in surrounding record or global scopes.
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D149016
This enables indexing in `!foreach` and permutation with `list[permlist]`.
Enhancements in syntax:
- `list<int>` is applicable as a slice element.
- `list[int,]` is evaluated as not `ElemType` but `list<ElemType>`
with a single element.
Part of D145872
FIXME: I didn't apply new semantics to BitSlice.
`Opt(flag, func, desc)` registers an option into `Action`.
`OptClass<EmitterC>` is also available if `EmitterC(RK).run(OS)` is capable.
`Action` is defined as `ManagedStatic<cl::opt>` to guarantee to be created
when each registration of emitter is invoked.
`llvm::TableGenMain(argv0, MainFn)` invokes `Action` instead of `MainFn`
Differential Revision: https://reviews.llvm.org/D144351