1079 Commits

Author SHA1 Message Date
Alan Li
5e0efc0f1d
Reland "[GlobalISel][LLT] Introduce FPInfo for LLT (Enable bfloat, ppc128float and others in GlobalISel) (#155107)" (#188502)
This is a reland of https://github.com/llvm/llvm-project/pull/155107
along with a fix for old gcc builds.

This patch is reverted in
https://github.com/llvm/llvm-project/pull/188344 due to compilation
failures described in
https://github.com/llvm/llvm-project/pull/155107#issuecomment-4121292756

The fix to old gcc builds is to remove `constexpr` modifiers in the
original patch in 0721d8e7768c011b8cf2d4d223ca6eca3392b1f9
2026-04-04 05:57:13 -07:00
Rahul Joshi
5781cc9bf6
[LLVM][Intrinsics] Refactor IIT encoding generation (#189790)
Refactor IIT encoding generation. The core change here is that when
generating IIT encodings, we pre-generate all the bits of the IIT
encoding except cases where a type needs to encode its own overload
index, which is patched in later in `TypeInfoGen`. In addition, this
change introduces a class hierarchy for dependent types, so that the
checks in `TypeInfoGen` are more meaningful, and renames/simplifies
several other pieces of code, as listed below.

1. Change the encoding for IIT_ARG's ArgInfo byte to encode the overload
slot index in lower 5 bits and the argument kind in upper 3 bits. This
enabled generating the same packed format for all other dependent types
that need to encode an overload slot index in the IIT encoding. Adjusted
the corresponding C++ code in `IITDescriptor::getArgumentNumber` and
`IIT_Descriptor::getArgumentKind`.
2. Introduce more descriptive classes to handle packing of the overload
index + arg kind into the IIT encoding. `OverloadIndexPlaceholder` is
used to generate a transient value in the type-signature that is patched
in `TypeInfoGen` with that type's overload index. `PackOverloadIndex` is
used to encapsulate the final packing of an overload index and argument
kind in a single byte, and `PatchOverloadIndex` is the class that does
the required patching of a `OverloadIndexPlaceholder` given the type's
overload index.
3. Delete `isAny`, `ArgCode` and `Number` from base `LLVMType` class.
Replace use of `isAny` with `isa<LLVMAnyType>`, `ArgCode` is not used
anymore, and move `Number`, which was used to represent the overload
index for a dependent type to the `LLVMDependentType` class and rename
it to `OverloadIndex`.
4. Introduce `LLVMDependentType` as a base class of all dependent types.
It holds the overload index of the type it depends on in its
`OverloadIndex` field. Also introduce 2 subclasses,
`LLVMFullyDependentType` to represent all fully dependent types (which
encode just the appropriate IIT code and the dependent type's overload
index) and `LLVMPartiallyDependentType` to represent partially dependent
types, that encode the appropriate IIT code and both this type's
overload index and the dependent type's overload index.
5. Change existing dependent type classes to derive from one of these
classes and rename the `num` class argument to `oidx` to better reflect
its meaning.
6. Rename various fields and classes used in `TypeInfoGen` to be more
meaningful. `AssignOverloadIndex` to do overload index assignment,
rename `ACIdxs` to `OverloadIdxs`, `ACTys` to `OverloadTypes` and use
the `DoPatchOverloadIndex` to patch in assigned overload slot indexes.
2026-04-01 06:39:36 -07:00
Rahul Joshi
c19e28d854
[NFC][LLVM] Simplify TypeInfoGen in Intrinsics.td (#189278)
Eliminate `MappingRIdx` by making it an identity function. Currently,
`MappingRIdx` is used to map the index of an `llvm_any*` type in an
intrinsic type signature to its overload index. Eliminating this mapping
means that dependent types in LLVM intrinsic definitions (like
`LLVMMatchType` and its subclasses) should use the overload index to
reference the overload type that it depends on (and not the index within
the llvm_any* subset of overloaded types).

See
https://discourse.llvm.org/t/rfc-simplifying-intrinsics-type-signature-iit-info-generation-encoding-in-intrinsicemitter-cpp/90383
2026-03-31 15:51:19 -07:00
Feng Zou
e1aef5ed5f
[X86][APX] Remove NF entries in X86CompressEVEXTable (#189308)
NF (No-Flags) instructions should not compress to non-NF instructions,
as this would incorrectly modify flags behavior. The compression table
is only intended for encoding optimizations that preserve semantics.

This removes the incorrect NF entries that could have led to
miscompilation if the compression logic were applied.
2026-03-31 09:30:49 +08:00
Pengcheng Wang
e6c89e8d6a
[TableGen] Improve the error report of getElementAsRecord (#189302) 2026-03-30 18:43:47 +08:00
Mehdi Amini
6a045c29a9
Revert "[GlobalISel][LLT] Introduce FPInfo for LLT (Enable bfloat, ppc128float and others in GlobalISel) (#155107)" (#188344)
This reverts commit b1aa6a45060bb9f89efded9e694503d6b4626a4a and commit
ce44d63e0d14039f1e8f68e6b7c4672457cabd4e.

This fails the build with some older gcc:

llvm/include/llvm/CodeGenTypes/LowLevelType.h:501:35: error: call to
non-constexpr function ‘static llvm::LLT llvm::LLT::integer(unsigned
int)’
     return integer(getSizeInBits());
                                   ^
2026-03-24 21:40:36 +00:00
Denis.G
b1aa6a4506
[GlobalISel][LLT] Introduce FPInfo for LLT (Enable bfloat, ppc128float and others in GlobalISel) (#155107)
Added extra information in LLT to support ambiguous fp types during
GlobalISel. Original idea by @tgymnich

Main differences from https://github.com/llvm/llvm-project/pull/122503
are:
* Do not deprecate LLT::scalar
* Allow targets to enable/disable IR translation with extenden LLT via
`TargetOption::EnableGlobalISelExtendedLLT` (disabled by default)
* `IRTranslator` use `TargetLoweringInfo` for appropriate `LLT`
generation.
* For this reason added flag in GlobalISelMatchTable` to allow switch
between legacy and new extended LLT names
* Revert using stubs like `LLT::float32` for float types as they are
real now. Added `TODO` for such cases.

Also MIRParser now may parse new type indentifiers.

---------

Co-authored-by: Tim Gymnich <tim@gymni.ch>
Co-authored-by: Ryan Cowan <ryan.cowan@arm.com>
2026-03-24 08:40:39 -04:00
gonzalobg
ea8fb06f24
[atomicrmw] fminimumnum/fmaximumnum support (#187030)
Adds support for `atomicrmw` `fminimumnum`/`fmaximumnum` operations.
These were added to C++ in P3008, and are exposed in libc++ in #186716 .
Adding LLVM IR support for these unblocks work in both backends with HW
support, and frontends.
2026-03-18 09:35:49 +01:00
azwolski
28ab5dddd9
[X86] Blocklist instructions that are unsafe for masked-load folding. (#178888)
This PR blocklist instructions that are unsafe for masked-load folding.

Folding with the same mask is only safe if every active destination
element reads only from source elements that are also active under the
same mask. These instructions perform element rearrangement or
broadcasting, which may cause active destination elements to read from
masked-off source elements.

VPERMILPD and VPERMILPS are safe only in the rrk form, the rik form
needs to be blocklisted. In the rrk form, the masked source operand is a
control mask, while in the rik form the masked source operand is the
data/value. This is also why VPSHUFB is safe to fold, while other
shuffles such as VSHUFPS are not.

Examples:
```
EVEX.128.66.0F.WIG 67 /r VPACKUSWB xmm1{k1}{z}, xmm2, xmm3/m128 
A: 00010203 7F000001 80000002 DEADBEEF  
E : 00000000 00000001 00000002 00000003  
D: 11111111 22222222 33333333 44444444  
k = 0x0400  
Masked_e = 00000000 00000000 00000000 00000000 (vmovdqu8{k}{z} Masked_e E) 
res1 = 00000000 00000000 00010000 00000000   (VPACKUSWB D{k}{z}, A, E) 
res2 =  00000000 00000000 00000000 00000000 (VPACKUSWB D{k}{z}, A, Masked_e) 

EVEX.128.66.0F38.W0 C4 /r VPCONFLICTD xmm1 {k1}{z}, xmm2/m128/m32bcst
A: DAA66D2B FFFFFFFC FFFFFFFC D9A0643C  
E : 7DDF743F 00000000 5FD99E73 4ED634C9  
D: 2629AB38 9E37782F 67BB800F AD66764A  
k = 0x0002 
Masked_e = (vmovdqu32 {k}{z} Masked_e E) 
res1 = 00000000 00000000 00000000 00000000 (VPCONFLICTD D{k}{z}, E) 
res2 = 00000000 00000001 00000000 00000000  (VPCONFLICTD D{k}{z}, Masked_e) 

EVEX.128.66.0F38.W1 8D /r VPERMW xmm1 {k1}{z}, xmm2, xmm3/m128 
A: 00010203 7F000001 80000002 DEADBEEF  
E : 00000000 00000001 00000002 00000003  
D: 11111111 22222222 33333333 44444444  
k = 0x0010 
Masked_e = 00000000 00000000 00000002 00000000 (vmovdqu16 {k}{z} Masked_e E) 
res1 = 00000000 00000000 00000001 00000000 (vpermw D{k}{z}, A, E) 
res2 =  00000000 00000000 00000000 00000000  (vpermw D{k}{z}, A, Masked_e) 

EVEX.128.66.0F38.W0 78 /r VPBROADCASTB xmm1{k1}{z}, xmm2/m8 
E : 7F4A7C15 6E490933 5D4C9659 4C433CE3  
D: F63F9D36 97F6E2B2 9432E8E6 FAEE7A3E  
k = 0x0002 
Masked_e = 00007C00 00000000 00000000 00000000 (vmovdqu8{k}{z} Masked_e E) 
res =  00001500 00000000 00000000 00000000 (vpbroadcastb D{k}{z}, E) 
res =  00000000 00000000 00000000 00000000 (vpbroadcastb D{k}{z}, Masked_e)
```

Baseline: https://github.com/llvm/llvm-project/pull/178411
2026-03-16 10:31:32 +00:00
Jay Foad
319808cec7
[TableGen] Fix MUL case in DAG default operands test (#185847)
The checks have been unused forever. This was an oversight in the patch
that introduced this test: https://reviews.llvm.org/D63814

Also fix the checks to match the actual output. This looks like another
oversight in the original patch, presumably because the checks were
never actually tested.
2026-03-11 17:09:58 +00:00
Jay Foad
79d2444dae
[TableGen] Let -register-info-debug dump the Artificial flag (#185899)
Dump the Artificial flag for RegisterClasses, SubRegIndices and
Registers. To avoid clutter it is only dumped when the flag is set (has
value 1).
2026-03-11 16:00:49 +00:00
Jay Foad
a257e1624d
[TableGen] Use CHECK-LABEL in aritficial registers tests. NFC. (#185846) 2026-03-11 10:25:34 +00:00
Henrich Lauko
89d150a797
[TableGen] Add let append/prepend syntax for field concatenation (#182382)
## Motivation

LLVM TableGen currently lacks a way to **accumulate** field values
across class hierarchies. When a derived class sets a field via `let`,
it completely replaces the parent's value. This forces users into
verbose workarounds like:

```tablegen
class Op { // This is generic MLIR Base 
  code extraClassDeclaration = ?;
}

// Some Generic shared base
class MyShared1OpClass : Op {
  code shared1ExtraClassDeclaration = [{ some generic code 1 }];
}

class MyShared2OpClass : MyShared1OpClass {
  code shared2ExtraClassDeclaration = [{ some generic code 2 }];
}

def MyOp : MyShared2OpClass {
  // need to manually concatenate shared code
  let extraClassDeclaration =   
      shared1ExtraClassDeclaration
    # shared2ExtraClassDeclaration
    # [{ additional specialized code }]; 
}
```

Instead I propose a more natural incremental solution without
unnecessery intermediate definitions:

```
class Op {
  code extraClassDeclaration = ?;
}

class MyShared1OpClass : Op {
  let append extraClassDeclaration = [{ some generic code 1 }];
}

class MyShared2OpClass : MyShared1OpClass {
  let append extraClassDeclaration = [{ some generic code 2 }];
}

def MyOp : MyShared2OpClass {
  let append extraClassDeclaration = [{ additional specialized code }]; 
}
```

This is especially painful in MLIR, where dialect authors want base
op/type/attribute classes to inject shared C++ declarations into all
derived definitions. I attempted to solve this in PR
https://github.com/llvm/llvm-project/pull/182265 with MLIR-specific
`inheritableExtraClassDeclaration`/`Definition` fields, but as
@joker-eph [pointed
out](https://github.com/llvm/llvm-project/pull/182265#discussion_r2098718600),
this is ad-hoc -- the same inheritance problem exists for `traits`,
`arguments`, `results`, and any other list/string/dag field. Rather than
adding `inheritable*` variants per field, we should solve this at the
language level.

## Design

This PR adds two new modifiers to the `let` statement: **`append`** and
**`prepend`**.

```tablegen
class Base {
  list<int> items = [1, 2];
  string text = "hello";
  dag d = (op);
}

def Example : Base {
  let append items = [3, 4];    // items = [1, 2, 3, 4]
  let prepend items = [0];      // items = [0, 1, 2]
  let append text = " world";   // text = "hello world"
  let prepend text = "say ";    // text = "say hello"
  let append d = (op 3:$a);     // d = (op 3:$a)
}
```

### Supported types

| Field type | Operation | Concat operator |
|---|---|---|
| `list<T>` | append/prepend | `!listconcat` |
| `string` / `code` | append/prepend | `!strconcat` |
| `dag` | append/prepend | `!con` |
| Other (`bit`, `int`, `bits`) | -- | Error |

### Semantics

- **`let append`** concatenates the new value **after** the current
value
- **`let prepend`** concatenates the new value **before** the current
value
- If the current value is **unset** (`?`), the new value is used
directly
- A plain **`let`** (without modifier) still replaces, allowing opt-out
from accumulated values
- Works in both **body-level** (`def Foo { let append ... }`) and
**top-level** (`let append ... in { }`) contexts

### Multi-level inheritance

Accumulation works naturally across inheritance chains:

```tablegen
class Base {
  list<int> items = [1, 2];
}

class Middle : Base {
  let append items = [3];    // items = [1, 2, 3]
}

def Leaf : Middle {
  let append items = [4];    // items = [1, 2, 3, 4]
}
```

### Multiple inheritance

TableGen supports multiple inheritance (`def D : A, B { ... }`), where
parent classes are processed left to right and the **last parent class's
value wins** for any shared field. `let append`/`let prepend` operates
on whatever value the field has *after* inheritance resolution — it does
not accumulate across sibling parents:

```tablegen
class A { list<int> items = [1, 2]; }
class B { list<int> items = [3, 4]; }

def D : A, B {
  let append items = [5];  // items = [3, 4, 5]  (A's value is lost)
}
```

This also applies to diamond inheritance:

```tablegen
class Base  { list<int> items = [1]; }
class Left  : Base { let append items = [2]; }  // [1, 2]
class Right : Base { let append items = [3]; }  // [1, 3]

def D : Left, Right {
  let append items = [4];  // items = [1, 3, 4]  (Left's [2] is lost)
}
```

This is consistent with how plain `let` works with multiple inheritance
— it is the standard last-writer-wins rule. Users who need accumulation
from multiple parents should use a single-inheritance chain instead.

## Backward compatibility

This proposal is **fully backward compatible**. The keywords `append`
and `prepend` are implemented as **context-sensitive keywords** — they
are only recognized as modifiers when they appear immediately after
`let` (in both body-level and top-level contexts). In all other
positions, `append` and `prepend` remain valid identifiers and can be
used as field names, class names, def names, etc. This means:

- No existing `.td` files (in-tree or out-of-tree) will break
- Fields named `append` or `prepend` continue to work: `let append
append = [5];` is valid (the first `append` is the modifier, the second
is the field name)
- The parser checks for the identifier string value after `let`, not for
a reserved token

RFC:
https://discourse.llvm.org/t/rfc-tablegen-add-let-append-prepend-syntax-for-field-concatenation/89924/
2026-03-09 18:54:08 +01:00
Sam Elliott
babead29c3
[NFC] Move fusion- to start of Fusion Feature Name (#185146)
This makes it a lot easier to see all the available fusions, because
they appear together in the list.
2026-03-06 21:57:48 -08:00
Petr Vesely
a9eeb151fb
[Tablegen] Fix condition to report when lanemask overflows (#181810)
This PR:

Fixes a slight off-by-one error in the check for how many bits are
allocated for subreg lane masks. If 65 subreg lanes are used, it fails
later, but the error message is not clear as to what has occured.
2026-03-06 09:55:06 +01:00
Craig Topper
1dc91cd620
[X86] Add (v)pcmpestr(m/i)q to set the W bit. (#184746)
These instructions don't ignore the W bit as we had previously marked.

Also support (v)pcmpestr(m/i)l as an alias for the W0 form to match
binutils.

Fixes part of #183635
2026-03-05 07:23:50 -08:00
Ivan Kosarev
21c1ba16ed
[TableGen] Complete the support for artificial registers (#183371)
Artificial registers were added in
eb0c510ecde667cd911682cc1e855f73f341d134
as a means of giving super-registers heavier weights than that
of their subregisters, even when they only contain a single
physical subregister.

Artifical registers thus do exist in code and participate in
register unit weight calculations, but are not supposed to be
available for register allocation.

This patch completes the support for artificial registers to:

- Ignore artificial registers when joining register unit uber
  sets. Artificial registers may be members of classes that
  together include registers and their sub-registers, making it
  impossible to compute normalised weights for uber sets they
  belong to.

  We have a use case downstream relying on this being supported,
  which allows to avoid introducing a large number of additional
  register classes.

- Not generate purely artificial register class intersections.
  It is critical not to have such classes, as the common LLVM
  codegen infrastructure will try to use them to constrain
  classes of virtual registers instead of producing COPYs
  whenever both the source and target register classes contain
  the same artificial registers.

- Not generate sub-classes where classes with the same
  non-artificial members already exist. This is mostly for
  convenience. For example, the HI16-capable subset of AMDGPU's
  AV_32 is VGPR_32, except VGPR_32 also contains the artificial
  staging registers. If the staging registers are not ignored,
  we'll end up having an additional generated register class,
  AV_32_with_hi16_in_VGPR_16, -- harmless, but also useless.

Eliminates a few inferred AMDGPU register classes:
    - VS_32_with_hi16
    - VS_32_Lo256_with_hi16
    - VS_32_Lo128_with_hi16
    - VRegOrLds_32_and_VS_32_Lo256
    - VRegOrLds_32_and_VS_32_Lo128
    - SRegOrLds_32_and_VRegOrLds_32

Causes no register class changes for other targets.
2026-03-04 13:33:26 +00:00
Nick Sarnie
bc0af9901b
[TableGen] Allow specification of underlying type for GenericEnum (#183769)
Allow specification of the underlying C++ data type for `GenericEnum`.

I ran into this because I was trying to use a TableGen-genered enum in
`DenseSet` which requires the underlying type be specified.

Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>
2026-03-02 15:09:23 +00:00
Rahul Joshi
e77f11c270
[NFC][RegisterInfoEmitter] Add target name prefix for a few variables (#183074)
Add target name prefix for a few static global variables in the
generated code. Also rework the TargetRegisterInfo constructor a bit to
use a ArrayRef for array of register classes and rename a few
constructor arguments to match the member names they initialize.
2026-02-25 07:46:12 -08:00
Rahul Joshi
6295903e12
[TableGen] Add asserts for a few register related checks (#182680)
Move some register file related error checking from C++ code to asserts
in Target.td file. Rename and extend the lit test to exercise these
errors.
2026-02-23 08:16:08 -08:00
Craig Topper
31daf5cd55
[TableGen] Add OPC_EmitIntegerByHwMode0 and OPC_CheckChildXTypeByHwMode0. NFC (#182686)
Add versions of these opcodes that implicitly call getValueTypeForHwMode
with index 0.

This reduces llc size by ~100K.
2026-02-21 23:03:56 -08:00
Rahul Joshi
a6416a8411
[NFC] Simplify a RegisterInfoEmitter lit test (#182672)
Eliminate SubRegIndex defs that are not used/required for the test.
2026-02-21 10:48:51 -08:00
Craig Topper
384bc40250
[TableGen][ISel] Add OPC_CheckTypeByHwMode0 to optimize the most frequent getValueTypeForHwMode index. (#182366)
Sort the unique ValueTypeByHwMode combinations by usage and add a
compressed opcode for the most common.

Reduces the RISCVGenDAGISel.inc table by about ~12K. The most common
being XLenVT.

I plan to add EmitIntegerByHwMode0 and EmitRegisterByHwMode0 in
subsequent patches.

Assisted-by: claude
2026-02-19 14:08:18 -08:00
Alexander Richardson
3459bb4f27
[TableGen] Introduce RegisterByHwMode
This is useful for `InstAlias` where a fixed register may depend on the
HwMode. The motivating use case for this is the RISC-V RVY ISA where
certain instructions mnemonics are remapped to take a different
register class depending on the HwMode and can be used as follows:
```
def NullReg : RegisterByHwMode<PtrRC, [RV32I, RV64I, RV64Y, RV64Y],
                                      [X0,    X0,    X0_Y,  X0_Y]>;
```

Pull Request: https://github.com/llvm/llvm-project/pull/175227
2026-02-18 17:23:10 -08:00
Ryan Mitchell
10ccf11ebb
[Tablegen] Patch RegUnitIntervals Initialization (#181173)
There were a few places it was missing some code-generation to properly
initialize it if enabled, and also it was missing the sentinel value.
2026-02-18 12:09:48 +01:00
Jay Foad
1b0cbdb8e8
[TableGen] Use standard name for default mode in debug printing (#181739)
In comments in generated files and in -register-info-debug output, use
the standard name "DefaultMode" for consistency, instead of hard coding
an alternative name "Default".
2026-02-17 10:33:58 +00:00
Jay Foad
7e653d06c1
[TableGen] Simplify printing of simple InfoByHwModes (#181714)
For the -register-info-debug output, don't bother printing a brace
enclosed list for simple InfoByHwModes, where every entry is the
default.
2026-02-16 20:12:38 +00:00
sstipano
5ec5701db3
Reapply "[MC][TableGen] Expand Opcode field of MCInstrDesc" (#180321) (#180954)
Difference from the previous version is that this one doesn't actually
encode opcodes in matcher tables as 32 bits, but still as 16 bits.
2026-02-12 09:17:02 +01:00
Rahul Joshi
d8b87934f0
[NFC][TableGen] Adopt CodeGenHelpers in GobalISel emitters (#180143)
Add specific emitters for `#ifdef` and `#ifndef` based guards and adopt
them and other CodeGenHelpers in Global ISel emitters.
2026-02-10 07:58:35 -08:00
Vladimir Vereschaka
19d681177f
Revert "[MC][TableGen] Expand Opcode field of MCInstrDesc" (#180321)
Reverts llvm/llvm-project#179652

This PR causes the out-of-memory build failures on many Windows
builders.
2026-02-06 21:58:50 -08:00
Petr Vesely
734eb95402
[Tablegen] Don't emit decoder tables with islands larger than 64 bits (#179651)
I have a downstream target which has 128-bit instructions where some
instructions can have large sections of encoding to be determined ahead
of time. This results in the island calculations for decoder tables to
emit checks over 64-bits.

This change will emit multiple separate checks when the island exceeds
64-bits.
2026-02-06 11:46:40 -08:00
sstipano
13d8870d45
[MC][TableGen] Expand Opcode field of MCInstrDesc (#179652)
Increase width of Opcode to `int` from `short` to allow more capacity.
2026-02-06 20:21:48 +01:00
Rahul Joshi
55ecf62227
[NFC][TableGen] Adopt CodeGenHelpers in MacroFusion emitter (#180132) 2026-02-06 09:04:13 -08:00
Min-Yih Hsu
6441f1c9d5
[RISCV] Introduce a new syntax for processor-specific tuning feature strings (#175063)
This patch proposes new a tuning feature string format that helps users
to build a performance model by "configuring" an existing tune CPU,
along with its scheduling model. For example, this string
```
"sifive-x280:single-element-vec-fp64"
```
takes ``sifive-x280`` as the "base" tune CPU and configured it with
``single-element-vec-fp64``. This gives us a performance model that
looks exactly like that of ``sifive-x280``, except some of the 64-bit
vector floating point instructions now produce only a single element per
cycle due to ``single-element-vec-fp64``.

This string could eventually be used in places like ``-mtune`` at the
frontend. Right now, this patch only implements the parser part, which
is put under the TargetParser library.

The grammar for this string is:
```
    tune-cpu      ::= 'tuning CPU name in lower case'
    directive     ::= "[a-zA-Z0-9_-]+"
    tune-features ::= directive ["," directive]*
```
A *directive* can and can only _enable_ or _disable_ a certain tuning
feature from the tuning CPU. A **positive directive**, like the
``single-element-vec-fp64`` we just saw, enables an additional tuning
feature in the associated tuning model.

A **negative directive**, on the other hand, removes a certain tuning
feature. For example, ``sifive-x390`` already has the
``single-element-vec-fp64`` feature, and we can use
"sifive-x390:no-single-element-vec-fp64" to create a new performance
model that looks nearly the same as ``sifive-x390`` except
``single-element-vec-fp64`` being cut out. In this case,
``no-single-element-vec-fp64`` is a negative directive.

There are additional restrictions on what we can put in the list of
directives, please refer to the documentations for more details.

Right now, this string only accepts directives that are explicitly
supported by the tune CPU. For example, "sifive-x280:prefer-w-inst" is
not a valide string as ``prefer-w-inst`` is not supported by
``sifive-x280`` at this moment. Vendors of these processors are expected
to maintain the compatibility of their supported directives across
different versions.

---------

Co-authored-by: Sam Elliott <aelliott@qti.qualcomm.com>
2026-02-05 15:22:07 -08:00
Jay Foad
7ea33e6848
[CodeGen] Remove unused first operand of SUBREG_TO_REG (#179690)
The first input operand of SUBREG_TO_REG was an immediate that most
targets set to 0. In practice it had no effect on codegen. Remove it.
2026-02-04 17:35:21 +00:00
Sam Elliott
e0181661dc
[TableGen][NFC] Use templated std::clamp (#179400) 2026-02-03 18:22:30 -08:00
Craig Topper
65cc6951d5 Reapply "[SelectionDAGISel] Separate the operand numbers in OPC_EmitNode/MorphNodeTo into their own table. (#178722)"
This includes a fix to use size_t instead of uint64_t in one place.
2026-02-03 14:20:42 -08:00
Craig Topper
19cf75c72f Revert "[SelectionDAGISel] Separate the operand numbers in OPC_EmitNode/MorphNodeTo into their own table. (#178722)"
This reverts commit caab98284166784459a2fb76df7bca3f1d35e41e.

This is failing some build bots.
2026-02-03 13:45:00 -08:00
Rahul Joshi
6716acd588
[NFC][TableGen] Adopt IfDefEmitter in TargetLibraryInfoEmitter (#179388) 2026-02-03 13:18:08 -08:00
Rahul Joshi
078f6bde1c
[NFC][TableGen] Adopt CodeGenHelpers in RegInfoEmitter (#179017)
- Change `NamespaceEmitter` to allow emitting anonymous namespaces.
- Adopt IfDef and namespace emitters in RegInfoEmitter.
2026-02-03 13:16:58 -08:00
Craig Topper
caab982841
[SelectionDAGISel] Separate the operand numbers in OPC_EmitNode/MorphNodeTo into their own table. (#178722)
The operand lists for these opcode require 1 byte per operand and are
usually small values that fit in 3-4 bits. This makes their storage
inefficient. In addition, many EmitNode/MorphNodeTo in the isel table
will use the same list of operand numbers.

This patch proposes to separate the operand lists into their own table
where they can be de-duplicated. The OPC_EmitNode/MorphNodeTo in the
main table will only store an index into this smaller table.

This is a reduced version of a suggestion from this very old FIXME.
d8d4096c0b/llvm/utils/TableGen/DAGISelMatcherGen.cpp (L1070)

For RISC-V this reduces the main table from 1437353 bytes to 1276015
bytes plus a 929 byte operand list table. A savings of about 11%.

For X86 this reduces the main table from 719237 bytes to 623612 bytes
plus a 1042 byte operand list table. A savings of about 11%.

I expect further savings could be had by moving more bytes over.
2026-02-03 13:13:13 -08:00
Rahul Joshi
f0c519d2c1
[NFC][TableGen] Adopt CodeGenHelpers in IntrinsicEmitter (#179310)
- Adopt IfDefEmitter in IntrinsicEmitter.
- Remove #undef for various flags in Intrinsics.cpp/Intrinsics.h as the
TableGen generated code does that now.
2026-02-03 10:56:00 -08:00
Craig Topper
5396f7951c
[SelectionDAGISel][TableGen] Remove trailing 0 from isel table. NFC (#178744)
I suspect this was here to prevent a trailing comma. If we actually
reach this byte in isel, it will be treated as OPC_Scope not a
terminator.
2026-01-29 13:05:27 -08:00
woruyu
170ad2335e
[TableGen][AsmMatcher] Fix optional operand mask indexing when HasMnemonicFirst is false (#176868)
### Summary
Fix optional operand mask indexing in the generated asm matcher when
HasMnemonicFirst is false.
2026-01-27 13:54:26 +08:00
Mirko Brkušanin
c9e0cf139c
[AMDGPU] Update patterns for v_cvt_flr and v_cvt_rpi (#177962)
Support GlobalISel and switch to checking `nnan` flag on instruction
instead of TargetOptions.
    
Instruction are renamed to v_cvt_floor and v_cvt_nearest on gfx11+
so add gfx11 tests as well.
2026-01-26 19:33:00 +01:00
Matt Arsenault
51d58657b8
TableGen: Emit subtarget generated methods as final (#177781) 2026-01-24 20:27:29 +01:00
Ryan Mitchell
7edf4e15ff
[TableGen] Allow targets to enforce regunits assignment as intervals (#175823)
General tablegen infrastructure for #174888
2026-01-23 19:30:26 +01:00
Sam Elliott
7184229fea
[NFC][MI] Tidy Up RegState enum use (2/2) (#177090)
This Change makes `RegState` into an enum class, with bitwise operators.
It also:
- Updates declarations of flag variables/arguments/returns from
`unsigned` to `RegState`.
- Updates empty RegState initializers from 0 to `{}`.

If this is causing problems in downstream code:
- Adopt the `RegState getXXXRegState(bool)` functions instead of using a
ternary operator such as `bool ? RegState::XXX : 0`.
- Adopt the `bool hasRegState(RegState, RegState)` function instead of
using a bitwise check of the flags.
2026-01-23 00:19:03 -08:00
Prerona Chaudhuri
8f1427d269
[TableGen] Gracefully error out in ParseTreePattern when DAG has zero operands so that llvm-tblgen doesn't crash (#161417)
Also handle the case when Pat->Child(i) is null in
CodeGenDAGPatterns::FindPatternInputsAndOutputs().
Fixes issue #157619 : TableGen asserts on invalid cast
2026-01-21 17:31:59 +00:00
Krzysztof Parzyszek
5875490909
Revert "[TableGen] Emit constexpr versions of some directive/clause f… (#177161)
…unctions (#176253)"

This reverts commit cf68af690ba7f98943e5f0f5cb39a91868d62098.

It increased the compilation time for a number of clang source files.
See comments in https://github.com/llvm/llvm-project/pull/176253 for
more information.
2026-01-21 08:06:55 -06:00