87 Commits

Author SHA1 Message Date
Rahul Joshi
633728f3b5
[NFC][TableGen][DecoderEmitter] Eliminate indent for a few functions (#148718)
Eliminate the `indent` argument for functions which are always called
with `indent(0)`.
2025-07-14 15:23:41 -07:00
Rahul Joshi
23b4f4eb9b
[NFC][TableGen] Change DecoderEmitter insertBits to use integer types only (#147613)
The `insertBits` templated function generated by DecoderEmitter is
called with variable `tmp` of type `TmpType` which is:

```
using TmpType = std::conditional_t<std::is_integral<InsnType>::value, InsnType, uint64_t>;
```

That is, `TmpType` is always an integral type. Change the generated
`insertBits` to be valid only for integer types, and eliminate the
unused `insertBits` function from `DecoderUInt128` in
AMDGPUDisassembler.h

Additionally, drop some of the requirements `InsnType` must support as
they no longer seem to be required.
2025-07-09 08:56:07 -07:00
Rahul Joshi
5f2e88a125
[NFC][TableGen] Rename CodeGenTarget instruction accessors (#146767)
Rename `getXYZInstructionsByEnumValue()` to just `getXYZInstructions`
and drop the `ByEnumValue` in the name.
2025-07-07 08:01:14 -07:00
Rahul Joshi
d7b8b65e23
[LLVM][TableGen][DecoderEmitter] Add wrapper struct for bit_value_t (#146248)
Add a convenience wrapper struct for the `bit_value_t` enum type to host
various constructors, query, and printing support. Also refactor related
code in several places. In `getBitsField`, use `llvm::append_range` and
`SmallVector::append()` and eliminate manual loops. Eliminate
`emitNameWithID` and instead use the `operator <<` that does the same
thing as this function. Have `BitValue::getValue()` (replacement for
`Value`) return std::optional<> instead of -1 for unset bits. Terminate
with a fatal error when a decoding conflict is encountered.
2025-07-01 07:36:17 -07:00
Rahul Joshi
92b50959da
[NFC][TableGen] Capitalize to in UseFnTableInDecodetoMCInst. (#146419) 2025-06-30 16:12:15 -07:00
Rahul Joshi
ed5f8f238d
[LLVM][DecoderEmitter] Add option to use function table in decodeToMCInst (#144814)
Add option `use-fn-table-in-decode-to-mcinst` to use a table of function
pointers instead of a switch case in the generated `decodeToMCInst`
function.

When the number of switch cases in this function is large, the generated
code takes a long time to compile in release builds. Using a table of
function pointers instead improves the compile time significantly (~3x
speedup in compiling the code in a downstream target). This option will
allow targets to opt into this mode if they desire for better build
times.

Tested with `check-llvm-mc` with the option enabled by default.
2025-06-24 18:49:05 -07:00
Rahul Joshi
376b71442d
[NFC][TableGen][DecoderEmitter] Use structured binding in range for loop (#144890)
Also assign variable names to different elements of `OpMap` for better
readibility, and eliminate `NumberedEncodingsRef` as `std::vector` will
automatically get converted to an `ArrayRef`.
2025-06-20 06:41:48 -07:00
Rahul Joshi
816ab1af0d
[NFCI][TableGen][DecoderEmitter] Cull Op handling when possible (#142974)
TryDecode/CheckPredicate/SoftFail MCD ops are not used by many targets.
Track the set of opcodes that were emitted and emit code for handling
TryDecode/CheckPredicate/SoftFail ops when decoding only if there were
emitted. This is purely eliminating dead code in the generated
`decodeInstruction` function.

This results in the following reduction in the size of the Disassembler
.so files with a release x86_64 release build on Linux:

```
Target                                                   Old Size        New Size  %  reduction
build/lib/libLLVMAArch64Disassembler.so.21.0git             256656          256656          0.00
build/lib/libLLVMAMDGPUDisassembler.so.21.0git              813000          808168          0.59
build/lib/libLLVMARCDisassembler.so.21.0git                  44816           43536          2.86
build/lib/libLLVMARMDisassembler.so.21.0git                 281744          278808          1.04
build/lib/libLLVMAVRDisassembler.so.21.0git                  36040           34496          4.28
build/lib/libLLVMBPFDisassembler.so.21.0git                  26248           23168         11.73
build/lib/libLLVMCSKYDisassembler.so.21.0git                 55960           53632          4.16
build/lib/libLLVMHexagonDisassembler.so.21.0git             115952          113416          2.19
build/lib/libLLVMLanaiDisassembler.so.21.0git                24360           21008         13.76
build/lib/libLLVMLoongArchDisassembler.so.21.0git            58584           56168          4.12
build/lib/libLLVMM68kDisassembler.so.21.0git                 57264           53880          5.91
build/lib/libLLVMMSP430Disassembler.so.21.0git               28896           28440          1.58
build/lib/libLLVMMipsDisassembler.so.21.0git                123128          120568          2.08
build/lib/libLLVMPowerPCDisassembler.so.21.0git              80656           78096          3.17
build/lib/libLLVMRISCVDisassembler.so.21.0git               154080          150200          2.52
build/lib/libLLVMSparcDisassembler.so.21.0git                42040           39568          5.88
build/lib/libLLVMSystemZDisassembler.so.21.0git              97056           94552          2.58
build/lib/libLLVMVEDisassembler.so.21.0git                   83944           81352          3.09
build/lib/libLLVMWebAssemblyDisassembler.so.21.0git          25280           25280          0.00
build/lib/libLLVMX86Disassembler.so.21.0git                2920624         2920624          0.00
build/lib/libLLVMXCoreDisassembler.so.21.0git                48320           44288          8.34
build/lib/libLLVMXtensaDisassembler.so.21.0git               42248           35840         15.17
```
2025-06-17 06:21:21 -07:00
Jay Foad
39ad3151e0
[TableGen] Use default member initializers. NFC. (#144349)
Automated with clang-tidy -fix -checks=-*,modernize-use-default-member-init
2025-06-16 15:26:47 +01:00
Rahul Joshi
7005a76638
[NFC][TableGen] Print DecodeIdx for DecodeOps in DecoderEmitter (#142963)
Print DecodeIdx associated with Decode MCD ops in the generated decoder
tables. This can help in debugging decode failures by first mapping the
Op -> DecodeIdx and then inspecting the code in `decodeToMCInst`
associated with that DecodeIdx.
2025-06-05 21:57:26 -07:00
Rahul Joshi
e53ccb78e4
[LLVM][MC] Introduce OrFail variants of MCD ops (#138614)
Introduce `OrFail` variants for all MCD Decoder Ops that have
`NumToSKip` encoded with them. This is intended to capture the common
case of jumps to the end of the decoder table which has a `OP_Fail` at
the end. Using the `OrFail` variants of these ops avoid encoding the
`NumToSkip` jump offset for these cases, resulting in a reduction in the
size of the decoder tables (from 5 - 17%). Additionally, for the AArch64
target, the table size reduces enough to switch to using 2-byte
`NumToSkip` encoding instead of existing 3-bytes, resulting in a net 30%
reduction in the size of the decoder table.

The total reduction in the size of the decoder tables for different
targets is as follows (computed using the following command: `for i in
*.inc; do echo -n ``basename $i: ``; grep "MCD::OPC_Fail," $i | awk
'{sum += $2} END { print sum}'; done`)

```
Target         Old Size   New Size   % Reduction
================================================
AArch64           153268     106987       30.20
AMDGPU            412056     340856       17.28
ARC                 5061       4605        9.01
ARM                73831      60847       17.59
AVR                 1306       1158       11.33
BPF                 1927       1795        6.85
CSKY                8692       6922       20.36
Hexagon            41965      34759       17.17
Lanai                982        924        5.91
LoongArch          21629      20035        7.37
M68k               13461      11689       13.16
MSP430              3716       3384        8.93
Mips               31415      25771       17.97
PPC                28931      24771       14.38
RISCV              34800      28352       18.53
Sparc               7432       6236       16.09
SystemZ            32248      29716        7.85
VE                 42873      36923       13.88
XCore               2316       2196        5.18
Xtensa              3443       2793       18.88
```
2025-06-05 06:17:50 -07:00
Kazu Hirata
252bd80871
[TableGen] Remove unused includes (NFC) (#141356)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-05-24 09:37:40 -07:00
Rahul Joshi
b5e3d8ec08
[LLVM][TableGen] Use StringRef for various members CGIOperandList::OperandInfo (#140625)
- Change `Name`, `SubopNames`, `PrinterMethodName`, and
`EncoderMethodNames` to be stored as StringRef.
- Also changed `CheckComplexPatMatcher::Name` to StringRef as a fallout
from the above.

Verified that all the tablegen generated files within LLVM are
unchanged.
2025-05-21 06:23:01 -07:00
Kazu Hirata
294eb7670f [TableGen] Fix a warning
This patch fixes an unused parameter warning with gcc7 under the
release configuration.
2025-05-12 23:18:30 -07:00
Rahul Joshi
9981afc5f9
[NFC][TableGen] Use StringRef::str() instead of casting (#139332)
- Also eliminate unneeded std::string() around some literal strings.
2025-05-12 15:41:27 -07:00
Craig Topper
afbd2ce80f
[TableGen] Use StringRef::empty() instead of comparing to an empty string. NFC (#137673) 2025-04-28 16:47:52 -07:00
Kazu Hirata
f4d3a0cb6a
[TableGen] Simplify insertBits (NFC) (#137538)
We can use "constexpr if" to combine the two variants of functions.
2025-04-27 12:35:54 -07:00
Rahul Joshi
ecb0daa72c
[NFC][LLVM][TableGen] Eliminate inheritance from std::vector (#136573) 2025-04-23 08:44:35 -07:00
Rahul Joshi
e1bb7f6dde
[LLVM][TableGen] Parameterize NumToSkip in DecoderEmitter (#136456)
- Add command line option `num-to-skip-size` to parameterize the size of
`NumToSkip` bytes in the decoder table. Default value will be 2, and
targets that need larger size can use 3.
- Keep all existing targets, except AArch64, to use size 2, and change
AArch64 to use size 3 since it run into the "disassembler decoding table
too large" error with size 2.
- Additional fixes on top of earlier revert: mark `decodeNumToSkip` as
static (not necessary anymore as the generated code is now in anonymous
namespace, but doing it for consistency) and incorporate Bazel build
changes from https://github.com/llvm/llvm-project/pull/136212
- Following is a rough reduction in size for the decoder tables by
switching to size 2.

```
Target         Old Size   New Size   % Reduction
================================================
AArch64           153254     153254        0.00
AMDGPU            471566     412805       12.46
ARC                 5724       5061       11.58
ARM                84936      73831       13.07
AVR                 1497       1306       12.76
BPF                 2172       1927       11.28
CSKY               10064       8692       13.63
Hexagon            47967      41965       12.51
Lanai               1108        982       11.37
LoongArch          24446      21621       11.56
MSP430              4200       3716       11.52
Mips               36330      31415       13.53
PPC                31897      28098       11.91
RISCV              37979      32790       13.66
Sparc               8331       7252       12.95
SystemZ            36722      32248       12.18
VE                 48296      42873       11.23
XCore               2590       2316       10.58
Xtensa              3827       3316       13.35
```
2025-04-21 08:15:08 -07:00
Rahul Joshi
6182015698
[NFC][LLVM][TableGen] Adjust pointer increments in DecoderEmitter (#136230)
- In both `emitTable` and the generated `decodeInstruction` function
increment the pointer to the decoder op as a part of the switch
statement instead of later on in each case.
2025-04-18 10:08:00 -07:00
Rahul Joshi
c244daec1c
[LLVM][TableGen] Fix Windows failure in DecoderEmitter (#136310)
- Avoid dereferencing the end() iterator to get the end pointer, instead
calculate it explicitly
- Fixes a regression introduced in
https://github.com/llvm/llvm-project/pull/136220.
- The windows build failure shows the following call stack:

```
 | Exception Code: 0x80000003
 |  #0 0x00007ff74bc05897 std::_Vector_const_iterator<class std::_Vector_val<struct std::_Simple_types<unsigned char>>>::operator*(void) const C:\Program Files\Microsoft Visual Studio\2022\Professional\VC\Tools\MSVC\14.37.32822\include\vector:52:0
 |  #1 0x00007ff74bbd3d64 `anonymous namespace'::DecoderEmitter::emitTable D:\buildbot\llvm-worker\clang-cmake-x86_64-avx512-win\llvm\llvm\utils\TableGen\DecoderEmitter.cpp:852:0
```
2025-04-18 10:05:40 -07:00
Rahul Joshi
3ed83630b2
[NFC][LLVM][TableGen] Use decodeULEB128 for OPC_SoftFail emission (#136220)
- Use `decodeULEB128` to decode +ve/-ve mask in OPC_SoftFail case.
- Use current `I`/`E` iterators as inputs to `decodeULEB128`.
2025-04-18 05:12:35 -07:00
Rahul Joshi
6c4caae449
[LLVM][TableGen] Move DecoderEmitter output to anonymous namespace (#136214)
- Move the code generated by DecoderEmitter to anonymous namespace.
- Move AMDGPU's usage of this code from header file to .cpp file.

Note, we get build errors like "call to function 'decodeInstruction'
that is neither visible in the template definition nor found by
argument-dependent lookup" if we do not change AMDGPU.
2025-04-18 04:35:05 -07:00
Rahul Joshi
6d8bf3cf3d
Revert "Reapply "[LLVM][TableGen] Parameterize NumToSkip in DecoderEmitter" (#136017)" (#136068)
Reverts llvm/llvm-project#136019

Expensive checks tests are failing, so reverting.
2025-04-16 18:24:10 -07:00
Rahul Joshi
8ebdd9d8a1
Reapply "[LLVM][TableGen] Parameterize NumToSkip in DecoderEmitter" (#136017) (#136019)
This reverts commit 7fd0c8acd4659ccd0aef5486afe32c8ddf0f2957, and fixes
the assert condition in `patchNumToSkip`.
2025-04-16 15:40:34 -07:00
Rahul Joshi
7fd0c8acd4
Revert "[LLVM][TableGen] Parameterize NumToSkip in DecoderEmitter" (#136017)
Reverts llvm/llvm-project#135882

Causing assert failures for AArch64 backend
2025-04-16 13:16:32 -07:00
Rahul Joshi
598ec8ce2d
[LLVM][TableGen] Parameterize NumToSkip in DecoderEmitter (#135882)
- Add command line option `num-to-skip-size` to parameterize the size of
`NumToSkip` bytes in the decoder table. Default value will be 2, and
targets that need larger size can use 3.
- Keep all existing targets, except AArch64, to use size 2, and change
AArch64 to use size 3 since it run into the "disassembler decoding table
too large" error with size 2.
- Following is a rough reduction in size for the decoder tables by
switching to size 2.

```
Target         Old Size   New Size   % Reduction
================================================
AArch64           153254     153254        0.00
AMDGPU            471566     412805       12.46
ARC                 5724       5061       11.58
ARM                84936      73831       13.07
AVR                 1497       1306       12.76
BPF                 2172       1927       11.28
CSKY               10064       8692       13.63
Hexagon            47967      41965       12.51
Lanai               1108        982       11.37
LoongArch          24446      21621       11.56
MSP430              4200       3716       11.52
Mips               36330      31415       13.53
PPC                31897      28098       11.91
RISCV              37979      32790       13.66
Sparc               8331       7252       12.95
SystemZ            36722      32248       12.18
VE                 48296      42873       11.23
XCore               2590       2316       10.58
Xtensa              3827       3316       13.35
```
2025-04-16 13:07:58 -07:00
Rahul Joshi
4780658823
[NFC][TableGen] DecoderEmitter optimize scope stack in Filter::emitTableEntry (#135693)
- Create a new stack scope only in the fallthrough case.
- For the non-fallthrough cases, any fixup entries will naturally be
added to the existing scope without needing to copy them manually.
- Verified that the generated `GenDisassembler` files are identical with
and without this change.
2025-04-15 07:49:01 -07:00
Rahul Joshi
9ba65cbcb5
[NFC][TableGen] Refactor DecoderEmitter.cpp (#135510)
- Add helper functions to insert ULEB128 encoded value and NumToSkip.
- Use ArrayRef<> instead of const vector references as function
  arguments.
- Return `OpHasCompleteDecoder` by value instead of by reference.
- Use range for loops.
- Remove {} around single line if/else bodies.
- In `emitSoftFailTableEntry`, unconditionally emit the Positive and
  Negative mask values, instead of explicitly emitting a 0 byte when
  the mask is not needed.
2025-04-14 14:09:00 -07:00
Craig Topper
40c859a704
[TableGen] Use size returned by encodeULEB128 to simplify some code. NFC (#133750)
We can use the length to insert all the bytes at once instead of
partially decoding them to insert one byte at a time.
2025-03-31 15:58:36 -07:00
Kazu Hirata
2c73711995
[TableGen] Use llvm::append_range (NFC) (#133649) 2025-03-30 12:21:38 -07:00
Craig Topper
fd21d35178
[TableGen] Reduce the number of vectors passed to getIslands. NFC (#130402)
Combine the StartBits, EndBits, and FieldVals vectors into a single
vector of a struct that contains all 3 pieces of information.

Instead of storing EndBits, we store NumBits since that's what the users
want.

I've removed the BitNo variable as it was easy to construct calculate
from StartBit. I've also removed Num in favor of Islands.size().
2025-03-10 21:02:09 -07:00
Craig Topper
f2607df291 [TableGen] Use uint8_t for bit_value_t enum. NFC
This reduces the amount of space needed for vectors of bit_value_t
and allows the user of memset.

Also reorder the enum values so BIT_FALSE is 0 and BIT_TRUE is 1.
2025-03-07 23:22:45 -08:00
Craig Topper
d65719fab3 [TableGen] Use isUInt to simplify some asserts. NFC 2025-03-07 22:51:43 -08:00
Craig Topper
8370ac88af [TableGen] Remove push_back from loop. NFC
We can initialize the vector to the right size and then assign
over some entries in the loop.
2025-03-07 22:01:15 -08:00
Craig Topper
f578982490 [TableGen] Remove unnecessary const_cast. NFC 2025-03-07 21:52:29 -08:00
Craig Topper
ff033d1f28 [TableGen] Use reference instead of pointer for FilterChooser in Filter. NFC 2025-03-07 19:11:31 -08:00
Craig Topper
6a42dc694c
[TableGen] Simplify emitULEB128 in DecoderEmitter.cpp. NFC (#130214)
Instead of returning the number of bytes emitted, just take the iterator
by reference so the increments in emitULEB128 will update the copy in
the caller.

Also pass the iterator by reference to emitNumToSkip so we don't need a
separate I += 3 in the caller.
2025-03-07 11:09:34 -08:00
Craig Topper
efb880de11 [TableGen] Fix incorrect comment. NFC 2025-03-03 14:37:37 -08:00
Craig Topper
3ce67a81fa [TableGen] Remove unnecessary use of utostr to print a byte. NFC
We can cast to unsigned instead.
2025-03-03 14:37:37 -08:00
chrisPyr
71f4c7dabe
[NFC]Make file-local cl::opt global variables static (#126486)
#125983
2025-03-03 13:46:33 +07:00
Craig Topper
4059faf613 [TableGen] Update comment for size of NumToSkip field in DecoderEmitter. NFC
NumToSkip is 24 bits. It used to be 16 bits.
2025-02-26 10:12:38 -08:00
Jay Foad
4e8c9d2813
[TableGen] Use std::pair instead of std::make_pair. NFC. (#123174)
Also use brace initialization and emplace to avoid explicitly 
constructing std::pair, and the same for std::tuple.
2025-01-16 13:20:41 +00:00
abhishek-kaushik22
943b212d56
[TableGen] Use std::move to avoid copy (#123088) 2025-01-15 22:50:00 +05:30
abhishek-kaushik22
31ce47b5d6
[TableGen] Use std::move to avoid copy (#113061) 2024-11-21 11:48:46 -08:00
Rahul Joshi
62e2c7fb2d
[LLVM][TableGen] Change all Init pointers to const (#112705)
This is a part of effort to have better const correctness in TableGen
backends:


https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
2024-10-18 07:50:22 -07:00
Rahul Joshi
708567ab0b
[LLVM][TableGen] Adopt indent for indentation (#109275)
Adopt `indent` for indentation DAGISelMatcher and DecoderEmitter.
2024-09-20 04:28:01 -07:00
Rahul Joshi
b594b93024
[LLVM][TableGen] Change DisassemblerEmitter to use const RecordKeeper (#109177)
Change DisassemblerEmitter to use const RecordKeeper.

This is a part of effort to have better const correctness in TableGen
backends:


https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
2024-09-20 04:22:37 -07:00
Rahul Joshi
3e24dd42dd
[NFC] Rename variables to conform to LLVM coding standards (#109166)
Rename `indent` to `Indent` and `o` to `OS`.
Rename `Indentation` to `Indent`.
Remove unused argument from `emitPredicateMatch`.
Change `Indent` argument to `emitBinaryParser` to by value.
2024-09-19 04:49:12 -07:00
Rahul Joshi
2bb3621faa
[LLVM][TableGen] Change DecoderEmitter to use const RecordKeeper (#109040)
Change DecoderEmitter to use const RecordKeeper.

This is a part of effort to have better const correctness in TableGen
backends:


https://discourse.llvm.org/t/psa-planned-changes-to-tablegen-getallderiveddefinitions-api-potential-downstream-breakages/81089
2024-09-18 05:35:26 -07:00