As the FIXME says, we might generate the wrong code to decode an
instruction if it had an operand with no encoding bits. An example is
M68k's `MOV16ds` that is defined as follows:
```
dag OutOperandList = (outs MxDRD16:$dst);
dag InOperandList = (ins SRC:$src);
list<Register> Uses = [SR];
string AsmString = "move.w\t$src, $dst"
dag Inst = (descend { 0, 1, 0, 0, 0, 0, 0, 0, 1, 1 },
(descend { 0, 0, 0 }, (operand "$dst", 3)));
```
The `$src` operand is not encoded, but what we see in the decoder is:
```C++
tmp = fieldFromInstruction(insn, 0, 3);
if (!Check(S, DecodeDR16RegisterClass(MI, tmp, Address, Decoder)))
{ return MCDisassembler::Fail; }
if (!Check(S, DecodeSRCRegisterClass(MI, insn, Address, Decoder)))
{ return MCDisassembler::Fail; }
return S;
```
This calls DecodeSRCRegisterClass passing it `insn` instead of the value
of a field that doesn't exist. DecodeSRCRegisterClass has an
unconditional llvm_unreachable inside it.
New decoder looks like:
```C++
tmp = fieldFromInstruction(insn, 0, 3);
if (!Check(S, DecodeDR16RegisterClass(MI, tmp, Address, Decoder)))
{ return MCDisassembler::Fail; }
return S;
```
We're still not disassembling this instruction right, but at least we no
longer have to provide a weird operand decoder method that accepts
instruction bits instead of operand bits.
See #154477 for the origins of the FIXME.
If an encoding has a custom decoder, the decoder is assumed to be
"complete" (always succeed) if hasCompleteDecoder field is true. We
determine this when constructing InstructionEncoding.
If the decoder for an encoding is *generated*, it always succeeds if
none of the operand decoders can fail. The latter is determined based on
the value of operands' DecoderMethod/hasCompleteDecoder. This happens
late, at table construction time, making the code harder to follow.
This change moves this logic to the InstructionEncoding constructor.
It is going to grow, so it makes sense to move its definition
out of class. Instead, inline `populateInstruction()` into it.
Also, rename a couple of methods to better convey their meaning.
We used to abuse Operands list to store instruction encoding's
DecoderMethod there. Let's store it in the InstructionEncoding class
instead, where it belongs.
This is where they belong, no need to maintain a separate map keyed by
encoding ID.
`populateInstruction()` has been made a member of `InstructionEncoding`
and is now called from the constructor.
Follow-up to #154288.
With HwModes involved, we used to analyze the same encoding multiple
times (unless `-suppress-per-hwmode-duplicates=O2` is specified). This
affected the build time and made the statistics inaccurate.
From the point of view of the generated code, this is an NFC.
When HwModes are involved, we can duplicate an instruction encoding that
does not belong to any HwMode multiple times. We can do better by
mapping HwMode to a list of encoding IDs it contains. (That is,
duplicate IDs instead of encodings.)
The encodings that were duplicated are still processed multiple times
(e.g., we call an expensive populateInstruction() on each instance).
This is going to be fixed in subsequent patches.
Previously, HW mode name was appended to decoder namespace name when
enumerating encodings, and then emitTable appended the bit width to it
to form the final table name. Let's do this all in one place.
A nice side effect is that this allows us to avoid having to deal with
std::string.
The changes in the tests are caused by the different order of tables.
This reverts commit b20bbd48e8b1966731a284b4208e048e060e97c2.
Reverted due to greendragon failures:
20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/utils/TableGen/DecoderEmitter.cpp:14:
20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/utils/TableGen/Common/CodeGenHwModes.h:14:
20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/DenseMap.h:20:
20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/STLExtras.h:21:
20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/Hashing.h:53:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/algorithm:1913:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/chrono:746:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__chrono/convert_to_tm.h:19:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__chrono/statically_widen.h:17:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__format/concepts.h:17:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__format/format_parse_context.h:15:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/string_view:1027:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/functional:515:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__functional/boyer_moore_searcher.h:26:
20:34:43 /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/vector:1376:19: error: object of type 'llvm::const_set_bits_iterator_impl<llvm::SmallBitVector>' cannot be assigned because its copy assignment operator is implicitly deleted
20:34:43 __mid = __first;
20:34:43 ^
20:34:43 /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/utils/TableGen/DecoderEmitter.cpp:2404:13: note: in instantiation of function template specialization 'std::vector<unsigned int>::assign<llvm::const_set_bits_iterator_impl<llvm::SmallBitVector>, 0>' requested here
20:34:43 HwModeIDs.assign(BV.set_bits_begin(), BV.set_bits_end());
20:34:43 ^
20:34:43 /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/BitVector.h:35:21: note: copy assignment operator of 'const_set_bits_iterator_impl<llvm::SmallBitVector>' is implicitly deleted because field 'Parent' is of reference type 'const llvm::SmallBitVector &'
20:34:43 const BitVectorT &Parent;
20:34:43 ^
20:34:43 1 warning and 1 error generated.
`NO_FIXED_SEGMENTS_SENTINEL` has a value that is actually a valid field
encoding and so it cannot be used as a sentinel.
Replace the sentinel with a new member variable, `VariableFC`, that
contains the value previously stored in `FilterChooserMap` with
`NO_FIXED_SEGMENTS_SENTINEL` key.
Most of the time we don't need instruction opcode. There is no need to
carry it around all the time, we can easily get it by other means.
Rename affected variables accordingly.
Part of an effort to simplify DecoderEmitter code.
* Print filter stack in non-reversed order.
* Print encoding name to the right of encoding bits to deal with
alignment issues.
* Use the correct bit width when printing encoding bits.
Example of old output:
```
01000100........
01000...........
0100............
................
tADDhirr 000000000000000001000100________
tADDrSP 000000000000000001000100_1101___
tADDspr 0000000000000000010001001____101
```
New output:
```
................
0100............
01000...........
01000100........
01000100________ tADDhirr
01000100_1101___ tADDrSP
010001001____101 tADDspr
```
Only one element of the `Filters` vector (see `BestIndex`) is used
outside the method that fills it. Localize the vector to the method,
replacing the member variable with the only used element.
Part of an effort to simplify DecoderEmitter code.
The containers passed by reference are always empty on entry to the
functions that fill them. Return them by value instead and let the
compiler do the return value optimization.
The `insertBits` templated function generated by DecoderEmitter is
called with variable `tmp` of type `TmpType` which is:
```
using TmpType = std::conditional_t<std::is_integral<InsnType>::value, InsnType, uint64_t>;
```
That is, `TmpType` is always an integral type. Change the generated
`insertBits` to be valid only for integer types, and eliminate the
unused `insertBits` function from `DecoderUInt128` in
AMDGPUDisassembler.h
Additionally, drop some of the requirements `InsnType` must support as
they no longer seem to be required.
Add a convenience wrapper struct for the `bit_value_t` enum type to host
various constructors, query, and printing support. Also refactor related
code in several places. In `getBitsField`, use `llvm::append_range` and
`SmallVector::append()` and eliminate manual loops. Eliminate
`emitNameWithID` and instead use the `operator <<` that does the same
thing as this function. Have `BitValue::getValue()` (replacement for
`Value`) return std::optional<> instead of -1 for unset bits. Terminate
with a fatal error when a decoding conflict is encountered.
Add option `use-fn-table-in-decode-to-mcinst` to use a table of function
pointers instead of a switch case in the generated `decodeToMCInst`
function.
When the number of switch cases in this function is large, the generated
code takes a long time to compile in release builds. Using a table of
function pointers instead improves the compile time significantly (~3x
speedup in compiling the code in a downstream target). This option will
allow targets to opt into this mode if they desire for better build
times.
Tested with `check-llvm-mc` with the option enabled by default.
Also assign variable names to different elements of `OpMap` for better
readibility, and eliminate `NumberedEncodingsRef` as `std::vector` will
automatically get converted to an `ArrayRef`.
TryDecode/CheckPredicate/SoftFail MCD ops are not used by many targets.
Track the set of opcodes that were emitted and emit code for handling
TryDecode/CheckPredicate/SoftFail ops when decoding only if there were
emitted. This is purely eliminating dead code in the generated
`decodeInstruction` function.
This results in the following reduction in the size of the Disassembler
.so files with a release x86_64 release build on Linux:
```
Target Old Size New Size % reduction
build/lib/libLLVMAArch64Disassembler.so.21.0git 256656 256656 0.00
build/lib/libLLVMAMDGPUDisassembler.so.21.0git 813000 808168 0.59
build/lib/libLLVMARCDisassembler.so.21.0git 44816 43536 2.86
build/lib/libLLVMARMDisassembler.so.21.0git 281744 278808 1.04
build/lib/libLLVMAVRDisassembler.so.21.0git 36040 34496 4.28
build/lib/libLLVMBPFDisassembler.so.21.0git 26248 23168 11.73
build/lib/libLLVMCSKYDisassembler.so.21.0git 55960 53632 4.16
build/lib/libLLVMHexagonDisassembler.so.21.0git 115952 113416 2.19
build/lib/libLLVMLanaiDisassembler.so.21.0git 24360 21008 13.76
build/lib/libLLVMLoongArchDisassembler.so.21.0git 58584 56168 4.12
build/lib/libLLVMM68kDisassembler.so.21.0git 57264 53880 5.91
build/lib/libLLVMMSP430Disassembler.so.21.0git 28896 28440 1.58
build/lib/libLLVMMipsDisassembler.so.21.0git 123128 120568 2.08
build/lib/libLLVMPowerPCDisassembler.so.21.0git 80656 78096 3.17
build/lib/libLLVMRISCVDisassembler.so.21.0git 154080 150200 2.52
build/lib/libLLVMSparcDisassembler.so.21.0git 42040 39568 5.88
build/lib/libLLVMSystemZDisassembler.so.21.0git 97056 94552 2.58
build/lib/libLLVMVEDisassembler.so.21.0git 83944 81352 3.09
build/lib/libLLVMWebAssemblyDisassembler.so.21.0git 25280 25280 0.00
build/lib/libLLVMX86Disassembler.so.21.0git 2920624 2920624 0.00
build/lib/libLLVMXCoreDisassembler.so.21.0git 48320 44288 8.34
build/lib/libLLVMXtensaDisassembler.so.21.0git 42248 35840 15.17
```
Print DecodeIdx associated with Decode MCD ops in the generated decoder
tables. This can help in debugging decode failures by first mapping the
Op -> DecodeIdx and then inspecting the code in `decodeToMCInst`
associated with that DecodeIdx.
Introduce `OrFail` variants for all MCD Decoder Ops that have
`NumToSKip` encoded with them. This is intended to capture the common
case of jumps to the end of the decoder table which has a `OP_Fail` at
the end. Using the `OrFail` variants of these ops avoid encoding the
`NumToSkip` jump offset for these cases, resulting in a reduction in the
size of the decoder tables (from 5 - 17%). Additionally, for the AArch64
target, the table size reduces enough to switch to using 2-byte
`NumToSkip` encoding instead of existing 3-bytes, resulting in a net 30%
reduction in the size of the decoder table.
The total reduction in the size of the decoder tables for different
targets is as follows (computed using the following command: `for i in
*.inc; do echo -n ``basename $i: ``; grep "MCD::OPC_Fail," $i | awk
'{sum += $2} END { print sum}'; done`)
```
Target Old Size New Size % Reduction
================================================
AArch64 153268 106987 30.20
AMDGPU 412056 340856 17.28
ARC 5061 4605 9.01
ARM 73831 60847 17.59
AVR 1306 1158 11.33
BPF 1927 1795 6.85
CSKY 8692 6922 20.36
Hexagon 41965 34759 17.17
Lanai 982 924 5.91
LoongArch 21629 20035 7.37
M68k 13461 11689 13.16
MSP430 3716 3384 8.93
Mips 31415 25771 17.97
PPC 28931 24771 14.38
RISCV 34800 28352 18.53
Sparc 7432 6236 16.09
SystemZ 32248 29716 7.85
VE 42873 36923 13.88
XCore 2316 2196 5.18
Xtensa 3443 2793 18.88
```