It can never be reached. It could be reached if we emitted an opcode
that could fall outside the outermost scope, but emission of all such
opcodes is guarded by `!isOutermostScope()`.
That also means we never add fixups to the outermost scope, so avoid
pushing an entry for it onto the stack.
There is no target named Thumb, so there is no need to make a special
case for it.
As part of this change, pass CodeGenTarget instead of DecoderEmitter
to FilterChooser to remove dependency between the latter two.
Change `InstructionEncoding` to use `Size` field to derive the BitWidth
for fixed length instructions as opposed to the number of bits in the
`Inst` field. For some backends, `Inst` has more bits than `Size`, but
`Size` is the true size of the instruction.
Also add validation that `Inst` has at least `Size * 8` bits and any
bits in `Inst` beyond that are either 0 or unset.
Verified no change in generated *GenDisassembler.inc files before/after.
* Inline two small functions so that `emitTableEntries()` calls itself
directly rather than through other functions.
* Peel the last iteration of the loop as it is special.
This should make the code easier to follow.
So we can see the changes in table sizes after making changes to
DecoderEmitter by simply running `grep DecoderTable`.
Also, remove an unnecessary terminating 0 from the end of the tables.
There was a lot of confusion about the responsibilities of Filter and
FilterChooser. They created instances of each other and called each
other's methods. Some of the methods had similar names and did similar
things.
This change moves most of the Filter members to FilterChooser and turns
Filter into a supplementary class with short lifetime. FilterChooser
constructs an array of (candidate) Filters, chooses the best performing
one, and applies it to the given set of encodings, creating inferior
FilterChoosers as necessary. The Filter array is then destroyed. All
responsibility for generating the decoder table now lies with
FilterChooser.
Extract `findBestFilter() const` searching for the best filter and
move calls to `recurse()` out of it to a single place.
Extract `dump()` as well, it is useful for debugging.
See the added test. Before this change the decoder would first read
the second byte, despite the fact that there are 1-byte instructions
that could match:
```
/* 0 */ MCD::OPC_ExtractField, 8, 8, // Inst{15-8} ...
/* 3 */ MCD::OPC_FilterValue, 0, 4, 0, // Skip to: 11
/* 7 */ MCD::OPC_Decode, 186, 2, 0, // Opcode: I16_0, DecodeIdx: 0
/* 11 */ MCD::OPC_FilterValue, 1, 4, 0, // Skip to: 19
/* 15 */ MCD::OPC_Decode, 187, 2, 0, // Opcode: I16_1, DecodeIdx: 0
/* 19 */ MCD::OPC_FilterValue, 2, 4, 0, // Skip to: 27
/* 23 */ MCD::OPC_Decode, 188, 2, 0, // Opcode: I16_2, DecodeIdx: 0
/* 27 */ MCD::OPC_ExtractField, 0, 1, // Inst{0} ...
/* 30 */ MCD::OPC_FilterValue, 0, 4, 0, // Skip to: 38
/* 34 */ MCD::OPC_Decode, 189, 2, 1, // Opcode: I8_0, DecodeIdx: 1
/* 38 */ MCD::OPC_FilterValueOrFail, 1,
/* 40 */ MCD::OPC_Decode, 190, 2, 1, // Opcode: I8_1, DecodeIdx: 1
/* 44 */ MCD::OPC_Fail,
```
There are no changes in the generated files. The only in-tree target
that uses variable length decoder is M68k, which is free of decoding
conflicts that could result in the decoder doing OOB access.
This also fixes misaligned "Decoding Conflict" dump,
prettified example output is shown in the second test.
`NumFiltered` is the number of elements in all vectors in a map.
It is ever compared to 1, which is equivalent to checking if the map
contains exactly one vector with exactly one element.
Parse the `Inst` and `SoftField` fields once and store them in
`InstructionEncoding` so that we don't parse them every time
`getMandatoryEncodingBits()` is called.
As the FIXME says, we might generate the wrong code to decode an
instruction if it had an operand with no encoding bits. An example is
M68k's `MOV16ds` that is defined as follows:
```
dag OutOperandList = (outs MxDRD16:$dst);
dag InOperandList = (ins SRC:$src);
list<Register> Uses = [SR];
string AsmString = "move.w\t$src, $dst"
dag Inst = (descend { 0, 1, 0, 0, 0, 0, 0, 0, 1, 1 },
(descend { 0, 0, 0 }, (operand "$dst", 3)));
```
The `$src` operand is not encoded, but what we see in the decoder is:
```C++
tmp = fieldFromInstruction(insn, 0, 3);
if (!Check(S, DecodeDR16RegisterClass(MI, tmp, Address, Decoder)))
{ return MCDisassembler::Fail; }
if (!Check(S, DecodeSRCRegisterClass(MI, insn, Address, Decoder)))
{ return MCDisassembler::Fail; }
return S;
```
This calls DecodeSRCRegisterClass passing it `insn` instead of the value
of a field that doesn't exist. DecodeSRCRegisterClass has an
unconditional llvm_unreachable inside it.
New decoder looks like:
```C++
tmp = fieldFromInstruction(insn, 0, 3);
if (!Check(S, DecodeDR16RegisterClass(MI, tmp, Address, Decoder)))
{ return MCDisassembler::Fail; }
return S;
```
We're still not disassembling this instruction right, but at least we no
longer have to provide a weird operand decoder method that accepts
instruction bits instead of operand bits.
See #154477 for the origins of the FIXME.
If an encoding has a custom decoder, the decoder is assumed to be
"complete" (always succeed) if hasCompleteDecoder field is true. We
determine this when constructing InstructionEncoding.
If the decoder for an encoding is *generated*, it always succeeds if
none of the operand decoders can fail. The latter is determined based on
the value of operands' DecoderMethod/hasCompleteDecoder. This happens
late, at table construction time, making the code harder to follow.
This change moves this logic to the InstructionEncoding constructor.
It is going to grow, so it makes sense to move its definition
out of class. Instead, inline `populateInstruction()` into it.
Also, rename a couple of methods to better convey their meaning.
We used to abuse Operands list to store instruction encoding's
DecoderMethod there. Let's store it in the InstructionEncoding class
instead, where it belongs.
This is where they belong, no need to maintain a separate map keyed by
encoding ID.
`populateInstruction()` has been made a member of `InstructionEncoding`
and is now called from the constructor.
Follow-up to #154288.
With HwModes involved, we used to analyze the same encoding multiple
times (unless `-suppress-per-hwmode-duplicates=O2` is specified). This
affected the build time and made the statistics inaccurate.
From the point of view of the generated code, this is an NFC.
When HwModes are involved, we can duplicate an instruction encoding that
does not belong to any HwMode multiple times. We can do better by
mapping HwMode to a list of encoding IDs it contains. (That is,
duplicate IDs instead of encodings.)
The encodings that were duplicated are still processed multiple times
(e.g., we call an expensive populateInstruction() on each instance).
This is going to be fixed in subsequent patches.
Previously, HW mode name was appended to decoder namespace name when
enumerating encodings, and then emitTable appended the bit width to it
to form the final table name. Let's do this all in one place.
A nice side effect is that this allows us to avoid having to deal with
std::string.
The changes in the tests are caused by the different order of tables.
This reverts commit b20bbd48e8b1966731a284b4208e048e060e97c2.
Reverted due to greendragon failures:
20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/utils/TableGen/DecoderEmitter.cpp:14:
20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/utils/TableGen/Common/CodeGenHwModes.h:14:
20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/DenseMap.h:20:
20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/STLExtras.h:21:
20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/Hashing.h:53:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/algorithm:1913:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/chrono:746:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__chrono/convert_to_tm.h:19:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__chrono/statically_widen.h:17:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__format/concepts.h:17:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__format/format_parse_context.h:15:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/string_view:1027:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/functional:515:
20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__functional/boyer_moore_searcher.h:26:
20:34:43 /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/vector:1376:19: error: object of type 'llvm::const_set_bits_iterator_impl<llvm::SmallBitVector>' cannot be assigned because its copy assignment operator is implicitly deleted
20:34:43 __mid = __first;
20:34:43 ^
20:34:43 /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/utils/TableGen/DecoderEmitter.cpp:2404:13: note: in instantiation of function template specialization 'std::vector<unsigned int>::assign<llvm::const_set_bits_iterator_impl<llvm::SmallBitVector>, 0>' requested here
20:34:43 HwModeIDs.assign(BV.set_bits_begin(), BV.set_bits_end());
20:34:43 ^
20:34:43 /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/BitVector.h:35:21: note: copy assignment operator of 'const_set_bits_iterator_impl<llvm::SmallBitVector>' is implicitly deleted because field 'Parent' is of reference type 'const llvm::SmallBitVector &'
20:34:43 const BitVectorT &Parent;
20:34:43 ^
20:34:43 1 warning and 1 error generated.
`NO_FIXED_SEGMENTS_SENTINEL` has a value that is actually a valid field
encoding and so it cannot be used as a sentinel.
Replace the sentinel with a new member variable, `VariableFC`, that
contains the value previously stored in `FilterChooserMap` with
`NO_FIXED_SEGMENTS_SENTINEL` key.
Most of the time we don't need instruction opcode. There is no need to
carry it around all the time, we can easily get it by other means.
Rename affected variables accordingly.
Part of an effort to simplify DecoderEmitter code.