llvm-project

Author	SHA1	Message	Date
Rahul Joshi	4eeeb8a01e	[NFC][MC][Decoder] Fix off-by-one indentation in generated code (#154855 )	2025-08-21 17:20:05 -07:00
Sergei Barannikov	c74afaac6c	[TableGen][DecoderEmitter] Use KnownBits for filters/encodings (NFCI) (#154691 ) `KnownBits` is faster and smaller than `std::vector<BitValue>`. It is also more convenient to use.	2025-08-22 01:37:47 +03:00
Sergei Barannikov	33f6b10c17	[TableGen][DecoderEmitter] Resolve a FIXME in emitDecoder (#154649 ) As the FIXME says, we might generate the wrong code to decode an instruction if it had an operand with no encoding bits. An example is M68k's `MOV16ds` that is defined as follows: ``` dag OutOperandList = (outs MxDRD16:$dst); dag InOperandList = (ins SRC:$src); list<Register> Uses = [SR]; string AsmString = "move.w\t$src, $dst" dag Inst = (descend { 0, 1, 0, 0, 0, 0, 0, 0, 1, 1 }, (descend { 0, 0, 0 }, (operand "$dst", 3))); ``` The `$src` operand is not encoded, but what we see in the decoder is: ```C++ tmp = fieldFromInstruction(insn, 0, 3); if (!Check(S, DecodeDR16RegisterClass(MI, tmp, Address, Decoder))) { return MCDisassembler::Fail; } if (!Check(S, DecodeSRCRegisterClass(MI, insn, Address, Decoder))) { return MCDisassembler::Fail; } return S; ``` This calls DecodeSRCRegisterClass passing it `insn` instead of the value of a field that doesn't exist. DecodeSRCRegisterClass has an unconditional llvm_unreachable inside it. New decoder looks like: ```C++ tmp = fieldFromInstruction(insn, 0, 3); if (!Check(S, DecodeDR16RegisterClass(MI, tmp, Address, Decoder))) { return MCDisassembler::Fail; } return S; ``` We're still not disassembling this instruction right, but at least we no longer have to provide a weird operand decoder method that accepts instruction bits instead of operand bits. See #154477 for the origins of the FIXME.	2025-08-21 22:22:16 +00:00
Rahul Joshi	22f8693248	[NFC][MC][Decoder] Extract fixed pieces of decoder code into new header file (#154802 ) Extract fixed functions generated by decoder emitter into a new MCDecoder.h header.	2025-08-21 15:06:43 -07:00
Sergei Barannikov	2421929ca6	[TableGen][DecoderEmitter] Infer encoding's HasCompleteDecoder earlier (NFCI) (#154644 ) If an encoding has a custom decoder, the decoder is assumed to be "complete" (always succeed) if hasCompleteDecoder field is true. We determine this when constructing InstructionEncoding. If the decoder for an encoding is generated, it always succeeds if none of the operand decoders can fail. The latter is determined based on the value of operands' DecoderMethod/hasCompleteDecoder. This happens late, at table construction time, making the code harder to follow. This change moves this logic to the InstructionEncoding constructor.	2025-08-21 21:35:30 +00:00
Sergei Barannikov	b96d5c2452	[TableGen][DecoderEmitter] Outline InstructionEncoding constructor (NFC) (#154673 ) It is going to grow, so it makes sense to move its definition out of class. Instead, inline `populateInstruction()` into it. Also, rename a couple of methods to better convey their meaning.	2025-08-21 06:08:57 +00:00
Sergei Barannikov	46343ca374	[TableGen][DecoderEmitter] Add DecoderMethod to InstructionEncoding (NFC) (#154477 ) We used to abuse Operands list to store instruction encoding's DecoderMethod there. Let's store it in the InstructionEncoding class instead, where it belongs.	2025-08-20 21:59:59 +00:00
Sergei Barannikov	19ac1ff56e	[TableGen][DecoderEmitter] Factor populateFixedLenEncoding (NFC) (#154511 ) Also drop the debug code under `#if 0` and a seemingly outdated comment.	2025-08-20 11:34:59 +00:00
Sergei Barannikov	9ae0bd2c9f	[TableGen][DecoderEmitter] Move Operands to InstructionEncoding (NFCI) (#154456 ) This is where they belong, no need to maintain a separate map keyed by encoding ID. `populateInstruction()` has been made a member of `InstructionEncoding` and is now called from the constructor.	2025-08-20 07:10:34 +03:00
Sergei Barannikov	8666ffdd15	[TableGen][DecoderEmitter] Rename some variables (NFC) And change references to pointers, to make the future diff smaller.	2025-08-20 04:55:07 +03:00
Sergei Barannikov	6462223853	[TableGen] Make ParseOperandName method const (NFC) Also change its name to start with a lowercase letter and update the doxygen comment to conform to the coding standard.	2025-08-20 03:21:15 +03:00
Sergei Barannikov	803edce6f7	[TableGen][DecoderEmitter] Analyze encodings once (#154309 ) Follow-up to #154288. With HwModes involved, we used to analyze the same encoding multiple times (unless `-suppress-per-hwmode-duplicates=O2` is specified). This affected the build time and made the statistics inaccurate. From the point of view of the generated code, this is an NFC.	2025-08-19 23:17:12 +00:00
Sergei Barannikov	07a6323c32	[TableGen][DecoderEmitter] Turn EncodingAndInst into a class (NFC) (#154230 ) The class will get more methods in follow-up patches.	2025-08-20 01:29:26 +03:00
Sergei Barannikov	56ce40bc73	[TableGen][DecoderEmitter] Stop duplicating encodings (NFC) (#154288 ) When HwModes are involved, we can duplicate an instruction encoding that does not belong to any HwMode multiple times. We can do better by mapping HwMode to a list of encoding IDs it contains. (That is, duplicate IDs instead of encodings.) The encodings that were duplicated are still processed multiple times (e.g., we call an expensive populateInstruction() on each instance). This is going to be fixed in subsequent patches.	2025-08-19 09:02:22 +00:00
Sergei Barannikov	cded128009	[TableGen][DecoderEmitter] Extract encoding parsing into a method (NFC) (#154271 ) Call it from the constructor so that we can make `run` method `const`. Turn a couple of related functions into methods as well.	2025-08-19 06:35:59 +00:00
Sergei Barannikov	6c3a0ab51a	[TableGen][DecoderEmitter] Shorten a few variable names (NFC) These "Numbered"-prefixed names were rather confusing than helpful.	2025-08-19 08:05:02 +03:00
Sergei Barannikov	f84ce1e1d0	[TableGen][DecoderEmitter] Extract a couple of loop invariants (NFC)	2025-08-19 07:47:15 +03:00
Sergei Barannikov	c8c2218c00	[TableGen][DecoderEmitter] Synthesize decoder table name in emitTable (#154255 ) Previously, HW mode name was appended to decoder namespace name when enumerating encodings, and then emitTable appended the bit width to it to form the final table name. Let's do this all in one place. A nice side effect is that this allows us to avoid having to deal with std::string. The changes in the tests are caused by the different order of tables.	2025-08-19 06:19:54 +03:00
Sergei Barannikov	61a859bf6f	Use llvm::copy instead of append_range to work around MacOS build failure	2025-08-19 01:43:22 +03:00
Sergei Barannikov	0cd4ae9be0	Reland "[TableGen][DecoderEmitter] Store HW mode ID instead of name (NFC) (#154052 )" (#154212 ) This reverts commit 5612dc533a9222a0f5561b2ba7c897115f26673f. Reland with MacOS build fixed.	2025-08-18 22:28:20 +00:00
Shubham Sandeep Rastogi	5612dc533a	Revert "[TableGen][DecoderEmitter] Store HW mode ID instead of name (NFC) (#154052 )" This reverts commit b20bbd48e8b1966731a284b4208e048e060e97c2. Reverted due to greendragon failures: 20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/utils/TableGen/DecoderEmitter.cpp:14: 20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/utils/TableGen/Common/CodeGenHwModes.h:14: 20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/DenseMap.h:20: 20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/STLExtras.h:21: 20:34:43 In file included from /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/Hashing.h:53: 20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/algorithm:1913: 20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/chrono:746: 20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__chrono/convert_to_tm.h:19: 20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__chrono/statically_widen.h:17: 20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__format/concepts.h:17: 20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__format/format_parse_context.h:15: 20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/string_view:1027: 20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/functional:515: 20:34:43 In file included from /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/__functional/boyer_moore_searcher.h:26: 20:34:43 /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/usr/include/c++/v1/vector:1376:19: error: object of type 'llvm::const_set_bits_iterator_impl<llvm::SmallBitVector>' cannot be assigned because its copy assignment operator is implicitly deleted 20:34:43 __mid = __first; 20:34:43 ^ 20:34:43 /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/utils/TableGen/DecoderEmitter.cpp:2404:13: note: in instantiation of function template specialization 'std::vector<unsigned int>::assign<llvm::const_set_bits_iterator_impl<llvm::SmallBitVector>, 0>' requested here 20:34:43 HwModeIDs.assign(BV.set_bits_begin(), BV.set_bits_end()); 20:34:43 ^ 20:34:43 /Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/BitVector.h:35:21: note: copy assignment operator of 'const_set_bits_iterator_impl<llvm::SmallBitVector>' is implicitly deleted because field 'Parent' is of reference type 'const llvm::SmallBitVector &' 20:34:43 const BitVectorT &Parent; 20:34:43 ^ 20:34:43 1 warning and 1 error generated.	2025-08-18 14:36:54 -07:00
Sergei Barannikov	13dd65096b	[TableGen][DecoderEmitter] Rename some variables for clarity (NFC)	2025-08-19 00:16:56 +03:00
Sergei Barannikov	b20bbd48e8	[TableGen][DecoderEmitter] Store HW mode ID instead of name (NFC) (#154052 ) This simplifies code a bit.	2025-08-18 22:53:09 +03:00
Sergei Barannikov	bad02e38c8	[TableGen][DecoderEmitter] Avoid using a sentinel value (#153986 ) `NO_FIXED_SEGMENTS_SENTINEL` has a value that is actually a valid field encoding and so it cannot be used as a sentinel. Replace the sentinel with a new member variable, `VariableFC`, that contains the value previously stored in `FilterChooserMap` with `NO_FIXED_SEGMENTS_SENTINEL` key.	2025-08-18 08:25:17 +03:00
Sergei Barannikov	9ddc043538	[TableGen] Use structured binding in one more place (NFC)	2025-08-18 06:15:44 +03:00
Sergei Barannikov	6947fb4556	[TableGen] Use structured binding in one place (NFC)	2025-08-17 23:50:23 +03:00
Sergei Barannikov	a10773c864	[TableGen][DecoderEmitter] Remove EncodingIDAndOpcode struct (NFC) (#154028 ) Most of the time we don't need instruction opcode. There is no need to carry it around all the time, we can easily get it by other means. Rename affected variables accordingly. Part of an effort to simplify DecoderEmitter code.	2025-08-17 20:13:48 +00:00
Sergei Barannikov	ea4325f174	[TableGen][DecoderEmitter] Improve conflicts dump (#154001 ) * Print filter stack in non-reversed order. * Print encoding name to the right of encoding bits to deal with alignment issues. * Use the correct bit width when printing encoding bits. Example of old output: ``` 01000100........ 01000........... 0100............ ................ tADDhirr 000000000000000001000100________ tADDrSP 000000000000000001000100_1101___ tADDspr 0000000000000000010001001____101 ``` New output: ``` ................ 0100............ 01000........... 01000100........ 01000100________ tADDhirr 01000100_1101___ tADDrSP 010001001____101 tADDspr ```	2025-08-17 06:42:25 +00:00
Sergei Barannikov	05f1673e75	[TableGen] Make a function static (NFC) Also, modernize the return value to std::optional.	2025-08-17 09:31:28 +03:00
Sergei Barannikov	05827e7ccb	[TableGen][DecoderEmitter] Dump conflicts earlier Dump a conflict as soon as we discover it, no need to wait until we start building the decoder table. This improves debugging experience.	2025-08-17 08:20:31 +03:00
Sergei Barannikov	fc6024d895	[TableGen][DecoderEmitter] Shrink lifetime of `Filters` vector (NFC) (#153998 ) Only one element of the `Filters` vector (see `BestIndex`) is used outside the method that fills it. Localize the vector to the method, replacing the member variable with the only used element. Part of an effort to simplify DecoderEmitter code.	2025-08-17 04:02:16 +00:00
Sergei Barannikov	7bb73455f7	[TableGen][DecoderEmitter] Add helpers for working with scopes (NFC) (#153979 ) Part of an effort to simplify DecoderEmitter code.	2025-08-16 21:49:17 +00:00
Sergei Barannikov	3acb679bda	[TableGen] Remove redundant variable (NFC)	2025-08-16 23:11:53 +03:00
Sergei Barannikov	56681c94f3	[TableGen][DecoderEmitter] Compute bit attribute once (NFC) (#153530 ) Pull the logic to compute bit attributes from `filterProcessor()` to its caller to avoid recomputing them on the second call.	2025-08-15 13:28:38 +03:00
Sergei Barannikov	a73403ba8a	[TableGen] Use `empty()` instead of `size() == 0` (NFC)	2025-08-14 06:36:24 +03:00
Sergei Barannikov	6abb6264ea	[TableGen] Declare loop induction variables in the loop header (NFC)	2025-08-14 05:48:16 +03:00
Sergei Barannikov	8f3254aa4a	[TableGen][DecoderEmitter] Returns insn_t / std::vector<Islands> by value (NFC) (#153354 ) The containers passed by reference are always empty on entry to the functions that fill them. Return them by value instead and let the compiler do the return value optimization.	2025-08-13 07:09:13 +00:00
Sergei Barannikov	1ffc38ca49	[TableGen][DecoderEmitter] Remove unused variables (NFC) (#153262 )	2025-08-12 20:21:01 +00:00
Sergei Barannikov	2f9f92ad01	[TableGen] Use getValueAsOptionalDef to simplify code (NFC) (#153170 )	2025-08-12 17:44:01 +03:00
Rahul Joshi	633728f3b5	[NFC][TableGen][DecoderEmitter] Eliminate `indent` for a few functions (#148718 ) Eliminate the `indent` argument for functions which are always called with `indent(0)`.	2025-07-14 15:23:41 -07:00
Rahul Joshi	23b4f4eb9b	[NFC][TableGen] Change DecoderEmitter `insertBits` to use integer types only (#147613 ) The `insertBits` templated function generated by DecoderEmitter is called with variable `tmp` of type `TmpType` which is: ``` using TmpType = std::conditional_t<std::is_integral<InsnType>::value, InsnType, uint64_t>; ``` That is, `TmpType` is always an integral type. Change the generated `insertBits` to be valid only for integer types, and eliminate the unused `insertBits` function from `DecoderUInt128` in AMDGPUDisassembler.h Additionally, drop some of the requirements `InsnType` must support as they no longer seem to be required.	2025-07-09 08:56:07 -07:00
Rahul Joshi	5f2e88a125	[NFC][TableGen] Rename `CodeGenTarget` instruction accessors (#146767 ) Rename `getXYZInstructionsByEnumValue()` to just `getXYZInstructions` and drop the `ByEnumValue` in the name.	2025-07-07 08:01:14 -07:00
Rahul Joshi	d7b8b65e23	[LLVM][TableGen][DecoderEmitter] Add wrapper struct for `bit_value_t` (#146248 ) Add a convenience wrapper struct for the `bit_value_t` enum type to host various constructors, query, and printing support. Also refactor related code in several places. In `getBitsField`, use `llvm::append_range` and `SmallVector::append()` and eliminate manual loops. Eliminate `emitNameWithID` and instead use the `operator <<` that does the same thing as this function. Have `BitValue::getValue()` (replacement for `Value`) return std::optional<> instead of -1 for unset bits. Terminate with a fatal error when a decoding conflict is encountered.	2025-07-01 07:36:17 -07:00
Rahul Joshi	92b50959da	[NFC][TableGen] Capitalize `to` in `UseFnTableInDecodetoMCInst`. (#146419 )	2025-06-30 16:12:15 -07:00
Rahul Joshi	ed5f8f238d	[LLVM][DecoderEmitter] Add option to use function table in decodeToMCInst (#144814 ) Add option `use-fn-table-in-decode-to-mcinst` to use a table of function pointers instead of a switch case in the generated `decodeToMCInst` function. When the number of switch cases in this function is large, the generated code takes a long time to compile in release builds. Using a table of function pointers instead improves the compile time significantly (~3x speedup in compiling the code in a downstream target). This option will allow targets to opt into this mode if they desire for better build times. Tested with `check-llvm-mc` with the option enabled by default.	2025-06-24 18:49:05 -07:00
Rahul Joshi	376b71442d	[NFC][TableGen][DecoderEmitter] Use structured binding in range for loop (#144890 ) Also assign variable names to different elements of `OpMap` for better readibility, and eliminate `NumberedEncodingsRef` as `std::vector` will automatically get converted to an `ArrayRef`.	2025-06-20 06:41:48 -07:00
Rahul Joshi	816ab1af0d	[NFCI][TableGen][DecoderEmitter] Cull Op handling when possible (#142974 ) TryDecode/CheckPredicate/SoftFail MCD ops are not used by many targets. Track the set of opcodes that were emitted and emit code for handling TryDecode/CheckPredicate/SoftFail ops when decoding only if there were emitted. This is purely eliminating dead code in the generated `decodeInstruction` function. This results in the following reduction in the size of the Disassembler .so files with a release x86_64 release build on Linux: ``` Target Old Size New Size % reduction build/lib/libLLVMAArch64Disassembler.so.21.0git 256656 256656 0.00 build/lib/libLLVMAMDGPUDisassembler.so.21.0git 813000 808168 0.59 build/lib/libLLVMARCDisassembler.so.21.0git 44816 43536 2.86 build/lib/libLLVMARMDisassembler.so.21.0git 281744 278808 1.04 build/lib/libLLVMAVRDisassembler.so.21.0git 36040 34496 4.28 build/lib/libLLVMBPFDisassembler.so.21.0git 26248 23168 11.73 build/lib/libLLVMCSKYDisassembler.so.21.0git 55960 53632 4.16 build/lib/libLLVMHexagonDisassembler.so.21.0git 115952 113416 2.19 build/lib/libLLVMLanaiDisassembler.so.21.0git 24360 21008 13.76 build/lib/libLLVMLoongArchDisassembler.so.21.0git 58584 56168 4.12 build/lib/libLLVMM68kDisassembler.so.21.0git 57264 53880 5.91 build/lib/libLLVMMSP430Disassembler.so.21.0git 28896 28440 1.58 build/lib/libLLVMMipsDisassembler.so.21.0git 123128 120568 2.08 build/lib/libLLVMPowerPCDisassembler.so.21.0git 80656 78096 3.17 build/lib/libLLVMRISCVDisassembler.so.21.0git 154080 150200 2.52 build/lib/libLLVMSparcDisassembler.so.21.0git 42040 39568 5.88 build/lib/libLLVMSystemZDisassembler.so.21.0git 97056 94552 2.58 build/lib/libLLVMVEDisassembler.so.21.0git 83944 81352 3.09 build/lib/libLLVMWebAssemblyDisassembler.so.21.0git 25280 25280 0.00 build/lib/libLLVMX86Disassembler.so.21.0git 2920624 2920624 0.00 build/lib/libLLVMXCoreDisassembler.so.21.0git 48320 44288 8.34 build/lib/libLLVMXtensaDisassembler.so.21.0git 42248 35840 15.17 ```	2025-06-17 06:21:21 -07:00
Jay Foad	39ad3151e0	[TableGen] Use default member initializers. NFC. (#144349 ) Automated with clang-tidy -fix -checks=-*,modernize-use-default-member-init	2025-06-16 15:26:47 +01:00
Rahul Joshi	7005a76638	[NFC][TableGen] Print DecodeIdx for DecodeOps in DecoderEmitter (#142963 ) Print DecodeIdx associated with Decode MCD ops in the generated decoder tables. This can help in debugging decode failures by first mapping the Op -> DecodeIdx and then inspecting the code in `decodeToMCInst` associated with that DecodeIdx.	2025-06-05 21:57:26 -07:00
Rahul Joshi	e53ccb78e4	[LLVM][MC] Introduce `OrFail` variants of MCD ops (#138614 ) Introduce `OrFail` variants for all MCD Decoder Ops that have `NumToSKip` encoded with them. This is intended to capture the common case of jumps to the end of the decoder table which has a `OP_Fail` at the end. Using the `OrFail` variants of these ops avoid encoding the `NumToSkip` jump offset for these cases, resulting in a reduction in the size of the decoder tables (from 5 - 17%). Additionally, for the AArch64 target, the table size reduces enough to switch to using 2-byte `NumToSkip` encoding instead of existing 3-bytes, resulting in a net 30% reduction in the size of the decoder table. The total reduction in the size of the decoder tables for different targets is as follows (computed using the following command: `for i in *.inc; do echo -n ``basename $i: ``; grep "MCD::OPC_Fail," $i \| awk '{sum += $2} END { print sum}'; done`) ``` Target Old Size New Size % Reduction ================================================ AArch64 153268 106987 30.20 AMDGPU 412056 340856 17.28 ARC 5061 4605 9.01 ARM 73831 60847 17.59 AVR 1306 1158 11.33 BPF 1927 1795 6.85 CSKY 8692 6922 20.36 Hexagon 41965 34759 17.17 Lanai 982 924 5.91 LoongArch 21629 20035 7.37 M68k 13461 11689 13.16 MSP430 3716 3384 8.93 Mips 31415 25771 17.97 PPC 28931 24771 14.38 RISCV 34800 28352 18.53 Sparc 7432 6236 16.09 SystemZ 32248 29716 7.85 VE 42873 36923 13.88 XCore 2316 2196 5.18 Xtensa 3443 2793 18.88 ```	2025-06-05 06:17:50 -07:00

1 2 3

126 Commits