llvm-project

Author	SHA1	Message	Date
Sergei Barannikov	60bdf09654	[TableGen][DecoderEmitter] Rework table construction/emission (#155889 ) ### Current state We have FilterChooser class, which can be thought of as a tree of encodings. Tree nodes are instances of FilterChooser itself, and come in two types: * A node containing single encoding that has constant bits in the specified bit range, a.k.a. singleton node. * A node containing only child nodes, where each child represents a set of encodings that have the same constant bits in the specified bit range. Either of these nodes can have an additional child, which represents a set of encodings that have some unknown bits in the same bit range. As can be seen, the data structure is very high level. The encoding tree represented by FilterChooser is then converted into a finite-state machine (FSM), represented as byte array. The translation is straightforward: for each node of the tree we emit a sequence of opcodes that check encoding bits and predicates for each encoding. For a singleton node we also emit a terminal "decode" opcode. The translation is done in one go, and this has negative consequences: * We miss optimization opportunities. * We have to use "fixups" when encoding transitions in the FSM since we don't know the size of the data we want to jump over in advance. We have to emit the data first and then fix up the location of the jump. This means the fixup size has to be large enough to encode the longest jump, so most of the transitions are encoded inefficiently. * Finally, when converting the FSM into human readable form, we have to decode the byte array we've just emitted. This is also done in one go, so we can't do any pretty printing. ### This PR We introduce an intermediary data structure, decoder tree, that can be thought as AST of the decoder program. This data structure is low level and as such allows for optimization and analysis. It resolves all the issues listed above. We now can: * Emit more optimal opcode sequences. * Compute the size of the data to be emitted in advance, avoiding fixups. * Do pretty printing. Serialization is done by a new class, DecoderTableEmitter, which converts the AST into a FSM in textual form, streamed right into the output file. ### Results * The new approach immediately resulted in 12% total table size savings across all in-tree targets, without implementing any optimizations on the AST. Many tables observe ~20% size reduction. * The generated file is much more readable. * The implementation is arguably simpler and more straightforward (the diff is only +150~200 lines, which feels rather small for the benefits the change gives).	2025-09-20 01:58:53 +00:00
Sergei Barannikov	7f4c297e94	[TableGen][CodeGen] Remove feature string from HwMode (#157600 ) `Predicates` and `Features` fields serve the same purpose. They should be kept in sync, but not all predicates are based on features. This resulted in introducing dummy features for that only reason. This patch removes `Features` field and changes TableGen emitters to use `Predicates` instead. Historically, predicates were written with the assumption that the checking code will be used in `SelectionDAGISel` subclasses, meaning they will have access to the subclass variables, such as `Subtarget`. There are no such variables in the generated `GenSubtargetInfo::getHwModeSet()`, so we need to provide them. This can be achieved by subclassing `HwModePredicateProlog`, see an example in `Hexagon.td`.	2025-09-10 12:39:47 +03:00
Sergei Barannikov	2a586a8118	[TableGen][DecoderEmitter] Remove dead OPC_Fail (#155229 ) It can never be reached. It could be reached if we emitted an opcode that could fall outside the outermost scope, but emission of all such opcodes is guarded by `!isOutermostScope()`. That also means we never add fixups to the outermost scope, so avoid pushing an entry for it onto the stack.	2025-08-25 19:15:35 +03:00
Sergei Barannikov	6ae0d9591e	[TableGen][DecoderEmitter] Print the size of the decoder tables (#155139 ) So we can see the changes in table sizes after making changes to DecoderEmitter by simply running `grep DecoderTable`. Also, remove an unnecessary terminating 0 from the end of the tables.	2025-08-24 09:09:31 +03:00
Jason Eckhardt	2ed0aacf97	[TableGen] Fixes for per-HwMode decoding problem (#82201 ) Today, if any instruction uses EncodingInfos/EncodingByHwMode to override the default encoding, the opcode field of the decoder table is generated incorrectly. This causes failed disassemblies and other problems. Specifically, the main correctness issue is that the EncodingID is inadvertently stored in the table rather than the actual opcode. This is caused by having set up the IndexOfInstruction map incorrectly during the loop to populate NumberedEncodings-- which is then propagated around when OpcMap is set up with a bad EncodingIDAndOpcode. Instead, do away with IndexOfInstruction altogether and use opcode value queried from CodeGenTarget::getInstrIntValue to set up OpcMap. This itself exposed another problem where emitTable was using the decoded opcode to index into NumberedEncodings. Instead pass in the EncodingIDAndOpcode vector, and create the reverse mapping from Opcode to EncodingID, which is then used to index NumberedEncodings. This problem is not currently exposed upstream since no in-tree targets yet use the per-HwMode feature. It does show up in at least two downstream targets.	2024-02-19 13:14:22 +08:00
Craig Topper	81a150656b	[TableGen][RISCV][Hexagon][LoongArch] Add a list of Predicates to HwMode. Use the predicate condition instead of checkFeatures in *GenDAGISel.inc. This makes the code similar to isel pattern predicates. checkFeatures is still used by code created by SubtargetEmitter so we can't remove the string. Backends need to be careful to keep the string and predicates in sync, but I don't think that's a big issue. I haven't measured it, but this should be a compile time improvement for isel since we don't have to do any of the string processing that's inside checkFeatures. Reviewed By: kparzysz Differential Revision: https://reviews.llvm.org/D146012	2023-03-14 13:00:38 -07:00
Paul C. Anagnostopoulos	93b4f85382	Update TableGen test files to use the new '...' range punctuation.	2020-09-12 16:26:32 -04:00
James Molloy	9948fe6997	[TableGen] Fix crash when using HwModes in CodeEmitterGen When an instruction has an encoding definition for only a subset of the available HwModes, ensure we just avoid generating an encoding rather than crash. llvm-svn: 374150	2019-10-09 09:15:34 +00:00
James Molloy	88a5fbfcea	[TableGen] Support encoding per-HwMode Much like ValueTypeByHwMode/RegInfoByHwMode, this patch allows targets to modify an instruction's encoding based on HwMode. When the EncodingInfos field is non-empty the Inst and Size fields of the Instruction are ignored and taken from EncodingInfos instead. As part of this promote getHwMode() from TargetSubtargetInfo to MCSubtargetInfo. This is NFC for all existing targets - new code is generated only if targets use EncodingByHwMode. llvm-svn: 372320	2019-09-19 13:39:54 +00:00

9 Commits