llvm-project

Author	SHA1	Message	Date
Stanislav Mekhanoshin	b132dd41eb	[AMDGPU] Remove wavefrontsize feature from GFX10+ (#98400 ) Processor definition shall not include a default feature which may be switched off by a different wave size. This allows not to write -mattr=-wavefrontsize32,+wavefrontsize64 in tests.	2024-07-16 01:02:25 -07:00
Carl Ritson	e83e53b702	[AMDGPU][MC] Allow UC_VERSION_* constant reuse (#96461 ) If more than one disassembler is created for a context then allow reuse of existing constants. Warn if constants values do not match.	2024-07-07 17:39:03 +09:00
Jay Foad	bb973785c9	[AMDGPU] Only reinitialize disassembler Bytes array when needed. NFC. (#96666 )	2024-06-27 15:45:30 +01:00
Ivan Kosarev	162386693f	[AMDGPU][MC] Support UC_VERSION_* constants. (#95618 ) Our other tools support them, so we want them in LLVM assembler/disassembler too.	2024-06-18 15:44:14 +01:00
luolent	a98a6e95be	Add clarifying parenthesis around non-trivial conditions in ternary expressions. (#90391 ) Fixes [#85868](https://github.com/llvm/llvm-project/issues/85868) Parenthesis are added as requested on ternary operators with non trivial conditions. I used this [precedence table](https://en.cppreference.com/w/cpp/language/operator_precedence) for reference, to make sure we get the expected behavior on each change.	2024-05-04 18:38:45 +01:00
Stanislav Mekhanoshin	6e722bbe30	[AMDGPU] Support byte_sel modifier on v_cvt_sr_fp8_f32 and v_cvt_sr_bf8_f32 (#90244 )	2024-04-26 13:02:57 -07:00
Emma Pilkington	68e814d911	[AMDGPU] Add disassembler diagnostics for invalid kernel descriptors (#87400 ) These mostly are checking for various reserved bits being set. The diagnostics for gpu-dependent reserved bits have a bit more context since they seem like the most likely ones to be observed in practice. This commit also improves the error handling mechanism for MCDisassembler::onSymbolStart(). Previously it had a comment stream parameter that was just being ignored by llvm-objdump, now it returns errors using Expected<T>.	2024-04-18 13:44:22 -04:00
Jay Foad	60e7ae3f30	[AMDGPU] Only try DecoderTables for the current subtarget. NFCI. (#82992 ) Speed up disassembly by only calling tryDecodeInst for DecoderTables that make sense for the current subtarget. This gives a 1.3x speed-up on check-llvm-mc-disassembler-amdgpu in my Release+Asserts build.	2024-02-26 13:02:08 +00:00
Jay Foad	42f6f95e08	[AMDGPU] Simplify AMDGPUDisassembler::getInstruction by removing Res. (#82775 ) Remove all the code that set and tested Res. Change all convert* functions to return void since none of them can fail. getInstruction only has one main point of failure, after all calls to tryDecodeInst have failed.	2024-02-23 18:44:02 +00:00
Jay Foad	3b7d43301e	[AMDGPU] Remove DPP DecoderNamespaces. NFC. (#82491 ) Now that there is no special checking for valid DPP encodings, these instructions can use the same DecoderNamespace as other 64- or 96-bit instructions. Also clean up setting DecoderNamespace: in most cases it should be set as a pair with AssemblerPredicate.	2024-02-22 11:18:18 +00:00
Jay Foad	b9ce237980	[AMDGPU] Clean up conversion of DPP instructions in AMDGPUDisassembler (#82480 ) Convert DPP instructions after all calls to tryDecodeInst, just like we do for all other instruction types. NFCI.	2024-02-22 10:39:43 +00:00
Jay Foad	bcbffd99c4	[AMDGPU] Split Dpp8FI and Dpp16FI operands (#82379 ) Split Dpp8FI and Dpp16FI into two different operands sharing an AsmOperandClass. They are parsed and rendered identically as fi:1 but the encoding is different: for DPP16 FI is a single bit, but for DPP8 it uses two different special values in the src0 field. Having a dedicated decoder for Dpp8FI allows it to reject other (non-special) src0 values so that AMDGPUDisassembler::getInstruction no longer needs to call isValidDPP8 to do post hoc validation of decoded DPP8 instructions.	2024-02-22 09:40:46 +00:00
Jay Foad	ddba6b271c	[AMDGPU] Stop using SDWA DecoderNamespaces. NFCI. (#82233 ) 64-bit SDWA encodings have to be checked first because their first 32 bits are a special case of the corresponding 32-bit non-SDWA encoding of the same instruction. But all 64-bit encodings are checked first, so we don't need special handling for SDWA.	2024-02-20 12:58:07 +00:00
Jay Foad	a4d4615771	[AMDGPU] Try decoding instructions longest first. NFCI. (#82014 ) AMDGPUDisassembler::getInstruction tries decoding instructions using different DecoderTables in a confusing order: first 96-bit instructions, then some 64-bit, then 32-bit, then some more 64-bit. This patch changes it to always try longer encodings first. The motivation is to make getInstruction easier to understand, and to pave the way for combining some 64-bit tables that do not need to be separate.	2024-02-20 12:09:21 +00:00
Stanislav Mekhanoshin	13e64958a0	[AMDGPU] Fix decoder for BF16 inline constants (#82276 ) Fix #82039.	2024-02-19 13:45:23 -08:00
Jay Foad	ded3ca224f	[AMDGPU] Set predicates more consistently for BUF instructions (#81865 ) Set DecoderNamespace and AssemblerPredicate in the base class for Real instructions for each subtarget. This avoids some ad hoc "let" around groups of instructions definitions, and fixes some missed cases like BUFFER_GL0_INV_gfx10 which was missing DecoderNamespace.	2024-02-17 13:19:39 +00:00
Jay Foad	d3b825f80a	[AMDGPU] Use consistent DecoderNamespace for wave64 instructions. NFC. (#81863 ) For wave64 WMMA instructions, putting W64 in the DecoderNamespace is more descriptive than WMMA, and matches other uses for GFX12 GLOBAL_LOAD_TR instructions.	2024-02-15 14:47:46 +00:00
Ivan Kosarev	4c931091a3	[AMDGPU][NFC] Get rid of some operand decoders defined using macros. (#81482 ) Use templates instead. Part of <https://github.com/llvm/llvm-project/issues/62629>.	2024-02-13 10:27:56 +00:00
Ivan Kosarev	7d19dc50de	[AMDGPU][True16] Support VOP3 source DPP operands. (#80892 )	2024-02-08 16:23:00 +00:00
Emma Pilkington	4eb0810922	[llvm-objdump][AMDGPU] Pass ELF ABIVersion through disassembler (#78907 ) Admittedly, its a bit ugly to pass the ABIVersion through onSymbolStart but I'm not sure what a better place for it would be.	2024-02-01 11:26:42 -05:00
Simon Pilgrim	70fbcdb41d	Fix MSVC "signed/unsigned mismatch" warning. NFC.	2024-01-26 14:40:10 +00:00
Ivan Kosarev	2aa8945d59	[AMDGPU][NFC] Use templates to decode AV operands. (#79313 ) Eliminates the need to define them manually. Part of <https://github.com/llvm/llvm-project/issues/62629>.	2024-01-25 11:30:04 +00:00
Ivan Kosarev	2e81ac25b4	[AMDGPU][NFC] Simplify AGPR/VGPR load/store operand definitions. (#79289 ) Part of <https://github.com/llvm/llvm-project/issues/62629>.	2024-01-24 15:38:16 +00:00
Mirko Brkušanin	7fdf608cef	[AMDGPU] Add GFX12 WMMA and SWMMAC instructions (#77795 ) Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com> Co-authored-by: Piotr Sobczak <piotr.sobczak@amd.com>	2024-01-24 13:43:07 +01:00
Mariusz Sikora	cfddb59be2	[AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (#78414 ) …bf8 instructions Add VOP1, VOP1_DPP8, VOP1_DPP16, VOP3, VOP3_DPP8, VOP3_DPP16 instructions that were supported on GFX940 (MI300): - V_CVT_F32_FP8 - V_CVT_F32_BF8 - V_CVT_PK_F32_FP8 - V_CVT_PK_F32_BF8 - V_CVT_PK_FP8_F32 - V_CVT_PK_BF8_F32 - V_CVT_SR_FP8_F32 - V_CVT_SR_BF8_F32 --------- Co-authored-by: Mateja Marjanovic <mateja.marjanovic@amd.com> Co-authored-by: Mirko Brkušanin <Mirko.Brkusanin@amd.com>	2024-01-24 12:21:15 +01:00
Emma Pilkington	bc82cfb38d	[AMDGPU] Add an asm directive to track code_object_version (#76267 ) Named '.amdhsa_code_object_version'. This directive sets the e_ident[ABIVERSION] in the ELF header, and should be used as the assumed COV for the rest of the asm file. This commit also weakens the --amdhsa-code-object-version CL flag. Previously, the CL flag took precedence over the IR flag. Now the IR flag/asm directive take precedence over the CL flag. This is implemented by merging a few COV-checking functions in AMDGPUBaseInfo.h.	2024-01-21 11:54:47 -05:00
Piotr Sobczak	57f6a3f7ea	[AMDGPU] Add global_load_tr for GFX12 (#77772 ) Support new amdgcn_global_load_tr instructions for load with transpose. * MC layer support for GLOBAL_LOAD_TR_B64/GLOBAL_LOAD_TR_B128 * Intrinsic int_amdgcn_global_load_tr * Clang builtins amdgcn_global_load_tr*	2024-01-18 15:14:42 +01:00
Nicolai Hähnle	49b492048a	AMDGPU: Fix packed 16-bit inline constants (#76522 ) Consistently treat packed 16-bit operands as 32-bit values, because that's really what they are. The attempt to treat them differently was ultimately incorrect and lead to miscompiles, e.g. when using non-splat constants such as (1, 0) as operands. Recognize 32-bit float constants for i/u16 instructions. This is a bit odd conceptually, but it matches HW behavior and SP3. Remove isFoldableLiteralV216; there was too much magic in the dependency between it and its use in SIFoldOperands. Instead, we now simply rely on checking whether a constant is an inline constant, and trying a bunch of permutations of the low and high halves. This is more obviously correct and leads to some new cases where inline constants are used as shown by tests. Move the logic for switching packed add vs. sub into SIFoldOperands. This has two benefits: all logic that optimizes for inline constants in packed math is now in one place; and it applies to both SelectionDAG and GISel paths. Disable the use of opsel with v_dot* instructions on gfx11. They are documented to ignore opsel on src0 and src1. It may be interesting to re-enable to use of opsel on src2 as a future optimization. A similar "proper" fix of what inline constants mean could potentially be applied to unpacked 16-bit ops. However, it's less clear what the benefit would be, and there are surely places where we'd have to carefully audit whether values are properly sign- or zero-extended. It is best to keep such a change separate. Fixes: Corruption in FSR 2.0 (latent bug exposed by an LLPC change)	2024-01-04 00:10:15 +01:00
Jay Foad	c01e844a7e	[AMDGPU] Update compute program resource registers for GFX12 (#75911 ) Co-authored-by: Konstantin Zhuravlyov <kzhuravl@amd.com>	2024-01-02 13:24:42 +00:00
Ivan Kosarev	8c6172b0ac	[AMDGPU][True16] Don't use the VGPR_LO/HI16 register classes. (#76440 ) Removing the classes requires updating tests and so is planned to be done with a separate change.	2023-12-28 11:48:25 +00:00
Jay Foad	8fdfd34cd2	[AMDGPU] Remove GDS and GWS for GFX12 (#76148 )	2023-12-21 15:27:08 +00:00
Mirko Brkušanin	569ef8ddd9	[AMDGPU] Add pseudo scalar trans instructions for GFX12 (#75204 )	2023-12-15 10:41:05 +01:00
Mirko Brkušanin	c1a6974d6b	[AMDGPU][MC] Add GFX12 SMEM encoding (#75215 )	2023-12-15 09:00:54 +01:00
Mariusz Sikora	7f55d7de1a	[AMDGPU] GFX12: Add Split Workgroup Barrier (#74836 ) Co-authored-by: Vang Thao <Vang.Thao@amd.com>	2023-12-13 15:01:13 +01:00
Piotr Sobczak	fac093dd08	[AMDGPU] Update IEEE and DX10_CLAMP for GFX12 (#75030 ) Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2023-12-13 13:52:40 +01:00
Mariusz Sikora	a97028ac51	[AMDGPU] Update VOP instructions for GFX12 (#74853 ) Co-authored-by: Mirko Brkusanin <Mirko.Brkusanin@amd.com>	2023-12-12 11:38:24 +01:00
Kazu Hirata	586ecdf205	[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956 ) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-11 21:01:36 -08:00
Jay Foad	19f4cec676	[AMDGPU] Add GFX12 encoding for VINTERP instructions (#74616 )	2023-12-07 10:15:31 +00:00
Jay Foad	44ff904d21	[AMDGPU] Add VEXPORT encoding for GFX12 (#74615 ) In GFX12 the exp instruction is renamed to export, but exp is still accepted as an alias. Co-authored-by: Mateja Marjanovic <mateja.marjanovic@amd.com>	2023-12-07 10:06:20 +00:00
Mariusz Sikora	19c9f9c0bf	[AMDGPU] GFX12: Add s_prefetch_inst/data instructions (#74448 ) Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2023-12-06 10:01:29 +01:00
Jay Foad	1c55b227fe	[AMDGPU] Add GFX12 encoding and aliases for existing SOP (SALU) instructions (#74305 )	2023-12-05 10:07:06 +00:00
Mirko Brkušanin	f5868cb6a6	[AMDGPU][MC] Add GFX12 VIMAGE and VSAMPLE encodings (#74062 )	2023-12-04 13:04:42 +01:00
Konstantin Zhuravlyov	6cfb64276d	AMDGPU: Minor updates to program resource registers (#69525 ) - Be explicit about which program resource register is supported by which target - RSRC1 - FP16_OVFL is GFX9+ - WGP_MODE is GFX10+ - MEM_ORDERED is GFX10+ - FWD_PROGRESS is GFX10+ - RSRC3 - INST_PREF_SIZE is GFX11+ - TRAP_ON_START is GFX11+ - TRAP_ON_END is GFX11+ - IMAGE_OP is GFX11+ - Do not emit GFX11+ fields when disassembling GFX10 code objects - Tighten enforcement of reserved bits in disassembler --------- Co-authored-by: Konstantin Zhuravlyov <kzhuravl@amd.com>	2023-10-19 12:40:19 -04:00
Stanislav Mekhanoshin	ab6c3d5034	[AMDGPU] Change the representation of double literals in operands (#68740 ) A 64-bit literal can be used as a 32-bit zero or sign extended operand. In case of double zeroes are added to the low 32 bits. Currently asm parser stores only high 32 bits of a double into an operand. To support codegen as requested by the https://github.com/llvm/llvm-project/issues/67781 we need to change the representation to store a full 64-bit value so that codegen can simply add immediates to an instruction. There is some code to support compatibility with existing tests and asm kernels. We allow to use short hex strings to represent only a high 32 bit of a double value as a valid literal.	2023-10-12 14:45:45 -07:00
Kazu Hirata	b05dbc4d5f	[llvm] Use llvm::endianness::{big,little,native} (NFC) Now that llvm::support::endianness has been renamed to llvm::endianness, we can use the shorter form. This patch replaces support::endianness::{big,little,native} with llvm::endianness::{big,little,native}.	2023-10-10 20:14:20 -07:00
Kazu Hirata	7c9d7b73e4	Revert "[AMDGPU] Add [[maybe_unused]] to several unused functions (NFC)" This reverts commit fff16807c2e6c64d671b2f6b7b3ae76f5e16e38d. We no longer need this workaround after 053478bbd0ae5329ea7993261225e6541f728858.	2023-09-25 11:16:47 -07:00
Kazu Hirata	fff16807c2	[AMDGPU] Add [[maybe_unused]] to several unused functions (NFC) Ivan is planning to introduce actual uses of these functions in near future.	2023-09-25 11:13:01 -07:00
Ivan Kosarev	9310baa596	[AMDGPU][NFC] Add True16 operand definitions. Reviewed By: Joe_Nash Differential Revision: https://reviews.llvm.org/D156103	2023-09-25 16:48:46 +01:00
Ivan Kosarev	fab28e0e14	Reapply "[AMDGPU] Introduce real and keep fake True16 instructions." Reverts 6cb3866b1ce9d835402e414049478cea82427cf1. Analysis of failures on buildbots with expensive checks enabled showed that the problem was triggered by changes in another commit, 469b3bfad20550968ac428738eb1f8bb8ce3e96d, and was caused by the bug addressed in #67245.	2023-09-23 22:07:41 +01:00
Ivan Kosarev	6cb3866b1c	Revert "[AMDGPU] Introduce real and keep fake True16 instructions." This reverts commit 0f864c7b8bc9323293ec3d85f4bd5322f8f61b16 due to failures on expensive checks.	2023-09-22 15:40:26 +01:00

1 2 3 4 5

231 Commits