llvm-project

Author	SHA1	Message	Date
Ivan Kosarev	2aa8945d59	[AMDGPU][NFC] Use templates to decode AV operands. (#79313 ) Eliminates the need to define them manually. Part of <https://github.com/llvm/llvm-project/issues/62629>.	2024-01-25 11:30:04 +00:00
Ivan Kosarev	2e81ac25b4	[AMDGPU][NFC] Simplify AGPR/VGPR load/store operand definitions. (#79289 ) Part of <https://github.com/llvm/llvm-project/issues/62629>.	2024-01-24 15:38:16 +00:00
Mirko Brkušanin	7fdf608cef	[AMDGPU] Add GFX12 WMMA and SWMMAC instructions (#77795 ) Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com> Co-authored-by: Piotr Sobczak <piotr.sobczak@amd.com>	2024-01-24 13:43:07 +01:00
Mariusz Sikora	cfddb59be2	[AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (#78414 ) …bf8 instructions Add VOP1, VOP1_DPP8, VOP1_DPP16, VOP3, VOP3_DPP8, VOP3_DPP16 instructions that were supported on GFX940 (MI300): - V_CVT_F32_FP8 - V_CVT_F32_BF8 - V_CVT_PK_F32_FP8 - V_CVT_PK_F32_BF8 - V_CVT_PK_FP8_F32 - V_CVT_PK_BF8_F32 - V_CVT_SR_FP8_F32 - V_CVT_SR_BF8_F32 --------- Co-authored-by: Mateja Marjanovic <mateja.marjanovic@amd.com> Co-authored-by: Mirko Brkušanin <Mirko.Brkusanin@amd.com>	2024-01-24 12:21:15 +01:00
Emma Pilkington	bc82cfb38d	[AMDGPU] Add an asm directive to track code_object_version (#76267 ) Named '.amdhsa_code_object_version'. This directive sets the e_ident[ABIVERSION] in the ELF header, and should be used as the assumed COV for the rest of the asm file. This commit also weakens the --amdhsa-code-object-version CL flag. Previously, the CL flag took precedence over the IR flag. Now the IR flag/asm directive take precedence over the CL flag. This is implemented by merging a few COV-checking functions in AMDGPUBaseInfo.h.	2024-01-21 11:54:47 -05:00
Piotr Sobczak	57f6a3f7ea	[AMDGPU] Add global_load_tr for GFX12 (#77772 ) Support new amdgcn_global_load_tr instructions for load with transpose. * MC layer support for GLOBAL_LOAD_TR_B64/GLOBAL_LOAD_TR_B128 * Intrinsic int_amdgcn_global_load_tr * Clang builtins amdgcn_global_load_tr*	2024-01-18 15:14:42 +01:00
Nicolai Hähnle	49b492048a	AMDGPU: Fix packed 16-bit inline constants (#76522 ) Consistently treat packed 16-bit operands as 32-bit values, because that's really what they are. The attempt to treat them differently was ultimately incorrect and lead to miscompiles, e.g. when using non-splat constants such as (1, 0) as operands. Recognize 32-bit float constants for i/u16 instructions. This is a bit odd conceptually, but it matches HW behavior and SP3. Remove isFoldableLiteralV216; there was too much magic in the dependency between it and its use in SIFoldOperands. Instead, we now simply rely on checking whether a constant is an inline constant, and trying a bunch of permutations of the low and high halves. This is more obviously correct and leads to some new cases where inline constants are used as shown by tests. Move the logic for switching packed add vs. sub into SIFoldOperands. This has two benefits: all logic that optimizes for inline constants in packed math is now in one place; and it applies to both SelectionDAG and GISel paths. Disable the use of opsel with v_dot* instructions on gfx11. They are documented to ignore opsel on src0 and src1. It may be interesting to re-enable to use of opsel on src2 as a future optimization. A similar "proper" fix of what inline constants mean could potentially be applied to unpacked 16-bit ops. However, it's less clear what the benefit would be, and there are surely places where we'd have to carefully audit whether values are properly sign- or zero-extended. It is best to keep such a change separate. Fixes: Corruption in FSR 2.0 (latent bug exposed by an LLPC change)	2024-01-04 00:10:15 +01:00
Jay Foad	c01e844a7e	[AMDGPU] Update compute program resource registers for GFX12 (#75911 ) Co-authored-by: Konstantin Zhuravlyov <kzhuravl@amd.com>	2024-01-02 13:24:42 +00:00
Ivan Kosarev	8c6172b0ac	[AMDGPU][True16] Don't use the VGPR_LO/HI16 register classes. (#76440 ) Removing the classes requires updating tests and so is planned to be done with a separate change.	2023-12-28 11:48:25 +00:00
Jay Foad	8fdfd34cd2	[AMDGPU] Remove GDS and GWS for GFX12 (#76148 )	2023-12-21 15:27:08 +00:00
Mirko Brkušanin	569ef8ddd9	[AMDGPU] Add pseudo scalar trans instructions for GFX12 (#75204 )	2023-12-15 10:41:05 +01:00
Mirko Brkušanin	c1a6974d6b	[AMDGPU][MC] Add GFX12 SMEM encoding (#75215 )	2023-12-15 09:00:54 +01:00
Mariusz Sikora	7f55d7de1a	[AMDGPU] GFX12: Add Split Workgroup Barrier (#74836 ) Co-authored-by: Vang Thao <Vang.Thao@amd.com>	2023-12-13 15:01:13 +01:00
Piotr Sobczak	fac093dd08	[AMDGPU] Update IEEE and DX10_CLAMP for GFX12 (#75030 ) Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2023-12-13 13:52:40 +01:00
Mariusz Sikora	a97028ac51	[AMDGPU] Update VOP instructions for GFX12 (#74853 ) Co-authored-by: Mirko Brkusanin <Mirko.Brkusanin@amd.com>	2023-12-12 11:38:24 +01:00
Kazu Hirata	586ecdf205	[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956 ) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-11 21:01:36 -08:00
Jay Foad	19f4cec676	[AMDGPU] Add GFX12 encoding for VINTERP instructions (#74616 )	2023-12-07 10:15:31 +00:00
Jay Foad	44ff904d21	[AMDGPU] Add VEXPORT encoding for GFX12 (#74615 ) In GFX12 the exp instruction is renamed to export, but exp is still accepted as an alias. Co-authored-by: Mateja Marjanovic <mateja.marjanovic@amd.com>	2023-12-07 10:06:20 +00:00
Mariusz Sikora	19c9f9c0bf	[AMDGPU] GFX12: Add s_prefetch_inst/data instructions (#74448 ) Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2023-12-06 10:01:29 +01:00
Jay Foad	1c55b227fe	[AMDGPU] Add GFX12 encoding and aliases for existing SOP (SALU) instructions (#74305 )	2023-12-05 10:07:06 +00:00
Mirko Brkušanin	f5868cb6a6	[AMDGPU][MC] Add GFX12 VIMAGE and VSAMPLE encodings (#74062 )	2023-12-04 13:04:42 +01:00
Konstantin Zhuravlyov	6cfb64276d	AMDGPU: Minor updates to program resource registers (#69525 ) - Be explicit about which program resource register is supported by which target - RSRC1 - FP16_OVFL is GFX9+ - WGP_MODE is GFX10+ - MEM_ORDERED is GFX10+ - FWD_PROGRESS is GFX10+ - RSRC3 - INST_PREF_SIZE is GFX11+ - TRAP_ON_START is GFX11+ - TRAP_ON_END is GFX11+ - IMAGE_OP is GFX11+ - Do not emit GFX11+ fields when disassembling GFX10 code objects - Tighten enforcement of reserved bits in disassembler --------- Co-authored-by: Konstantin Zhuravlyov <kzhuravl@amd.com>	2023-10-19 12:40:19 -04:00
Stanislav Mekhanoshin	ab6c3d5034	[AMDGPU] Change the representation of double literals in operands (#68740 ) A 64-bit literal can be used as a 32-bit zero or sign extended operand. In case of double zeroes are added to the low 32 bits. Currently asm parser stores only high 32 bits of a double into an operand. To support codegen as requested by the https://github.com/llvm/llvm-project/issues/67781 we need to change the representation to store a full 64-bit value so that codegen can simply add immediates to an instruction. There is some code to support compatibility with existing tests and asm kernels. We allow to use short hex strings to represent only a high 32 bit of a double value as a valid literal.	2023-10-12 14:45:45 -07:00
Kazu Hirata	b05dbc4d5f	[llvm] Use llvm::endianness::{big,little,native} (NFC) Now that llvm::support::endianness has been renamed to llvm::endianness, we can use the shorter form. This patch replaces support::endianness::{big,little,native} with llvm::endianness::{big,little,native}.	2023-10-10 20:14:20 -07:00
Kazu Hirata	7c9d7b73e4	Revert "[AMDGPU] Add [[maybe_unused]] to several unused functions (NFC)" This reverts commit fff16807c2e6c64d671b2f6b7b3ae76f5e16e38d. We no longer need this workaround after 053478bbd0ae5329ea7993261225e6541f728858.	2023-09-25 11:16:47 -07:00
Kazu Hirata	fff16807c2	[AMDGPU] Add [[maybe_unused]] to several unused functions (NFC) Ivan is planning to introduce actual uses of these functions in near future.	2023-09-25 11:13:01 -07:00
Ivan Kosarev	9310baa596	[AMDGPU][NFC] Add True16 operand definitions. Reviewed By: Joe_Nash Differential Revision: https://reviews.llvm.org/D156103	2023-09-25 16:48:46 +01:00
Ivan Kosarev	fab28e0e14	Reapply "[AMDGPU] Introduce real and keep fake True16 instructions." Reverts 6cb3866b1ce9d835402e414049478cea82427cf1. Analysis of failures on buildbots with expensive checks enabled showed that the problem was triggered by changes in another commit, 469b3bfad20550968ac428738eb1f8bb8ce3e96d, and was caused by the bug addressed in #67245.	2023-09-23 22:07:41 +01:00
Ivan Kosarev	6cb3866b1c	Revert "[AMDGPU] Introduce real and keep fake True16 instructions." This reverts commit 0f864c7b8bc9323293ec3d85f4bd5322f8f61b16 due to failures on expensive checks.	2023-09-22 15:40:26 +01:00
Ivan Kosarev	0f864c7b8b	[AMDGPU] Introduce real and keep fake True16 instructions. The existing fake True16 instructions using 32-bit VGPRs are supposed to co-exist with real ones until all the necessary True16 functionality is implemented and relevant tests are updated. Reviewed By: arsenm, Joe_Nash Differential Revision: https://reviews.llvm.org/D156101	2023-09-22 10:57:56 +01:00
Mirko Brkušanin	923285b139	[AMDGPU] Add gfx1150 SALU Float instructions (#66884 )	2023-09-21 11:22:15 +02:00
Austin Kerbow	69447d6afe	[AMDGPU] Add ASM and MC updates for preloading kernargs Add assembler directives for preloading kernel arguments that correspond to new fields in the kernel descriptor for the length and offset of arguments that will be placed in SGPRs prior to kernel launch. Alignment of the arguments in SGPRs is equivalent to the kernarg segment when accessed via the kernarg_segment_ptr. Kernarg SGPRs are allocated directly after other user SGPRs. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D159459	2023-09-19 15:45:16 -07:00
Ivan Kosarev	289ae6525d	[AMDGPU][MC] Fix handling of A16 operands in intersect_ray instructions. The patch adds the support for 'noa16' operands in non-A16 variants of the instructions, fixes validation of A16 operands and eliminates the custom conversion to MCInst. Part of <https://github.com/llvm/llvm-project/issues/62629>. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D155057	2023-07-13 19:46:03 +01:00
Scott Linder	986001c827	[AMDGPU] Improve assembler + disassembler handling of kernel descriptors * Relax the AsmParser to accept `.amdhsa_wavefront_size32 0` when the `.amdhsa_shared_vgpr_count` directive is present. * Teach the KD disassembler to respect the setting of KERNEL_CODE_PROPERTY_ENABLE_WAVEFRONT_SIZE32 when calculating the value of `.amdhsa_next_free_vgpr`. * Teach the KD disassembler to disassemble COMPUTE_PGM_RSRC3 for gfx90a and gfx10+. * Include "pseudo directive" comments for gfx10 fields which are not controlled by any assembler directive. * Fix disassembleObject failure diagnostic in llvm-objdump to not hard-code a comment string, and to follow the convention of not capitalizing the first sentence. Reviewed By: rochauha Differential Revision: https://reviews.llvm.org/D128014	2023-07-06 21:20:51 +00:00
Ivan Kosarev	dac5957ca9	[AMDGPU][AsmParser][NFC] Clean up the implementation of KImmFP operands. addKImmFPOperands() duplicates the KImmFP-specific logic implemented in addLiteralImmOperand() and therefore can be removed. Part of <https://github.com/llvm/llvm-project/issues/62629>. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D154427	2023-07-05 10:43:41 +01:00
Ivan Kosarev	b7e8a55a83	[AMDGPU][AsmParser][NFC] Simplify parsing of sopp_brtarget operands. Also refine the definitions while there. Part of <https://github.com/llvm/llvm-project/issues/62629>. Reviewed By: mbrkusanin Differential Revision: https://reviews.llvm.org/D154061	2023-06-30 16:35:46 +01:00
Scott Linder	ede070a28d	[NFC][AMDGPU] Refactor AMDGPUDisassembler Clean up ahead of a patch to fix bugs in the AMDGPUDisassembler. Use split-file to simplify and extend existing kernel-descriptor disassembly tests. Add a comment to AMDHSAKernelDescriptor.h, as at least one small set towards keeping all kernel-descriptor sensitive code in sync. Reviewed By: MaskRay, kzhuravl, arsenm Differential Revision: https://reviews.llvm.org/D130105	2023-06-29 16:06:09 +00:00
Ivan Kosarev	e23891a382	[AMDGPU][Disassembler] Fix a spurious error message in an instruction comment. The patch prevents pollution of instruction comments with error messages generated during unsuccessful decoding attempts. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D149049	2023-04-26 12:50:17 +01:00
Mirko Brkusanin	b3dc0e69cf	[AMDGPU][MC][GFX11] Add Partial NSA format for image sample instructions Image sample instructions that need more than 5 VGPRs for VAddr can use partial NSA for NSA encoding format. VGPRs that can not fit into the encoding are sequential after the last one. This patch adds assembly and disassembly parts. Differential Revision: https://reviews.llvm.org/D144033	2023-02-23 13:33:34 +01:00
Fangrui Song	432caca39a	Simplify with hasFeature. NFC	2023-02-17 18:22:24 -08:00
Kazu Hirata	64dad4ba9a	Use llvm::bit_cast (NFC)	2023-02-14 01:22:12 -08:00
Changpeng Fang	7ca3444fba	AMDGPU: Use module flag to get code object version at IR level folow-up Summary: This is part of the leftover work for https://reviews.llvm.org/D143138. In this work, we pass code object version as an argument to initialize target ID and use it for targetID dump. Reviewers: arsenm Differential Revision https://reviews.llvm.org/D143293	2023-02-10 11:16:38 -08:00
Petar Avramovic	13512f84f3	AMDGPU/MC: Fix decoders for VSrc_v2b32 and VSrc_v2f32 RegisterOperands Decoder should make 32 bit value when decoding immediates, not 64 bit. Differential Revision: https://reviews.llvm.org/D143574	2023-02-09 12:16:46 +01:00
Petar Avramovic	a256e1d97a	AMDGPU/MC: Fix indentation and remove unused macro after D142636	2023-02-06 13:19:03 +01:00
Petar Avramovic	b0c1a45ba5	AMDGPU/MC: Refactor decoders. Rework decoders for float immediates decodeFPImmed creates immediate operand using register operand width, but size of created immediate should correspond to OperandType for RegisterOperand. e.g. OPW128 could be used for RegisterOperands that use v2f64 v4f32 and v8f16. Each RegisterOperands would have different OperandType and require that immediate is decoded using 64, 32 and 16 bit immediate respectively. decodeOperand_<RegClass> only provides width for register decoding, introduce decodeOperand_<RegClass>_Imm<ImmWidth> that also provides width for immediate decoding. Refactor RegisterOperands: - decoders get _Imm<ImmWidth> suffix in some cases - removed unused RegisterOperands defined via multiclass - use different RegisterOperand in a few places, new RegisterOperand's decoder corresponds to the number of bits used for operand's encoding Refactor decoder functions: - add asserts for the size of encoding that will be decoded - regroup them according to the method of decoding decodeOperand_<RegClass> (register only, no immediate) decoders can now create immediate of consistent size, use it for better diagnostic of 'invalid immediate'. Differential Revision: https://reviews.llvm.org/D142636	2023-02-01 16:52:57 +01:00
Jay Foad	768aed1378	[MC] Make more use of MCInstrDesc::operands. NFC. Change MCInstrDesc::operands to return an ArrayRef so we can easily use it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end. A future patch will remove opInfo_begin and opInfo_end. Also use it instead of raw access to the OpInfo pointer. A future patch will remove this pointer. Differential Revision: https://reviews.llvm.org/D142213	2023-01-23 11:31:41 +00:00
Kazu Hirata	caa99a01f5	Use llvm::popcount instead of llvm::countPopulation(NFC)	2023-01-22 12:48:51 -08:00
Fangrui Song	f4c16c4473	[MC] llvm::Optional => std::optional https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 21:36:08 +00:00
Kazu Hirata	20cde15415	[Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:06 -08:00
Mateja Marjanovic	595a08847a	[AMDGPU] Add support for new LLVM vector types Add VReg, AReg and SReg on AMDGPU for bit widths: 288, 320, 352 and 384. Differential Revision: https://reviews.llvm.org/D138205	2022-11-29 17:02:04 +01:00

1 2 3 4 5

210 Commits