llvm-project

Author	SHA1	Message	Date
Ivan Kosarev	3c3f6d8776	[AMDGPU][AsmParser][NFC] Eliminate Match_PreferE32. (#92159 ) Was added in 88e0b251815563016ad50241dd592e304bc03ee5 and is unused since fcef407aa21ad5a79d66a088e6f2a66a5745725d.	2024-05-15 11:53:38 +01:00
Janek van Oirschot	d86b68afd7	MCExpr-ify SIProgramInfo (#88257 ) Convert members in SIProgramInfo affected by variables provided by AMDGPUResourceUsageAnalysis into MCExprs.	2024-05-09 13:02:32 +01:00
Emma Pilkington	dcc7ef3ce8	[AMDGPU][MC] Disable sendmsg SYSMSG_OP_HOST_TRAP_ACK on gfx9+ (#90203 ) This is no longer supported as of gfx9. Fixes #52903 This commit also includes some refactoring of sendmsg operand parsing: - Use CustomOperand for sendmsg operations, this allows them to be conditionally available based on a STI check (and automatically in sync with SIDefines.h). - Move CustomOperand table lookups from AMDGPUBaseInfo to AMDGPUAsmUtils. This cleans up an awkward interface where AMDGPUAsmUtils defined a table/size as globals that AMDGPUBaseInfo had to loop over. - Clean up a few of the operand lookup functions while moving them.	2024-05-07 07:38:58 -04:00
luolent	a98a6e95be	Add clarifying parenthesis around non-trivial conditions in ternary expressions. (#90391 ) Fixes [#85868](https://github.com/llvm/llvm-project/issues/85868) Parenthesis are added as requested on ternary operators with non trivial conditions. I used this [precedence table](https://en.cppreference.com/w/cpp/language/operator_precedence) for reference, to make sure we get the expected behavior on each change.	2024-05-04 18:38:45 +01:00
Stanislav Mekhanoshin	57216f7bd6	[AMDGPU] Support byte_sel modifier for v_cvt_f32_fp8 and v_cvt_f32_bf8 (#90887 )	2024-05-02 12:03:51 -07:00
Ivan Kosarev	9bebf25ecb	[AMDGPU][AsmParser][NFC] Generate NamedIntOperand predicates automatically. (#90576 ) Part of <https://github.com/llvm/llvm-project/issues/62629>.	2024-05-01 10:02:02 +01:00
Ivan Kosarev	e5c92c51e9	[AMDGPU][AsmParser] Do not use predicates for validation of NamedIntOperands. (#90251 ) Their job is to discriminate between different types of operands, not to check if they are valid. For validation we can use conversion functions. Clears the road to generating predicates automatically. Part of <https://github.com/llvm/llvm-project/issues/62629>.	2024-04-29 14:06:39 +01:00
Stanislav Mekhanoshin	6e722bbe30	[AMDGPU] Support byte_sel modifier on v_cvt_sr_fp8_f32 and v_cvt_sr_bf8_f32 (#90244 )	2024-04-26 13:02:57 -07:00
Joe Nash	6a13bbf92f	[AMDGPU][MC] Enables sgpr or imm src1 for float VOP3 DPP, but excludi… (#87382 ) …ng VOPC. Fixes support on GFX1150 and GFX12 where src1 of e64_dpp instructions should allow sgpr and imm operands. PR #67461 added support for this with int operands, but it was missing a piece for float. Changing VOPC e64_dpp will be in a different patch because there is a bug preventing that change.	2024-04-03 11:34:12 -04:00
Janek van Oirschot	1103a2a337	Reland [AMDGPU] MCExpr-ify MC layer kernel descriptor (#86494 ) Kernel descriptor attributes, with their respective emit and asm parse functionality, converted to MCExpr. Relands #80855 with fixes	2024-03-27 11:59:56 +00:00
Sergei Barannikov	5e5b656102	[MC] Make `MCParsedAsmOperand::getReg()` return `MCRegister` (#86444 )	2024-03-25 05:13:48 +03:00
Janek van Oirschot	797336b127	Revert "[AMDGPU] MCExpr-ify MC layer kernel descriptor" (#86151 ) Reverts llvm/llvm-project#80855	2024-03-21 10:19:54 -07:00
Janek van Oirschot	857161c367	[AMDGPU] MCExpr-ify MC layer kernel descriptor (#80855 ) Kernel descriptor attributes, with their respective emit and asm parse functionality, converted to MCExpr.	2024-03-21 13:57:10 +00:00
Janek van Oirschot	f7bebc1914	Reland [AMDGPU] Add AMDGPU specific variadic operation MCExprs (#84562 ) Adds AMDGPU specific variadic MCExpr operations 'max' and 'or'. Relands #82022 with fixes	2024-03-14 14:31:00 +00:00
Shilei Tian	e963d0740e	[AMDGPU] Replace `isInlinableLiteral16` with specific version (#84402 ) The current implementation of `isInlinableLiteral16` assumes, a 16-bit inlinable literal is either an `i16` or a `fp16`. This is not always true because of `bf16`. However, we can't tell `fp16` and `bf16` apart by just looking at the value. This patch splits `isInlinableLiteral16` into three versions, `i16`, `fp16`, `bf16` respectively, and call the corresponding version.	2024-03-08 14:49:52 -05:00
Diana Picus	0086cc95b3	[AMDGPU] Rename getNumVGPRBlocks. NFC (#84161 ) Rename getNumVGPRBlocks to getEncodedNumVGPRBlocks, to clarify that it's using the encoding granule. This is used to program the hardware. In practice, the hardware will use the alloc granule instead, so this patch also adds a new helper, getAllocatedNumVGPRBlocks, which can be useful when driving heuristics.	2024-03-07 12:46:42 +01:00
Florian Mayer	0083c3eb83	Revert "[AMDGPU] Add AMDGPU specific variadic operation MCExprs" (#84273 ) Reverts llvm/llvm-project#82022 Fails on hwasan build bot: https://lab.llvm.org/buildbot/#/builders/236/builds/9874/steps/10/logs/stdio	2024-03-06 19:37:49 -08:00
Janek van Oirschot	bec2d105c7	[AMDGPU] Add AMDGPU specific variadic operation MCExprs (#82022 ) Adds AMDGPU specific variadic MCExpr operations 'max' and 'or'.	2024-03-06 21:01:54 +00:00
Shilei Tian	e9c1dbb408	Revert "[AMDGPU] Replace `isInlinableLiteral16` with specific version (#81345 )" This reverts commit 530f0e64ec11327879c44f2fd55c7c28efdbaa2d because it breaks downstream.	2024-03-06 08:42:54 -05:00
Shilei Tian	530f0e64ec	[AMDGPU] Replace `isInlinableLiteral16` with specific version (#81345 )	2024-03-04 08:40:42 -05:00
Jay Foad	53f89a0bb7	[AMDGPU] Remove AtomicNoRet class and getAtomicNoRetOp table (#83593 )	2024-03-01 17:18:55 +00:00
Ivan Kosarev	680c780a36	[AMDGPU][AsmParser] Support structured HWREG operands. (#82805 ) Symbolic values are to be supported separately.	2024-02-28 14:44:34 +00:00
Ivan Kosarev	dfa1d9b027	[AMDGPU][NFC] Have helpers to deal with encoding fields. (#82772 ) These are hoped to provide more convenient and less error prone facilities to encode and decode fields than manually defined constants and functions.	2024-02-23 17:34:55 +00:00
Stanislav Mekhanoshin	98db8d0cb7	[AMDGPU] Fix v_dot2_f16_f16/v_dot2_bf16_bf16 operands (#82423 ) src0 and src1 are packed f16/bf16, we are printing literals like 0x40002000, but we cannot parse it.	2024-02-20 16:34:40 -08:00
Stanislav Mekhanoshin	030d07574f	[AMDGPU] Fix bf16 inv2pi inline constant hadling (#82283 ) Inline constant 1/(2pi) has the truncated value 0x3e22. According to the spec it is not rounded. A bf16 value in a nutshall is a fp32 value with cleared 16 bites of mantissa. The value 0x3e22 converted to fp32 is 0.158203125 and the next representable value 0x3e23 means 0.1591796875. The fp32 value of 1/(2pi) = 0.15915494 cannot be represented in bf16. Although since bf16 values are essentailly truncated fp32 values we can use 0.15915494 as an idiomatic representation of 1/(2*pi) inline constant. This is also consistent with sp3 behaviour. The patch fixes the problem that value we are printing for inv2pi inline constant is not parsed as inv2pi by the asm parser and gets rounded.	2024-02-19 15:34:09 -08:00
Shilei Tian	46734aa1e5	[AMDGPU] Use `bf16` instead of `i16` for bfloat (#80908 ) Currently we generally use `i16` to represent `bf16` in those tablegen files. This patch is trying to use `bf16` directly. Fix #79369.	2024-02-16 15:58:30 -05:00
Konstantin Zhuravlyov	fcef407aa2	AMDGPU/NFC: Remove some bits from TSFlags (#81525 ) - AMDGPU/NFC: Purge SOPK_ZEXT from TSFlags - Moved to helper function in SIInstInfo - AMDGPU/NFC: Purge VOPAsmPrefer32Bit from TSFlags - This flag did not make sense / remnants of something else I think	2024-02-12 16:43:48 -05:00
Ivan Kosarev	7d19dc50de	[AMDGPU][True16] Support VOP3 source DPP operands. (#80892 )	2024-02-08 16:23:00 +00:00
Shilei Tian	09fc333ec0	[NFC] Fold an `if` statement into `return` of bool expression	2024-01-31 13:55:22 -05:00
Kazu Hirata	292b508eba	[AMDGPU] Use StringRef::consume_front (NFC)	2024-01-30 22:12:10 -08:00
Shilei Tian	6a21e00e39	[AMDGPU][AsmParser] Allow `v_writelane_b32` to use SGPR and M0 as source operands at the same time (#78827 ) Currently the asm parser takes `v_writelane_b32 v1, s13, m0` as illegal instruction for pre-gfx11 because it uses two constant buses while the hardware can only allow one. However, based on the comment of `AMDGPUInstructionSelector::selectWritelane`, it is allowed to have M0 as lane selector and a SGPR used as SRC0 because the lane selector doesn't count as a use of constant bus. In fact, codegen can already generate this form, but this inconsistency is not exposed because the validation of constant bus limitation only happens when paring an assembly but we don't have a test case when both SGPR and M0 used as source operands for the instruction.	2024-01-30 15:39:31 -05:00
Nico Weber	184ca39529	[llvm] Move CodeGenTypes library to its own directory (#79444 ) Finally addresses https://reviews.llvm.org/D148769#4311232 :) No behavior change.	2024-01-25 12:01:31 -05:00
Ivan Kosarev	b0b7be2701	[AMDGPU][NFC] Rename the reg-or-imm operand predicates to match their class names. (#79439 ) No need to have two names for the same thing. Also simplifies operand definitions. Part of <https://github.com/llvm/llvm-project/issues/62629>.	2024-01-25 13:29:54 +00:00
Mirko Brkušanin	7fdf608cef	[AMDGPU] Add GFX12 WMMA and SWMMAC instructions (#77795 ) Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com> Co-authored-by: Piotr Sobczak <piotr.sobczak@amd.com>	2024-01-24 13:43:07 +01:00
Mariusz Sikora	cfddb59be2	[AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (#78414 ) …bf8 instructions Add VOP1, VOP1_DPP8, VOP1_DPP16, VOP3, VOP3_DPP8, VOP3_DPP16 instructions that were supported on GFX940 (MI300): - V_CVT_F32_FP8 - V_CVT_F32_BF8 - V_CVT_PK_F32_FP8 - V_CVT_PK_F32_BF8 - V_CVT_PK_FP8_F32 - V_CVT_PK_BF8_F32 - V_CVT_SR_FP8_F32 - V_CVT_SR_BF8_F32 --------- Co-authored-by: Mateja Marjanovic <mateja.marjanovic@amd.com> Co-authored-by: Mirko Brkušanin <Mirko.Brkusanin@amd.com>	2024-01-24 12:21:15 +01:00
Ivan Kosarev	5a458767dd	[AMDGPU][True16] Support source DPP operands. (#79025 )	2024-01-23 09:52:49 +00:00
Emma Pilkington	bc82cfb38d	[AMDGPU] Add an asm directive to track code_object_version (#76267 ) Named '.amdhsa_code_object_version'. This directive sets the e_ident[ABIVERSION] in the ELF header, and should be used as the assumed COV for the rest of the asm file. This commit also weakens the --amdhsa-code-object-version CL flag. Previously, the CL flag took precedence over the IR flag. Now the IR flag/asm directive take precedence over the CL flag. This is implemented by merging a few COV-checking functions in AMDGPUBaseInfo.h.	2024-01-21 11:54:47 -05:00
Mariusz Sikora	28b7e498b6	AMDGPU/GFX12: Add new dot4 fp8/bf8 instructions (#77892 ) Endoding is VOP3P. Tagged as deep/machine learning instructions. i32 type (v4fp8 or v4bf8 packed in i32) is used for src0 and src1. src0 and src1 have no src_modifiers. src2 is f32 and has src_modifiers: f32 fneg(neg_lo[2]) and f32 fabs(neg_hi[2]). --------- Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com>	2024-01-18 14:00:27 +01:00
Ivan Kosarev	084f1c2ee0	[AMDGPU][True16] Support V_CEIL_F16. (#73108 ) As not all fake instructions have their real counterparts implemented yet, we specify no AssemblerPredicate for UseFakeTrue16Insts to allow both fake and real True16 instructions in assembler and disassembler tests in the -mattr=+real-true16 mode during the transition period. Source DPP and desitnation VOPDstOperand_t16 operands are still not supported and will be addressed separately.	2024-01-10 08:46:19 +00:00
Nicolai Hähnle	49b492048a	AMDGPU: Fix packed 16-bit inline constants (#76522 ) Consistently treat packed 16-bit operands as 32-bit values, because that's really what they are. The attempt to treat them differently was ultimately incorrect and lead to miscompiles, e.g. when using non-splat constants such as (1, 0) as operands. Recognize 32-bit float constants for i/u16 instructions. This is a bit odd conceptually, but it matches HW behavior and SP3. Remove isFoldableLiteralV216; there was too much magic in the dependency between it and its use in SIFoldOperands. Instead, we now simply rely on checking whether a constant is an inline constant, and trying a bunch of permutations of the low and high halves. This is more obviously correct and leads to some new cases where inline constants are used as shown by tests. Move the logic for switching packed add vs. sub into SIFoldOperands. This has two benefits: all logic that optimizes for inline constants in packed math is now in one place; and it applies to both SelectionDAG and GISel paths. Disable the use of opsel with v_dot* instructions on gfx11. They are documented to ignore opsel on src0 and src1. It may be interesting to re-enable to use of opsel on src2 as a future optimization. A similar "proper" fix of what inline constants mean could potentially be applied to unpacked 16-bit ops. However, it's less clear what the benefit would be, and there are surely places where we'd have to carefully audit whether values are properly sign- or zero-extended. It is best to keep such a change separate. Fixes: Corruption in FSR 2.0 (latent bug exposed by an LLPC change)	2024-01-04 00:10:15 +01:00
Mirko Brkušanin	82e33d6203	[AMDGPU] Add VDSDIR instructions for GFX12 (#75197 )	2024-01-03 16:32:00 +01:00
Jay Foad	c01e844a7e	[AMDGPU] Update compute program resource registers for GFX12 (#75911 ) Co-authored-by: Konstantin Zhuravlyov <kzhuravl@amd.com>	2024-01-02 13:24:42 +00:00
Mirko Brkušanin	c1a6974d6b	[AMDGPU][MC] Add GFX12 SMEM encoding (#75215 )	2023-12-15 09:00:54 +01:00
Mirko Brkušanin	47615ddc84	[AMDGPU][MC] Add GFX12 VFLAT, VSCRATCH and VGLOBAL encodings (#75193 )	2023-12-14 14:22:04 +01:00
Mirko Brkušanin	ac406b4817	[AMDGPU][MC] Add GFX12 VBUFFER encoding (#75195 )	2023-12-14 12:58:18 +01:00
Mariusz Sikora	7f55d7de1a	[AMDGPU] GFX12: Add Split Workgroup Barrier (#74836 ) Co-authored-by: Vang Thao <Vang.Thao@amd.com>	2023-12-13 15:01:13 +01:00
Piotr Sobczak	fac093dd08	[AMDGPU] Update IEEE and DX10_CLAMP for GFX12 (#75030 ) Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2023-12-13 13:52:40 +01:00
Mariusz Sikora	a97028ac51	[AMDGPU] Update VOP instructions for GFX12 (#74853 ) Co-authored-by: Mirko Brkusanin <Mirko.Brkusanin@amd.com>	2023-12-12 11:38:24 +01:00
Kazu Hirata	586ecdf205	[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956 ) This patch replaces uses of StringRef::{starts,ends}with with StringRef::{starts,ends}_with for consistency with std::{string,string_view}::{starts,ends}_with in C++20. I'm planning to deprecate and eventually remove StringRef::{starts,ends}with.	2023-12-11 21:01:36 -08:00
Mariusz Sikora	19c9f9c0bf	[AMDGPU] GFX12: Add s_prefetch_inst/data instructions (#74448 ) Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2023-12-06 10:01:29 +01:00

1 2 3 4 5 ...

581 Commits