553 Commits

Author SHA1 Message Date
Shilei Tian
09fc333ec0 [NFC] Fold an if statement into return of bool expression 2024-01-31 13:55:22 -05:00
Kazu Hirata
292b508eba [AMDGPU] Use StringRef::consume_front (NFC) 2024-01-30 22:12:10 -08:00
Shilei Tian
6a21e00e39
[AMDGPU][AsmParser] Allow v_writelane_b32 to use SGPR and M0 as source operands at the same time (#78827)
Currently the asm parser takes `v_writelane_b32 v1, s13, m0` as illegal
instruction for pre-gfx11 because it uses two constant buses while the
hardware
can only allow one. However, based on the comment of
`AMDGPUInstructionSelector::selectWritelane`,
it is allowed to have M0 as lane selector and a SGPR used as SRC0
because the
lane selector doesn't count as a use of constant bus. In fact, codegen
can already
generate this form, but this inconsistency is not exposed because the
validation
of constant bus limitation only happens when paring an assembly but we
don't have
a test case when both SGPR and M0 used as source operands for the
instruction.
2024-01-30 15:39:31 -05:00
Nico Weber
184ca39529
[llvm] Move CodeGenTypes library to its own directory (#79444)
Finally addresses https://reviews.llvm.org/D148769#4311232 :)

No behavior change.
2024-01-25 12:01:31 -05:00
Ivan Kosarev
b0b7be2701
[AMDGPU][NFC] Rename the reg-or-imm operand predicates to match their class names. (#79439)
No need to have two names for the same thing. Also simplifies operand
definitions.

Part of <https://github.com/llvm/llvm-project/issues/62629>.
2024-01-25 13:29:54 +00:00
Mirko Brkušanin
7fdf608cef
[AMDGPU] Add GFX12 WMMA and SWMMAC instructions (#77795)
Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com>
Co-authored-by: Piotr Sobczak <piotr.sobczak@amd.com>
2024-01-24 13:43:07 +01:00
Mariusz Sikora
cfddb59be2
[AMDGPU][GFX12] VOP encoding and codegen - add support for v_cvt fp8/… (#78414)
…bf8 instructions

    Add VOP1, VOP1_DPP8, VOP1_DPP16, VOP3, VOP3_DPP8, VOP3_DPP16
    instructions that were supported on GFX940 (MI300):
    - V_CVT_F32_FP8
    - V_CVT_F32_BF8
    - V_CVT_PK_F32_FP8
    - V_CVT_PK_F32_BF8
    - V_CVT_PK_FP8_F32
    - V_CVT_PK_BF8_F32
    - V_CVT_SR_FP8_F32
    - V_CVT_SR_BF8_F32

---------

Co-authored-by: Mateja Marjanovic <mateja.marjanovic@amd.com>
Co-authored-by: Mirko Brkušanin <Mirko.Brkusanin@amd.com>
2024-01-24 12:21:15 +01:00
Ivan Kosarev
5a458767dd
[AMDGPU][True16] Support source DPP operands. (#79025) 2024-01-23 09:52:49 +00:00
Emma Pilkington
bc82cfb38d
[AMDGPU] Add an asm directive to track code_object_version (#76267)
Named '.amdhsa_code_object_version'. This directive sets the
e_ident[ABIVERSION] in the ELF header, and should be used as the assumed
COV for the rest of the asm file.

This commit also weakens the --amdhsa-code-object-version CL flag.
Previously, the CL flag took precedence over the IR flag. Now the IR
flag/asm directive take precedence over the CL flag. This is implemented
by merging a few COV-checking functions in AMDGPUBaseInfo.h.
2024-01-21 11:54:47 -05:00
Mariusz Sikora
28b7e498b6
AMDGPU/GFX12: Add new dot4 fp8/bf8 instructions (#77892)
Endoding is VOP3P. Tagged as deep/machine learning instructions. i32
type (v4fp8 or v4bf8 packed in i32) is used for src0 and src1. src0 and
src1 have no src_modifiers. src2 is f32 and has src_modifiers: f32
fneg(neg_lo[2]) and f32 fabs(neg_hi[2]).

---------

Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com>
2024-01-18 14:00:27 +01:00
Ivan Kosarev
084f1c2ee0
[AMDGPU][True16] Support V_CEIL_F16. (#73108)
As not all fake instructions have their real counterparts implemented
yet, we specify no AssemblerPredicate for UseFakeTrue16Insts to allow
both fake and real True16 instructions in assembler and disassembler
tests in the -mattr=+real-true16 mode during the transition period.

Source DPP and desitnation VOPDstOperand_t16 operands are still not
supported and will be addressed separately.
2024-01-10 08:46:19 +00:00
Nicolai Hähnle
49b492048a
AMDGPU: Fix packed 16-bit inline constants (#76522)
Consistently treat packed 16-bit operands as 32-bit values, because
that's really what they are. The attempt to treat them differently was
ultimately incorrect and lead to miscompiles, e.g. when using non-splat
constants such as (1, 0) as operands.

Recognize 32-bit float constants for i/u16 instructions. This is a bit
odd conceptually, but it matches HW behavior and SP3.

Remove isFoldableLiteralV216; there was too much magic in the dependency
between it and its use in SIFoldOperands. Instead, we now simply rely on
checking whether a constant is an inline constant, and trying a bunch of
permutations of the low and high halves. This is more obviously correct
and leads to some new cases where inline constants are used as shown by
tests.

Move the logic for switching packed add vs. sub into SIFoldOperands.
This has two benefits: all logic that optimizes for inline constants in
packed math is now in one place; and it applies to both SelectionDAG and
GISel paths.

Disable the use of opsel with v_dot* instructions on gfx11. They are
documented to ignore opsel on src0 and src1. It may be interesting to
re-enable to use of opsel on src2 as a future optimization.

A similar "proper" fix of what inline constants mean could potentially
be applied to unpacked 16-bit ops. However, it's less clear what the
benefit would be, and there are surely places where we'd have to
carefully audit whether values are properly sign- or zero-extended. It
is best to keep such a change separate.

Fixes: Corruption in FSR 2.0 (latent bug exposed by an LLPC change)
2024-01-04 00:10:15 +01:00
Mirko Brkušanin
82e33d6203
[AMDGPU] Add VDSDIR instructions for GFX12 (#75197) 2024-01-03 16:32:00 +01:00
Jay Foad
c01e844a7e
[AMDGPU] Update compute program resource registers for GFX12 (#75911)
Co-authored-by: Konstantin Zhuravlyov <kzhuravl@amd.com>
2024-01-02 13:24:42 +00:00
Mirko Brkušanin
c1a6974d6b
[AMDGPU][MC] Add GFX12 SMEM encoding (#75215) 2023-12-15 09:00:54 +01:00
Mirko Brkušanin
47615ddc84
[AMDGPU][MC] Add GFX12 VFLAT, VSCRATCH and VGLOBAL encodings (#75193) 2023-12-14 14:22:04 +01:00
Mirko Brkušanin
ac406b4817
[AMDGPU][MC] Add GFX12 VBUFFER encoding (#75195) 2023-12-14 12:58:18 +01:00
Mariusz Sikora
7f55d7de1a
[AMDGPU] GFX12: Add Split Workgroup Barrier (#74836)
Co-authored-by: Vang Thao <Vang.Thao@amd.com>
2023-12-13 15:01:13 +01:00
Piotr Sobczak
fac093dd08
[AMDGPU] Update IEEE and DX10_CLAMP for GFX12 (#75030)
Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
2023-12-13 13:52:40 +01:00
Mariusz Sikora
a97028ac51
[AMDGPU] Update VOP instructions for GFX12 (#74853)
Co-authored-by: Mirko Brkusanin <Mirko.Brkusanin@amd.com>
2023-12-12 11:38:24 +01:00
Kazu Hirata
586ecdf205
[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-11 21:01:36 -08:00
Mariusz Sikora
19c9f9c0bf
[AMDGPU] GFX12: Add s_prefetch_inst/data instructions (#74448)
Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
2023-12-06 10:01:29 +01:00
Mirko Brkušanin
f5868cb6a6
[AMDGPU][MC] Add GFX12 VIMAGE and VSAMPLE encodings (#74062) 2023-12-04 13:04:42 +01:00
Ivan Kosarev
c9cdaffe49
[AMDGPU] Fix operand definitions for atomic scalar memory instructions. (#71799)
CPol and CPol_GLC1 operand classes have identical predicates, which
means AsmParser cannot differentiate between the RTN and non-RTN
variants of the instructions. When it currently selects the wrong
instruction, a hack in cvtSMEMAtomic() corrects the op-code. Using the
new predicated-value operands makes this hack and the whole conversion
function not needed.

Other uses of CPol_GLC1 operands are to be addressed separately.

Resolves about half of the remaining ~1000 pairs of ambiguous
instructions.

Part of <https://github.com/llvm/llvm-project/issues/69256>.
2023-11-10 14:00:46 +00:00
Stanislav Mekhanoshin
f7ab79f33e
[AMDGPU] Allow lit() on operands which do not accept modifiers (#69527) 2023-10-19 10:41:44 -07:00
Konstantin Zhuravlyov
6cfb64276d
AMDGPU: Minor updates to program resource registers (#69525)
- Be explicit about which program resource register is supported by
which target
    - RSRC1
      - FP16_OVFL is GFX9+
      - WGP_MODE is GFX10+
      - MEM_ORDERED is GFX10+
      - FWD_PROGRESS is GFX10+
    - RSRC3
      - INST_PREF_SIZE is GFX11+
      - TRAP_ON_START is GFX11+
      - TRAP_ON_END is GFX11+
      - IMAGE_OP is GFX11+
  - Do not emit GFX11+ fields when disassembling GFX10 code objects
  - Tighten enforcement of reserved bits in disassembler

---------

Co-authored-by: Konstantin Zhuravlyov <kzhuravl@amd.com>
2023-10-19 12:40:19 -04:00
Ivan Kosarev
509b5708e9
[AMDGPU][AsmParser] Eliminate custom predicates for named-bit operands. (#69243)
isGDS() and isTFE() need special treatment, because they may be both
named-bit and token operands.

Part of #62629.
2023-10-17 13:59:42 +01:00
Stanislav Mekhanoshin
06cd6485ae
[AMDGPU] Make ubsan happy (#68959) 2023-10-13 00:41:35 -07:00
Stanislav Mekhanoshin
ab6c3d5034
[AMDGPU] Change the representation of double literals in operands (#68740)
A 64-bit literal can be used as a 32-bit zero or sign extended operand.
In case of double zeroes are added to the low 32 bits. Currently asm
parser stores only high 32 bits of a double into an operand. To support
codegen as requested by the
https://github.com/llvm/llvm-project/issues/67781 we need to change the
representation to store a full 64-bit value so that codegen can simply
add immediates to an instruction.

There is some code to support compatibility with existing tests and asm
kernels. We allow to use short hex strings to represent only a high 32
bit of a double value as a valid literal.
2023-10-12 14:45:45 -07:00
Stanislav Mekhanoshin
e724c7e841
[AMDGPU] Add lit() asm operand modifier for SP3 compatibility. (#68839)
A literal can be optionally wrapped in a lit() expression. This does not
force literal encoding for an inline immediate, but neither does SP3.
The syntax is dummy for compatibility only.
2023-10-12 00:39:56 -07:00
Mirko Brkušanin
2cd2445c21
[AMDGPU] Src1 of VOP3 DPP instructions can be SGPR on supported subtargets (#67461)
In order to avoid duplicating every dpp pseudo opcode that has src1, we
allow it for all opcodes and add manual checks on subtargets that do not
support it.
2023-09-29 11:54:49 +02:00
Ivan Kosarev
9310baa596 [AMDGPU][NFC] Add True16 operand definitions.
Reviewed By: Joe_Nash

Differential Revision: https://reviews.llvm.org/D156103
2023-09-25 16:48:46 +01:00
Pierre van Houtryve
fe2f67e4ba
[AMDGPU] Remove Code Object V2 (#65715)
Code Object V2 has been deprecated for more than a year now. We can
safely remove it from LLVM.

- [clang] Remove support for the `-mcode-object-version=2` option.
- [lld] Remove/refactor tests that were still using COV2
- [llvm] Update AMDGPUUsage.rst
- Code Object V2 docs are left for informational purposes because those
code objects may still be supported by the runtime/loaders for a while.
- [AMDGPU] Remove COV2 emission capabilities.
- [AMDGPU] Remove `MetadataStreamerYamlV2` which was only used by COV2
- [AMDGPU] Update all tests that were still using COV2 - They are either
deleted or ported directly to code object v4 (as v3 is also planned to
be removed soon).
2023-09-21 12:00:45 +02:00
Austin Kerbow
69447d6afe [AMDGPU] Add ASM and MC updates for preloading kernargs
Add assembler directives for preloading kernel arguments that correspond
to new fields in the kernel descriptor for the length and offset of
arguments that will be placed in SGPRs prior to kernel launch. Alignment
of the arguments in SGPRs is equivalent to the kernarg segment when
accessed via the kernarg_segment_ptr. Kernarg SGPRs are allocated
directly after other user SGPRs.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D159459
2023-09-19 15:45:16 -07:00
jwanggit86
37b538819b
[AMDGPU] Incorrect error message regarding SCC modifier (#65660)
For the AMD GFX90A GPU, the SCC instruction modifier is allowed for
certain classes of instructions. However, the current assembler
generates an error message, "scc is not supported on this GPU",
regardless of the instruciton. This fix modifies the message as well as
the logic for generating the message. Related tests are moved from
gfx90a_err.s to gfx90a_asm_features.s.

Co-authored-by: Jun Wang <jun.wang7@amd.com>
2023-09-08 13:07:06 -07:00
Sergei Barannikov
a479be0f39 [MC] Change tryParseRegister to return ParseStatus (NFC)
This finishes the work of replacing OperandMatchResultTy with
ParseStatus, started in D154101.
As a drive-by change, rename some RegNo variables to just Reg
(a leftover from the days when RegNo had 'unsigned' type).
2023-09-06 10:28:12 +03:00
Stanislav Mekhanoshin
cfe9a134bb [AMDGPU] Rename 64BitDPP feature and fix the checks
Names '64BitDPP' and especially 'DPP64' were found misleading, and
DPP64 can easily be mixed with DPP16 and DPP8 while these are
different concepts. DPP16 and DPP8 refers to lanes where DPP64
refers to the operand size.

In fact the essential part here is that these instructions are
executed on the DP ALU, so rename the feature accordingly.

I have also found a bug in a check for these instructions, which is
fixed here and a common utility function is now used.

Differential Revision: https://reviews.llvm.org/D158465
2023-08-22 11:00:10 -07:00
Jay Foad
18919ee759 [AMDGPU] Validate GDS in the assembler
GFX90A DS instructions cannot use the gds modifier, except for the
DS_GWS_* instructions where it is still mandatory.

Differential Revision: https://reviews.llvm.org/D157100
2023-08-11 09:21:55 +01:00
Ivan Kosarev
6b693f5e34 [AMDGPU][AsmParser][NFC] Translate parsed DS instructions to MCInsts automatically.
Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D155620
2023-07-19 11:30:56 +01:00
Ivan Kosarev
7b6e606dac [AMDGPU][AsmParser][NFC] Translate parsed MIMG instructions to MCInsts automatically.
Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D155061
2023-07-13 19:47:31 +01:00
Ivan Kosarev
289ae6525d [AMDGPU][MC] Fix handling of A16 operands in intersect_ray instructions.
The patch adds the support for 'noa16' operands in non-A16 variants of
the instructions, fixes validation of A16 operands and eliminates the
custom conversion to MCInst.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D155057
2023-07-13 19:46:03 +01:00
Scott Linder
986001c827 [AMDGPU] Improve assembler + disassembler handling of kernel descriptors
* Relax the AsmParser to accept `.amdhsa_wavefront_size32 0` when the
  `.amdhsa_shared_vgpr_count` directive is present.
* Teach the KD disassembler to respect the setting of
  KERNEL_CODE_PROPERTY_ENABLE_WAVEFRONT_SIZE32 when calculating the
  value of `.amdhsa_next_free_vgpr`.
* Teach the KD disassembler to disassemble COMPUTE_PGM_RSRC3 for gfx90a
  and gfx10+.
* Include "pseudo directive" comments for gfx10 fields which are not
  controlled by any assembler directive.
* Fix disassembleObject failure diagnostic in llvm-objdump to not
  hard-code a comment string, and to follow the convention of not
  capitalizing the first sentence.

Reviewed By: rochauha

Differential Revision: https://reviews.llvm.org/D128014
2023-07-06 21:20:51 +00:00
Ivan Kosarev
12460cf90f [AMDGPU][AsmParser] Simplify the implementation of SWZ operands.
Those are implicit helper operands and therefore don't need any parsers
or printers.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: piotr, foad

Differential Revision: https://reviews.llvm.org/D154432
2023-07-05 10:45:12 +01:00
Ivan Kosarev
dac5957ca9 [AMDGPU][AsmParser][NFC] Clean up the implementation of KImmFP operands.
addKImmFPOperands() duplicates the KImmFP-specific logic implemented in
addLiteralImmOperand() and therefore can be removed.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D154427
2023-07-05 10:43:41 +01:00
Ivan Kosarev
ee165cdb1b [AMDGPU] Eliminate SIMCCodeEmitter and de-virtualise encoding methods.
Simplifies some future changes needed for
<https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D154337
2023-07-05 10:13:33 +01:00
Sergei Barannikov
d73c8c8fab [AMDGPU] Replace OperandMatchResultTy with ParseStatus (NFC)
ParseStatus is slightly more convenient to use due to implicit
conversion from bool, which allows to do something like:
```
  return Error(L, "msg");
```
when with MatchOperandResultTy it had to be:
```
  Error(L, "msg");
  return MatchOperand_ParseFail;
```
It also has more appropriate name since parse* methods are not only for
parsing operands.

Reviewed By: kosarev

Differential Revision: https://reviews.llvm.org/D154293
2023-07-04 22:29:56 +03:00
Ivan Kosarev
1ef47165fd [AMDGPU][AsmParser][NFC] Remove an unused function.
Was added in <https://reviews.llvm.org/D63293>, but never used.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D154331
2023-07-04 11:33:17 +01:00
Ivan Kosarev
b7e8a55a83 [AMDGPU][AsmParser][NFC] Simplify parsing of sopp_brtarget operands.
Also refine the definitions while there.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: mbrkusanin

Differential Revision: https://reviews.llvm.org/D154061
2023-06-30 16:35:46 +01:00
Ivan Kosarev
59fd48d71e [AMDGPU][AsmParser][NFC] Simplify instruction operand definitions.
This addresses the trivial cases that only require removing the
operand classes and renaming related entities.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D153965
2023-06-29 10:51:44 +01:00
Ivan Kosarev
5183ca8779 [AMDGPU][AsmParser] Eliminate cvtMtbuf().
Now that we have proper support for optional operands, the standard LLVM
machinery can take care of converting parsed instructions to MCInsts.
There are likely more cases where the conversion can be done
automatically, probably with some additional treatment. The plan is to
address them separately.

Part of <https://github.com/llvm/llvm-project/issues/62629>.

Reviewed By: arsenm, foad

Differential Revision: https://reviews.llvm.org/D153565
2023-06-23 12:43:52 +01:00