694 Commits

Author SHA1 Message Date
Mirko Brkušanin
45b037cf7a
[AMDGPU] Add fp8/bf8 conversion instructions for gfx1170 (#180191) 2026-02-09 13:56:43 +01:00
Matt Arsenault
8554ed738f
AMDGPU: Add syntax for s_wait_event values (#180272)
Previously this would just print hex values. Print names for the
recognized values, matching the sp3 syntax.
2026-02-09 08:29:55 +00:00
Mirko Brkušanin
65cdee827b
[AMDGPU] Move magic strings used for WMMA modifiers (NFC) (#180201) 2026-02-06 16:36:50 +01:00
Jun Wang
b9b7b31e23
[AMDGPU][MC] Allow nodone etc. in exp instructions (#172749)
This patch allows nodone, nocompr, novm, and norow_en to be used in exp
instructions to indicate the corresponding modifiers are not present.
2026-02-04 10:29:07 -08:00
Mariusz Sikora
3c0f5045e1
[AMDGPU] Add FeatureGFX13 and SMEM encoding for gfx13 (#177567)
For now list of features is based on gfx12 and gfx1250

---------

Co-authored-by: Jay Foad <jay.foad@amd.com>
2026-01-26 14:16:36 +01:00
Shilei Tian
c253b9f9ca
[AMDGPU] Fix inline constant encoding for v_pk_fmac_f16 (#176659)
This PR handles`v_pk_fmac_f16` inline constant encoding/decoding
differences between pre-GFX11 and GFX11+ hardware.

- Pre-GFX11: fp16 inline constants produce `(f16, 0)` - value in low 16
bits, zero in high.
- GFX11+: fp16 inline constants are duplicated to both halves `(f16,
f16)`.

Fixes #94116.
2026-01-20 19:14:59 -05:00
Matt Arsenault
2ee12f191a
AMDGPU: Use RegClassByHwMode to manage GWS operand special case (#169373)
On targets that require even aligned 64-bit VGPRs, GWS operands
require even alignment of a 32-bit operand. Previously we had a hacky
post-processing which added an implicit operand to try to manage
the constraint. This would require special casing in other passes
to avoid breaking the operand constraint. This moves the handling
into the instruction definition, so other passes no longer need
to consider this edge case. MC still does need to special case this,
to print/parse as a 32-bit register. This also still ends up net
less work than introducing even aligned 32-bit register classes.

This also should be applied to the image special case.
2025-11-25 18:55:34 +00:00
Jun Wang
306f49a254
[AMDGPU][MC] Disallow nogds in ds_gws_* instructions (#166873)
The ds_gws_* instructions require gds as an operand. However, when nogds
is given, it is treated the same as gds. This patch fixes this to
disallow nogds.
2025-11-14 10:14:12 -08:00
Craig Topper
ee41ab3dea
[AMDGPU] Use MCRegister instead of unsigned. NFC (#167558) 2025-11-11 18:39:31 +00:00
Kewen Meng
9a0000b150
Revert "[AMDGPU][MC] GFX9 - Support NV bit in FLAT instructions in pre-GFX90A" (#166693)
Reverts llvm/llvm-project#154237

It breaks bot: https://lab.llvm.org/buildbot/#/builders/123/builds/30172
2025-11-05 20:07:33 -08:00
Jun Wang
db6231b4c2
[AMDGPU][MC] GFX9 - Support NV bit in FLAT instructions in pre-GFX90A (#154237) targets
This patch enables support of the NV (non-volatile) bit in FLAT
instructions in GFX9 (pre-GFX90A) targets.
2025-11-05 11:52:56 -08:00
Jeffrey Byrnes
66b481556e
[AMDGPU][AsmParser]: Use dummy operand for parsing buffer_ SWZ operand. (#165305)
`MCInstrDesc` counts the SWZ operand for `NumOperands` -- thus, since we
do not parse this into the `MCInst` operands, there will be a mismatch
between `MCInst.getNumOperands` and `MCInstrDesc.getNumOperands` .

`llvm-mca` assumes that each operand counted by
`MCInstrDesc.getNumOperands` will be present in `MCInst.operands`
263377a175/llvm/lib/MCA/InstrBuilder.cpp (L324)

This parses a dummy operand for the buffer_loads as a placeholder for
the implicit SWZ operand. This is similar to the parsing of `tbuffer_`
variants which automatically parse the dummy operand
263377a175/llvm/utils/TableGen/AsmMatcherEmitter.cpp (L1853)
2025-10-27 18:39:59 -07:00
Stanislav Mekhanoshin
997af95fac
[AMDGPU] Remove validation of s_set_vgpr_msb range (#164888)
We will need the full 16-bit range of the operand to record
previous mode.
2025-10-23 15:45:35 -07:00
Ivan Kosarev
3d4da1ee81
[AMDGPU][MC] Do not inline lit()/lit64() operands. (#162137)
For now treat the modifiers synonymous to each other.

The disassembler side is to be addressed separately.
2025-10-08 12:43:42 +01:00
Ivan Kosarev
20f41ed8c1
[AMDGPU][MC] Avoid creating lit64() operands unless asked or needed. (#161191)
There should normally be no need to generate implicit lit64()
modifiers on the assembler side. It's the encoder's responsibility
to recognise literals that are implicitly 64 bits wide.

The exceptions are where we rewrite floating-point operand values
as integer ones, which would not be assembled back to the original
values unless wrapped into lit64().

Respect explicit lit() modifiers for non-inline values as
necessary to avoid regressions in MC tests. This change still
doesn't prevent use of inline constants where lit()/lit64 is
specified; subject to a separate patch.

On disassembling, only create lit64() operands where necessary for
correct round-tripping.

Add round-tripping tests where useful and feasible.
2025-10-08 10:51:55 +01:00
Matt Arsenault
1a5494ca4a
AMDGPU: Use RegClassByHwMode to manage operand VGPR operand constraints (#158272)
This removes special case processing in TargetInstrInfo::getRegClass to
fixup register operands which depending on the subtarget support AGPRs,
or require even aligned registers.

This regresses assembler diagnostics, which currently work by hackily
accepting invalid cases and then post-rejecting a validly parsed
instruction.
On the plus side this now emits a comment when disassembling unaligned
registers for targets with the alignment requirement.
2025-10-08 11:19:54 +09:00
Jun Wang
64190462f7
[AMDGPU][MC] GFX9 - allow op_sel in v_interp_p2_f16 (#150712)
AMDGPU documentation states op_sel[3] can be used in v_interp_p2_f16 in
GFX9.
2025-10-06 10:58:13 -07:00
Rahul Joshi
2a72522c52
[NFC][LLVM] Pass/return SMLoc by value instead of const reference (#160797)
SMLoc itself encapsulates just a pointer, so there is no need to pass or
return it by reference.
2025-09-26 08:58:35 -07:00
Matt Arsenault
0a80631142
AMDGPU: Ensure both wavesize features are not set (#159234)
Make sure we cannot be in a mode with both wavesizes. This
prevents assertions in a future change. This should probably
just be an error, but we do not have a good way to report
errors from the MCSubtargetInfo constructor.
2025-09-25 09:46:34 +00:00
Ivan Kosarev
9e55d81c68
[AMDGPU][AsmParser] Introduce MC representation for lit() and lit64(). (#160316)
And rework the lit64() support to use it.

The rules for when to add lit64() can be simplified and
improved. In this change, however, we just follow the existing
conventions on the assembler and disassembler sides.

In codegen we do not (and normally should not need to) add explicit
lit() and lit64() modifiers, so the codegen tests lose them. The change
is an NFCI otherwise.

Simplifies printing operands.
2025-09-24 12:35:50 +01:00
Ivan Kosarev
f3762b7b71
[AMDGPU][AsmParser][NFC] Combine the Lit and Lit64 modifier flags. (#160315)
They represent mutually exclusive values of the same attribute.
2025-09-23 17:49:18 +01:00
Matt Arsenault
7dc7d0e129
AMDGPU: Remove subtarget feature hacking in AsmParser (#159227)
The wavesize hacking part was already done in
createAMDGPUMCSubtargetInfo, and we can move the default
target hack there too.
2025-09-17 12:09:53 +09:00
Ivan Kosarev
7ba7021951
[AMDGPU][MC] Keep MCOperands unencoded. (#158685)
We have proper encoding facilities to encode operands and instructions;
there's no need to pollute the MC representation with encoding details.

Supposed to be an NFCI, but happens to fix some re-encoded instruction
codes in disassembler tests.

The 64-bit operands are to be addressed in following patches introducing
MC-level representation for lit() and lit64() modifiers, to then be
respected by both the assembler and disassembler.
2025-09-16 09:01:01 +01:00
Ivan Kosarev
ed165237cd
[AMDGPU][AsmParser] Simplify getting source locations of operands. (#158323)
Remember indexes of MCOperands in MCParsedAsmOperands as we add them to
instructions. Then use the indexes to find locations by known MCOperands
indexes.

Happens to fix some reported locations in tests. NFCI otherwise.

getImmLoc() is to be eliminated as well; there's enough work for another
patch.
2025-09-15 17:13:31 +01:00
Pierre van Houtryve
dcaa29c8ed
Revert "[AMDGPU][gfx1250] Add cu-store subtarget feature (#150588)" (#157639)
This reverts commit be17791f2624f22b3ed24a2539406164a379125d.

This is not necessary for gfx1250 anymore.
2025-09-10 10:20:59 +02:00
Matt Arsenault
811538eb50
AMDGPU: Check aligned vgpr feature in assembler (#156997)
Use the new feature instead of listing the two separate cases.
2025-09-06 08:28:35 +09:00
Aleksandar Spasojevic
1b47135c9d
[AMDGPU] Ensure positive InstOffset for buffer operations (#145504)
GFX12+ buffer ops require positive InstOffset per AMD hardware spec.
Modified assembler/disassembler to reject negative buffer offsets.
2025-09-04 15:37:46 +02:00
Jay Foad
837a706fb6
[AMDGPU] Fix source location for assembler warnings (#156621)
Call MCInst::setLoc earlier so it is available for warnings generated
during MatchInstructionImpl.
2025-09-04 10:43:47 +01:00
Stanislav Mekhanoshin
6aebbb0a85
[AMDGPU] Define 1024 VGPRs on gfx1250 (#156765)
This is a baseline support, it is not useable yet.
2025-09-03 16:25:18 -07:00
Stanislav Mekhanoshin
cc9acb9df7
[AMDGPU] Add s_set_vgpr_msb gfx1250 instruction (#156524) 2025-09-02 14:22:57 -07:00
Jeremy Kun
be2f0205b6
NFC: remove some instances of deprecated capture (#154884)
```
 warning: implicit capture of 'this' with a capture default of '=' is deprecated [-Wdeprecated-this-capture]
```

Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
2025-08-26 20:29:26 +00:00
Stanislav Mekhanoshin
438c099c23
[AMDGPU] gfx1250 kernel descriptor update (#155008) 2025-08-22 12:58:41 -07:00
Gang Chen
60dbde69cd
[AMDGPU] report named barrier cnt part2 (#154588) 2025-08-20 12:00:45 -07:00
Stanislav Mekhanoshin
57c1e01e48
[AMDGPU] Don't allow wgp mode on gfx1250 (#153680)
- gfx1250 only supports cu mode
2025-08-14 15:16:56 -07:00
Stanislav Mekhanoshin
ea14834966
[AMDGPU] Per-subtarget DPP instruction classification (#153096)
This is NFCI at this point.
2025-08-11 15:41:02 -07:00
Stanislav Mekhanoshin
dddeb07c2e
[AMDGPU] Restrict packed math FP32 instructions to read only one SGPR per operand on gfx12+ (#152465)
Sec. 4.6.7.1 of the gfx1250 SPG states that if an SGPR is used
as an operand, only one SGPR will be read for both the low and high
operations. As a result, the corresponding bits in `op_sel` and
`op_sel_hi` must be the same when the operand is an SGPR.

Co-authored-by: Tian, Shilei <Shilei.Tian@amd.com>

Co-authored-by: Tian, Shilei <Shilei.Tian@amd.com>
2025-08-07 16:13:34 -07:00
Stanislav Mekhanoshin
d08c2977e8
[AMDGPU] Add MC support for new gfx1250 src_flat_scratch_base_lo/hi (#152203) 2025-08-05 14:35:48 -07:00
Stanislav Mekhanoshin
dd0737bd99
[AMDGPU] gfx1250 v_wmma_ld_scale instructions (#152010) 2025-08-04 11:36:48 -07:00
Stanislav Mekhanoshin
c7bb105e97
[AMDGPU] Add v_cvt_scale_pk8_* gfx1250 instructions (#151616) 2025-07-31 18:55:59 -07:00
Stanislav Mekhanoshin
49d89bc9f4
[AMDGPU] Add gfx1250 cvt_pk|sr_fp8|bf8_f32 instructions (#151595) 2025-07-31 16:04:46 -07:00
Stanislav Mekhanoshin
ce40863209
[AMDGPU] Add v_cvt_sr|pk_bf8|fp8_f16 gfx1250 instructions (#151415) 2025-07-30 17:24:45 -07:00
Pierre van Houtryve
be17791f26
[AMDGPU][gfx1250] Add cu-store subtarget feature (#150588)
Determines whether we can use `SCOPE_CU` stores (on by default), or
whether all stores must be done at `SCOPE_SE` minimum.
2025-07-29 11:38:43 +02:00
Stanislav Mekhanoshin
a0b854d576
[AMDGPU] MC support for gfx1250 scale_offset modifier (#149881) 2025-07-21 15:04:59 -07:00
Stanislav Mekhanoshin
b66084acd9
[AMDGPU] Verify asm VGPR alignment on gfx1250 (#149880)
Co-authored-by: Shilei Tian <Shilei.Tian@amd.com>
2025-07-21 14:23:27 -07:00
Changpeng Fang
d6094370cb
AMDGPU: Support v_wmma_f32_16x16x128_f8f6f4 on gfx1250 (#149684)
Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
2025-07-21 10:09:42 -07:00
Stanislav Mekhanoshin
6d8e53d4af
[AMDGPU] Support nv memory instructions modifier on gfx1250 (#149582) 2025-07-18 14:38:46 -07:00
Kazu Hirata
ff5f355d9b
[AMDGPU] Use a range-based for loop (NFC) (#148767) 2025-07-15 08:02:46 -07:00
Changpeng Fang
b80b02536b
AMDGPU: Implement MC layer support for gfx1250 wmma instructions. (#148570)
Regular wmma/swmmac plus matrix reuse only.

---------

Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
Co-authored-by: Shilei Tian <Shilei.Tian@amd.com>
2025-07-15 00:48:57 -07:00
Stanislav Mekhanoshin
f090554359
[AMDGPU] MC support for v_fmaak_f64/v_fmamk_f64 gfx1250 intructions (#148282) 2025-07-11 14:17:03 -07:00
Stanislav Mekhanoshin
7920dff394
[AMDGPU] VOPD/VOPD3 changes for gfx1250 (#147602) 2025-07-10 14:15:01 -07:00