6 Commits

Author SHA1 Message Date
Jay Foad
0f94d2b385 [AMDGPU] GFX11: automatically release VGPRs at the end of the shader
GFX11 has a new message type MSG_DEALLOC_VGPRS which can be used to
release a shader's VGPRs. Sending this at the end of a shader (just
before the s_endpgm) can help overall system performance in cases where
the s_endpgm would have to wait for outstanding VMEM stores to complete
before releasing the VGPRs.

Differential Revision: https://reviews.llvm.org/D128442
2022-06-30 20:55:14 +01:00
Jay Foad
c155a944fb [AMDGPU] GFX11 CodeGen support for MIMG instructions
This includes:
- New llvm.amdgcn.image.msaa.load.* intrinsics
- NSA changes, because MIMG-NSA is now limited to 3 dwords
- Split CD forms of IMAGE_SAMPLE instructions out into separate
  test files since they are no longer supported in GFX11

Differential Revision: https://reviews.llvm.org/D127837
2022-06-16 18:23:14 +01:00
Jay Foad
d28624a209 [AMDGPU] Stop adding an implicit def of vcc_hi for wave32
This doesn't seem to be needed for anything.

Differential Revision: https://reviews.llvm.org/D92400
2020-12-02 10:11:42 +00:00
Sebastian Neubauer
a343b9b032 Revert "[AMDGPU] Insert waitcnt after returning from call"
This reverts commit ca907bfb57d8ad3ec3bcc2cff2abab7b1b933af6.

According to michel.daenzer,
> This completely broke the Mesa radeonsi driver on Navi 14. Xorg +
> xterm come up with major corruption & psychedelic colours.
2020-09-23 17:16:39 +02:00
Sebastian Neubauer
ca907bfb57 [AMDGPU] Insert waitcnt after returning from call
When memory operations are outstanding on function calls, either the
caller or the callee can insert a waitcnt to ensure that all reads are
finished.
Calls need some time to be executed, so if the callee inserts the
waitcnt, filling the instruction buffer and waiting for memory will be
interleaved, hiding some latency. This comes at the cost of having a
waitcnt inside functions that may not be needed as no memory operations
are outstanding.

For function calls, this is already implemented. The same principal
applies to returns: If the caller inserts a waitcnt after the call, the
callee does not have to wait and the return and memory operation can be
run in parallel.

This commit implements waiting in the caller after returning from a
function call.

Differential Revision: https://reviews.llvm.org/D87674
2020-09-23 12:17:59 +02:00
Sebastian Neubauer
8756869170 [AMDGPU] Add a16 feature to gfx10
Based on D72931

This adds a new feature called A16 which is enabled for gfx10.
gfx9 keeps the R128A16 feature so it can share all the instruction encodings
with gfx7/8.

Differential Revision: https://reviews.llvm.org/D73956
2020-02-10 09:04:23 +01:00