9 Commits

Author SHA1 Message Date
Matt Arsenault
ce096b2207 AMDGPU: Convert some tests to opaque pointers
These required update_mir_test_checks.
2022-12-19 09:04:17 -05:00
Jay Foad
113aafbf23 [AMDGPU] Clean up SReg classes
Remove unused LO16 classes SReg_LO16_XM0_XEXEC, SReg_LO16_XEXEC_HI and
SReg_LO16_XM0.

Simplify the definition of SReg_32.

Add SReg_32_XEXEC and use it to improve SReg_1_XEXEC which previously
excluded M0 for no good reason.

Improve SReg_1 which previously excluded EXEC_HI for no good reason.

Differential Revision: https://reviews.llvm.org/D140012
2022-12-14 15:57:56 +00:00
Joe Nash
b982ba2a6e [AMDGPU][GFX11] Use VGPR_32_Lo128 for VOP1,2,C
Due to the encoding changes in GFX11, we had a hack in place that
    disables the use of VGPRs above 128. This patch removes the need for
    that hack.

    We introduce a new register class VGPR_32_Lo128 which is used for 16-bit
    operands of VOP1, VOP2, and VOPC instructions. This register class only has the
    low 128 VGPRs, but is otherwise identical to VGPR_32. Therefore, 16-bit VOP1,
    VOP2, and VOPC instructions are correctly limited to use the first 128
    VGPRs, while the other instructions can freely use all 256.

    We introduce new pseduo-instructions used on GFX11 which have the suffix
    t16 (True 16) to use the VGPR_32_Lo128 register class.

Reviewed By: foad, rampitec, #amdgpu

Differential Revision: https://reviews.llvm.org/D133723
2022-09-20 09:56:28 -04:00
Joe Nash
b92161f927 [AMDGPU] Autogenerate spill-vector-superclass. NFC
This test is already a subset of the autogenerated test lines, so truly
auto-generate it to make it easier to update.
2022-08-11 10:34:10 -04:00
Alexander Timofeev
a321d95b59 [AMDGPU] avoid blind converting to VALU REG_SEQUENCE and PHIs
In the 2e29b0138ca243 we introduce a specific solving algorithm
that analyzes the VGPR to SGPR copies use chains and either lowers
the copy to v_readfirstlane_b32 or converts the whole chain to VALU forms.
Same time we still have the code that blindly converts to VALU REG_SEQUENCE and PHIs
in case they produce SGPR but have VGPRs input operands. In case the REG_SEQUENCE and PHIs
are in the VGPR to SGPR copy use chain, and this chain was considered long enough to convert
copy to v_readfistlane_b32, further lowering them to VALU leads to several kinds of issues.
At first, we have v_readfistlane_b32 which is completely useless because most parts of its use chain
were moved to VALU forms. Second, we may encounter subtle bugs related to the EXEC-dependent CF
because of the weird mixing of SALU and VALU instructions.
This change removes the code that moves REG_SEQUENCE and PHIs to VALU. Instead, we use the fact
that both REG_SEQUENCE and PHIs have copy semantics. That is, if they define SGPR but have VGPR inputs,
we insert VGPR to SGPR copies to make them pure SGPR. Then, the new copies are processed by the common
VGPR to SGPR lowering algorithm.
This is Part 2 in the series of commits aiming at the massive refactoring of the SIFixSGPRCopies pass.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D130367
2022-08-02 18:37:57 +02:00
Alexander Timofeev
d7ae1a9097 Revert "[AMDGPU] avoid blind converting to VALU REG_SEQUENCE and PHIs"
This reverts commit 76d9ae924cc361578ecbb5688559f7cebc78ab87.
because it causes several VK CTS tests to fail
2022-07-29 14:19:07 +02:00
Alexander Timofeev
76d9ae924c [AMDGPU] avoid blind converting to VALU REG_SEQUENCE and PHIs
In the 2e29b0138ca243 we introduce a specific solving algorithm
that analyzes the VGPR to SGPR copies use chains and either lowers
the copy to v_readfirstlane_b32 or converts the whole chain to VALU forms.
Same time we still have the code that blindly converts to VALU REG_SEQUENCE and PHIs
in case they produce SGPR but have VGPRs input operands. In case the REG_SEQUENCE and PHIs
are in the VGPR to SGPR copy use chain, and this chain was considered long enough to convert
copy to v_readfistlane_b32, further lowering them to VALU leads to several kinds of issues.
At first, we have v_readfistlane_b32 which is completely useless because most parts of its use chain
were moved to VALU forms. Second, we may encounter subtle bugs related to the EXEC-dependent CF
because of the weird mixing of SALU and VALU instructions.
This change removes the code that moves REG_SEQUENCE and PHIs to VALU. Instead, we use the fact
that both REG_SEQUENCE and PHIs have copy semantics. That is, if they define SGPR but have VGPR inputs,
we insert VGPR to SGPR copies to make them pure SGPR. Then, the new copies are processed by the common
VGPR to SGPR lowering algorithm.
This is Part 2 in the series of commits aiming at the massive refactoring of the SIFixSGPRCopies pass.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D130367
2022-07-28 14:30:29 +02:00
Christudasan Devadasan
cf58b9ce98 [AMDGPU] Add AV class spill pseudo instructions
While enabling vector superclasses with D109301,
the AV spills are converted into VGPR spills by
introducing appropriate copies. The whole thing
ended up adding two instructions per spill (a copy
+ vgpr spill pseudo) and caused an incorrect
liverange update during inline spiller.

This patch adds the pseudo instructions for all
AV spills from 32b to 1024b and handles them in
the way all other spills are lowered.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D115439
2021-12-10 03:10:34 -05:00
Christudasan Devadasan
5297cbf045 [AMDGPU] Enable copy between VGPR and AGPR classes during regalloc
Greedy register allocator prefers to move a constrained
live range into a larger allocatable class over spilling
them. This patch defines the necessary superclasses for
vector registers. For subtargets that support copy between
VGPRs and AGPRs, the vector register spills during regalloc
now become just copies.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D109301
2021-11-29 22:19:33 -05:00