llvm-project/llvm/test/CodeGen/AMDGPU/fold-reload-into-m0.mir
Christudasan Devadasan ac0f64f06d
[AMDGPU] Split vgpr regalloc pipeline (#93526)
Allocating wwm-registers and per-thread VGPR operands
together imposes many challenges in the way the
registers are reused during allocation. There are
times when regalloc reuses the registers of regular
VGPRs operations for wwm-operations in a small range
leading to unwantedly clobbering their inactive lanes
causing correctness issues that are hard to trace.

This patch splits the VGPR allocation pipeline further
to allocate wwm-registers first and the regular VGPR
operands in a separate pipeline. The splitting would
ensure that the physical registers used for wwm
allocations won't take part in the next allocation
pipeline to avoid any such clobbering.
2024-09-30 19:55:42 +05:30

61 lines
2.3 KiB
YAML

# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -verify-machineinstrs -stress-regalloc=2 -start-before=greedy -stop-after=virtregmap -o - %s | FileCheck %s
# Test that a spill of a copy of m0 is not folded to be a spill of m0 directly.
---
name: merge_sgpr_spill_into_copy_from_m0
tracksRegLiveness: true
machineFunctionInfo:
isEntryFunction: true
body: |
bb.0:
; CHECK-LABEL: name: merge_sgpr_spill_into_copy_from_m0
; CHECK: S_NOP 0, implicit-def $m0
; CHECK-NEXT: $sgpr0 = S_MOV_B32 $m0
; CHECK-NEXT: $vgpr0 = IMPLICIT_DEF
; CHECK-NEXT: $vgpr0 = V_WRITELANE_B32 killed $sgpr0, 0, $vgpr0
; CHECK-NEXT: $sgpr0 = V_READLANE_B32 $vgpr0, 0
; CHECK-NEXT: S_NOP 0, implicit-def dead renamable $sgpr1, implicit-def dead renamable $sgpr0, implicit killed renamable $sgpr0
; CHECK-NEXT: $sgpr0 = V_READLANE_B32 $vgpr0, 0
; CHECK-NEXT: $m0 = S_MOV_B32 killed $sgpr0
; CHECK-NEXT: S_NOP 0
; CHECK-NEXT: S_SENDMSG 0, implicit $m0, implicit $exec
S_NOP 0, implicit-def $m0
%0:sreg_32 = COPY $m0
S_NOP 0, implicit-def %1:sreg_32, implicit-def %2:sreg_32, implicit %0
$m0 = COPY %0
S_SENDMSG 0, implicit $m0, implicit $exec
...
# Test that a reload into a copy of m0 is not folded to be a reload of m0 directly.
---
name: reload_sgpr_spill_into_copy_to_m0
tracksRegLiveness: true
machineFunctionInfo:
isEntryFunction: true
body: |
bb.0:
; CHECK-LABEL: name: reload_sgpr_spill_into_copy_to_m0
; CHECK: $vgpr0 = IMPLICIT_DEF
; CHECK-NEXT: S_NOP 0, implicit-def renamable $sgpr0, implicit-def dead renamable $sgpr1, implicit-def $m0
; CHECK-NEXT: $vgpr0 = V_WRITELANE_B32 killed $sgpr0, 0, $vgpr0
; CHECK-NEXT: $sgpr0 = V_READLANE_B32 $vgpr0, 0
; CHECK-NEXT: S_NOP 0, implicit killed renamable $sgpr0, implicit-def dead renamable $sgpr1, implicit-def dead renamable $sgpr0
; CHECK-NEXT: $sgpr0 = V_READLANE_B32 $vgpr0, 0
; CHECK-NEXT: $m0 = S_MOV_B32 killed $sgpr0
; CHECK-NEXT: S_NOP 0
; CHECK-NEXT: S_SENDMSG 0, implicit $m0, implicit $exec
S_NOP 0, implicit-def %0:sreg_32, implicit-def %1:sreg_32, implicit-def $m0
S_NOP 0, implicit %0, implicit-def %3:sreg_32, implicit-def %4:sreg_32
$m0 = COPY %0
S_SENDMSG 0, implicit $m0, implicit $exec
...