llvm-project/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-extract.mir
Petar Avramovic 29f88b93fd [GlobalISel] Rework more/fewer elements for vectors
Artifact combiner is not able to access individual elements after using
LCMTy style merge/unmerge, extract and insert to change vector number of
elements (pad with undef or split to sub-vector instructions).
Use unmerge to individual elements instead and then merge elements into
requested types.
Change argument lowering for vectors and moreElementsVector to use
buildPadVectorWithUndefElements and buildDeleteTrailingVectorElements.
FewerElementsVector had a few helpers that had different behavior,
introduce new helper for most of the opcodes.
FewerElementsVector helper is more flexible since it can create leftover
instruction smaller then requested type (useful in case target wants to
avoid pad with undef and use fewer registers). If target does not want
leftover of different type it should call more elements first.
Some helpers were performing more elements first to have split without
leftover. Opcodes that used this helper use clampMaxNumElementsStrict
(does more elements first) in LegalizerInfo to avoid test changes.
Fixes failures caused by failing to combine artifacts created during
more/fewer elements vector.

Differential Revision: https://reviews.llvm.org/D114198
2021-12-23 14:30:02 +01:00

201 lines
9.4 KiB
YAML

# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
# RUN: llc -march=amdgcn -run-pass=instruction-select -verify-machineinstrs -o - %s | FileCheck %s
---
name: extract512
legalized: true
regBankSelected: true
body: |
bb.0:
; CHECK-LABEL: name: extract512
; CHECK: [[DEF:%[0-9]+]]:sgpr_512 = IMPLICIT_DEF
; CHECK: [[COPY:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub0
; CHECK: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub1
; CHECK: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub2
; CHECK: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub3
; CHECK: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub4
; CHECK: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub5
; CHECK: [[COPY6:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub6
; CHECK: [[COPY7:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub7
; CHECK: [[COPY8:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub8
; CHECK: [[COPY9:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub9
; CHECK: [[COPY10:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub10
; CHECK: [[COPY11:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub11
; CHECK: [[COPY12:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub12
; CHECK: [[COPY13:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub13
; CHECK: [[COPY14:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub14
; CHECK: [[COPY15:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub15
; CHECK: $sgpr0 = COPY [[COPY]]
; CHECK: $sgpr1 = COPY [[COPY1]]
; CHECK: $sgpr2 = COPY [[COPY2]]
; CHECK: $sgpr3 = COPY [[COPY3]]
; CHECK: $sgpr4 = COPY [[COPY4]]
; CHECK: $sgpr5 = COPY [[COPY5]]
; CHECK: $sgpr6 = COPY [[COPY6]]
; CHECK: $sgpr7 = COPY [[COPY7]]
; CHECK: $sgpr8 = COPY [[COPY8]]
; CHECK: $sgpr9 = COPY [[COPY9]]
; CHECK: $sgpr10 = COPY [[COPY10]]
; CHECK: $sgpr11 = COPY [[COPY11]]
; CHECK: $sgpr12 = COPY [[COPY12]]
; CHECK: $sgpr13 = COPY [[COPY13]]
; CHECK: $sgpr14 = COPY [[COPY14]]
; CHECK: $sgpr15 = COPY [[COPY15]]
; CHECK: SI_RETURN_TO_EPILOG $sgpr0, $sgpr1, $sgpr2, $sgpr3, $sgpr4, $sgpr5, $sgpr6, $sgpr7, $sgpr8, $sgpr9, $sgpr10, $sgpr11, $sgpr12, $sgpr13, $sgpr14, $sgpr15
%0:sgpr(s512) = G_IMPLICIT_DEF
%1:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 0
%2:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 32
%3:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 64
%4:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 96
%5:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 128
%6:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 160
%7:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 192
%8:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 224
%9:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 256
%10:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 288
%11:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 320
%12:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 352
%13:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 384
%14:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 416
%15:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 448
%16:sgpr(s32) = G_EXTRACT %0:sgpr(s512), 480
$sgpr0 = COPY %1:sgpr(s32)
$sgpr1 = COPY %2:sgpr(s32)
$sgpr2 = COPY %3:sgpr(s32)
$sgpr3 = COPY %4:sgpr(s32)
$sgpr4 = COPY %5:sgpr(s32)
$sgpr5 = COPY %6:sgpr(s32)
$sgpr6 = COPY %7:sgpr(s32)
$sgpr7 = COPY %8:sgpr(s32)
$sgpr8 = COPY %9:sgpr(s32)
$sgpr9 = COPY %10:sgpr(s32)
$sgpr10 = COPY %11:sgpr(s32)
$sgpr11 = COPY %12:sgpr(s32)
$sgpr12 = COPY %13:sgpr(s32)
$sgpr13 = COPY %14:sgpr(s32)
$sgpr14 = COPY %15:sgpr(s32)
$sgpr15 = COPY %16:sgpr(s32)
SI_RETURN_TO_EPILOG $sgpr0, $sgpr1, $sgpr2, $sgpr3, $sgpr4, $sgpr5, $sgpr6, $sgpr7, $sgpr8, $sgpr9, $sgpr10, $sgpr11, $sgpr12, $sgpr13, $sgpr14, $sgpr15
...
---
name: extract_s_s32_s1024
legalized: true
regBankSelected: true
body: |
bb.0:
; CHECK-LABEL: name: extract_s_s32_s1024
; CHECK: [[DEF:%[0-9]+]]:sgpr_1024 = IMPLICIT_DEF
; CHECK: [[COPY:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub0
; CHECK: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub1
; CHECK: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub2
; CHECK: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub3
; CHECK: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub4
; CHECK: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub5
; CHECK: [[COPY6:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub6
; CHECK: [[COPY7:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub7
; CHECK: [[COPY8:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub8
; CHECK: [[COPY9:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub9
; CHECK: [[COPY10:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub10
; CHECK: [[COPY11:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub11
; CHECK: [[COPY12:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub12
; CHECK: [[COPY13:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub13
; CHECK: [[COPY14:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub14
; CHECK: [[COPY15:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub15
; CHECK: [[COPY16:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub16
; CHECK: [[COPY17:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub17
; CHECK: [[COPY18:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub18
; CHECK: [[COPY19:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub19
; CHECK: [[COPY20:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub20
; CHECK: [[COPY21:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub21
; CHECK: [[COPY22:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub22
; CHECK: [[COPY23:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub23
; CHECK: [[COPY24:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub24
; CHECK: [[COPY25:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub25
; CHECK: [[COPY26:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub26
; CHECK: [[COPY27:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub27
; CHECK: [[COPY28:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub28
; CHECK: [[COPY29:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub29
; CHECK: [[COPY30:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub30
; CHECK: [[COPY31:%[0-9]+]]:sreg_32 = COPY [[DEF]].sub31
; CHECK: S_ENDPGM 0, implicit [[DEF]], implicit [[COPY]], implicit [[COPY1]], implicit [[COPY2]], implicit [[COPY3]], implicit [[COPY4]], implicit [[COPY5]], implicit [[COPY6]], implicit [[COPY7]], implicit [[COPY8]], implicit [[COPY9]], implicit [[COPY10]], implicit [[COPY11]], implicit [[COPY12]], implicit [[COPY13]], implicit [[COPY14]], implicit [[COPY15]], implicit [[COPY16]], implicit [[COPY17]], implicit [[COPY18]], implicit [[COPY19]], implicit [[COPY20]], implicit [[COPY21]], implicit [[COPY22]], implicit [[COPY23]], implicit [[COPY24]], implicit [[COPY25]], implicit [[COPY26]], implicit [[COPY27]], implicit [[COPY28]], implicit [[COPY29]], implicit [[COPY30]], implicit [[COPY31]]
%0:sgpr(s1024) = G_IMPLICIT_DEF
%1:sgpr(s32) = G_EXTRACT %0:sgpr, 0
%2:sgpr(s32) = G_EXTRACT %0:sgpr, 32
%3:sgpr(s32) = G_EXTRACT %0:sgpr, 64
%4:sgpr(s32) = G_EXTRACT %0:sgpr, 96
%5:sgpr(s32) = G_EXTRACT %0:sgpr, 128
%6:sgpr(s32) = G_EXTRACT %0:sgpr, 160
%7:sgpr(s32) = G_EXTRACT %0:sgpr, 192
%8:sgpr(s32) = G_EXTRACT %0:sgpr, 224
%9:sgpr(s32) = G_EXTRACT %0:sgpr, 256
%10:sgpr(s32) = G_EXTRACT %0:sgpr, 288
%11:sgpr(s32) = G_EXTRACT %0:sgpr, 320
%12:sgpr(s32) = G_EXTRACT %0:sgpr, 352
%13:sgpr(s32) = G_EXTRACT %0:sgpr, 384
%14:sgpr(s32) = G_EXTRACT %0:sgpr, 416
%15:sgpr(s32) = G_EXTRACT %0:sgpr, 448
%16:sgpr(s32) = G_EXTRACT %0:sgpr, 480
%17:sgpr(s32) = G_EXTRACT %0:sgpr, 512
%18:sgpr(s32) = G_EXTRACT %0:sgpr, 544
%19:sgpr(s32) = G_EXTRACT %0:sgpr, 576
%20:sgpr(s32) = G_EXTRACT %0:sgpr, 608
%21:sgpr(s32) = G_EXTRACT %0:sgpr, 640
%22:sgpr(s32) = G_EXTRACT %0:sgpr, 672
%23:sgpr(s32) = G_EXTRACT %0:sgpr, 704
%24:sgpr(s32) = G_EXTRACT %0:sgpr, 736
%25:sgpr(s32) = G_EXTRACT %0:sgpr, 768
%26:sgpr(s32) = G_EXTRACT %0:sgpr, 800
%27:sgpr(s32) = G_EXTRACT %0:sgpr, 832
%28:sgpr(s32) = G_EXTRACT %0:sgpr, 864
%29:sgpr(s32) = G_EXTRACT %0:sgpr, 896
%30:sgpr(s32) = G_EXTRACT %0:sgpr, 928
%31:sgpr(s32) = G_EXTRACT %0:sgpr, 960
%32:sgpr(s32) = G_EXTRACT %0:sgpr, 992
S_ENDPGM 0, implicit %0, implicit %1, implicit %2, implicit %3, implicit %4, implicit %5, implicit %6, implicit %7, implicit %8, implicit %9, implicit %10, implicit %11, implicit %12, implicit %13, implicit %14, implicit %15, implicit %16, implicit %17, implicit %18, implicit %19, implicit %20, implicit %21, implicit %22, implicit %23, implicit %24, implicit %25, implicit %26, implicit %27, implicit %28, implicit %29, implicit %30, implicit %31, implicit %32
...
# TODO: Handle offset 32
---
name: extract_sgpr_s64_from_s128
legalized: true
regBankSelected: true
body: |
bb.0:
; CHECK-LABEL: name: extract_sgpr_s64_from_s128
; CHECK: [[DEF:%[0-9]+]]:sgpr_128 = IMPLICIT_DEF
; CHECK: [[COPY:%[0-9]+]]:sreg_64 = COPY [[DEF]].sub0_sub1
; CHECK: [[COPY1:%[0-9]+]]:sreg_64 = COPY [[DEF]].sub2_sub3
; CHECK: S_ENDPGM 0, implicit [[COPY]], implicit [[COPY1]]
%0:sgpr(s128) = G_IMPLICIT_DEF
%1:sgpr(s64) = G_EXTRACT %0, 0
%2:sgpr(s64) = G_EXTRACT %0, 64
S_ENDPGM 0, implicit %1, implicit %2
...
---
name: extract_sgpr_s96_from_s128
legalized: true
regBankSelected: true
body: |
bb.0:
liveins: $sgpr0_sgpr1_sgpr2_sgpr3
; CHECK-LABEL: name: extract_sgpr_s96_from_s128
; CHECK: [[COPY:%[0-9]+]]:sgpr_128_with_sub1_sub2_sub3 = COPY $sgpr0_sgpr1_sgpr2_sgpr3
; CHECK: [[COPY1:%[0-9]+]]:sgpr_128_with_sub0_sub1_sub2 = COPY [[COPY]]
; CHECK: [[COPY2:%[0-9]+]]:sgpr_96 = COPY [[COPY1]].sub0_sub1_sub2
; CHECK: [[COPY3:%[0-9]+]]:sgpr_96 = COPY [[COPY]].sub1_sub2_sub3
; CHECK: S_ENDPGM 0, implicit [[COPY2]], implicit [[COPY3]]
%0:sgpr(s128) = COPY $sgpr0_sgpr1_sgpr2_sgpr3
%1:sgpr(s96) = G_EXTRACT %0, 0
%2:sgpr(s96) = G_EXTRACT %0, 32
S_ENDPGM 0, implicit %1, implicit %2
...