llvm-project/llvm/test/CodeGen/AMDGPU/fold-zero-high-bits-skips-non-reg.mir
Matt Arsenault 80064b6e32
AMDGPU: Try constant fold after folding immediate (#141862)
This helps avoid some regressions in a future patch. The or 0
pattern appears in the division tests because the reduce 64-bit
bit operation to a 32-bit one with half identity value is only
implemented for constants. We could fix that by using computeKnownBits.
Additionally the pattern disappears if I optimize the IR division
expansion, so that IR should probably be emitted more optimally in
the first place.
2025-06-10 11:44:44 +09:00

18 lines
898 B
YAML

# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
# RUN: llc -mtriple=amdgcn -mcpu=gfx1101 -run-pass si-fold-operands %s -o - | FileCheck %s
---
name: test_tryFoldZeroHighBits_skips_nonreg
tracksRegLiveness: true
body: |
bb.0:
; CHECK-LABEL: name: test_tryFoldZeroHighBits_skips_nonreg
; CHECK: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; CHECK-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_MOV_B32_e32_]], %subreg.sub0, [[V_MOV_B32_e32_]], %subreg.sub1
; CHECK-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_1]]
%0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
%1:vreg_64 = REG_SEQUENCE %0, %subreg.sub0, %0, %subreg.sub1
%2:vgpr_32 = V_AND_B32_e64 65535, %1.sub0, implicit $exec
S_NOP 0, implicit %2
...