Oliver Stannard 40614e1c14 [ARM] Save and restore CPSR around tMOVimm32
When resolving a frame index with a large offset for v6M execute-only,
we emit a tMOVimm32 pseudo-instruction, which later gets lowered to a
sequence of instructions, all of which are flag-setting. However, a
frame index may be generated for a register spill or reload instruction,
which can be inserted at a point where CPSR is live. This patch inserts
MRS and MSR instructions around the tMOVimm32 to save and restore the
value of CPSR, if CPSR is live at that point.

This may need up to two virtual registers (one to build the immediate
value, one to save CPSR) during frame index lowering, which happens
after register allocation, so we need to ensure two spill slots are
avilable to the register scavenger to ensure it can free up enough
registers for this.

There is no test for the emission (or not) of the MRS/MSR pair, because
it requires a spill or reload to be inserted at a point where CPSR is
live, which requires a large, complex function and is fragile enough
that any optimisation changes will break the test. This bug was easily
found by csmith with -verify-machineinstrs, which I now run regularly on
v6M execute-only (and many other combinations).

Patch by John Brawn and myself.

Reviewed By: stuij

Differential Revision: https://reviews.llvm.org/D158404
2023-08-24 14:15:02 +01:00

52 lines
1.7 KiB
LLVM

; RUN: llc -mtriple=arm-eabi %s -o /dev/null
; RUN: llc -mtriple=thumbv6m-eabi -mattr=+execute-only %s -o - -filetype=obj | \
; RUN: llvm-objdump -d --no-leading-addr --no-show-raw-insn - | FileCheck %s
define void @test1() {
; CHECK-LABEL: <test1>:
;; are we using correct prologue immediate materialization pattern for
;; execute only
; CHECK: sub sp, #0x100
%tmp = alloca [ 64 x i32 ] , align 4
ret void
}
define void @test2() {
; CHECK-LABEL: <test2>:
;; are we using correct prologue immediate materialization pattern for
;; execute-only
; CHECK: movs [[REG:r[0-9]+]], #0xff
; CHECK-NEXT: lsls [[REG]], [[REG]], #0x8
; CHECK-NEXT: adds [[REG]], #0xff
; CHECK-NEXT: lsls [[REG]], [[REG]], #0x8
; CHECK-NEXT: adds [[REG]], #0xef
; CHECK-NEXT: lsls [[REG]], [[REG]], #0x8
; CHECK-NEXT: adds [[REG]], #0xb8
%tmp = alloca [ 4168 x i8 ] , align 4
ret void
}
define i32 @test3() {
;; are we using correct prologue immediate materialization pattern for
;; execute-only
; CHECK-LABEL: <test3>:
; CHECK: movs [[REG:r[0-9]+]], #0xcf
; CHECK-NEXT: lsls [[REG]], [[REG]], #0x8
; CHECK-NEXT: adds [[REG]], #0xff
; CHECK-NEXT: lsls [[REG]], [[REG]], #0x8
; CHECK-NEXT: adds [[REG]], #0xff
; CHECK-NEXT: lsls [[REG]], [[REG]], #0x8
; CHECK-NEXT: adds [[REG]], #0xf4
%retval = alloca i32, align 4
%tmp = alloca i32, align 4
%a = alloca [u0x30000001 x i8], align 16
store i32 0, ptr %tmp
;; are we choosing correct store/tSTRspi pattern for execute-only
; CHECK: movs [[REG:r[0-9]+]], #0x30
; CHECK-NEXT: lsls [[REG]], [[REG]], #0x18
; CHECK-NEXT: add [[REG]], sp
; CHECK-NEXT: str {{r[0-9]+}}, [[[REG]], #0x4]
%tmp1 = load i32, ptr %tmp
ret i32 %tmp1
}