llvm-project/llvm/test/Other/print-changed-machine.ll
Amara Emerson 1e2f87868f [AArch64][GlobalISel] Move the localizer to run before the legalizer, and always localize globals.
Our strategy for localizing globals in the entry block breaks down when we have
large functions with high register pressure, using lots of globals. When this
happens, our heuristics say that globals with many uses should not be localized,
leading us to cause excessive spills and stack usage. These situations are also
exacerbated by LTO which tends to generate large functions.

For now, moving to a strategy that's simpler and more akin to SelectionDAG
fixes these issues and makes our codegen more similar. This has an overall
neutral effect on size on CTMark, while showing slight improvements with -Os -flto
on benchmarks. For low level firmware software though we see big improvements.

The reason this is neutral, and not an improvement, is because we give up the
gains from CSE'ing globals in cases where we low register pressure. I think
this can be addressed in future with some better heuristics.

Differential Revision: https://reviews.llvm.org/D147484
2023-04-03 20:41:54 -07:00

58 lines
2.6 KiB
LLVM

; REQUIRES: aarch64-registered-target
; RUN: llc -filetype=null -mtriple=aarch64 -O0 -print-changed %s 2>&1 | FileCheck %s --check-prefixes=VERBOSE,VERBOSE-BAR
; RUN: llc -filetype=null -mtriple=aarch64 -O0 -print-changed -filter-print-funcs=foo %s 2>&1 | FileCheck %s --check-prefixes=VERBOSE,NO-BAR
; VERBOSE: *** IR Dump After IRTranslator (irtranslator) on foo ***
; VERBOSE-NEXT: # Machine code for function foo: IsSSA, TracksLiveness{{$}}
; VERBOSE-NEXT: Function Live Ins: $w0
; VERBOSE-EMPTY:
; VERBOSE-NEXT: bb.1.entry:
; VERBOSE: *** IR Dump After Analysis for ComputingKnownBits (gisel-known-bits) on foo omitted because no change ***
; VERBOSE-NEXT: *** IR Dump After AArch64O0PreLegalizerCombiner (aarch64-O0-prelegalizer-combiner) on foo omitted because no change ***
; VERBOSE: *** IR Dump After Legalizer (legalizer) on foo ***
; VERBOSE-NEXT: # Machine code for function foo: IsSSA, TracksLiveness, Legalized
; VERBOSE-NEXT: Function Live Ins: $w0
; VERBOSE-EMPTY:
; VERBOSE-NEXT: bb.1.entry:
; VERBOSE-BAR: *** IR Dump After IRTranslator (irtranslator) on bar ***
; NO-BAR-NOT: on bar ***
; RUN: llc -filetype=null -mtriple=aarch64 -O0 -print-changed=quiet %s 2>&1 | FileCheck %s --check-prefix=QUIET
; QUIET: *** IR Dump After IRTranslator (irtranslator) on foo ***
; QUIET-NOT: ***
; QUIET: *** IR Dump After Localizer (localizer) on foo ***
; RUN: llc -filetype=null -mtriple=aarch64 -O0 -print-changed -filter-passes=irtranslator,legalizer %s 2>&1 | \
; RUN: FileCheck %s --check-prefixes=VERBOSE-FILTER
; RUN: llc -filetype=null -mtriple=aarch64 -O0 -print-changed=quiet -filter-passes=irtranslator %s 2>&1 | \
; RUN: FileCheck %s --check-prefixes=QUIET-FILTER --implicit-check-not='IR Dump'
; VERBOSE-FILTER: *** IR Dump After IRTranslator (irtranslator) on foo ***
; VERBOSE-FILTER: *** IR Dump After AArch64O0PreLegalizerCombiner (aarch64-O0-prelegalizer-combiner) on foo filtered out ***
; VERBOSE-FILTER: *** IR Dump After Legalizer (legalizer) on foo ***
; VERBOSE-FILTER-NOT: *** IR Dump After {{.*}} () on
; QUIET-FILTER: *** IR Dump After IRTranslator (irtranslator) on foo ***
; QUIET-FILTER: *** IR Dump After IRTranslator (irtranslator) on bar ***
;; dot-cfg/dot-cfg-quiet are unimplemented. Currently they behave like 'quiet'.
; RUN: llc -filetype=null -mtriple=aarch64 -O0 -print-changed=dot-cfg %s 2>&1 | FileCheck %s --check-prefix=QUIET
@var = global i32 0
define void @foo(i32 %a) {
entry:
%b = add i32 %a, 1
store i32 %b, ptr @var
ret void
}
define void @bar(i32 %a) {
entry:
%b = add i32 %a, 2
store i32 %b, ptr @var
ret void
}