Previously, `computeProcResourceMasks()` would print resource masks on debug mode from multiple call sites, creating noise in the debug output. This patch aims to fix this and also print more info about the resources. It splits to 2 types of debug prints for resources: 1. No simulation - mask only 2. Simulation - mask + other info For 2, it shares printing on a single place in `ResourceManager` constructor, that should cover all the other simulation cases indirectly: 1. `llvm/lib/MCA/HardwareUnits/ResourceManager` - covered 2. `llvm/lib/MCA/InstrBuilder.c` - should be covered indirectly - only used by `llvm-mca` before simulation that constructs a `ResourceManager` 3. `llvm/tools/llvm-mca/Views/SummaryView.cpp` - after simulation that constructs a `ResourceManager` 4. `llvm/tools/llvm-mca/Views/BottleneckAnalysis.cpp` - after simulation that constructs a `ResourceManager` It also adds `BufferSize` to the output, which should be useful to debug scheduling model + MCA integration. For 1, it inlines mask-only printing into 2 other callers: 1. `llvm/include/llvm/MCA/Stages/InstructionTables.h` 2. `llvm/tools/llvm-exegesis/lib/SchedClassResolution.cpp` as they only use the masks there. I think this is a reasonable duplication across distinguishably different users/tools. Now every pair of callers, even across groups (1 and 2), effectively print in a mutually exclusive way. The patch adds debug tests for the 3 new callers, in the corresponding root test directories, to drive further location of logically target-independent tests that just require some target at the root. I think this convention is more discoverable, and is pretty widely used in the project.
30 lines
1.3 KiB
ArmAsm
30 lines
1.3 KiB
ArmAsm
# REQUIRES: asserts
|
|
# REQUIRES: aarch64-registered-target
|
|
|
|
# RUN: llvm-mca < %s -mtriple=aarch64 -mcpu=apple-m1 -debug 2>&1 | FileCheck %s
|
|
|
|
# LLVM-MCA-BEGIN foo
|
|
add x2, x0, x1
|
|
# LLVM-MCA-END
|
|
|
|
## Print detailed processor resources information on simulation
|
|
# CHECK-COUNT-1: Processor resources:
|
|
# CHECK-NEXT: [ 0] - 0x00000000000000 - InvalidUnit
|
|
# CHECK-NEXT: [ 1] - 0x00000000000001 - CyUnitB (BufferSize=24)
|
|
# CHECK-NEXT: [ 2] - 0x00000000000002 - CyUnitBR (BufferSize=-1)
|
|
# CHECK-NEXT: [ 3] - 0x00000000000004 - CyUnitFloatDiv (BufferSize=-1)
|
|
# CHECK-NEXT: [ 4] - 0x00000000000008 - CyUnitI (BufferSize=48)
|
|
# CHECK-NEXT: [ 5] - 0x00000000000010 - CyUnitID (BufferSize=16)
|
|
# CHECK-NEXT: [ 6] - 0x00000000000020 - CyUnitIM (BufferSize=32)
|
|
# CHECK-NEXT: [ 7] - 0x00000000000040 - CyUnitIS (BufferSize=24)
|
|
# CHECK-NEXT: [ 8] - 0x00000000000080 - CyUnitIntDiv (BufferSize=-1)
|
|
# CHECK-NEXT: [ 9] - 0x00000000000100 - CyUnitLS (BufferSize=28)
|
|
# CHECK-NEXT: [10] - 0x00000000000200 - CyUnitV (BufferSize=48)
|
|
# CHECK-NEXT: [11] - 0x00000000000400 - CyUnitVC (BufferSize=16)
|
|
# CHECK-NEXT: [12] - 0x00000000000800 - CyUnitVD (BufferSize=16)
|
|
# CHECK-NEXT: [13] - 0x00000000001000 - CyUnitVM (BufferSize=32)
|
|
# CHECK: [0] Code Region - foo
|
|
|
|
## Do not print mask-only information on simulation
|
|
# CHECK-NOT: Processor resource masks:
|