llvm-project/llvm/test/tools/llvm-profgen/recursion-compression-pseudoprobe.test
Hongtao Yu 3f97016857 [llvm-profgen] Decoding pseudo probe for profiled function only.
Complete pseudo probes decoding can result in large memory usage. In practice only a small porting of the decoded probes are used in profile generation. I'm changing the full decoding mode to be decoding for profiled functions only, though we still do a full scan of the .pseudoprobe section due to a missing table-of-content but we don't have to build the in-memory data structure for functions not sampled.

To build the in-memory data structure for profiled functions only, I'm rewriting the previous non-recursive probe decoding logic to be recursive. This is easy to read and maintain.

I also have to change the previous representation of unsymbolized context from probe-based stack to address-based stack since the profiled functions are unknown yet by the time of virtual unwinding. The address-based stack will be converted to probe-based stack after virtual unwinding and on-demand probe decoding.

I'm seeing 20GB memory is saved for one of our internal large service.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D121643
2022-03-23 14:15:11 -07:00

198 lines
8.3 KiB
Plaintext

; Firstly test uncompression(--compress-recursion=0)
; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --compress-recursion=0 --profile-summary-hot-count=0 --csspgo-preinliner=0 --gen-cs-nested-profile=0
; RUN: FileCheck %s --input-file %t -check-prefix=CHECK-UNCOMPRESS
; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --skip-symbolization --profile-summary-hot-count=0
; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-UNWINDER
; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --profile-summary-hot-count=0 --csspgo-preinliner=0 --gen-cs-nested-profile=0
; RUN: FileCheck %s --input-file %t
; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe-nommap.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --skip-symbolization --profile-summary-hot-count=0
; RUN: FileCheck %s --input-file %t --check-prefix=CHECK-UNWINDER
; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe-nommap.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --profile-summary-hot-count=0 --csspgo-preinliner=0 --gen-cs-nested-profile=0
; RUN: FileCheck %s --input-file %t
; RUN: llvm-profgen --format=text --perfscript=%S/Inputs/recursion-compression-pseudoprobe.perfscript --binary=%S/Inputs/recursion-compression-pseudoprobe.perfbin --output=%t --compress-recursion=0 --profile-summary-hot-count=0 --csprof-max-context-depth=0 --csspgo-preinliner=0 --gen-cs-nested-profile=0
; RUN: FileCheck %s --input-file %t -check-prefix=CHECK-MAX-CTX-DEPTH
; CHECK-UNCOMPRESS: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:6 @ fa]:4:1
; CHECK-UNCOMPRESS: 1: 1
; CHECK-UNCOMPRESS: 3: 1
; CHECK-UNCOMPRESS: 5: 1
; CHECK-UNCOMPRESS: 8: 1 fa:1
; CHECK-UNCOMPRESS: !CFGChecksum: 563070469352221
; CHECK-UNCOMPRESS: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:6 @ fa:8 @ fa]:4:1
; CHECK-UNCOMPRESS: 1: 1
; CHECK-UNCOMPRESS: 3: 1
; CHECK-UNCOMPRESS: 4: 1
; CHECK-UNCOMPRESS: 7: 1 fb:1
; CHECK-UNCOMPRESS: !CFGChecksum: 563070469352221
; CHECK-UNCOMPRESS: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:6 @ fa:8 @ fa:7 @ fb:6 @ fa]:4:1
; CHECK-UNCOMPRESS: 1: 1
; CHECK-UNCOMPRESS: 3: 1
; CHECK-UNCOMPRESS: 4: 1
; CHECK-UNCOMPRESS: 7: 1 fb:1
; CHECK-UNCOMPRESS: !CFGChecksum: 563070469352221
; CHECK-UNCOMPRESS: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb]:3:1
; CHECK-UNCOMPRESS: 1: 1
; CHECK-UNCOMPRESS: 2: 1
; CHECK-UNCOMPRESS: 5: 1 fb:1
; CHECK-UNCOMPRESS: !CFGChecksum: 563022570642068
; CHECK-UNCOMPRESS: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb]:3:1
; CHECK-UNCOMPRESS: 1: 1
; CHECK-UNCOMPRESS: 2: 1
; CHECK-UNCOMPRESS: 5: 1 fb:1
; CHECK-UNCOMPRESS: !CFGChecksum: 563022570642068
; CHECK-UNCOMPRESS: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb]:3:1
; CHECK-UNCOMPRESS: 1: 1
; CHECK-UNCOMPRESS: 2: 1
; CHECK-UNCOMPRESS: 5: 1 fb:1
; CHECK-UNCOMPRESS: !CFGChecksum: 563022570642068
; CHECK-UNCOMPRESS: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb]:3:1
; CHECK-UNCOMPRESS: 1: 1
; CHECK-UNCOMPRESS: 3: 1
; CHECK-UNCOMPRESS: 6: 1 fa:1
; CHECK-UNCOMPRESS: !CFGChecksum: 563022570642068
; CHECK-UNCOMPRESS: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:6 @ fa:8 @ fa:7 @ fb]:3:1
; CHECK-UNCOMPRESS: 1: 1
; CHECK-UNCOMPRESS: 3: 1
; CHECK-UNCOMPRESS: 6: 1 fa:1
; CHECK-UNCOMPRESS: !CFGChecksum: 563022570642068
; CHECK-UNCOMPRESS: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:6 @ fa:8 @ fa:7 @ fb:6 @ fa:7 @ fb]:3:1
; CHECK-UNCOMPRESS: 1: 1
; CHECK-UNCOMPRESS: 3: 1
; CHECK-UNCOMPRESS: 6: 1 fa:1
; CHECK-UNCOMPRESS: !CFGChecksum: 563022570642068
; CHECK-UNCOMPRESS: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb:6 @ fa:8 @ fa:7 @ fb:6 @ fa:7 @ fb:6 @ fa]:2:1
; CHECK-UNCOMPRESS: 1: 1
; CHECK-UNCOMPRESS: 3: 1
; CHECK-UNCOMPRESS: !CFGChecksum: 563070469352221
; CHECK-UNCOMPRESS: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:5 @ fb:5 @ fb:5 @ fb]:1:0
; CHECK-UNCOMPRESS: 5: 1 fb:1
; CHECK-UNCOMPRESS: !CFGChecksum: 563022570642068
; CHECK-MAX-CTX-DEPTH: [fb]:19:6
; CHECK-MAX-CTX-DEPTH: 1: 6
; CHECK-MAX-CTX-DEPTH: 2: 3
; CHECK-MAX-CTX-DEPTH: 3: 3
; CHECK-MAX-CTX-DEPTH: 4: 0
; CHECK-MAX-CTX-DEPTH: 5: 4 fb:4
; CHECK-MAX-CTX-DEPTH: 6: 3 fa:3
; CHECK-MAX-CTX-DEPTH: !CFGChecksum: 563022570642068
; CHECK-MAX-CTX-DEPTH: [fa]:14:4
; CHECK-MAX-CTX-DEPTH: 1: 4
; CHECK-MAX-CTX-DEPTH: 3: 4
; CHECK-MAX-CTX-DEPTH: 4: 2
; CHECK-MAX-CTX-DEPTH: 5: 1
; CHECK-MAX-CTX-DEPTH: 6: 0
; CHECK-MAX-CTX-DEPTH: 7: 2 fb:2
; CHECK-MAX-CTX-DEPTH: 8: 1 fa:1
; CHECK-MAX-CTX-DEPTH: !CFGChecksum: 563070469352221
; CHECK: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb]:13:4
; CHECK: 1: 4
; CHECK: 2: 3
; CHECK: 3: 1
; CEHCK: 5: 4 fb:4
; CHECK: 6: 1 fa:1
; CHECK !CFGChecksum: 563022570642068
; CHECK: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:6 @ fa:8 @ fa:7 @ fb:6 @ fa]:6:2
; CHECK: 1: 2
; CHECK: 3: 2
; CHECK: 4: 1
; CHECK: 7: 1 fb:1
; CHECK: !CFGChecksum: 563070469352221
CHECK: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:6 @ fa]:4:1
; CHECK: 1: 1
; CHECK: 3: 1
; CHECK: 5: 1
; CHECK: 8: 1 fa:1
; CHECK: !CFGChecksum: 563070469352221
; CHECK: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:6 @ fa:8 @ fa]:4:1
; CHECK: 1: 1
; CHECK: 3: 1
; CHECK: 4: 1
; CHECK: 7: 1 fb:1
; CHECK: !CFGChecksum: 563070469352221
; CHECK: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:6 @ fa:8 @ fa:7 @ fb]:3:1
; CHECK: 1: 1
; CHECK: 3: 1
; CHECK: 6: 1 fa:1
; CHECK: !CFGChecksum: 563022570642068
; CHECK: [main:2 @ foo:5 @ fa:8 @ fa:7 @ fb:5 @ fb:6 @ fa:8 @ fa:7 @ fb:6 @ fa:7 @ fb]:3:1
; CHECK: 1: 1
; CHECK: 3: 1
; CHECK: 6: 1 fa:1
; CHECK: !CFGChecksum: 563022570642068
; CHECK-UNWINDER: [0x842 @ 0x7d4 @ 0x7e0 @ 0x7ab]
; CHECK-UNWINDER-NEXT: 3
; CHECK-UNWINDER-NEXT: 7a0-7a7:1
; CHECK-UNWINDER-NEXT: 7a0-7ab:3
; CHECK-UNWINDER-NEXT: 7b2-7b5:1
; CHECK-UNWINDER-NEXT: 3
; CHECK-UNWINDER-NEXT: 7a7->7b2:1
; CHECK-UNWINDER-NEXT: 7ab->7a0:4
; CHECK-UNWINDER-NEXT: 7b5->7c0:1
; CHECK-UNWINDER-NEXT: [0x842 @ 0x7d4 @ 0x7e0 @ 0x7ab @ 0x7b5]
; CHECK-UNWINDER-NEXT: 1
; CHECK-UNWINDER-NEXT: 7c0-7d4:1
; CHECK-UNWINDER-NEXT: 1
; CHECK-UNWINDER-NEXT: 7d4->7c0:1
; CHECK-UNWINDER-NEXT: [0x842 @ 0x7d4 @ 0x7e0 @ 0x7ab @ 0x7b5 @ 0x7d4]
; CHECK-UNWINDER-NEXT: 2
; CHECK-UNWINDER-NEXT: 7c0-7cd:1
; CHECK-UNWINDER-NEXT: 7db-7e0:1
; CHECK-UNWINDER-NEXT: 2
; CHECK-UNWINDER-NEXT: 7cd->7db:1
; CHECK-UNWINDER-NEXT: 7e0->7a0:1
; CHECK-UNWINDER-NEXT: [0x842 @ 0x7d4 @ 0x7e0 @ 0x7ab @ 0x7b5 @ 0x7d4 @ 0x7e0]
; CHECK-UNWINDER-NEXT: 2
; CHECK-UNWINDER-NEXT: 7a0-7a7:1
; CHECK-UNWINDER-NEXT: 7b2-7b5:1
; CHECK-UNWINDER-NEXT: 2
; CHECK-UNWINDER-NEXT: 7a7->7b2:1
; CHECK-UNWINDER-NEXT: 7b5->7c0:1
; CHECK-UNWINDER-NEXT: [0x842 @ 0x7d4 @ 0x7e0 @ 0x7ab @ 0x7b5 @ 0x7d4 @ 0x7e0 @ 0x7b5]
; CHECK-UNWINDER-NEXT: 2
; CHECK-UNWINDER-NEXT: 7c0-7cd:2
; CHECK-UNWINDER-NEXT: 7db-7e0:1
; CHECK-UNWINDER-NEXT: 2
; CHECK-UNWINDER-NEXT: 7cd->7db:2
; CHECK-UNWINDER-NEXT: 7e0->7a0:1
; CHECK-UNWINDER-NEXT: [0x842 @ 0x7d4 @ 0x7e0 @ 0x7ab @ 0x7b5 @ 0x7d4 @ 0x7e0 @ 0x7b5 @ 0x7e0]
; CHECK-UNWINDER-NEXT: 2
; CHECK-UNWINDER-NEXT: 7a0-7a7:1
; CHECK-UNWINDER-NEXT: 7b2-7b5:1
; CHECK-UNWINDER-NEXT: 2
; CHECK-UNWINDER-NEXT: 7a7->7b2:1
; CHECK-UNWINDER-NEXT: 7b5->7c0:1
; clang -O3 -fexperimental-new-pass-manager -fuse-ld=lld -fpseudo-probe-for-profiling
; -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -Xclang -mdisable-tail-calls
; -g test.c -o a.out
#include <stdio.h>
int fb(int n) {
if(n > 10) return fb(n / 2);
return fa(n - 1);
}
int fa(int n) {
if(n < 2) return n;
if(n % 2) return fb(n - 1);
return fa(n - 1);
}
void foo() {
int s, i = 0;
while (i++ < 10000)
s += fa(i);
printf("sum is %d\n", s);
}
int main() {
foo();
return 0;
}