9 Commits

Author SHA1 Message Date
Mingming Liu
51a3bc1217
[ThinLTO]Clean up 'import-assume-unique-local' flag. (#102424)
While manual compiles can specify full file paths and build automation
tools use full, unique paths in practice, it's not clear whether it's a
general good practice to enforce full paths (fail a build if relative
paths are used).

`NumDefs == 1` condition [1] should hold true for many internal-linkage
vtables as long as full paths are indeed used to salvage the marginal
performance when local-linkage vtables are imported due to indirect
reference.
https://github.com/llvm/llvm-project/pull/100448#discussion_r1692068402
has more details.

[1]
https://github.com/llvm/llvm-project/pull/100448/files#diff-e7cb370fee46f0f773f2b5429dfab36b75126d3909ae98ee87ff3d0e3f75c6e9R215
2024-08-09 16:48:05 -07:00
Mingming Liu
ac1a1e5797
[ThinLTO][TypeProf] Import local-linkage global var for mod1:func_foo-> mod2:local-var edge (#100448)
VTable value profiling can create reference edges from `mod1:func_foo`
to `mod2:local-vtable`. Indirect call profiling can create reference
edges from `mod1:func_foo` to `mod2:local_func_bar`.

Given a ref chain `mod1:func_foo -> mod2:local-var`,`local-var` doesn't
get imported by default.

Compiler checks / requires the module of 'local-var' is the same as the
function that referenced it(`mod1:func_foo`). This is to prevent
mis-compilation when both `mod1` and `mod2` has `local-var` of the same
name, and cpp files are compiled without full path.

This patch allows the import when one of the following conditions
happen:
1) Introduce an option `import-assume-local-unique`. When the compiler
user can guarantee that all files are compiled with full paths, they can
set this option.
2) When there is one instance of value summary.

Test:
* A/B testing this option alone gives -0.16% statistically consistent
cpu cycle reduction on one search workload (no throughput increase)
* Testing it together with existing more-efficient ICP bumps the
throughput increase by a margin (0.05%~0.1%)
* No regressions observed.
2024-07-24 18:23:14 -07:00
Mingming Liu
a634171896
[InstrPGO][TypeProf]Annotate vtable types when they are present in the profile (#99402)
Before this change, when `file.profdata` have vtable profiles but `--enable-vtable-value-profiling` is not on for optimized build, warnings from this line [1] will show up. They are benign for performance but confusing.

It's better to automatically annotate vtable profiles if `file.profdata` has them. This PR implements it in profile use pass.
* If `-icp-max-num-vtables` is zero (default value is 6), vtable profiles won't be annotated.

[1] 464d321ee8/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1762-L1768)
2024-07-22 11:57:36 -07:00
Mingming Liu
8598bcb993
[compiler-rt][test]Use c-style headers in instrprof-vtable-value-prof.cpp (#97245)
Use c-style headers just like other compiler-rt/profile tests do [1], to
fix ` 'cstdio' file not found` in
https://lab.llvm.org/buildbot/#/builders/122/builds/150

[1]
9b9405621b/compiler-rt/test/profile/instrprof-value-prof.c (L27-L30)
and
9b9405621b/compiler-rt/test/tsan/printf-1.c (L6-L16)
2024-06-30 19:20:10 -07:00
Mingming Liu
038bc1c18c
[Test][compiler-rt]Require lld for instrprof-vtable-value-prof.cpp (#97228)
Fix test failure on
https://lab.llvm.org/buildbot/#/builders/95/builds/672
* lld project is disabled on that build bot [1] and external lld
`/home/buildbots/llvm-external-buildbots/clang.16.0.1/bin/ld.lld` is
used to run the test [2]. It doesn't know new options like
`-enable-vtable-value-profiling` (which was introduced in
1351d17826)
* Update test to require `lld` and `lld-available`.

Tested:
1. By disabling lld in the project and using old `lld` installed
previously[3], I can reproduce the failure.
2. With `requires: lld`, the test become unsupported.

[1]
https://lab.llvm.org/buildbot/#/builders/95/builds/672/steps/4/logs/stdio
[2]
https://lab.llvm.org/buildbot/#/builders/95/builds/672/steps/6/logs/FAIL__Profile-powerpc64le__instrprof-vtable-value-
[3] `cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug
-DLLVM_ENABLE_PROJECTS='clang;compiler-rt' -DLLVM_USE_SPLIT_DWARF=On
-DLLVM_USE_LINKER=lld -DLLVM_ENABLE_SPHINX=ON
-DLLVM_OPTIMIZED_TABLEGEN=TRUE -DLLVM_TARGETS_TO_BUILD=X86
-DLLVM_ENABLE_ZLIB=1 ../llvm`
2024-06-30 13:15:44 -07:00
Mingming Liu
1518b260ce
[TypeProf][InstrFDO]Implement more efficient comparison sequence for indirect-call-promotion with vtable profiles. (#81442)
Clang's `-fwhole-program-vtables` is required for this optimization to
take place. If `-fwhole-program-vtables` is not enabled, this change is
no-op.
    
* Function-comparison (before):

```
%vtable = load ptr, ptr %obj
%vfn = getelementptr inbounds ptr, ptr %vtable, i64 1
%func = load ptr, ptr %vfn
%cond = icmp eq ptr %func, @callee
br i1 %cond, label bb1, label bb2:

bb1:
   call @callee

bb2:
   call %func
```

* VTable-comparison (after):

```
%vtable = load ptr, ptr %obj
%cond = icmp eq ptr %vtable, @vtable-address-point
br i1 %cond, label bb1, label bb2:

bb1:
   call @callee

bb2:
  %vfn = getelementptr inbounds ptr, ptr %vtable, i64 1
  %func = load ptr, ptr %vfn
  call %func
```
    
Key changes:
1. Find out virtual calls and the vtables they come from.
- The ICP relies on type intrinsic `llvm.type.test` to find out virtual
calls and the
compatible vtables, and relies on type metadata to find the address
point for comparison.
2. ICP pass does cost-benefit analysis and compares vtable only when the
number of vtables for a function candidate is within (option specified)
threshold.
3. Sink the function addressing and vtable load instruction to indirect
fallback.
- The sink helper functions are simplified versions of
`InstCombinerImpl::tryToSinkInstruction`. Currently debug intrinsics are
not handled. Ideally `InstCombinerImpl::tryToSinkInstructionDbgValues`
and `InstCombinerImpl::tryToSinkInstructionDbgVariableRecords` could be
moved into Transforms/Utils/Local.cpp (or another util cpp file) to
handle debug intrinsics when moving instructions across basic blocks.
4. Keep value profiles updated
     1) Update vtable value profiles after inline
     2) For either function-based comparison or vtable-based comparison,
          update both vtable and indirect call value profiles.
2024-06-29 23:21:33 -07:00
Mingming Liu
3f78d89a2e
[TypeProf][InstrFDO]Omit vtable symbols in indexed profiles by default (#96520)
- The indexed iFDO profiles contains compressed vtable names for `llvm-profdata show --show-vtables` debugging 
   usage. An optimized build doesn't need it and doesn't decompress the blob now [1], since optimized binary has the 
   source code and IR to find vtable symbols.
- The motivation is to avoid increasing profile size when it's not necessary.
- This doesn't change the indexed profile format and thereby doesn't need a version change.

[1] eac925fb81/llvm/include/llvm/ProfileData/InstrProfReader.h (L696-L699)
2024-06-26 11:38:20 -07:00
Mingming Liu
5bbc640f64
[nfc] Disable the a cpp compiler-rt test on ppc bigendian systems due to build errors (#87262)
`Linux/instrprof-vtable-value-prof.cpp` needs to be built for the test
to run. However, cpp compile & link failed with undefined-ABI error [1].
See original failure in
https://lab.llvm.org/buildbot/#/builders/18/builds/16429

[1] 
```
FAIL: Profile-powerpc64 :: Linux/instrprof-vtable-value-prof.cpp (2406 of 2414)
******************** TEST 'Profile-powerpc64 :: Linux/instrprof-vtable-value-prof.cpp' FAILED ********************
Exit Code: 1
Command Output (stderr):
--
RUN: at line 3: /home/buildbots/llvm-external-buildbots/workers/ppc64be-sanitizer/sanitizer-ppc64be/build/build_debug/./bin/clang  --driver-mode=g++  -m64  -ldl  -fprofile-generate -fuse-ld=lld -O2 -g -fprofile-generate=. -mllvm -enable-vtable-value-profiling /home/buildbots/llvm-external-buildbots/workers/ppc64be-sanitizer/sanitizer-ppc64be/build/llvm-project/compiler-rt/test/profile/Linux/instrprof-vtable-value-prof.cpp -o /home/buildbots/llvm-external-buildbots/workers/ppc64be-sanitizer/sanitizer-ppc64be/build/build_debug/runtimes/runtimes-bins/compiler-rt/test/profile/Profile-powerpc64/Linux/Output/instrprof-vtable-value-prof.cpp.tmp-test
+ /home/buildbots/llvm-external-buildbots/workers/ppc64be-sanitizer/sanitizer-ppc64be/build/build_debug/./bin/clang --driver-mode=g++ -m64 -ldl -fprofile-generate -fuse-ld=lld -O2 -g -fprofile-generate=. -mllvm -enable-vtable-value-profiling /home/buildbots/llvm-external-buildbots/workers/ppc64be-sanitizer/sanitizer-ppc64be/build/llvm-project/compiler-rt/test/profile/Linux/instrprof-vtable-value-prof.cpp -o /home/buildbots/llvm-external-buildbots/workers/ppc64be-sanitizer/sanitizer-ppc64be/build/build_debug/runtimes/runtimes-bins/compiler-rt/test/profile/Profile-powerpc64/Linux/Output/instrprof-vtable-value-prof.cpp.tmp-test
ld.lld: error: /lib/../lib64/Scrt1.o: ABI version 1 is not supported
clang: error: linker command failed with exit code 1 (use -v to see invocation)

```
2024-04-01 09:55:24 -07:00
Mingming Liu
1351d17826
[InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (#66825)
(The profile format change is split into a standalone change into https://github.com/llvm/llvm-project/pull/81691)

* For InstrFDO value profiling, implement instrumentation and lowering for virtual table address.
* This is controlled by `-enable-vtable-value-profiling` and off by default.
* When the option is on, raw profiles will carry serialized `VTableProfData` structs and compressed vtables as payloads.
 
* Implement profile reader and writer support 
  * Raw profile reader is used by `llvm-profdata` but not compiler. Raw profile reader will construct InstrProfSymtab with symbol names, and map profiled runtime address to vtable symbols.
  * Indexed profile reader is used by `llvm-profdata` and compiler. When initialized, the reader stores a pointer to the beginning of in-memory compressed vtable names and the length of string. When used in `llvm-profdata`, reader decompress the string to show symbols of a profiled site. When used in compiler, string decompression doesn't
happen since IR is used to construct InstrProfSymtab.
  * Indexed profile writer collects the list of vtable names, and stores that to index profiles.
  * Text profile reader and writer support are added but mostly follow the implementation for indirect-call value type.
* `llvm-profdata show -show-vtables <args> <profile>` is implemented.

rfc in
https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600#pick-instrumentation-points-and-instrument-runtime-types-7
2024-04-01 08:52:35 -07:00