In this PR, static-data-splitter pass finds out the local-linkage global
variables in {`.rodata`, `.data.rel.ro`, `bss`, `.data`} sections by
analyzing machine instruction operands, and aggregates their accesses
from code across functions.
A follow-up item is to analyze global variable initializers and count
for access from data.
* This limitation is demonstrated by `bss2` and `data3` in
`llvm/test/CodeGen/X86/global-variable-partition.ll`.
Some stats of static-data-splitter with this patch:
**section**|**bss**|**rodata**|**data**
:-----:|:-----:|:-----:|:-----:
hot-prefixed section coverage|99.75%|97.71%|91.30%
unlikely-prefixed section size percentage|67.94%|39.37%|63.10%
1. The coverage is defined as `#perf-sample-in-hot-prefixed <data>
section / #perf-sample in <data.*> section` for each <data> section.
* The perf command samples
`MEM_INST_RETIRED.ALL_LOADS:u:pinned:precise=2` events at a high
frequency (`perf -c 2251`) for 30 seconds. The profiled binary is built
as non-PIE so `data.rel.ro` coverage data is not available.
2. The unlikely-prefixed `<data>` section size percentage is defined as
`unlikely <data> section size / the sum size of <data>.* sections` for
each `<data>` section
This is meant as a preparation for PR #130988 "[AMDGPU] Implement IR
expansion for frem instruction" which implements the expansion of
another instruction in this pass. The more general name seems more
appropriate given this change and quite reasonable even without it.
EnableTailMerge is false by default and is handled by the pass builder.
Passes are independent of target pipeline options.
This completes the generic `MachineLateOptimization` passes for the NPM
pipeline.
`RegisterClassInfo` was supposed to be kept alive between pass runs,
which wasn't being done leading to recomputations increasing the compile
time.
Now the Impl class is a member of the legacy and new passes so that it
is not reconstructed on every pass run.
---------
Co-authored-by: Christudasan Devadasan <christudasan.devadasan@amd.com>
https://discourse.llvm.org/t/rfc-profile-guided-static-data-partitioning/83744
proposes to partition static data sections.
This patch introduces a codegen pass. This patch produces jump table
hotness in the in-memory states (machine jump table info and entries).
Target-lowering and asm-printer consume the states and produce `.hot`
section suffix. The follow up PR
https://github.com/llvm/llvm-project/pull/122215 implements such
changes.
---------
Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>
The existing analysis was already a pimpl wrapper.
I have extracted legacy pass logic to a LDVImpl wrapper named
`LiveDebugVariables` which is the analysis::Result now. This controls
whether to activate the LDV (depending on `-live-debug-variables` and
DIsubprogram) itself.
The legacy and new analysis only construct the LiveDebugVariables.
VirtRegRewriter will test this.