6 Commits

Author SHA1 Message Date
Kyungwoo Lee
d23c5c2d65
[CGData] Global Merge Functions (#112671)
This implements a global function merging pass. Unlike traditional
function merging passes that use IR comparators, this pass employs a
structurally stable hash to identify similar functions while ignoring
certain constant operands. These ignored constants are tracked and
encoded into a stable function summary. When merging, instead of
explicitly folding similar functions and their call sites, we form a
merging instance by supplying different parameters via thunks. The
actual size reduction occurs when identically created merging instances
are folded by the linker.

Currently, this pass is wired to a pre-codegen pass, enabled by the
`-enable-global-merge-func` flag.
In a local merging mode, the analysis and merging steps occur
sequentially within a module:
- `analyze`: Collects stable function hashes and tracks locations of
ignored constant operands.
- `finalize`: Identifies merge candidates with matching hashes and
computes the set of parameters that point to different constants.
- `merge`: Uses the stable function map to optimistically create a
merged function.

We can enable a global merging mode similar to the global function
outliner
(https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753/),
which will perform the above steps separately.
- `-codegen-data-generate`: During the first round of code generation,
we analyze local merging instances and publish their summaries.
- Offline using `llvm-cgdata` or at link-time, we can finalize all these
merging summaries that are combined to determine parameters.
- `-codegen-data-use`: During the second round of code generation, we
optimistically create merging instances within each module, and finally,
the linker folds identically created merging instances.

Depends on #112664
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
2024-11-13 17:34:07 -08:00
Kyungwoo Lee
ffcf3c8688
[CGData][llvm-cgdata] Support for stable function map (#112664)
This introduces a new cgdata format for stable function maps. The raw
data is embedded in the __llvm_merge section during compile time. This
data can be read and merged using the llvm-cgdata tool, into an indexed
cgdata file. Consequently, the tool is now capable of handling either
outlined hash trees, stable function maps, or both, as they are
orthogonal.

Depends on #112662.
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
2024-11-04 17:32:50 -08:00
Jie Fu
723a9b87e2 [llvm-cgdata] Fix -Wcovered-switch-default (NFC)
/llvm-project/llvm/tools/llvm-cgdata/llvm-cgdata.cpp:349:3:
error: default label in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]
  default:
  ^
1 error generated.
2024-08-20 22:53:28 +08:00
Kyungwoo Lee
9bb555688c
Reland [CGData] llvm-cgdata #89884 (#101461)
Reland [CGData] llvm-cgdata #89884 using `Opt` instead of `cl`
- Action options are required, `--convert`, `--show`, `--merge`. This
was similar to sub-commands previously implemented, but having a prefix
`--`.
- `--format` option is added, which specifies `text` or `binary`.

---------

Co-authored-by: Kyungwoo Lee <kyulee@fb.com>
2024-08-20 07:26:50 -07:00
Gulfem Savrun Yeniceri
73d78973fe Revert "[CGData] llvm-cgdata (#89884)"
This reverts commit d3fb41dddc11b0ebc338a3b9e6a5ab7288ff7d1d
and forward fix patches because of the issue explained in:
https://github.com/llvm/llvm-project/pull/89884#issuecomment-2244348117.

Revert "Fix tests for https://github.com/llvm/llvm-project/pull/89884
(#100061)"

This reverts commit 67937a3f969aaf97a745a45281a0d22273bff713.

Revert "Fix build break for https://github.com/llvm/llvm-project/pull/89884 (#100050)"

This reverts commit c33878c5787c128234d533ad19d672dc3eea19a8.

Revert "[CGData] Fix -Wpessimizing-move in CodeGenDataReader.cpp (NFC)"

This reverts commit 1f8b2b146141f3563085a1acb77deb50857a636d.
2024-07-23 11:40:20 +00:00
Kyungwoo Lee
d3fb41dddc
[CGData] llvm-cgdata (#89884)
The llvm-cgdata tool has been introduced to handle reading and writing
of codegen data. This data includes an optimistic codegen summary that
can be utilized to enhance subsequent codegen. Currently, the tool
supports saving and restoring the outlined hash tree, facilitating
machine function outlining across modules. Additional codegen summaries
can be incorporated into separate sections as required. This patch
primarily establishes basic support for the reader and writer, similar
to llvm-profdata.

The high-level operations of llvm-cgdata are as follows:
1. It reads local raw codegen data from a custom section (for example,
__llvm_outline) embedded in native binary files
2. It merges local raw codegen data into an indexed codegen data,
complete with a suitable header.
3. It handles reading and writing of the indexed codegen data into a
standalone file.

This depends on https://github.com/llvm/llvm-project/pull/89792.
This is a patch for
https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.

---------

Co-authored-by: Kyungwoo Lee <kyulee@fb.com>
2024-07-23 10:25:51 +09:00