333 Commits

Author SHA1 Message Date
Kazu Hirata
0ef8ef66cc
[Transforms] Remove unused includes (NFC) (#141357)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-05-24 09:37:43 -07:00
bd1976bris
6520b21ce0
[DTLTO][LLVM] Integrated Distributed ThinLTO (DTLTO) (#127749)
This patch adds initial support for Integrated Distributed ThinLTO
(DTLTO) in LLVM, which manages distribution internally during the
traditional link step. This enables compatibility with any build
system that supports in-process ThinLTO. In contrast, existing
approaches to distributed ThinLTO, which split the thin-link
(--thinlto-index-only), backend compilation, and final link into
separate steps, require build system support, e.g. Bazel.

This patch implements the core DTLTO mechanism, which enables
delegation of ThinLTO backend jobs to an external process (the
distributor). The distributor can then manage job distribution through
systems like Incredibuild. A generic JSON interface is used to
communicate with the distributor, allowing for the creation of new
distributors (and thus integration with different distribution
systems) without modifying LLVM.

Please see llvm/docs/dtlto.rst for more details.

RFC: https://discourse.llvm.org/t/rfc-integrated-distributed-thinlto/69641
Design Review: https://github.com/llvm/llvm-project/pull/126654
2025-05-23 20:07:53 +01:00
Kazu Hirata
6ab7cb7899
[Transforms] Remove unused local variables (NFC) (#138442) 2025-05-04 00:35:22 -07:00
Owen Rodley
d3d856ad84
Clean up external users of GlobalValue::getGUID(StringRef) (#129644)
See https://discourse.llvm.org/t/rfc-keep-globalvalue-guids-stable/84801
for context.

This is a non-functional change which just changes the interface of
GlobalValue, in preparation for future functional changes. This part
touches a fair few users, so is split out for ease of review. Future
changes to the GlobalValue implementation can then be focused purely on
that class.

This does the following:

* Rename GlobalValue::getGUID(StringRef) to
  getGUIDAssumingExternalLinkage. This is simply making explicit at the
  callsite what is currently implicit.
* Where possible, migrate users to directly calling getGUID on a
  GlobalValue instance.
* Otherwise, where possible, have them call the newly renamed
  getGUIDAssumingExternalLinkage, to make the assumption explicit.


There are a few cases where neither of the above are possible, as the
caller saves and reconstructs the necessary information to compute the
GUID themselves. We want to migrate these callers eventually, but for
this first step we leave them be.
2025-04-28 11:09:43 +10:00
Shilei Tian
b0fede358f
[ThinLTO] Don't convert functions to declarations if force-import-all is enabled (#134541)
On one hand, we intend to force import all functions when the option is
enabled.
On the other hand, we currently drop definitions of some functions and
convert
them to declarations, which contradicts this intent.

With this PR, functions will no longer be converted to declarations when
`force-import-all` is enabled.
2025-04-12 18:22:22 -04:00
Mircea Trofin
4532512f6c
[ctxprof] Move MoveSymbolGUID to address dependency issues (#134334)
See PR #134192
2025-04-03 19:02:46 -07:00
Mircea Trofin
2146826169
[ctxprof] Support for "move" semantics for the contextual root (#134192)
This PR finishes what PR #133992 started.
2025-04-03 18:36:45 -07:00
Mircea Trofin
61768b3528
[ctxprof] Don't import roots elsewhere (#134012)
Block a context root from being imported by its callers. 

Suppose that happened. Its caller - usually a message pump - inlines its copy of the root. Then it (the root) and whatever it calls will be the non-contextually optimized callee versions.
2025-04-03 13:21:39 -07:00
Mircea Trofin
d59b2c4def
[ctxprof][nfc] Make computeImportForFunction a member of ModuleImportsManager (#134011) 2025-04-02 18:18:17 -07:00
Mircea Trofin
02467f9e21
[ctxprof] Option to move a whole tree to its own module (#133992)
Modules may contain a mix of functions that participate or don't participate in callgraphs covered by a contextual profile. We currently have been importing all the functions under a context root in the module defining that root, but if the other functions there are covered by flat profiles, the result is difficult to reason about.

This patch allows moving everything under a context root (and that root) in its own module. For now, we expect a module with a filename matching the GUID of the function be present in the set of modules known by the linker. This mechanism can be improved in a later patch.

Subsequent patches will handle implementing "move" instead of "import" semantics for the root function (because we want to make sure only one version of the root exists - so the optimizations we perform are actually the ones being observed at runtime).
2025-04-02 18:15:48 -07:00
Kazu Hirata
73dc2afd2c
[Transforms] Use *Set::insert_range (NFC) (#132652)
We can use *Set::insert_range to collapse:

  for (auto Elem : Range)
    Set.insert(E);

down to:

  Set.insert_range(Range);

In some cases, we can further fold that into the set declaration.
2025-03-23 19:42:53 -07:00
Kazu Hirata
0dcc201ac4
[Transforms] Use *Set::insert_range (NFC) (#132056)
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range.  This patch replaces:

  Dest.insert(Src.begin(), Src.end());

with:

  Dest.insert_range(Src);

This patch does not touch custom begin like succ_begin for now.
2025-03-19 15:35:01 -07:00
Mircea Trofin
2068a18c86
[ctxprof][nfc] Prepare CtxProfAnalysis for flat profiles (#129623)
Mostly remove the equivalence "no contexts == no CtxProfAnalysis result", and instead check explicitly there are no contextual profiles.
2025-03-04 16:42:47 -08:00
Mircea Trofin
4312075efa
[nfc][thinlto] remove unnecessary return from renameModuleForThinLTO (#121851)
Same goes for `FunctionImportGlobalProcessing::run`.

The return value was used, but it was always `false`.
2025-01-06 15:19:09 -08:00
Mingming Liu
6faf17b762
[ThinLTO]Supports declaration import for global variables in distributed ThinLTO (#117616)
When `-import-declaration` option is enabled, declaration import is
supported for functions. https://github.com/llvm/llvm-project/pull/88024
has the context for this option.

This patch supports declaration import for global variables in
distributed ThinLTO. The motivating use case is to propagate `dso_local`
attribute of global variables across modules, to optimize global
variable access when a binary is built with
`-fno-direct-access-external-data`.
* With `-fdirect-access-external-data`, non thread-local global
variables will [have `dso_local`
attributes](fe3c23b439/clang/lib/CodeGen/CodeGenModule.cpp (L1730-L1746)).
This optimizes the global variable access as shown by
https://gcc.godbolt.org/z/vMzWcKdh3
2024-12-02 16:15:52 -08:00
Krzysztof Pszeniczny
991154d0fb
[LTO] Use .at instead of .lookup to avoid copies. (NFC) (#117888)
`DenseMap::lookup` returns by value (because it default-creates the
returned value if the key isn't present in the map), which means that we
do a lot of copying here. Since we assert that something is present in
the returned value two lines below this call, it's safe to use `.at`
here instead.

Copying and then destroying dense maps here is responsible for 60% of
the time spent in LTO indexing in a large internal build.
2024-11-27 18:41:29 +01:00
Nuri Amari
2edd897a42
Make WriteIndexesThinBackend multi threaded (#109847)
We've noticed that for large builds executing thin-link can take on the
order of 10s of minutes. We are only using a single thread to write the
sharded indices and import files for each input bitcode file. While we
need to ensure the index file produced lists modules in a deterministic
order, that doesn't prevent us from executing the rest of the work in
parallel.

In this change we use a thread pool to execute as much of the backend's
work as possible in parallel. In local testing on a machine with 80
cores, this change makes a thin-link for ~100,000 input files run in ~2
minutes. Without this change it takes upwards of 10 minutes.

---------

Co-authored-by: Nuri Amari <nuriamari@fb.com>
2024-10-07 08:16:46 -07:00
Mircea Trofin
885ac29910 [nfc][ctx_prof] Change some internal "set" types
- the set used for targets under a callsite is simpler to use if iterators
  are stable (it gets manipulated during updates)
- the set used to fetch the transitive closure of GUIDs under a node can
  be left as a choice to the user.
2024-09-12 10:34:53 -07:00
Kazu Hirata
3dad29b677
[LTO] Remove unused includes (NFC) (#108110)
clangd reports these as unused headers.  My manual inspection agrees
with the findings.
2024-09-10 19:36:04 -07:00
Kazu Hirata
5c0d61e318
[LTO] Reduce memory usage for import lists (#106772)
This patch reduces the memory usage for import lists by employing
memory-efficient data structures.

With this patch, an import list for a given destination module is
basically DenseSet<uint32_t> with each element indexing into the
deduplication table containing tuples of:

  {SourceModule, GUID, Definition/Declaration}

In one of our large applications, the peak memory usage goes down by
9.2% from 6.120GB to 5.555GB during the LTO indexing step.

This patch addresses several sources of space inefficiency associated
with std::unordered_map:

- std::unordered_map<GUID, ImportKind> takes up 16 bytes because of
  padding even though ImportKind only carries one bit of information.

- std::unordered_map uses pointers to elements, both in the hash table
  proper and for collision chains.

- We allocate an instance of std::unordered_map for each
  {Destination Module, Source Module} pair for which we have at least
  one import.  Most import lists have less than 10 imports, so the
  metadata like the size of std::unordered_map and the pointer to the
  hash table costs a lot relative to the actual contents.
2024-09-01 08:36:06 -07:00
Kazu Hirata
eb9c49c900
[LTO] Make getImportType a proper function (NFC) (#106450)
I'm planning to reduce the memory footprint of ThinLTO indexing by
changing ImportMapTy.  A look-up of the import type will involve data
private to ImportMapTy, so it must be done by a member function of
ImportMapTy.  This patch turns getImportType into a member function so
that a subsequent "real" change will just have to update the
implementation of the function in place.
2024-08-28 13:53:07 -07:00
Kazu Hirata
4f15039cf2
[LTO] Introduce new type alias ImportListsTy (NFC) (#106420)
The background is as follows.  I'm planning to reduce the memory
footprint of ThinLTO indexing by changing ImportMapTy, the data
structure used for an import list.  Once this patch lands, I'm
planning to change the type slightly.  The new type alias allows us to
update the type without touching many places.
2024-08-28 10:42:12 -07:00
Kazu Hirata
29bb523b7c
[LTO] Introduce a helper lambda in gatherImportedSummariesForModule (NFC) (#106251)
This patch forward ports the heterogeneous std::map::operator[]() from
C++26 so that we can look up the map without allocating an instance of
std::string when the key-value pair exists in the map.

The background is as follows.  I'm planning to reduce the memory
footprint of ThinLTO indexing by changing ImportMapTy, the data
structure used for an import list.  The new list will be a hash set of
tuples (SourceModule, GUID, ImportType) represented in a space
efficient manner.  That means that as we iterate over the hash set, we
encounter SourceModule as many times as GUID.  We don't want to create
a temporary instance of std::string every time we look up
ModuleToSummariesForIndex like:

auto &SummariesForIndex =
ModuleToSummariesForIndex[std::string(ILI.first)];

This patch removes the need to create the temporaries by enabling the
hetegeneous lookup with std::set<K, V, std::less<>> and forward
porting std::map::operator[]() from C++26.
2024-08-27 12:43:07 -07:00
Kazu Hirata
0359b9a230
[LTO] Introduce a helper function collectImportStatistics (NFC) (#106179)
This patch introduces a helper function collectImportStatistics.  The
new function computes statistics of imports for
ComputeCrossModuleImport and dumpImportListForModule with no
functional change.

The background is as follows.  I'm planning to reduce the memory
footprint of ThinLTO indexing by changing ImportMapTy, the data
structure used for an import list.  The new list will be a hash set of
tuples (SourceModule, GUID, ImportType) represented in a space
efficient manner.  That means that obtaining statistics like the
number of definitions per source module requires us to go through the
entire import list (for a given destination module).

Introducing a helper function now makes the callers more independent
of the underlying data structures used in ImportMapT.
2024-08-27 08:39:49 -07:00
Kazu Hirata
4e30cf7b2a
[LTO] Introduce getSourceModules (NFC) (#105955)
This patch introduces getSourceModules to compute the list of source
modules in the ascending alphabetical order.  The new function is
intended to hide implementation details of ImportMapTy while
simplifying FunctionImporter::importFunctions a little bit.
2024-08-26 11:02:05 -07:00
Kazu Hirata
dbd7ce0ccd
[IR] Inroduce ModuleToSummariesForIndexTy (NFC) (#105906)
This patch introduces type alias ModuleToSummariesForIndexTy.

I'm planning to change the type slightly to allow heterogeneous lookup
(that is, std::map<K, V, std::less<>>) in a subsequent patch.  The
problem is that changing the type affects many places.  Using a type
alias reduces the impact.
2024-08-23 17:32:52 -07:00
Kazu Hirata
3563907969
[LTO] Turn ImportMapTy into a proper class (NFC) (#105748)
This patch turns type alias ImportMapTy into a proper class to provide
a more intuitive interface like:

  ImportList.addDefinition(...)

as opposed to:

  FunctionImporter::addDefinition(ImportList, ...)

Also, this patch requires all non-const accesses to go through
addDefinition, maybeAddDeclaration, and addGUID while providing const
accesses via:

  const ImportMapTyImpl &getImportMap() const { return ImportMap; }

I realize ImportMapTy may not be the best name as a class (maybe OK as
a type alias).  I am not renaming ImportMapTy in this patch at least
because there are 47 mentions of ImportMapTy under llvm/.
2024-08-22 21:56:01 -07:00
Kazu Hirata
ca48b015a1
[LTO] Use a helper function to add a definition (NFC) (#105721)
I missed this one when I introduced helper functions in:

  commit 3082a381f57ef2885c270f41f2955e08c79634c5
  Author: Kazu Hirata <kazu@google.com>
  Date:   Thu Aug 22 12:06:47 2024 -0700
2024-08-22 16:01:36 -07:00
Kazu Hirata
3082a381f5
[LTO] Introduce helper functions to add GUIDs to ImportList (NFC) (#105555)
The new helper functions make the intent clearer while hiding
implementation details, including how we handle previously added
entries.  Note that:

- If we are adding a GUID as a GlobalValueSummary::Definition, then we
  override a previously added GlobalValueSummary::Declaration entry
  for the same GUID.

- If we are adding a GUID as a GlobalValueSummary::Declaration, then a
  previously added GlobalValueSummary::Definition entry for the same
  GUID takes precedence, and no change is made.
2024-08-22 12:06:47 -07:00
Kazu Hirata
fdbc4089e7
[LTO] Compare std::optional<ImportKind> directly with ImportKind (NFC) (#105561)
Note that:

  Opt == Val if and only (Opt && *Opt == Val)

where:

  std::optional<T> Opt;
  T Val;
2024-08-21 16:53:18 -07:00
Mircea Trofin
6807ca8e93
[nfc][ctx_prof] Use one flag for the "use" scenario (#103377)
No need to have two flags, one for the thinlink and one for compilation.
2024-08-13 11:00:51 -07:00
Mingming Liu
51a3bc1217
[ThinLTO]Clean up 'import-assume-unique-local' flag. (#102424)
While manual compiles can specify full file paths and build automation
tools use full, unique paths in practice, it's not clear whether it's a
general good practice to enforce full paths (fail a build if relative
paths are used).

`NumDefs == 1` condition [1] should hold true for many internal-linkage
vtables as long as full paths are indeed used to salvage the marginal
performance when local-linkage vtables are imported due to indirect
reference.
https://github.com/llvm/llvm-project/pull/100448#discussion_r1692068402
has more details.

[1]
https://github.com/llvm/llvm-project/pull/100448/files#diff-e7cb370fee46f0f773f2b5429dfab36b75126d3909ae98ee87ff3d0e3f75c6e9R215
2024-08-09 16:48:05 -07:00
Mircea Trofin
c99bd3ceff
[ctx_prof] Extend WorkloadImportsManager to use the contextual profile (#98682)
Keeping the json-based input as it's useful for diagnostics or for driving the import by other means than contextual composition.

The support for the contextual profile is just another modality for constructing the import list (`WorkloadImportsManager::Workloads`).

Everything else - i.e. the actual importing logic - is already independent from how that list was obtained.
2024-07-29 18:06:00 -04:00
Mingming Liu
ba8883c46e
Fix buildbot failure by fixing the base pointer type (#100508)
This should fix buildbot failures like
https://lab.llvm.org/buildbot/#/builders/169/builds/1448
2024-07-24 21:27:46 -07:00
Mingming Liu
ac1a1e5797
[ThinLTO][TypeProf] Import local-linkage global var for mod1:func_foo-> mod2:local-var edge (#100448)
VTable value profiling can create reference edges from `mod1:func_foo`
to `mod2:local-vtable`. Indirect call profiling can create reference
edges from `mod1:func_foo` to `mod2:local_func_bar`.

Given a ref chain `mod1:func_foo -> mod2:local-var`,`local-var` doesn't
get imported by default.

Compiler checks / requires the module of 'local-var' is the same as the
function that referenced it(`mod1:func_foo`). This is to prevent
mis-compilation when both `mod1` and `mod2` has `local-var` of the same
name, and cpp files are compiled without full path.

This patch allows the import when one of the following conditions
happen:
1) Introduce an option `import-assume-local-unique`. When the compiler
user can guarantee that all files are compiled with full paths, they can
set this option.
2) When there is one instance of value summary.

Test:
* A/B testing this option alone gives -0.16% statistically consistent
cpu cycle reduction on one search workload (no throughput increase)
* Testing it together with existing more-efficient ICP bumps the
throughput increase by a margin (0.05%~0.1%)
* No regressions observed.
2024-07-24 18:23:14 -07:00
Mingming Liu
50fea9943f
Reland "[ThinLTO][Bitcode] Generate import type in bitcode" (#97253)
https://github.com/llvm/llvm-project/pull/87600 was reverted in order to
revert
6262763341.
Now https://github.com/llvm/llvm-project/pull/95482 is fix forward for
6262763341.
This patch is a reland for
https://github.com/llvm/llvm-project/pull/87600

**Changes on top of original patch**
In `llvm/include/llvm/IR/ModuleSummaryIndex.h`, make the type of
`GVSummaryPtrSet` an `unordered_set` which is more memory efficient when
the number of elements is smaller than 128 [1]

**Original commit message**

For distributed ThinLTO, the LTO indexing step generates combined
summary for each module, and postlink pipeline reads the combined
summary which stores the information for link-time optimization.

This patch populates the 'import type' of a summary in bitcode, and
updates bitcode reader to parse the bit correctly.

[1]
393eff4e02/llvm/lib/Support/SmallPtrSet.cpp (L43)
2024-07-08 22:20:33 -07:00
Mingming Liu
af784a5c13
[ThinLTO] Use a set rather than a map to track exported ValueInfos. (#97360)
https://github.com/llvm/llvm-project/pull/95482 is a reland of
https://github.com/llvm/llvm-project/pull/88024.
https://github.com/llvm/llvm-project/pull/95482 keeps indexing memory
usage reasonable by using unordered_map and doesn't make other changes
to originally reviewed code.

While discussing possible ways to minimize indexing memory usage, Teresa
asked whether I need `ExportSetTy` as a map or a set is sufficient. This
PR implements the idea. It uses a set rather than a map to track exposed
ValueInfos.

Currently, `ExportLists` has two use cases, and neither needs to track a
ValueInfo's import/export status. So using a set is sufficient and
correct.
1) In both in-process and distributed ThinLTO, it's used to decide if a
function or global variable is visible [1] from another module after importing
creates additional cross-module references.
     * If a cross-module call edge is seen today, the callee must be visible
       to another module without keeping track of its export status already.
       For instance, this [2] is how callees of direct calls get exported.
2) For in-process ThinLTO [3], it's used to compute lto cache key.
     * The cache key computation already hashes [4] 'ImportList' , and 'ExportList' is
        determined by 'ImportList'. So it's fine to not track 'import type' for export list.

[1] 66cd8ec4c0/llvm/lib/LTO/LTO.cpp (L1815-L1819)
[2] 66cd8ec4c0/llvm/lib/LTO/LTO.cpp (L1783-L1794)
[3] 66cd8ec4c0/llvm/lib/LTO/LTO.cpp (L1494-L1496)
[4] b76100e220/llvm/lib/LTO/LTO.cpp (L194-L222)
2024-07-03 13:15:17 -07:00
Mingming Liu
8d9db947b7
Reland "[ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option" (#95482)
Make `FunctionsToImportTy` an `unordered_map` rather than `DenseMap`.
Credit goes to jvoung@ for the 'DenseMap -> unordered_map' change. This
is a reland of https://github.com/llvm/llvm-project/pull/92718

* `DenseMap` allocates space for a large number of key/value pairs and
wastes space when the number of elements are small.
* While init bucket size is zero [1], it quickly allocates buckets for 64 elements [2]
when the number of elements is small (for example, 3 or 4 elements). The programmer
manual [3] also mentions it could waste space.
* Experiments show `FunctionsToImportTy.size()` is smaller than 4 for
multiple binaries with high indexing ram usage. `unordered_map` grows
factor is at most 2 in llvm libc [4] for insert operations.
 
With this change, `ComputeCrossModuleImport` ram increase is smaller
than 0.5G on a couple of binaries with high indexing ram usage. A wider
range of (pre-release) tests pass.

[1] ad79a14c9e/llvm/include/llvm/ADT/DenseMap.h (L431-L432) 
[2] ad79a14c9e/llvm/include/llvm/ADT/DenseMap.h (L849)
[3] https://llvm.org/docs/ProgrammersManual.html#llvm-adt-densemap-h
[4] ad79a14c9e/libcxx/include/__hash_table (L1525-L1526)

**Original commit message** 
The goal is to populate `declaration` import status if a new flag
`-import-declaration` is on.

* For in-process ThinLTO, the `declaration` status is visible to backend
`function-import` pass, so `FunctionImporter::importFunctions` should
read the import status and be no-op for declaration summaries.
Basically, the postlink pipeline is updated to keep its current behavior
(import definitions), but not updated to handle `declaration` summaries.
Two use cases ([better call-graph
sort](https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5)
or [cross-module
auto-init](https://github.com/llvm/llvm-project/pull/87597#discussion_r1556067195))
would use this bit differently.

* For distributed ThinLTO, the `declaration` status is not serialized to
bitcode. As discussed, https://github.com/llvm/llvm-project/pull/87600
will do this.
2024-06-20 10:50:31 -07:00
Abhina Sree
d3342e5b92
[SystemZ][z/OS] Continue marking text files with OF_Text (#95111)
Text files should be opened with OF_Text to have the correct encoding.
2024-06-12 09:22:21 -04:00
Mingming Liu
707f4de428
Revert "Reland "[ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option" (#92718) (#94503)
This reverts commit e33db249b53fb70dce62db3ebd82d42239bd1d9d.

The change from *set to *map increases memory usage, and caused indexing
OOM in some applications. Need to profile offline to bring the memory
usage down.
2024-06-05 10:06:55 -07:00
Mingming Liu
53061eecdb
Revert "[ThinLTO][Bitcode] Generate import type in bitcode (#87600)" (#94502)
This reverts commit 6262763341fcd71a2b0708cf7485f9abd1d26ba8, to prepare
for the revert of https://github.com/llvm/llvm-project/pull/92718.


https://github.com/llvm/llvm-project/pull/92718 causes LTO indexing OOM
in some applications.
2024-06-05 09:59:46 -07:00
Mingming Liu
6262763341
[ThinLTO][Bitcode] Generate import type in bitcode (#87600)
For distributed ThinLTO, the LTO indexing step generates combined
summary for each module, and postlink pipeline reads the combined
summary which stores the information for link-time optimization.

This patch populates the 'import type' of a summary in bitcode, and
updates bitcode reader to parse the bit correctly.
2024-05-22 09:52:54 -07:00
Mingming Liu
e33db249b5
Reland "[ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option" (#92718)
The original PR is reviewed in
https://github.com/llvm/llvm-project/pull/88024, and this PR adds one
line (b9f04d199d)
to fix test

Limit to one thread for in-process ThinLTO to test `LLVM_DEBUG` log.
- This should fix build bot failure like
https://lab.llvm.org/buildbot/#/builders/259/builds/4727 and
https://lab.llvm.org/buildbot/#/builders/9/builds/43876
- I could repro the failure and see interleaved log messages by using
`-thinlto-threads=all`

**Original Commit Message:**

The goal is to populate `declaration` import status if a new flag
`-import-declaration` is on.

* For in-process ThinLTO, the `declaration` status is visible to backend
`function-import` pass, so `FunctionImporter::importFunctions` should
read the import status and be no-op for declaration summaries.
Basically, the postlink pipeline is updated to keep its current behavior
(import definitions), but not updated to handle `declaration` summaries.
Two use cases ([better call-graph
sort](https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5)
or [cross-module
auto-init](https://github.com/llvm/llvm-project/pull/87597#discussion_r1556067195))
would use this bit differently.

* For distributed ThinLTO, the `declaration` status is not serialized to
bitcode. As discussed, https://github.com/llvm/llvm-project/pull/87600
will do this.
2024-05-20 08:55:31 -07:00
Mingming Liu
6b0733e3a3
Revert "[ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option" (#92715)
Reverts llvm/llvm-project#88024

Build bot failures
(https://lab.llvm.org/buildbot/#/builders/259/builds/4727 and
https://lab.llvm.org/buildbot/#/builders/9/builds/43876)
2024-05-19 22:42:18 -07:00
Mingming Liu
8de7890572
[ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option (#88024)
The goal is to populate `declaration` import status if a new flag`-import-declaration` is on.

* For in-process ThinLTO, the `declaration` status is visible to backend
`function-import` pass, so `FunctionImporter::importFunctions` should
read the import status and be no-op for declaration summaries.
Basically, the postlink pipeline is updated to keep its current behavior
(import definitions), but not updated to handle `declaration` summaries.
Two use cases (better call-graph sort and cross-module auto-init)
would use this bit differently.

* For distributed ThinLTO, the `declaration` status is not serialized to
bitcode. As discussed, https://github.com/llvm/llvm-project/pull/87600
will do this.

[1] https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5
[2] https://github.com/llvm/llvm-project/pull/87597#discussion_r1556067195
2024-05-19 22:22:47 -07:00
lifengxiang1025
e40cabfea4
[MemProf] Match function's summary and definition strictly (#83665)
Problem description:
https://github.com/llvm/llvm-project/pull/81008#issuecomment-1933468520
Solution:
https://github.com/llvm/llvm-project/pull/81008#issuecomment-1934192548
(choose plan2)
2024-03-12 11:00:02 +08:00
lifengxiang1025
daf3079222
[ThinLTO] Add metedata 'thinlto_src_module' and 'thinlto_src_file' (#83110)
Originally, when `EnableImportMetadata` enabled, `SourceFileName` will
be recorded as `thinlto_src_module`. Now `SourceFileName` will be
recorded as `thinlto_src_file` and `ModuleIdentifier` will be recorded
as `thinlto_src_module`.
2024-02-29 10:42:06 +08:00
Mircea Trofin
ed10fba1b2
[ThinLTO] Allow importing based on a workload definition (#74545)
An example of a "workload definition" would be "the transitive closure of functions actually called to satisfy a RPC request", i.e. a (typically significantly) smaller subset of the transitive closure (static + possible indirect call targets) of callees. This means this workload definition is a type of flat dynamic profile.

Producing one is not in scope - it can be produced offline from traces, or from sample-based profiles, etc.

This patch adds awareness to ThinLTO of such a concept. A workload is defined as a root and a list of functions. All function references are by-name (more readable than GUIDs). In the case of aliases, the expectation is the list contains all the alternative names.

The workload definitions are presented to the linker as a json file, containing a dictionary. The keys are the roots, the values are the list of functions.

The import list for a module defining a root will be the functions listed for it in the profile.

Using names this way assumes unique names for internal functions, i.e. clang's `-funique-internal-linkage-names`.

Note that the behavior affects the entire module where a root is defined (i.e. different workloads best be defined in different modules), and does not affect modules that don't define roots.
2023-12-14 15:10:48 -08:00
Nikita Popov
c4c0ac10f1 [IPO] Remove unnecessary bitcasts (NFC) 2023-11-06 16:49:45 +01:00
Kazu Hirata
6e8013a130 [llvm] Stop including llvm/ADT/StringMap.h (NFC)
These source files do not use StringMap.
2023-10-13 20:09:33 -07:00