24 Commits

Author SHA1 Message Date
Benjamin Kramer
6e55370b81 Hide some implementation details so they can't cause ODR conflicts. NFC. 2023-07-14 15:54:04 +02:00
Kazu Hirata
c963892a45 [llvm] Use DenseMapBase::lookup (NFC) 2023-06-10 09:02:25 -07:00
Kan Wu
b8d2f7177c [MemProf] Add hot allocation type
Add "Hot" AllocationType (in addition to existing cold, notcold).

Use lifetime access density as metric to identify hot allocations.
Treat hot as notcold for MemProfContextDisambiguation for now
before the disambiguation for "hot" is done.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D149932
2023-05-08 10:34:53 -07:00
Teresa Johnson
1768898680 [MemProf] Control availability of hot/cold operator new from LTO link
Adds an LTO option to indicate that whether we are linking with an
allocator that supports hot/cold operator new interfaces. If not,
at the start of the LTO backends any existing memprof hot/cold
attributes are removed from the IR, and we also remove memprof metadata
so that post-LTO inlining doesn't add any new attributes.

This is done via setting a new flag in the module summary index. It is
important to communicate via the index to the LTO backends so that
distributed ThinLTO handles this correctly, as they are invoked by
separate clang processes and the combined index is how we communicate
information from the LTO link. Specifically, for distributed ThinLTO the
LTO related processes look like:
```
   # Thin link:
   $ lld --thinlto-index-only obj1.o ... objN.o -llib ...
   # ThinLTO backends:
   $ clang -x ir obj1.o -fthinlto-index=obj1.o.thinlto.bc -c -O2
   ...
   $ clang -x ir objN.o -fthinlto-index=objN.o.thinlto.bc -c -O2
```

It is during the thin link (lld --thinlto-index-only) that we have
visibility into linker dependences and want to be able to pass the new
option via -Wl,-supports-hot-cold-new. This will be recorded in the
summary indexes created for the distributed backend processes
(*.thinlto.bc) and queried from there, so that we don't need to know
during those individual clang backends what allocation library was
linked. Since in-process ThinLTO and regular LTO also use a combined
index, for consistency we query the flag out of the index in all LTO
backends.

Additionally, when the LTO option is disabled, exit early from the
MemProfContextDisambiguation handling performed during LTO, as this is
unnecessary.

Depends on D149117 and D149192.

Differential Revision: https://reviews.llvm.org/D149215
2023-05-08 08:02:21 -07:00
NAKAMURA Takumi
a29a97d76b Fix a warning in D149117 [-Wunused-but-set-variable] 2023-05-06 11:30:04 +09:00
Teresa Johnson
a28261c711 [MemProf] Create single version of helper function (NFC)
Small clean up to keep a single version of getAllocTypeAttributeString
which was duplicated locally.
2023-05-05 18:31:35 -07:00
Teresa Johnson
cfad2d3a3d [MemProf] Context disambiguation cloning pass [patch 4/4]
Applies ThinLTO cloning decisions made during the thin link and
recorded in the summary index to the IR during the ThinLTO backend.

Depends on D141077.

Differential Revision: https://reviews.llvm.org/D149117
2023-05-05 16:26:32 -07:00
Teresa Johnson
04f3c5a71e Restore again "[MemProf] Context disambiguation cloning pass [patch 3/4]"
This reverts commit f09807ca9dda2f588298d8733e89a81105c88120, restoring
bfe7205975a63a605ff3faacd97fe4c1bf4c19b3 and follow on fix
e3e6bc699574550f2ed1de07f4e5bcdddaa65557, now that the nondeterminism
has been addressed by D149924.

Differential Revision: https://reviews.llvm.org/D141077
2023-05-05 13:27:33 -07:00
Teresa Johnson
1a3947df85 [MemProf] Use MapVector to avoid non-determinism
Multiple cases of instability in the cloning behavior occurred due to
iteration of maps indexed by pointers. Fix by changing the maps to
MapVector. This necessitated adding DenseMapInfo specializations for the
structure types used in the keys.

These were found while trying to commit patch 3 of the cloning
(bfe7205975a63a605ff3faacd97fe4c1bf4c19b3), but the second one turned
out to be in code committed in patch 2, but just exposed by a new test
added with patch 3. Specifically, the iteration in identifyClones().

Added the portion of the new test cases from patch 3 that only relied on
the already committed changes and exposed the issue.

Differential Revision: https://reviews.llvm.org/D149924
2023-05-05 07:06:41 -07:00
Teresa Johnson
f09807ca9d Revert "Restore "[MemProf] Context disambiguation cloning pass [patch 3/4]""
This reverts commit bfe7205975a63a605ff3faacd97fe4c1bf4c19b3, and follow
on fix e3e6bc699574550f2ed1de07f4e5bcdddaa65557, due to some remaining
instability exposed by the bot enabling expensive checks:
https://lab.llvm.org/buildbot/#/builders/42/builds/9842
2023-05-04 09:41:48 -07:00
Teresa Johnson
bfe7205975 Restore "[MemProf] Context disambiguation cloning pass [patch 3/4]"
This reverts commit 6fbf022908c104a380fd1854fb96eafc64509366, restoring
commit bf6ff4fd4b735afffc65f92a4a79f6610e7174c3 with a fix for a bot
failure due to a previously unstable iteration order.

Differential Revision: https://reviews.llvm.org/D141077
2023-05-04 06:31:44 -07:00
Teresa Johnson
6fbf022908 Revert "[MemProf] Context disambiguation cloning pass [patch 3/4]"
This reverts commit bf6ff4fd4b735afffc65f92a4a79f6610e7174c3.

There is a bot failure where we are getting the correct remarks output
but in a different order. I'll need to investigate to see where we are
having nondeterministic behavior.
2023-05-03 14:08:54 -07:00
Teresa Johnson
bf6ff4fd4b [MemProf] Context disambiguation cloning pass [patch 3/4]
Applies cloning decisions to the IR, cloning functions and updating
calls. For Regular LTO, the IR is updated directly during function
assignment, whereas for ThinLTO it is recorded in the summary index
(a subsequent patch will apply to the IR via the index during the
ThinLTO backend.

The function assignment and cloning proceeds greedily, and we create new
clones as needed when we find an incompatible assignment of function
clones to callsite clones (i.e. when different callers need to invoke
different combinations of callsite clones).

Depends on D140949.

Differential Revision: https://reviews.llvm.org/D141077
2023-05-03 13:34:00 -07:00
Teresa Johnson
969268831b [MemProf] Don't use constexpr via lambda capture due to MSVC issues (NFC)
Modifies a104e27030587507a711cef0e2b0ddb447fe68fe slightly to switch a
constexpr used via a lambda capture to a const, due to issues with MSVC.
See https://reviews.llvm.org/D140949#inline-1438809 for context.
2023-04-23 15:03:49 -07:00
Teresa Johnson
a104e27030 Restore "[MemProf] Context disambiguation cloning pass [patch 2/3]"
This restores d0649a6ad8be778abf7569f502148d577f8bc6f1 (reverted in
commit 03bf59d275a16815dc5a2e3f279815554f7cd0ca), with fixes for bot
failures. Confirmed that gcc, which reproduced both failures, now
builds it fine.

Differential Revision: https://reviews.llvm.org/D140949
2023-04-21 19:38:46 -07:00
Teresa Johnson
03bf59d275 Revert "[MemProf] Context disambiguation cloning pass [patch 2/3]"
This reverts commit d0649a6ad8be778abf7569f502148d577f8bc6f1.

Reverting due to a number of bot failures that need investigation.
2023-04-21 14:37:42 -07:00
Teresa Johnson
d0649a6ad8 [MemProf] Context disambiguation cloning pass [patch 2/3]
Performs cloning on the CallsiteContextGraph (not on the IR or summary
index), in order to uniquely identify the allocation behavior of an
allocation call given its context. In order to do this, the graph is
recursively traversed starting from the allocation nodes, until we
identify a point where the allocation behavior is unambiguous (the edges
have a single allocation type). Nodes are then cloned as we unwind the
recursion. We try to perform the minimal amount of cloning required to
disambiguate the contexts.

The follow-on patch will contain the support for applying the cloning to
the IR.

Depends on D140908 and D145836.

Differential Revision: https://reviews.llvm.org/D140949
2023-04-21 14:31:44 -07:00
Shraiysh Vaishay
7021182d6b [nfc][llvm] Replace pointer cast functions in PointerUnion by llvm casting functions.
This patch replaces the uses of PointerUnion.is function by llvm::isa,
PointerUnion.get function by llvm::cast, and PointerUnion.dyn_cast by
llvm::dyn_cast_if_present. This is according to the FIXME in
the definition of the class PointerUnion.

This patch does not remove them as they are being used in other
subprojects.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D148449
2023-04-17 13:40:51 -05:00
Kazu Hirata
c83c4b58d1 [Transforms] Apply fixes from performance-for-range-copy (NFC) 2023-04-16 08:25:28 -07:00
Teresa Johnson
dc2f2d2180 [MemProf] Use stable_sort to avoid non-determinism
Switch from std::sort to std::stable_sort when sorting callsites to
avoid non-determinism when the comparisons are equal. This showed up in
internal testing of fe27495be2040007c7b20844a9371b06156ab405.
2023-03-23 09:20:39 -07:00
Teresa Johnson
fe27495be2 [MemProf] Context disambiguation cloning pass [patch 1b/3]
Adds support for building the graph in ThinLTO from MemProf summaries.

Follow-on patches will contain the support for cloning on the graph and
in the IR.

Depends on D140908.

Differential Revision: https://reviews.llvm.org/D145836
2023-03-22 14:57:53 -07:00
Teresa Johnson
700cd99061 Restore "[MemProf] Context disambiguation cloning pass [patch 1a/3]"
This restores commit d6ad4f01c3dafcab335bca66dac6e36d9eac8421, which was
reverted in commit 883dbb9c86be87593a58ef10b070b3a0564c7fee, along with
a fix for gcc 12.2 build errors in the original commit.

Support for building, printing, and displaying CallsiteContextGraph
which represents the MemProf metadata contexts. Uses CRTP to enable
support for both IR (regular LTO) and summary (ThinLTO). This patch
includes the support for building it in regular LTO mode (from
memprof and callsite metadata), and the next patch will add the
handling for building it from ThinLTO summaries.

Also includes support for dumping the graph to text and to dot files.

Follow-on patches will contain the support for cloning on the graph and
in the IR.

The graph represents the call contexts in all memprof metadata on
allocation calls, with nodes for the allocations themselves, as well as
for the calls in each context. The graph is initially built from the
allocation memprof metadata (or summary) MIBs. It is then updated to
match calls with callsite metadata onto the nodes, updating it to
reflect any inlining performed on those calls.

Each MIB (representing an allocation's call context with allocation
behavior) is assigned a unique context id during the graph build. The
edges and nodes in the graph are decorated with the context ids they
carry. This is used to correctly update the graph when cloning is
performed so that we can uniquify the context for a single (possibly
cloned) allocation.

Differential Revision: https://reviews.llvm.org/D140908
2023-03-22 10:16:06 -07:00
Nikita Popov
883dbb9c86 Revert "[MemProf] Context disambiguation cloning pass [patch 1a/3]"
This reverts commit d6ad4f01c3dafcab335bca66dac6e36d9eac8421.

Fails to build on at least gcc 12.2:

/home/npopov/repos/llvm-project/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:482:1: error: no declaration matches ‘ContextNode<DerivedCCG, FuncTy, CallTy>* CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::getNodeForInst(const CallInfo&)’
  482 | CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::getNodeForInst(
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/npopov/repos/llvm-project/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:393:16: note: candidate is: ‘CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::ContextNode* CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::getNodeForInst(const CallInfo&)’
  393 |   ContextNode *getNodeForInst(const CallInfo &C);
      |                ^~~~~~~~~~~~~~
/home/npopov/repos/llvm-project/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:99:7: note: ‘class CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>’ defined here
   99 | class CallsiteContextGraph {
      |       ^~~~~~~~~~~~~~~~~~~~
2023-03-22 15:43:46 +01:00
Teresa Johnson
d6ad4f01c3 [MemProf] Context disambiguation cloning pass [patch 1a/3]
Support for building, printing, and displaying CallsiteContextGraph
which represents the MemProf metadata contexts. Uses CRTP to enable
support for both IR (regular LTO) and summary (ThinLTO). This patch
includes the support for building it in regular LTO mode (from
memprof and callsite metadata), and the next patch will add the
handling for building it from ThinLTO summaries.

Also includes support for dumping the graph to text and to dot files.

Follow-on patches will contain the support for cloning on the graph and
in the IR.

The graph represents the call contexts in all memprof metadata on
allocation calls, with nodes for the allocations themselves, as well as
for the calls in each context. The graph is initially built from the
allocation memprof metadata (or summary) MIBs. It is then updated to
match calls with callsite metadata onto the nodes, updating it to
reflect any inlining performed on those calls.

Each MIB (representing an allocation's call context with allocation
behavior) is assigned a unique context id during the graph build. The
edges and nodes in the graph are decorated with the context ids they
carry. This is used to correctly update the graph when cloning is
performed so that we can uniquify the context for a single (possibly
cloned) allocation.

Depends on D140786.

Differential Revision: https://reviews.llvm.org/D140908
2023-03-22 07:05:27 -07:00