## Purpose
Add proper preprocessor guards for all `dump()` methods in the LLVM
support library. This change ensures these methods are not part of the
public ABI for release builds.
## Overview
* Annotates all `dump` methods in Support and ADT source with the
`LLVM_DUMP_METHOD` macro.
* Conditionally includes all `dump` method definitions in Support and
ADT source so they are only present on debug/assert builds and when
`LLVM_ENABLE_DUMP` is explicitly defined.
NOTE: For many of these `dump` methods, the implementation was already
properly guarded but the declaration in the header file was not.
## Background
This issue was raised in comments on #136629. I am addressing it as a
separate change since it is independent from the changes being made in
that PR.
According to [this
documentation](https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Support/Compiler.h#L637),
`dump` methods should be annotated with `LLVM_DUMP_METHOD` and
conditionally included as follows:
```
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
LLVM_DUMP_METHOD void dump() const;
#endif
```
## Validation
* Local release build succeeds.
* CI
Exposed by the test added in the reverted #120514
* Fix libstdc++/libc++ differences due to nth_element. https://github.com/llvm/llvm-project/pull/125450#issuecomment-2631404178
* Fix LLVM_ENABLE_REVERSE_ITERATION=1 differences
* Fix potential issue in `currentSize += D::getSize(*sections[*sectionIdxs.begin()])` where DenseSet was used, though not covered by a test
The base class llvm::ThreadPoolInterface will be renamed
llvm::ThreadPool in a subsequent commit.
This is a breaking change: clients who use to create a ThreadPool must
now create a DefaultThreadPool instead.
Adding weights to utility nodes in BP so that we can give more
importance to
certain utilities. This is useful when we optimize several objectives
jointly.
When LargestTraceSize is a power of two, createBPFunctionNodes does not
allocate a group ID for Trace[LargestTraceSize-1] (as N is off by 1).
Fix
this and change floor+log2 to Log2_64.
BalancedPartitioning::bisect can use unstable sort because `Nodes`
contains distinct `InputOrderIndex`s.
BalancedPartitioning::runIterations: use one DenseMap and simplify the
node renumbering code.
In https://reviews.llvm.org/D147812 I introduced the class
`BalancedPartitioning` and it seemed to trigger a warning in flang
```
C:\Users\buildbot-worker\minipc-ryzen-win\flang-x86_64-windows\llvm-project\llvm\include\llvm/Support/BalancedPartitioning.h(89): warning C4305: 'initializing': truncation from 'double' to 'float'
```
For good measure, I converted all double literals to floats. This should
be a NFC.
In [0] we described an algorithm called //BalancedPartitioning// (bp) to consume function traces [1] and compute a function order that reduces the number of page faults during startup.
This patch adds the `order` command to the `llvm-profdata` tool which uses bp to output a function order that can be passed to the linker via `--symbol-ordering-file=`.
Special thanks to Sergey Pupyrev and Julian Mestre for designing this balanced partitioning algorithm.
[0] https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068
[1] https://reviews.llvm.org/D147287
Reviewed By: spupyrev
Differential Revision: https://reviews.llvm.org/D147812