Fixes an attribute mismatch error in `AllocTokenPass` that occurs during
ThinLTO builds at OptimizationLevel::O0.
The `getTokenAllocFunction` in `AllocTokenPass` was incorrectly copying
attributes from the instrumented function (`Callee`) to an *existing*
`void()` alloc-token function retrieved by `Mod.getOrInsertFunction`.
This resulted in arg attributes being added to a function with no
parameters, causing `VerifyPass` to fail with "Attribute after last
parameter!".
The fix modifies `getTokenAllocFunction` to pass the `Callee`'s
attributes directly to the `Mod.getOrInsertFunction` overload. This
ensures attributes are only applied when the alloc-token function is
*newly inserted*, preventing unintended attribute modifications on
already existing function declarations.
See https://g-issues.chromium.org/issues/474289092 for detailed
reproduction steps and analysis.
Co-authored-by: Ayumi Ono <ayumiohno@google.com>
Unconditionally add AllocTokenPass to the optimization pipelines, and
ensure that it runs last in LTO backend pipelines. The latter ensures
that AllocToken instrumentation can be moved later in the LTO pipeline
to avoid interference with other optimizations (e.g. PGHO) and enable
late heap-allocation optimizations.
In preparation of removing AllocTokenPass being added by Clang, add
support for AllocTokenPass to read configuration options from LLVM
module flags.
To optimize given the pass is now runs unconditionally, only retrieve
TargetLibraryInfo and OptimizationRemarkEmitter when necessary.
The option -falloc-token-max=0 is supposed to be usable to override
previous settings back to the target default max tokens (SIZE_MAX).
This did not work for the builtin:
```
| executed command: clang -cc1 [..] -nostdsysteminc -triple x86_64-linux-gnu -std=c++23 -fsyntax-only -verify clang/test/SemaCXX/alloc-token.cpp -falloc-token-max=0
| clang: llvm/lib/Support/AllocToken.cpp:38: std::optional<uint64_t> llvm::getAllocToken(AllocTokenMode, const AllocTokenMetadata &, uint64_t): Assertion `MaxTokens && "Must provide non-zero max tokens"' failed.
```
Fix it by also picking the default if "0" is passed.
Improve the documentation to be clearer what the value of "0" means.
Refactor the AllocToken pass to accept the mode via pass options rather
than LLVM cl::opt. This is both cleaner, but also required to make the
mode frontend-driven and avoid potential inconsistencies.
Refactor the stateless (hash-based) token calculation logic out of the
`AllocToken` pass and into `llvm/Support/AllocToken.h`.
This helps with making the token calculation logic available to other
parts of the codebase, which will be necessary for frontend
implementation of `__builtin_infer_alloc_token` to perform constexpr
evaluation.
The `AllocTokenMode` enum and a new `AllocTokenMetadata` struct are
moved into a shared header. The `getAllocTokenHash()` function now
provides the source of truth for calculating token IDs for `TypeHash`
and `TypeHashPointerSplit` modes.
Introduce a new intrinsic, `llvm.alloc.token.id`, to allow compile-time
querying of allocation token IDs.
The `AllocToken` pass is taught to recognize and lower this intrinsic.
It extracts the `!alloc_token` metadata from the intrinsic's argument,
feeds it into the same token-generation logic used for instrumenting allocation
calls, and replaces the intrinsic with the resulting constant integer token ID.
This is a prerequisite for `__builtin_infer_alloc_token`. The pass now
runs on all functions to ensure intrinsics are lowered, but continues to
only instrument allocation calls in functions with the
`sanitize_alloc_token` attribute.
Introduce `AllocToken`, an instrumentation pass designed to provide
tokens to memory allocators enabling various heap organization
strategies, such as heap partitioning.
Initially, the pass instruments functions marked with a new attribute
`sanitize_alloc_token` by rewriting allocation calls to include a token
ID, appended as a function argument with the default ABI.
The design aims to provide a flexible framework for implementing
different token generation schemes. It currently supports the following
token modes:
- TypeHash (default): token IDs based on a hash of the allocated type
- Random: statically-assigned pseudo-random token IDs
- Increment: incrementing token IDs per TU
For the `TypeHash` mode introduce support for `!alloc_token` metadata:
the metadata can be attached to allocation calls to provide richer
semantic
information to be consumed by the AllocToken pass. Optimization remarks
can be enabled to show where no metadata was available.
An alternative "fast ABI" is provided, where instead of passing the
token ID as an argument (e.g., `__alloc_token_malloc(size, id)`), the
token ID is directly encoded into the name of the called function (e.g.,
`__alloc_token_0_malloc(size)`). Where the maximum tokens is small, this
offers more efficient instrumentation by avoiding the overhead of
passing an additional argument at each allocation site.
Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [1]
---
This change is part of the following series:
1. https://github.com/llvm/llvm-project/pull/160131
2. https://github.com/llvm/llvm-project/pull/156838
3. https://github.com/llvm/llvm-project/pull/162098
4. https://github.com/llvm/llvm-project/pull/162099
5. https://github.com/llvm/llvm-project/pull/156839
6. https://github.com/llvm/llvm-project/pull/156840
7. https://github.com/llvm/llvm-project/pull/156841
8. https://github.com/llvm/llvm-project/pull/156842