…ty in instrumentation code, be consistent with runtime
In runtime code, the size of memory block mapped to a single shadow
location is called MEM_GRANULARITY.
In instrumentation code, the size of memory block mapped to a single
shadow location is called DefaultShadowGranularity.
Actually, the SHADOW_GRANULARITY is 8 (1 << SHADOW_SCALE), and the
MEM_GRANULARITY is 64.
The wording of DefaultShadowGranularity in instrumentation code is a bit
misleading, this patch renames DefaultShadowGranularity to
DefaultMemGranularity, be consistent with runtime.
Unlike ASan, MemProf uses the same memory access callback(inline
sequence) for different size memory access, remove unneeded TypeSize
stored in InterestingMemoryAccess.
Now MemProf can't do IR annotation right in the local linkage function
and global initial function __cxx_global_var_init. In llvm-profdata
which convert raw memory profile to memory profile, it uses function
name in dwarf to create GUID. But when llvm consumes memory profile, it
use `getIRPGOFuncName` or `getPGOFuncName` which returns local linkage
function as `FileName;FunctionName` or `FileName:FunctionName` to get
function name and create GUID. So profile creator's GUID is not same as
profile consumer.
So I think MemProf should be used with `unique-internal-linkage-names`
and don't use PGOFuncName.
__cxx_global_var_init is created later than where
UniqueInternalLinkageNames works. So I add uniq suffix to
__cxx_global_var_init additionally.
Co-authored-by: lifengxiang <lifengxiang.1025@bytedance.com>
Loosen up the matching so that a missing leaf debug frame in the profile
does not prevent matching an allocation context if we can match further
up the inlined call context. This relies on the pre-inliner, which was
already the default when performing normal PGO feedback along with the
MemProf feedback, but to ensure matching is not affected by the presence
of PGO, enable the pre-inliner for MemProf feedback as well.
Detect when we are matching a memprof profile with no column numbers,
and in that case treat all column numbers as 0 when matching. The
profiled binary might have been built with -gno-column-info, for
example.
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form. This patch replaces
llvm::support::endianness with llvm::endianness.
Prior to this diff, names in the `__llvm_prf_names` section had the format `[<filepath>:]<function-name>`, e.g., `main.cpp:foo`, `bar`. `<filepath>` is used to discriminate between possibly identical function names when linkage is local and `<function-name>` simply comes from `F.getName()`. This has two problems:
* `:` is commonly found in Objective-C functions so that names like `main.mm:-[C foo::]` and `-[C bar::]` are difficult to parse
* `<function-name>` might be different from the linkage name, so it cannot be used to pass a function order to the linker via `-symbol-ordering-file` or `-order_file` (see https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068)
Instead, this diff changes the format to `[<filepath>;]<linkage-name>`, e.g., `main.cpp;_foo`, `_bar`. The hope is that `;` won't realistically be found in either `<filepath>` or `<linkage-name>`.
To prevent invalidating all prior IRPGO profiles, we also lookup the prior name format when a record is not found (see `InstrProfSymtab::create()`, `readMemprof()`, and `getInstrProfRecord()`). It seems that Swift and Clang FE-PGO rely on the original `getPGOFuncName()`, so we cannot simply replace it.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D156569
This restores commit b4a82b62258c5f650a1cccf5b179933e6bae4867, reverted
in 3ab7ef28eebf9019eb3d3c4efd7ebfd160106bb1 because it was thought to
cause a bot failure, which ended up being unrelated to this patch set.
Differential Revision: https://reviews.llvm.org/D154856
This restores commit 29252fdd530f68d0de38a0cd26ed428bb2f5c16a, reverted
in 3498cf52ba1c23cbf8acdf99d649d2fa25291eef because it was thought to
cause a bot failure, which ended up being unrelated to this patch set.
Differential Revision: https://reviews.llvm.org/D154872
Previously the MemProf profile was expected to be in the same profile
file as a normal PGO profile, passed via the usual -fprofile-use=
option, and was matched in the same pass. To simplify profile
preparation, since the raw MemProf profile requires the binary for
symbolization and may be simpler to index separately from the raw PGO
profile, and also to enable providing a MemProf profile for a SamplePGO
build, separate out the MemProf feedback option and matching pass.
This patch adds the -fmemory-profile-use=${file} option, and the
provided file is passed down to LLVM and ultimately used in a new
MemProfUsePass which performs the matching of just the memory profile
contents of that file.
Note that a single profile file containing both normal PGO and MemProf
profile data is still supported, and the relevant profile data is
matched by the appropriate matching pass(es) based on which option(s)
the profile is provided with (the same profile file can be supplied to
both feedback options).
Differential Revision: https://reviews.llvm.org/D154856
Split out of D154856, this prepares for the addition of a new dedicated
memory profile matching pass.
Differential Revision: https://reviews.llvm.org/D154872
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
We don't need to insert a load of the dynamic shadow address unless there
are interesting memory accesses to profile.
Split out of D124703.
Differential Revision: https://reviews.llvm.org/D124797
Suppress instrumentation of PGO counter accesses, which is unnecessary
and costly. Also suppress accesses to other compiler inserted variables
starting with "__llvm". This is a slightly expanded variant of what is
done for tsan in shouldInstrumentReadWriteFromAddress.
Differential Revision: https://reviews.llvm.org/D124703
Based on the output of include-what-you-use.
This is a big chunk of changes. It is very likely to break downstream code
unless they took a lot of care in avoiding hidden ehader dependencies, something
the LLVM codebase doesn't do that well :-/
I've tried to summarize the biggest change below:
- llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h
- llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h
- llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h
- llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h
- llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h
- llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h
- llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h
And the usual count of preprocessed lines:
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 6400831
after: 6189948
200k lines less to process is no that bad ;-)
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D118652
Determine the masked load/store access type from the value type
of the intrinsics, rather than the pointer element type. For
cleanliness, include the access type in InterestingMemoryAccess.
Instead use either Type::getPointerElementType() or
Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecating the API.
Skip stack accesses unless requested, as the memory profiler runtime
does not currently look at or report accesses for these addresses.
Differential Revision: https://reviews.llvm.org/D109868
We're immediately dereferencing the casted pointer, so use cast<> which will assert instead of dyn_cast<> which can return null.
Fixes static analyzer warning.
The x86-64 backend currently has a bug which uses a wrong register when for the GOTPCREL reference.
The program will crash without the dso_local specifier.
Similar to -fprofile-generate=, add -fmemory-profile= which takes a
directory path. This is passed down to LLVM via a new module flag
metadata. LLVM in turn provides this name to the runtime via the new
__memprof_profile_filename variable.
Additionally, always pass a default filename (in $cwd if a directory
name is not specified vi the = form of the option). This is also
consistent with the behavior of the PGO instrumentation. Since the
memory profiles will generally be fairly large, it doesn't make sense to
dump them to stderr. Also, importantly, the memory profiles will
eventually be dumped in a compact binary format, which is another reason
why it does not make sense to send these to stderr by default.
Change the existing memprof tests to specify log_path=stderr when that
was being relied on.
Depends on D89086.
Differential Revision: https://reviews.llvm.org/D89087
This is consistent with the clang option added in
7ed8124d46f94601d5f1364becee9cee8538265e, and the comments on the
runtime patch in D87120.
Differential Revision: https://reviews.llvm.org/D87622