SanitizerBinaryMetadata should only apply to to host code, and not GPU
code. Recently AMD GPU target code has experimental sanitizer support.
If we're compiling a mixed host/device source file, only add sanitizer
metadata to host code.
Differential Revision: https://reviews.llvm.org/D145519
For large projects it will be required to opt out entire subdirectories.
In the absence of fine-grained control over the flags passed via the
build system, introduce -fexperimental-sanitize-metadata-ignorelist=.
The format is identical to other sanitizer ignore lists, and its effect
will be to simply not instrument either functions or entire modules
based on the rules in the ignore list file.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D143664
By switching them to external with default visibility, DSOs may not call
their own constructor/destructor. This is incorrect, because they pass
different parameters.
Fix it by marking the ctors/dtors as external linkage but with hidden
visibility.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D143611
Emit all constant integers produced by SanitizerBinaryMetadata as
ULEB128 to further reduce binary space used. Increasing the version is
not necessary given this change depends on (and will land) along with
the bump to v2.
To support this, the !pcsections metadata format is extended to allow
for per-section options, encoded in the first MD operator which must
always be a string and contain the section: "<section>!<options>".
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D143484
Optimize the encoding of "covered" metadata by:
1. Reducing feature mask from 4 bytes to 1 byte (needs increase once we
reach more than 8 features).
2. Only emitting UAR stack args size if it is non-zero, saving 4 bytes
in the common case.
One caveat is that the emitted metadata for function PC (offset), size,
and UAR size (if enabled) are no longer aligned to 4 bytes.
SanitizerBinaryMetadata version base is increased to 2, since the change
is backwards incompatible.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D143482
For atomics metadata, we can make data race analysis more efficient by
entirely ignoring functions that include memory accesses but which only
access non-escaping (non-shared) and/or non-mutable memory. Such
functions will not be considered to be covered by "atomics" metadata,
resulting in the following benefits:
1. reduces "covered" metadata; and
2. allows data race analysis to skip such functions.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D143159
Profiling and GCOV generate known data-racy loads/stores. Pretend they
are atomic so that analysis using PC-keyed metadata for atomics do not
report them as data races (which would look like false positives).
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D142982
Declare callbacks extern weak (if no existing declaration exists), and
only call if the function address is non-null.
This allows to attach semantic metadata to binaries where no user of
that metadata exists, avoiding to have to link empty stub callbacks.
Once the binary is linked (statically or dynamically) against a tool
runtime that implements the callbacks, the respective callbacks will be
called. This vastly simplifies gradual deployment of tools using the
metadata, esp. avoiding having to recompile large codebases with
different compiler flags (which negatively impacts compiler caches).
Reviewed By: dvyukov, vitalybuka
Differential Revision: https://reviews.llvm.org/D142408
There are no intrinsic functions that leak arguments.
If the called function does not return, the current function
does not return as well, so no possibility of use-after-return.
Sanitizer function also don't leak or don't return.
It's safe to both pass pointers to local variables to them
and to tail-call them.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D142190
There are no intrinsic functions that leak arguments.
If the called function does not return, the current function
does not return as well, so no possibility of use-after-return.
Sanitizer function also don't leak or don't return.
It's safe to both pass pointers to local variables to them
and to tail-call them.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D142190
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).
This fixes clang.
Only mark functions that have address-taken locals
as requiring UAR checking.
On a large internal app this reduces number of marked functions
from 78441 to 66618. Mostly small, trivial getter/setter-type
functions are unmarked, but also some amount of larger
number-crunching-type functions are unmarked as well.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D139811
Currently per-function metadata consists of:
(start-pc, size, features)
This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D136078
D130887 uses a dummy empty section `sanmd_covered` (with the SHF_GNU_RETAIN flag on
ELF) to prevent `undefined symbol: __start_sanmd_covered` if all `sanmd_covered`
are discarded by `ld --gc-sections` (in `-z start-stop-gc` mode).
The dummy `sanmd_covered` does not have the SHF_LINK_ORDER flag, so mixing it
with SHF_LINK_ORDER `sanmd_covered` causes an issue to GNU ld<2.36
(https://sourceware.org/bugzilla/show_bug.cgi?id=26256).
Similar to D98903 for SanitizerCoverage, let's make encapsulation symbols
undefined weak[1]. This additionally avoids size cost due to the dummy section and
symbol.
[1]: https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D139276
Currently per-function metadata consists of:
(start-pc, size, features)
This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D136078
Currently per-function metadata consists of:
(start-pc, size, features)
This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D136078
This reverts commit a1255dc467f7ce57a966efa76bbbb4ee91d9115a.
This patch results in:
llvm/lib/CodeGen/SanitizerBinaryMetadata.cpp:57:17: error: no member
named 'size' in 'llvm::MDTuple'
Currently per-function metadata consists of:
(start-pc, size, features)
This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D136078
Introduces the SanitizerBinaryMetadata instrumentation pass which uses
the new MD_pcsections metadata kinds to instrument certain types of
instructions and functions required for breakpoint-based sanitizers.
The first intended user of the binary metadata emitted will be a variant
of GWP-TSan [1]. GWP-TSan will require information about atomic
accesses; to unambiguously determine if an access is atomic or not, we
also require "covered" information which code has been compiled with
SanitizerBinaryMetadata instrumentation enabled.
[1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D130887