28 Commits

Author SHA1 Message Date
Fangrui Song
107185fa85
[sanitizer] Remove unneeded pointer casts and migrate away from getInt8PtrTy. NFC (#72327)
After opaque pointer migration, getInt8PtrTy() is considered legacy.
Replace it with getPtrTy(), and while here, remove some unneeded pointer
casts.
2023-11-15 10:48:58 -08:00
Simon Pilgrim
3ca4fe80d4 [Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC.
startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)
2023-11-06 16:50:18 +00:00
Marco Elver
5f605e254a [SanitizerBinaryMetadata] Respect no_sanitize("thread") function attribute
To avoid false positives, respect no_sanitize("thread") when atomics
metadata is enabled.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D148694
2023-04-19 14:49:56 +02:00
Bjorn Pettersson
a20f7efbc5 Remove several no longer needed includes. NFCI
Mostly removing includes of InitializePasses.h and Pass.h in
passes that no longer has support for the legacy PM.
2023-04-17 13:54:19 +02:00
Marco Elver
61ed64954b [SanitizerBinaryMetadata] Do not add to GPU code
SanitizerBinaryMetadata should only apply to to host code, and not GPU
code. Recently AMD GPU target code has experimental sanitizer support.

If we're compiling a mixed host/device source file, only add sanitizer
metadata to host code.

Differential Revision: https://reviews.llvm.org/D145519
2023-03-09 10:15:28 +01:00
Marco Elver
421215b919 [SanitizerBinaryMetadata] Support ignore list
For large projects it will be required to opt out entire subdirectories.
In the absence of fine-grained control over the flags passed via the
build system, introduce -fexperimental-sanitize-metadata-ignorelist=.

The format is identical to other sanitizer ignore lists, and its effect
will be to simply not instrument either functions or entire modules
based on the rules in the ignore list file.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D143664
2023-02-10 10:25:48 +01:00
Marco Elver
8c469d1693 [SanitizerBinaryMetadata] Make constructors/destructors hidden
By switching them to external with default visibility, DSOs may not call
their own constructor/destructor. This is incorrect, because they pass
different parameters.

Fix it by marking the ctors/dtors as external linkage but with hidden
visibility.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D143611
2023-02-09 00:46:28 +01:00
Marco Elver
bf9814b705 [SanitizerBinaryMetadata] Emit constants as ULEB128
Emit all constant integers produced by SanitizerBinaryMetadata as
ULEB128 to further reduce binary space used. Increasing the version is
not necessary given this change depends on (and will land) along with
the bump to v2.

To support this, the !pcsections metadata format is extended to allow
for per-section options, encoded in the first MD operator which must
always be a string and contain the section: "<section>!<options>".

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D143484
2023-02-08 13:12:34 +01:00
Marco Elver
3d53b52730 [SanitizerBinaryMetadata] Optimize used space for features and UAR stack args
Optimize the encoding of "covered" metadata by:

 1. Reducing feature mask from 4 bytes to 1 byte (needs increase once we
    reach more than 8 features).

 2. Only emitting UAR stack args size if it is non-zero, saving 4 bytes
    in the common case.

One caveat is that the emitted metadata for function PC (offset), size,
and UAR size (if enabled) are no longer aligned to 4 bytes.

SanitizerBinaryMetadata version base is increased to 2, since the change
is backwards incompatible.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D143482
2023-02-08 13:12:33 +01:00
Fangrui Song
6ce8e716bf [SanitizerBinaryMetadata] Make module_[cd]tor external
If a COMDAT key has a local linkage, it behaves as `comdat nodeduplicate` and
llvm/lib/Linker/LinkModules.cpp does not deduplicate its members.
This is not intended. Switch to an external linkage to allow deduplication.

See also https://maskray.me/blog/2021-07-25-comdat-and-section-group#grp_comdat

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D143530
2023-02-08 00:58:39 -08:00
Archibald Elliott
62c7f035b4 [NFC][TargetParser] Remove llvm/ADT/Triple.h
I also ran `git clang-format` to get the headers in the right order for
the new location, which has changed the order of other headers in two
files.
2023-02-07 12:39:46 +00:00
Marco Elver
960b4c3b5d [SanitizerBinaryMetadata] Treat constant globals and non-escaping addresses specially
For atomics metadata, we can make data race analysis more efficient by
entirely ignoring functions that include memory accesses but which only
access non-escaping (non-shared) and/or non-mutable memory. Such
functions will not be considered to be covered by "atomics" metadata,
resulting in the following benefits:

  1. reduces "covered" metadata; and
  2. allows data race analysis to skip such functions.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D143159
2023-02-03 15:35:24 +01:00
Marco Elver
764c88a50a [SanitizerBinaryMetadata] Pretend compiler-generated loads/stores are atomic
Profiling and GCOV generate known data-racy loads/stores. Pretend they
are atomic so that analysis using PC-keyed metadata for atomics do not
report them as data races (which would look like false positives).

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D142982
2023-01-31 17:30:43 +01:00
Marco Elver
5265adc737 [SanitizerBinaryMetadata] Declare callbacks extern weak
Declare callbacks extern weak (if no existing declaration exists), and
only call if the function address is non-null.

This allows to attach semantic metadata to binaries where no user of
that metadata exists, avoiding to have to link empty stub callbacks.

Once the binary is linked (statically or dynamically) against a tool
runtime that implements the callbacks, the respective callbacks will be
called. This vastly simplifies gradual deployment of tools using the
metadata, esp. avoiding having to recompile large codebases with
different compiler flags (which negatively impacts compiler caches).

Reviewed By: dvyukov, vitalybuka

Differential Revision: https://reviews.llvm.org/D142408
2023-01-24 12:54:20 +01:00
Dmitry Vyukov
f7f01599ec sanmd: refine selection of functions for UAR checking
There are no intrinsic functions that leak arguments.
If the called function does not return, the current function
does not return as well, so no possibility of use-after-return.
Sanitizer function also don't leak or don't return.
It's safe to both pass pointers to local variables to them
and to tail-call them.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D142190
2023-01-21 09:51:15 +01:00
Arthur Eubanks
2329a9266d Revert "sanmd: refine selection of functions for UAR checking"
This reverts commit 9d4f1a9eff27716069dc6a2d991baa228c197b85.

Breaks under -DCOMPILER_RT_BUILD_SANITIZERS=OFF
2023-01-20 13:40:50 -08:00
Dmitry Vyukov
9d4f1a9eff sanmd: refine selection of functions for UAR checking
There are no intrinsic functions that leak arguments.
If the called function does not return, the current function
does not return as well, so no possibility of use-after-return.
Sanitizer function also don't leak or don't return.
It's safe to both pass pointers to local variables to them
and to tail-call them.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D142190
2023-01-20 16:22:37 +01:00
Fangrui Song
21c4dc7997 std::optional::value => operator*/operator->
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).

This fixes clang.
2022-12-17 00:42:05 +00:00
Dmitry Vyukov
5addb736a9 sanmd: improve precision of UAR analysis
Only mark functions that have address-taken locals
as requiring UAR checking.

On a large internal app this reduces number of marked functions
from 78441 to 66618. Mostly small, trivial getter/setter-type
functions are unmarked, but also some amount of larger
number-crunching-type functions are unmarked as well.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D139811
2022-12-12 11:41:59 +01:00
Dmitry Vyukov
dbe8c2c316 Use-after-return sanitizer binary metadata
Currently per-function metadata consists of:
(start-pc, size, features)

This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D136078
2022-12-05 14:40:31 +01:00
Fangrui Song
eecb22d8e1 [SanitizerBinaryMetadata] Use weak __start_/__stop_ instead of dummy empty section
D130887 uses a dummy empty section `sanmd_covered` (with the SHF_GNU_RETAIN flag on
ELF) to prevent `undefined symbol: __start_sanmd_covered` if all `sanmd_covered`
are discarded by `ld --gc-sections` (in `-z start-stop-gc` mode).

The dummy `sanmd_covered` does not have the SHF_LINK_ORDER flag, so mixing it
with SHF_LINK_ORDER `sanmd_covered` causes an issue to GNU ld<2.36
(https://sourceware.org/bugzilla/show_bug.cgi?id=26256).

Similar to D98903 for SanitizerCoverage, let's make encapsulation symbols
undefined weak[1]. This additionally avoids size cost due to the dummy section and
symbol.

[1]: https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D139276
2022-12-04 15:06:34 -08:00
Marco Elver
b95646fe70 Revert "Use-after-return sanitizer binary metadata"
This reverts commit d3c851d3fc8b69dda70bf5f999c5b39dc314dd73.

Some bots broke:

- https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8796062278266465473/overview
- https://lab.llvm.org/buildbot/#/builders/124/builds/5759/steps/7/logs/stdio
2022-11-30 23:35:50 +01:00
Dmitry Vyukov
d3c851d3fc Use-after-return sanitizer binary metadata
Currently per-function metadata consists of:
(start-pc, size, features)

This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D136078
2022-11-30 14:50:22 +01:00
Dmitry Vyukov
0aedf9d714 Revert "Use-after-return sanitizer binary metadata"
This reverts commit e6aea4a5db09c845276ece92737a6aac97794100.

Broke tests:
https://lab.llvm.org/buildbot/#/builders/16/builds/38992
2022-11-30 09:38:56 +01:00
Dmitry Vyukov
e6aea4a5db Use-after-return sanitizer binary metadata
Currently per-function metadata consists of:
(start-pc, size, features)

This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D136078
2022-11-30 09:14:19 +01:00
Kazu Hirata
dbb1130966 Revert "Use-after-return sanitizer binary metadata"
This reverts commit a1255dc467f7ce57a966efa76bbbb4ee91d9115a.

This patch results in:

  llvm/lib/CodeGen/SanitizerBinaryMetadata.cpp:57:17: error: no member
  named 'size' in 'llvm::MDTuple'
2022-11-29 09:04:00 -08:00
Dmitry Vyukov
a1255dc467 Use-after-return sanitizer binary metadata
Currently per-function metadata consists of:
(start-pc, size, features)

This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D136078
2022-11-29 17:37:36 +01:00
Marco Elver
97c2220565 [SanitizerBinaryMetadata] Introduce SanitizerBinaryMetadata instrumentation pass
Introduces the SanitizerBinaryMetadata instrumentation pass which uses
the new MD_pcsections metadata kinds to instrument certain types of
instructions and functions required for breakpoint-based sanitizers.

The first intended user of the binary metadata emitted will be a variant
of GWP-TSan [1]. GWP-TSan will require information about atomic
accesses; to unambiguously determine if an access is atomic or not, we
also require "covered" information which code has been compiled with
SanitizerBinaryMetadata instrumentation enabled.

[1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D130887
2022-09-07 21:25:40 +02:00