Most sanitizers don't support static linking. One primary reason is the
incompatibility with interceptors. `GetTlsSize` is another reason.
asan/memprof use `__interception::DoesNotSupportStaticLinking`
(`_DYNAMIC` reference) to reject -static at link time. Port this
detector to other sanitizers. dfsan actually supports -static for
certain cases. Don't touch dfsan.
s390x is one of the architectures where the "long double" type was changed
from a 64-bit IEEE to a 128-bit IEEE type back in the glibc 2.4 days.
This means that glibc still exports two versions of the long double functions
(those that already existed back then), and we have to intercept the correct
version. There is already an existing define SANITIZER_NLDBL_VERSION that
indicates this situation, we simply have to respect it when intercepting
strtold and wcstold.
In addition, on s390x a long double return value is passed in memory via
implicit reference. This means the interceptor for functions returning
long double has to unpoison that memory slot, or else we will get
false-positive uninitialized memory reference warnings when the caller
accesses that return value - similar to what is already done in the
mallinfo interceptor. Create a variant macro INTERCEPTOR_STRTO_SRET and
use it on s390x.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D159378
SANITIZER_GLIBC is always defined so should be tested with an if not an
ifdef.
Fixes: ad7e2501000d
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D159041
`strtol("0b1", 0, 0)` can be (pre-C23) 0 or (C23) 1.
`sscanf("0b10", "%i", &x)` is similar. glibc 2.38 introduced
`__isoc23_strtol` and `__isoc23_scanf` family functions for binary
compatibility.
When `_ISOC2X_SOURCE` is defined (implied by `_GNU_SOURCE`) or
`__STDC_VERSION__ > 201710L`, `__GLIBC_USE_ISOC2X` is defined to 1 and
these `__isoc23_*` symbols are used.
Add `__isoc23_` versions for the following interceptors:
* sanitizer_common_interceptors.inc implements strtoimax/strtoumax.
Remove incorrect FIXME about https://github.com/google/sanitizers/issues/321
* asan_interceptors.cpp implements just strtol and strtoll. The default
`replace_str` mode checks `nptr` is readable and `endptr` is writable.
atoi reuses the existing strtol interceptor.
* msan_interceptors.cpp implements strtol family functions and their
`_l` versions. Tested by lib/msan/tests/msan_test.cpp
* sanitizer_common_interceptors.inc implements scanf family functions.
The strtol family functions are spreaded, which is not great, but the
patch (intended for release/17.x) does not attempt to address the issue.
Add symbols to lib/sanitizer_common/symbolizer/scripts/global_symbols.txt to
support both glibc pre-2.38 and 2.38.
When build bots migrate to glibc 2.38+, we will lose test coverage for
non-isoc23 versions since the existing C++ unittests imply `_GNU_SOURCE`.
Add test/sanitizer_common/TestCases/{strtol.c,scanf.c}.
They catch msan false positive in the absence of the interceptors.
Fix https://github.com/llvm/llvm-project/issues/64388
Fix https://github.com/llvm/llvm-project/issues/64946
Link: https://lists.gnu.org/archive/html/info-gnu/2023-07/msg00010.html
("The GNU C Library version 2.38 is now available")
Reviewed By: #sanitizers, vitalybuka, mgorny
Differential Revision: https://reviews.llvm.org/D158943
Currently if a program calls sigaction very early (before non-lazy sanitizer
initialization, in particular if .preinit_array initialization is not enabled),
then sigaction will wrongly fail since the interceptor is not initialized yet.
In all other interceptors we do lazy runtime initialization for this reason,
but we don't do it in the signal interceptors.
Do lazy runtime initialization in signal interceptors as well.
Reviewed By: melver
Differential Revision: https://reviews.llvm.org/D155188
Relanding with #if SANITIZER_GLIBC to avoid breaking FreeBSD.
Also incorporates Arthur's BUILD.gn fix (thanks!) from https://reviews.llvm.org/rGc1e283851772ba494113311405d48cfb883751d1
Original commit message:
This patch adds an msan interceptor for dladdr1 (with support for RTLD_DL_LINKMAP and RTLD_DL_SYMENT) and an accompanying test. It also adds a helper file, msan_dl.cpp, that contains UnpoisonDllAddrInfo (refactored out of the dladdr interceptor) and UnpoisonDllAddr1ExtraInfo.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D154272
Reland with -Wcast-qual issue fixed
Original commit message:
This patch adds an msan interceptor for dladdr1 (with support for RTLD_DL_LINKMAP and RTLD_DL_SYMENT) and an accompanying test. It also adds a helper file, msan_dl.cpp, that contains UnpoisonDllAddrInfo (refactored out of the dladdr interceptor) and UnpoisonDllAddr1ExtraInfo.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D154272
This patch adds an msan interceptor for dladdr1 (with support
for RTLD_DL_LINKMAP and RTLD_DL_SYMENT) and an accompanying
test. It also adds a helper file, msan_dl.cpp, that contains
UnpoisonDllAddrInfo (refactored out of the dladdr interceptor)
and UnpoisonDllAddr1ExtraInfo.
Differential Revision: https://reviews.llvm.org/D154272
D135716 introduced -ftrivial-auto-var-init=pattern where supported.
Unfortunately this introduces unwanted memset() for large stack arrays,
as shown by the new tests added for asan and msan (tsan already had this
test).
In general, the problem of compiler-inserted memintrinsic calls
(memset/memcpy/memmove) is not new to compiler-rt, and has been a
problem before.
To avoid introducing unwanted memintrinsic calls, we redefine
memintrinsics as __sanitizer_internal_mem* at the assembly level for
most source files automatically (where sanitizer_common_internal_defs.h
is included).
In few cases, redefining a symbol in this way causes issues for
interceptors, namely the memintrinsic interceptor themselves. For such
source files we have to selectively disable the redefinition.
Other alternatives have been considered, but simply do not work well in
the context of compiler-rt:
1. Linker --wrap: this does not work because --wrap only
applies to the final link, and would not apply when building
sanitizer static libraries.
2. Changing references to memset() via objcopy: this may work,
but due to the complexities of the build system, introducing
such a post-processing step for the right object files (in
particular object files defining memset cannot be touched)
seems infeasible.
The chosen solution works well (as shown by the tests). Other libraries
have chosen the same solution where nothing else works (see e.g. glibc's
"symbol-hacks.h").
v4:
- Add interface attribute to __sanitizer_internal_mem* declarations as
well, as otherwise some compilers (MSVC) will complain.
- Add SANITIZER_COMMON_NO_REDEFINE_BUILTINS to source files using
C++STL, since this could lead to ODR violations (see added comment).
v3:
- Don't use ALIAS() to alias internal_mem*() functions to
__sanitizer_internal_mem*() functions, but just define them as
ALWAYS_INLINE functions instead. This will work on darwin and windows.
v2:
- Fix ubsan_minimal build where compiler decides to insert
memset/memcpy: ubsan_minimal has work without RTSanitizerCommonLibc,
therefore do not redefine the builtins.
- Fix definition of internal_mem* functions with compilers that want the
aliased function to already be defined before.
- Fix definition of __sanitizer_internal_mem* functions with compilers
more pedantic about attribute placement around extern "C".
Reviewed By: vitalybuka, dvyukov
Differential Revision: https://reviews.llvm.org/D151152
This reverts commit fc011a72881cdddc95bfa61f3f38916c29b7e362.
This reverts commit 4ad6a0c9a409b19b950a6a2a90d5405cea2e9b89.
This reverts commit 4b1eb4cf0e8eff5f68410720167b4986da597010.
Still causes Windows build bots to fail.
D135716 introduced -ftrivial-auto-var-init=pattern where supported.
Unfortunately this introduces unwanted memset() for large stack arrays,
as shown by the new tests added for asan and msan (tsan already had this
test).
In general, the problem of compiler-inserted memintrinsic calls
(memset/memcpy/memmove) is not new to compiler-rt, and has been a
problem before.
To avoid introducing unwanted memintrinsic calls, we redefine
memintrinsics as __sanitizer_internal_mem* at the assembly level for
most source files automatically (where sanitizer_common_internal_defs.h
is included).
In few cases, redefining a symbol in this way causes issues for
interceptors, namely the memintrinsic interceptor themselves. For such
source files we have to selectively disable the redefinition.
Other alternatives have been considered, but simply do not work well in
the context of compiler-rt:
1. Linker --wrap: this does not work because --wrap only
applies to the final link, and would not apply when building
sanitizer static libraries.
2. Changing references to memset() via objcopy: this may work,
but due to the complexities of the build system, introducing
such a post-processing step for the right object files (in
particular object files defining memset cannot be touched)
seems infeasible.
The chosen solution works well (as shown by the tests). Other libraries
have chosen the same solution where nothing else works (see e.g. glibc's
"symbol-hacks.h").
v3:
- Don't use ALIAS() to alias internal_mem*() functions to
__sanitizer_internal_mem*() functions, but just define them as
ALWAYS_INLINE functions instead. This will work on darwin and windows.
v2:
- Fix ubsan_minimal build where compiler decides to insert
memset/memcpy: ubsan_minimal has work without RTSanitizerCommonLibc,
therefore do not redefine the builtins.
- Fix definition of internal_mem* functions with compilers that want the
aliased function to already be defined before.
- Fix definition of __sanitizer_internal_mem* functions with compilers
more pedantic about attribute placement around extern "C".
Reviewed By: vitalybuka, dvyukov
Differential Revision: https://reviews.llvm.org/D151152
This reverts commit 4369de7af46605522bf7dbe3bc31d00b0eb4bee6.
Fails on Mac OS with "sanitizer_libc.cpp:109:5: error: aliases are not
supported on darwin".
D135716 introduced -ftrivial-auto-var-init=pattern where supported.
Unfortunately this introduces unwanted memset() for large stack arrays,
as shown by the new tests added for asan and msan (tsan already had this
test).
In general, the problem of compiler-inserted memintrinsic calls
(memset/memcpy/memmove) is not new to compiler-rt, and has been a
problem before.
To avoid introducing unwanted memintrinsic calls, we redefine
memintrinsics as __sanitizer_internal_mem* at the assembly level for
most source files automatically (where sanitizer_common_internal_defs.h
is included).
In few cases, redefining a symbol in this way causes issues for
interceptors, namely the memintrinsic interceptor themselves. For such
source files we have to selectively disable the redefinition.
Other alternatives have been considered, but simply do not work well in
the context of compiler-rt:
1. Linker --wrap: this does not work because --wrap only
applies to the final link, and would not apply when building
sanitizer static libraries.
2. Changing references to memset() via objcopy: this may work,
but due to the complexities of the build system, introducing
such a post-processing step for the right object files (in
particular object files defining memset cannot be touched)
seems infeasible.
The chosen solution works well (as shown by the tests). Other libraries
have chosen the same solution where nothing else works (see e.g. glibc's
"symbol-hacks.h").
v2:
- Fix ubsan_minimal build where compiler decides to insert
memset/memcpy: ubsan_minimal has work without RTSanitizerCommonLibc,
therefore do not redefine the builtins.
- Fix definition of internal_mem* functions with compilers that want the
aliased function to already be defined before.
- Fix definition of __sanitizer_internal_mem* functions with compilers
more pedantic about attribute placement around extern "C".
Reviewed By: vitalybuka, dvyukov
Differential Revision: https://reviews.llvm.org/D151152
D135716 introduced -ftrivial-auto-var-init=pattern where supported.
Unfortunately this introduces unwanted memset() for large stack arrays,
as shown by the new tests added for asan and msan (tsan already had this
test).
In general, the problem of compiler-inserted memintrinsic calls
(memset/memcpy/memmove) is not new to compiler-rt, and has been a
problem before.
To avoid introducing unwanted memintrinsic calls, we redefine
memintrinsics as __sanitizer_internal_mem* at the assembly level for
most source files automatically (where sanitizer_common_internal_defs.h
is included).
In few cases, redefining a symbol in this way causes issues for
interceptors, namely the memintrinsic interceptor themselves. For such
source files we have to selectively disable the redefinition.
Other alternatives have been considered, but simply do not work well in
the context of compiler-rt:
1. Linker --wrap: this does not work because --wrap only
applies to the final link, and would not apply when building
sanitizer static libraries.
2. Changing references to memset() via objcopy: this may work,
but due to the complexities of the build system, introducing
such a post-processing step for the right object files (in
particular object files defining memset cannot be touched)
seems infeasible.
The chosen solution works well (as shown by the tests). Other libraries
have chosen the same solution where nothing else works (see e.g. glibc's
"symbol-hacks.h").
Reviewed By: vitalybuka, dvyukov
Differential Revision: https://reviews.llvm.org/D151152
This moves memintrinsic interceptors (memcpy/memmove/memset) into a new
file sanitizer_common_interceptors_memintrinsics.inc.
This is in preparation of redefining builtins, however, we must be
careful to not redefine builtins in TUs that define interceptors of the
same name.
In all cases except for MSan, memintrinsic interceptors were moved to a
new TU $tool_interceptors_memintrinsics.cpp. In the case of MSan, it
turns out this is not yet necessary (as shown by the later patch
introducing memcpy tests).
NFC.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D151552
Most uses of ALIAS() are in conjunction with WRAPPER_NAME().
Simplify the code and just make ALIAS() turn its argument into a string
(similar to Linux kernel's __alias macro). This in turn allows removing
WRAPPER_NAME().
NFC.
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D151216
These are aliases on musl and as of 1.2.4, aren't visible by default
(only with -D_LARGEFILE64_SOURCE), and will be removed entirely in a future
release.
This fixes a runtime failure with msan on musl-1.2.4:
```
$ echo 'int main(){}' | clang -x c - -fsanitize=memory -o /dev/null
/usr/bin/x86_64-gentoo-linux-musl-ld.bfd: /usr/lib/llvm/16/bin/../../../../lib/clang/16/lib/linux/libclang_rt.msan-x86_64.a(msan_interceptors.cpp.o): in function `__interceptor_getrlimit64':
[...]
```
Bug: https://bugs.gentoo.org/906603
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D150925
glibc >= 2.33 uses shared functions for stat family functions.
D111984 added support for non-64 bit variants but they
do not appear to be enough as we have been noticing msan
errors on 64-bit stat variants on Chrome OS.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D121652
In glibc before 2.33, include/sys/stat.h defines fstat/fstat64 to
`__fxstat/__fxstat64` and provides `__fxstat/__fxstat64` in libc_nonshared.a.
The symbols are glibc specific and not needed on other systems.
Reviewed By: vitalybuka, #sanitizers
Differential Revision: https://reviews.llvm.org/D118423
A signal handler can alter ucontext_t to affect execution after
the signal returns. Check that the contents are initialized.
Restoring unitialized values in registers can't be good.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D116209
ucontext_t can be larger than its static size if it contains
AVX state and YMM/ZMM registers.
Currently a signal handler that tries to access that state
can produce false positives with random origins on stack.
Account for the additional ucontext_t state.
Reviewed By: vitalybuka
Differential Revision: https://reviews.llvm.org/D116208
It should be NFC, as they already intercept pthread_create.
This will let us to fix BackgroundThread for these sanitizerts.
In in followup patches I will fix MaybeStartBackgroudThread for them
and corresponding tests.
Reviewed By: kstoimenov
Differential Revision: https://reviews.llvm.org/D114935
Since glibc 2.34, dlsym does
1. malloc 1
2. malloc 2
3. free pointer from malloc 1
4. free pointer from malloc 2
These sequence was not handled by trivial dlsym hack.
This fixes https://bugs.llvm.org/show_bug.cgi?id=52278
Reviewed By: eugenis, morehouse
Differential Revision: https://reviews.llvm.org/D112588
If async signal handler called when we MsanThread::Init
signal handler may trigger false reports.
I failed to reproduce this locally for a test.
Differential Revision: https://reviews.llvm.org/D113328
Add following interceptors on Linux: stat, lstat, fstat, fstatat.
This fixes use-of-uninitialized value on platforms with GLIBC 2.33+.
In particular: Arch Linux, Ubuntu hirsute/impish.
The tests should have also been failing during the release on the mentioned platforms, but I cannot find any related discussion.
Most likely, the regression was introduced by glibc commit [[ 8ed005daf0 | 8ed005daf0ab03e14250032 ]]:
all stat-family functions are now exported as shared functions.
Before, some of them (namely stat, lstat, fstat, fstatat) were provided as a part of libc_noshared.a and called their __xstat dopplegangers. This is still true for Debian Sid and earlier Ubuntu's. stat interceptors may be safely provided for them, no problem with that.
Closes https://github.com/google/sanitizers/issues/1452.
See also https://jira.mariadb.org/browse/MDEV-24841
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D111984
llvm-libc is expected to be built with sanitizers and not use interceptors in
the long run. For now though, we have a hybrid process, where functions
implemented in llvm-libc are instrumented, and glibc fills and sanitizer
interceptors fill in the rest.
Current sanitizers have an invariant that the REAL(...) function called from
inside of an interceptor is uninstrumented. A lot of interceptors call strlen()
in order to figure out the size of the region to check/poison. Switch these
callsites over to the internal, unsanitized implementation.
Reviewed By: hctim, vitalybuka
Differential Revision: https://reviews.llvm.org/D108316