173 Commits

Author SHA1 Message Date
Yussur Mustafa Oraji
ded1f3ec96
[TSan] Add option to ignore capturing behavior when instrumenting (#148156)
While not needed for most applications, some tools such as
[MUST](https://www.i12.rwth-aachen.de/cms/i12/forschung/forschungsschwerpunkte/lehrstuhl-fuer-hochleistungsrechnen/~nrbe/must/)
depend on the instrumentation being present.
MUST uses the ThreadSanitizer annotation interface to detect data races
in MPI programs, where the capture tracking is detrimental as it has no
bearing on MPI data races, leading to missed races.
2025-08-06 15:47:33 +02:00
Kunqiu Chen
355725a25e
[TSan] Fix missing inst cleanup (#144067)
Commit 44e875ad5b2ce26826dd53f9e7d1a71436c86212 introduced a change that
replaces `ReplaceInstWithInst` with `Instruction::replaceAllUsesWith`,
without subsequent instruction cleanup.

This results in TSan leaving behind useless `load atomic` instructions
after 'replacing' them.

This commit adds cleanup back, consistent with the context.
2025-06-18 17:09:32 +08:00
Jeremy Morse
9eb0020555
[DebugInfo][RemoveDIs] Remove a swathe of debug-intrinsic code (#144389)
Seeing how we can't generate any debug intrinsics any more: delete a
variety of codepaths where they're handled. For the most part these are
plain deletions, in others I've tweaked comments to remain coherent, or
added a type to (what was) type-generic-lambdas.

This isn't all the DbgInfoIntrinsic call sites but it's most of the
simple scenarios.

Co-authored-by: Nikita Popov <github@npopov.com>
2025-06-17 15:55:14 +01:00
Camsyn
59b26abbbe
[TSan, SanitizerBinaryMetadata] Analyze the capture status for alloca rather than arbitrary Addr (#132756)
This PR is based on my last PR #132752 (the first commit of this PR),
but addressing a different issue.

This commit addresses the limitation in `PointerMayBeCaptured` analysis
when dealing with derived pointers (e.g. arr+1) as described in issue
#132739.

The current implementation of `PointerMayBeCaptured` may miss captures
of the underlying `alloca` when analyzing derived pointers, leading to
some FNs in TSan, as follows:
```cpp
void *Thread(void *a) {
  ((int*)a)[1] = 43;
  return 0;
}

int main() {
  int Arr[2] = {41, 42};
  pthread_t t;
  pthread_create(&t, 0, Thread, &Arr[0]);
  // Missed instrumentation here due to the FN of PointerMayBeCaptured
  Arr[1] = 43;
  barrier_wait(&barrier);
  pthread_join(t, 0);
}
```
Refer to this [godbolt page](https://godbolt.org/z/n67GrxdcE) to get the
compilation result of TSan.

Even when `PointerMayBeCaptured` working correctly, it should backtrack
to the original `alloca` firstly during analysis, causing redundancy to
the outer's `findAllocaForValue`.
```cpp
    const AllocaInst *AI = findAllocaForValue(Addr);
    // Instead of Addr, we should check whether its base pointer is captured.
    if (AI && !PointerMayBeCaptured(Addr, true)) ...
```

Key changes:
Directly analyze the capture status of the underlying `alloca` instead
of derived pointers to ensure accurate capture detection
```cpp
    const AllocaInst *AI = findAllocaForValue(Addr);
    // Instead of Addr, we should check whether its base pointer is captured.
    if (AI && !PointerMayBeCaptured(AI, true)) ...
```
2025-04-24 10:48:07 +02:00
Camsyn
bf6986f9f0
[TSan, SanitizerBinaryMetadata] Improve instrument for derived pointers via phis/selects (#132752)
ThreadSanitizer.cpp and SanitizerBinaryMetadata.cpp previously used
`getUnderlyingObject` to check if pointers originate from stack objects.

However, `getUnderlyingObject()` by default only looks through linear
chains, not selects/phis. In particular, this means that we miss cases
involving pointer induction variables.

For instance,
```llvm
%stkobj = alloca [2 x i32], align 8
; getUnderlyingObject(%derived) = %derived
%derived = getelementptr inbounds i32, ptr %stkobj, i64 1
```

This will result in redundant instrumentation of TSan, resulting in
greater performance costs, especially when there are loops, referring to
this [godbolt page](https://godbolt.org/z/eaT1fPjTW) for details.
```cpp
char loop(int x) {
    char buf[10];
    char *p = buf;
    for (int i = 0; i < x && i < 10; i++) {
      // Should not instrument, as its base object is a non-captured stack
      // variable.
      // However, currectly, it is instrumented due to %p = %phi ...
      *p++ = i;
    }

    // Use buf to prevent it from being eliminated by optimization
    return buf[9];
}
```

There are TWO APIs `getUnderlyingObjectAggressive` and
`findAllocaForValue` that can backtrack the pointer via tree traversal,
supporting phis/selects.

This patch replaces `getUnderlyingObject` with `findAllocaForValue`
which:
1. Properly tracks through PHINodes and select operations
2. Directly identifies if a pointer comes from a `AllocaInst`

Performance impact:
- Compilation: Moderate cost increase due to wider value tracing, but...
- Runtime: Significant wins for code with pointer induction variables
derived from stack allocas, especially for loop-heavy code, as
instrumentation can now be safely omitted.
2025-04-17 10:09:07 +02:00
Rahul Joshi
74b7abf154
[IRBuilder] Add new overload for CreateIntrinsic (#131942)
Add a new `CreateIntrinsic` overload with no `Types`, useful for
creating calls to non-overloaded intrinsics that don't need additional
mangling.
2025-03-31 08:10:34 -07:00
Nikita Popov
979c275097
[IR] Store Triple in Module (NFC) (#129868)
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.

For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.

The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
2025-03-06 10:27:47 +01:00
Nikita Popov
9cbdcfcafd [CaptureTracking] Remove StoreCaptures parameter (NFC)
The implementation doesn't use it, and is unlikely to use it in
the future.

The places that do set StoreCaptures=false, do so incorrectly and
would be broken if the parameter actually did anything.
2025-02-24 12:00:57 +01:00
Jeremy Morse
6292a808b3
[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to getFirstNonPHI use the iterator-returning version.

This patch changes a bunch of call-sites calling getFirstNonPHI to use
getFirstNonPHIIt, which returns an iterator. All these call sites are
where it's obviously safe to fetch the iterator then dereference it. A
follow-up patch will contain less-obviously-safe changes.

We'll eventually deprecate and remove the instruction-pointer
getFirstNonPHI, but not before adding concise documentation of what
considerations are needed (very few).

---------

Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
2025-01-24 13:27:56 +00:00
Kazu Hirata
4d12a14357
[Instrumentation] Remove unused includes (NFC) (#115117)
Identified with misc-include-cleaner.
2024-11-06 08:36:34 -08:00
Jay Foad
d9c95efb6c
[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112546)
Convert almost every instance of:
  CreateCall(Intrinsic::getOrInsertDeclaration(...), ...)
to the equivalent CreateIntrinsic call.
2024-10-16 15:43:30 +01:00
Rahul Joshi
fa789dffb1
[NFC] Rename Intrinsic::getDeclaration to getOrInsertDeclaration (#111752)
Rename the function to reflect its correct behavior and to be consistent
with `Module::getOrInsertFunction`. This is also in preparation of
adding a new `Intrinsic::getDeclaration` that will have behavior similar
to `Module::getFunction` (i.e, just lookup, no creation).
2024-10-11 05:26:03 -07:00
Antonio Frighetto
2ae968a0d9
[Instrumentation] Move out to Utils (NFC) (#108532)
Utility functions have been moved out to Utils. Minor opportunity to
drop the header where not needed.
2024-09-15 21:07:40 -07:00
Chaitanya
62ced8116b
[Sanitizer] Make sanitizer passes idempotent (#99439)
This PR changes the sanitizer passes to be idempotent. 
When any sanitizer pass is run after it has already been run before,
double instrumentation is seen in the resulting IR. This happens because
there is no check in the pass, to verify if IR has been instrumented
before.

This PR checks if "nosanitize_*" module flag is already present and if
true, return early without running the pass again.
2024-08-12 11:16:44 +05:30
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Vitaly Buka
a441645f80
[tsan] Don't crash on vscale (#91018)
Co-authored-by: Heejin Ahn <aheejin@gmail.com>
2024-05-03 16:29:26 -07:00
Nikita Popov
8a237ab7d9 [TSan] Avoid use of ReplaceInstWithInst()
This is mainly for consistency across code paths, but also makes
sure that all calls use IRInstrumentationBuilder and its special
debuginfo handling.

The two remaining uses don't actually need RAUW, they just have
to erase the original instruction.
2024-03-15 09:21:27 +01:00
Nikita Popov
ff2fb2a1d7
[TSan] Fix atomicrmw xchg with pointer and floats (#85228)
atomicrmw xchg also accepts pointer and floating-point values. To handle
those, insert necessary casts to and from integer. This is what we do
for cmpxchg as well.

Fixes https://github.com/llvm/llvm-project/issues/85226.
2024-03-15 09:02:10 +01:00
Simon Pilgrim
3ca4fe80d4 [Transforms] Use StringRef::starts_with/ends_with instead of startswith/endswith. NFC.
startswith/endswith wrap starts_with/ends_with and will eventually go away (to more closely match string_view)
2023-11-06 16:50:18 +00:00
Fangrui Song
5624e86ae0 [tsan] Respect !nosanitize metadata and remove gcov special case
Certain instrumentations set the !nosanitize metadata for inserted
instructions, which are generally not interested for sanitizers. Skip
tsan instrumentation like we do for asan (D126294)/msan/hwasan.

-fprofile-arcs instrumentation has data race unless
-fprofile-update=atomic is specified. Let's remove the the `__llvm_gcov`
special case from commit 0222adbcd25779a156399bcc16fde9f6d083a809 (2016)
as the racy instructions have the !nosanitize metadata.
(-fprofile-arcs instrumentation does not use `__llvm_gcda` as global variables.)

```
std::atomic<int> c;
void foo() { c++; }
int main() {
  std::thread th(foo);
  c++;
  th.join();
}
```
Tested that `clang++ --coverage -fsanitize=thread a.cc && ./a.out` does
not report spurious tsan errors.

Also remove the default CC1 option -fprofile-update=atomic for
-fsanitize=thread to make options more orthogonal.

Reviewed By: Enna1

Differential Revision: https://reviews.llvm.org/D158385
2023-08-24 22:31:11 -07:00
Bjorn Pettersson
e6e9a87534 Drop some typed pointer handling
Differential Revision: https://reviews.llvm.org/D156739
2023-08-02 12:08:37 +02:00
Nikita Popov
9cf5254878 [llvm] Remove some uses of isOpaqueOrPointeeTypeEquals() (NFC) 2023-07-18 11:18:31 +02:00
Marco Elver
4eef2e30d6 [ThreadSanitizer] Add fallback DebugLocation for memintrinsic calls
When building with debug info enabled, some load/store instructions do
not have a DebugLocation attached. When using the default IRBuilder, it
attempts to copy the DebugLocation from the insertion-point instruction.
When there's no DebugLocation, no attempt is made to add one.

Add a fallback DebugLocation with the help of InstrumentationIRBuilder for
memintrinsics. In particular, the compiler may optimize load/store without
debug info into memintrinsics, which then are missing debug info as well.
2023-07-17 17:52:16 +02:00
Kazu Hirata
55e2cd1609 Use llvm::count{lr}_{zero,one} (NFC) 2023-01-28 12:41:20 -08:00
Jonas Paulsson
dc3875e468 Add parameter extension attributes in various instrumentation passes.
For the targets that have in their ABI the requirement that arguments and
return values are extended to the full register bitwidth, it is important
that calls when built also take care of this detail.

The OMPIRBuilder, AddressSanitizer, GCOVProfiling, MemorySanitizer and
ThreadSanitizer passes are with this patch hopefully now doing this properly.

Reviewed By: Eli Friedman, Ulrich Weigand, Johannes Doerfert

Differential Revision: https://reviews.llvm.org/D133949
2023-01-18 18:29:12 -06:00
Fangrui Song
21c4dc7997 std::optional::value => operator*/operator->
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).

This fixes clang.
2022-12-17 00:42:05 +00:00
Manuel Brito
45a892d012 Use poison instead of undef where its used as a placeholder [NFC]
Differential Revision: https://reviews.llvm.org/D139789
2022-12-11 17:18:00 +00:00
Fangrui Song
a996cc217c Remove unused #include "llvm/ADT/Optional.h" 2022-12-05 06:31:11 +00:00
Vitaly Buka
b4257d3bf5 [tsan] Replace mem intrinsics with calls to interceptors
After https://reviews.llvm.org/rG463aa814182a23 tsan replaces llvm
intrinsics with calls to glibc functions. However this approach is
fragile, as slight changes in pipeline can return llvm intrinsics back.
In particular InstCombine can do that.

Msan/Asan already declare own version of these memory
functions for the similar purpose.

KCSAN, or anything that uses something else than compiler-rt, needs to
implement this callbacks.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D133268
2022-09-06 13:09:31 -07:00
Vitaly Buka
c51a12d598 Revert "[tsan] Replace mem intrinsics with calls to interceptors"
Breaks
http://45.33.8.238/macm1/43944/step_4.txt
https://lab.llvm.org/buildbot/#/builders/70/builds/26926

This reverts commit 77654a65a373da9c4829de821e7b393ea811ee40.
2022-09-06 09:47:33 -07:00
Vitaly Buka
77654a65a3 [tsan] Replace mem intrinsics with calls to interceptors
After https://reviews.llvm.org/rG463aa814182a23 tsan replaces llvm
intrinsics with calls to glibc functions. However this approach is
fragile, as slight changes in pipeline can return llvm intrinsics back.
In particular InstCombine can do that.

Msan/Asan already declare own version of these memory
functions for the similar purpose.

KCSAN, or anything that uses something else than compiler-rt, needs to
implement this callbacks.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D133268
2022-09-06 08:25:32 -07:00
Fangrui Song
de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Kazu Hirata
e20d210eef [llvm] Qualify auto (NFC)
Identified with readability-qualified-auto.
2022-08-07 23:55:27 -07:00
Kazu Hirata
0e37ef0186 [Transforms] Fix comment typos (NFC) 2022-08-07 23:55:24 -07:00
Kazu Hirata
611ffcf4e4 [llvm] Use value instead of getValue (NFC) 2022-07-13 23:11:56 -07:00
Kazu Hirata
a7938c74f1 [llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 21:42:52 -07:00
Kazu Hirata
3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
2022-06-25 11:56:50 -07:00
Kazu Hirata
aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Guillaume Chatelet
a6c2ab0c3f [NFC][Alignment] Use proper type in instrumentLoadOrStore 2022-06-13 12:59:38 +00:00
Marco Elver
9ae87b5973 [Instrumentation] Share InstrumentationIRBuilder between TSan and SanCov
Factor our InstrumentationIRBuilder and share it between ThreadSanitizer
and SanitizerCoverage. Simplify its usage at the same time (use function
of passed Instruction or BasicBlock).

This class may be used in other instrumentation passes in future.

NFCI.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D125038
2022-05-06 09:15:17 +02:00
Marco Elver
47bdea3f7e [ThreadSanitizer] Add fallback DebugLocation for instrumentation calls
When building with debug info enabled, some load/store instructions do
not have a DebugLocation attached. When using the default IRBuilder, it
attempts to copy the DebugLocation from the insertion-point instruction.
When there's no DebugLocation, no attempt is made to add one.

This is problematic for inserted calls, where the enclosing function has
debug info but the call ends up without a DebugLocation in e.g. LTO
builds that verify that both the enclosing function and calls to
inlinable functions have debug info attached.

This issue was noticed in Linux kernel KCSAN builds with LTO and debug
info enabled:

  | ...
  | inlinable function call in a function with debug info must have a !dbg location
  |   call void @__tsan_read8(i8* %432)
  | ...

To fix, ensure that all calls to the runtime have a DebugLocation
attached, where the possibility exists that the insertion-point might
not have any DebugLocation attached to it.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D124937
2022-05-05 15:21:35 +02:00
serge-sans-paille
7030654296 [iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since fa5a4e1b95c8f37796 detected a few
regressions, fixing them.

Differential Revision: https://reviews.llvm.org/D124847
2022-05-04 08:32:38 +02:00
Fangrui Song
c74a706893 [LegacyPM] Remove ThreadSanitizerLegacyPass
Using the legacy PM for the optimization pipeline was deprecated in 13.0.0.
Following recent changes to remove non-core features of the legacy
PM/optimization pipeline, remove ThreadSanitizerLegacyPass.

Reviewed By: #sanitizers, vitalybuka

Differential Revision: https://reviews.llvm.org/D124209
2022-04-27 16:25:41 -07:00
Marco Elver
cbe1e67ead [Instruction] Introduce getAtomicSyncScopeID()
An analysis may just be interested in checking if an instruction is
atomic but system scoped or single-thread scoped, like ThreadSanitizer's
isAtomic(). Unfortunately Instruction::isAtomic() can only answer the
"atomic" part of the question, but to also check scope becomes rather
verbose.

To simplify and reduce redundancy, introduce a common helper
getAtomicSyncScopeID() which returns the scope of an atomic operation.
Start using it in ThreadSanitizer.

NFCI.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D121910
2022-03-17 14:59:37 +01:00
serge-sans-paille
ed98c1b376 Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332
2022-03-12 17:26:40 +01:00
Zarko Todorovski
0d3add216f [llvm][NFC] Inclusive language: Reword replace uses of sanity in llvm/lib/Transform comments and asserts
Reworded some comments and asserts to avoid usage of `sanity check/test`

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D114372
2021-11-23 13:22:55 -05:00
Dmitry Vyukov
a7c57c4ec8 tsan: don't consider debug calls as calls
Tsan pass does 2 optimizations based on presence of calls:
1. Don't emit function entry/exit callbacks if there are no calls
and no memory accesses.
2. Combine read/write of the same variable if there are no
intervening calls.
However, all debug info is represented as CallInst as well
and thus effectively disables these optimizations.
Don't consider debug info calls as calls.

Reviewed By: glider, melver

Differential Revision: https://reviews.llvm.org/D114079
2021-11-17 14:42:16 +01:00
Bjorn Pettersson
8f8616655c [NewPM] Use a separate struct for ModuleThreadSanitizerPass
Split ThreadSanitizerPass into ThreadSanitizerPass (as a function
pass) and ModuleThreadSanitizerPass (as a module pass).
Main reason is to make sure that we have a unique mapping from
ClassName to PassName in the new passmanager framework, making it
possible to correctly identify the passes when dealing with options
such as -print-after and -print-pipeline-passes.

This is a follow-up to D105006 and D105007.
2021-09-16 14:58:42 +02:00
Alexander Potapenko
8300d52e8c [tsan] Add support for disable_sanitizer_instrumentation attribute
Unlike __attribute__((no_sanitize("thread"))), this one will cause TSan
to skip the entire function during instrumentation.

Depends on https://reviews.llvm.org/D108029

Differential Revision: https://reviews.llvm.org/D108202
2021-08-23 12:38:33 +02:00
Arthur Eubanks
de0ae9e89e [NFC] Cleanup more AttributeList::addAttribute() 2021-08-17 21:05:41 -07:00