2105 Commits

Author SHA1 Message Date
Dmitry Vyukov
1d4d2cceda [TSan] Add a runtime flag to print full thread creation stacks up to the main thread
Currently, we only print how threads involved in data race are created from their parent threads.
Add a runtime flag 'print_full_thread_history' to print thread creation stacks for the threads involved in the data race and their ancestors up to the main thread.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D122131
2022-03-24 17:30:27 +01:00
Dmitry Vyukov
9e66e5872c tsan: print signal num in errno spoiling reports
For errno spoiling reports we only print the stack
where the signal handler is invoked. And the top
frame is the signal handler function, which is supposed
to give the info for debugging.
But in same cases the top frame can be some common thunk,
which does not give much info. E.g. for Go/cgo it's always
runtime.cgoSigtramp.

Print the signal number.
This is what we can easily gather and it may give at least
some hints regarding the issue.

Reviewed By: melver, vitalybuka

Differential Revision: https://reviews.llvm.org/D121979
2022-03-18 16:12:11 +01:00
Dmitry Vyukov
66298e1c54 tsan: fix another false positive related to open/close
The false positive fixed by commit f831d6fc80
("tsan: fix false positive during fd close") still happens episodically
on the added more stressful test which does just open/close.

I don't have a coherent explanation as to what exactly happens
but the fix fixes the false positive on this test as well.
The issue may be related to lost writes during asynchronous MADV_DONTNEED.
I've debugged similar unexplainable false positive related to freed and
reused memory and at the time the only possible explanation I found is that
an asynchronous MADV_DONTNEED may lead to lost writes. That's why commit
302ec7b9bc ("tsan: add memory_limit_mb flag") added StopTheWorld around
the memory flush, but unfortunately the commit does not capture these findings.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D121363
2022-03-10 17:02:51 +01:00
Dmitry Vyukov
f831d6fc80 tsan: fix false positive during fd close
FdClose is a subjet to the same atomicity problem as MemoryRangeFreed
(memory state is not "monotoic" wrt race detection).
So we need to lock the thread slot in FdClose the same way we do
in MemoryRangeFreed.
This fixes the modified stress.cpp test.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D121143
2022-03-08 10:40:56 +01:00
Stella Laurenzo
38151a08c2 Reapply "[cmake] Prefix gtest and gtest_main with "llvm_"."
This reverts commit 7cdda6b8ce49ae3c90c068cff4dc355bba5d77f2.

Differential Revision: https://reviews.llvm.org/D121020
2022-03-04 13:45:43 -08:00
Stella Laurenzo
7cdda6b8ce Revert "[cmake] Prefix gtest and gtest_main with "llvm_"."
lldb buildbot failure. will investigate and roll forward.

This reverts commit 9f37775472b45986b0ecce5243bd6ce119e5bd69.
2022-03-02 11:13:46 -08:00
Stella Laurenzo
9f37775472 [cmake] Prefix gtest and gtest_main with "llvm_".
The upstream project ships CMake rules for building vanilla gtest/gmock which conflict with the names chosen by LLVM. Since LLVM's build rules here are quite specific to LLVM, prefixing them to avoid collision is the right thing (i.e. there does not appear to be a path to letting someone *replace* LLVM's googletest with one they bring, so co-existence should be the goal).

This allows LLVM to be included with testing enabled within projects that themselves have a dependency on an official gtest release.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D120789
2022-03-02 10:53:32 -08:00
Xu Mingjie
f19f672328 [TSan][NFC] fixup for comment of Shadow
There should be 1-bit unused field between tid field and is_atomic field of Shadow.

Reviewed By: dvyukov, vitalybuka

Differential Revision: https://reviews.llvm.org/D119417
2022-02-23 11:24:24 -08:00
Vitaly Buka
475c43339b Revert "[TSan][NFC] fixup for comment of Shadow"
Wrong author.

This reverts commit 6bff092e3ed4ae1f21b290f88cf7152cb331aa48.
2022-02-23 11:24:24 -08:00
Vitaly Buka
6bff092e3e [TSan][NFC] fixup for comment of Shadow
There should be 1-bit unused field between tid field and is_atomic field of Shadow.

Reviewed By: dvyukov, vitalybuka

Differential Revision: https://reviews.llvm.org/D119417
2022-02-23 11:16:25 -08:00
Fangrui Song
da2a16f702 [tsan] Make __fxstat code path glibc only
This fixes Linux musl build after D118423.
2022-02-11 15:23:18 -08:00
Dmitry Vyukov
595d340dce sanitizer_common: make internal/external headers compatible
This is a follow up to 4f3f4d672254
("sanitizer_common: fix __sanitizer_get_module_and_offset_for_pc signature mismatch")
which fixes a similar problem for msan build.

I am getting the following error compiling a unit test for code that
uses sanitizer_common headers and googletest transitively includes
sanitizer interface headers:

In file included from third_party/gwp_sanitizers/singlestep_test.cpp:3:
In file included from sanitizer_common/sanitizer_common.h:19:
sanitizer_interface_internal.h:41:5: error: typedef redefinition with different types
('struct __sanitizer_sandbox_arguments' vs 'struct __sanitizer_sandbox_arguments')
  } __sanitizer_sandbox_arguments;
common_interface_defs.h:39:3: note: previous definition is here
} __sanitizer_sandbox_arguments;

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D119546
2022-02-11 19:39:44 +01:00
Dimitry Andric
28fb22c90f [TSan] Handle FreeBSD specific indirection of libpthread functions
Similar to 60cc1d3218fc for NetBSD, add aliases and interceptors for the
following pthread related functions:

- pthread_cond_init(3)
- pthread_cond_destroy(3)
- pthread_cond_signal(3)
- pthread_cond_broadcast(3)
- pthread_cond_wait(3)
- pthread_mutex_init(3)
- pthread_mutex_destroy(3)
- pthread_mutex_lock(3)
- pthread_mutex_trylock(3)
- pthread_mutex_unlock(3)
- pthread_rwlock_init(3)
- pthread_rwlock_destroy(3)
- pthread_rwlock_rdlock(3)
- pthread_rwlock_tryrdlock(3)
- pthread_rwlock_wrlock(3)
- pthread_rwlock_trywrlock(3)
- pthread_rwlock_unlock(3)
- pthread_once(3)
- pthread_sigmask(3)

In FreeBSD's libc, a number of internal aliases of the pthread functions
are invoked, typically with an additional prefixed underscore, e.g.
_pthread_cond_init() and so on.

ThreadSanitizer needs to intercept these aliases too, otherwise some
false positive reports about data races might be produced.

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D119034
2022-02-07 11:01:37 +01:00
Fangrui Song
c80d349859 [msan][tsan] Refine __fxstat{,at}{,64} condition
In glibc before 2.33, include/sys/stat.h defines fstat/fstat64 to
`__fxstat/__fxstat64` and provides `__fxstat/__fxstat64` in libc_nonshared.a.
The symbols are glibc specific and not needed on other systems.

Reviewed By: vitalybuka, #sanitizers

Differential Revision: https://reviews.llvm.org/D118423
2022-01-28 09:15:39 -08:00
Petr Hosek
48a38954c9 [CMake] Use generator expression to get in-tree libc++ path
When using the in-tree libc++, we should be using the full path to
ensure that we're using the right library and not accidentally pick up
the system library.

Differential Revision: https://reviews.llvm.org/D118200
2022-01-26 14:12:48 -08:00
Julian Lettner
db07e082ab [TSan] Omit vfork interceptor iOS simulator runtime
`_vfork` moved from libsystem_kernel.dylib to libsystem_c.dylib as part
of the below changes.  The iOS simulator does not actually have
libsystem_kernel.dylib of its own, it only has the host Mac's.  The
umbrella-nature of Libsystem makes this movement transparent to
everyone; except the simulator! So when we "back deploy", i.e., use the
current version of TSan with an older simulator runtime then this symbol
is now missing, when we run on the latest OS (but an older simulator
runtime).

Note we use `SANITIZER_IOS` because usage of vfork is forbidden on iOS
and the API is completely unavailable on watchOS and tvOS, even if this
problem is specific to the iOS simulator.

Caused by:
rdar://74818691 (Shim vfork() to fork syscall on iOS)
rdar://76762076 (Shim vfork() to fork syscall on macOS)

Radar-Id: rdar://8634734
2022-01-21 17:36:12 -08:00
Julian Lettner
7acb68b80b [NFC] Fixup for comment 2022-01-11 15:35:15 -08:00
Julian Lettner
ff11cd9550 [TSan][Darwin] Enable Trace/TraceAlloc unit tests
These tests are now green:
```
  Trace.MultiPart
  Trace.RestoreAccess
  Trace.RestoreMutexLock
  TraceAlloc.FinishedThreadReuse
  TraceAlloc.FinishedThreadReuse2
  TraceAlloc.SingleThread
```

rdar://82107856
2022-01-11 15:33:29 -08:00
Dmitry Vyukov
765921de5b sanitizer_common: prefix thread-safety macros with SANITIZER_
Currently we use very common names for macros like ACQUIRE/RELEASE,
which cause conflicts with system headers.
Prefix all macros with SANITIZER_ to avoid conflicts.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D116652
2022-01-07 15:11:00 +01:00
Nico Weber
085f078307 Revert "Revert D109159 "[amdgpu] Enable selection of s_cselect_b64.""
This reverts commit 859ebca744e634dcc89a2294ffa41574f947bd62.
The change contained many unrelated changes and e.g. restored
unit test failes for the old lld port.
2022-01-05 13:10:25 -05:00
David Salinas
859ebca744 Revert D109159 "[amdgpu] Enable selection of s_cselect_b64."
This reverts commit 640beb38e7710b939b3cfb3f4c54accc694b1d30.

That commit caused performance degradtion in Quicksilver test QS:sGPU and a functional test failure in (rocPRIM rocprim.device_segmented_radix_sort).
Reverting until we have a better solution to s_cselect_b64 codegen cleanup

Change-Id: Ibf8e397df94001f248fba609f072088a46abae08

Reviewed By: kzhuravl

Differential Revision: https://reviews.llvm.org/D115960

Change-Id: Id169459ce4dfffa857d5645a0af50b0063ce1105
2022-01-05 17:57:32 +00:00
Dmitry Vyukov
f78d49e068 tsan: remove old vector clocks
They are unused in the new tsan runtime.

Depends on D112604.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D112605
2021-12-21 19:54:27 +01:00
Dmitry Vyukov
22a251c3d0 tsan: remove hacky call
It's unused in the new tsan runtime.

Depends on D112603.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D112604
2021-12-21 19:53:49 +01:00
Dmitry Vyukov
9789e74a90 tsan: reduce shadow ranges
The new tsan runtime has 2x more compact shadow.
Adjust shadow ranges accordingly.

Depends on D112603.

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D113751
2021-12-21 19:53:19 +01:00
Dmitry Vyukov
53fc462513 tsan: remove unused variable
Depends on D113983.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D113984
2021-12-21 19:52:34 +01:00
Dmitry Vyukov
c82bd4c5ba tsan: use VReport instead of VPrintf in background thread
If there are multiple processes, it's hard to understand
what output comes from what process.
VReport prepends pid to the output. Use it.

Depends on D113982.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D113983
2021-12-21 19:51:48 +01:00
Dmitry Vyukov
05ca57a054 tsan: better maintain current time in the background thread
Update now after long operations so that we don't use
stale value in subsequent computations.

Depends on D113981.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D113982
2021-12-21 19:51:39 +01:00
Dmitry Vyukov
d95baa98f3 tsan: fix failures after multi-threaded fork
Creating threads after a multi-threaded fork is semi-supported,
we don't give particular guarantees, but we try to not fail
on simple cases and we have die_after_fork=0 flag that enables
not dying on creation of threads after a multi-threaded fork.
This flag is used in the wild:
23c052e3e3/SConstruct (L3599)

fork_multithreaded.cpp test started hanging in debug mode
after the recent "tsan: fix deadlock during race reporting" commit,
which added proactive ThreadRegistryLock check in SlotLock.

But the test broke earlier after "tsan: remove quadratic behavior in pthread_join"
commit which made tracking of alive threads based on pthread_t stricter
(CHECK-fail on 2 threads with the same pthread_t, or joining a non-existent thread).
When we start a thread after a multi-threaded fork, the new pthread_t
can actually match one of existing values (for threads that don't exist anymore).
Thread creation started CHECK-failing on this, but the test simply
ignored this CHECK failure in the child thread and "passed".
But after "tsan: fix deadlock during race reporting" the test started hanging dead,
because CHECK failures recursively lock thread registry.

Fix this purging all alive threads from thread registry on fork.

Also the thread registry mutex somehow lost the internal deadlock detector id
and was excluded from deadlock detection. If it would have the id, the CHECK
wouldn't hang because of the nested CHECK failure due to the deadlock.
But then again the test would have silently ignore this error as well
and the bugs wouldn't have been noticed.
Add the deadlock detector id to the thread registry mutex.

Also extend the test to check more cases and detect more bugs.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D116091
2021-12-21 16:54:00 +01:00
Dmitry Vyukov
d4d86fede8 tsan: always handle closing of file descriptors
If we miss both close of a file descriptor and a subsequent open
if the same file descriptor number, we report false positives
between operations on the old and on the new descriptors.

There are lots of ways to create new file descriptors, but for closing
there is mostly close call. So we try to handle at least it.
However, if the close happens in an ignored library, we miss it
and start reporting false positives.

Handle closing of file descriptors always, even in ignored libraries
(as we do for malloc/free and other critical functions).
But don't imitate memory accesses on close for ignored libraries.

FdClose checks validity of the fd (fd >= 0) itself,
so remove the excessive checks in the callers.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D116095
2021-12-21 13:35:34 +01:00
Dmitry Vyukov
52a4a4a53c tsan: remove unused ReportMutex::destroyed
Depends on D113980.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D113981
2021-12-21 11:37:01 +01:00
Dmitry Vyukov
69807fe161 tsan: change ReportMutex::id type to int
We used to use u64 as mutex id because it was some
tricky identifier built from address and reuse count.
Now it's just the mutex index in the report (0, 1, 2...),
so use int to represent it.

Depends on D112603.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D113980
2021-12-21 11:36:49 +01:00
Dmitry Vyukov
abb825725e tsan: optimize __tsan_read/write16
These callbacks are used for SSE vector accesses.
In some computational programs these accesses dominate.
Currently we do 2 uninlined 8-byte accesses to handle them.
Inline and optimize them similarly to unaligned accesses.
This reduces the vector access benchmark time from 8 to 3 seconds.

Depends on D112603.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D114594
2021-12-21 11:33:28 +01:00
Vitaly Buka
700d16b6d6 [tsan] Fix Darwin crash after D115759
Remove global constructor which may or may not be needed for Android,
at it breaks Darwin now.
2021-12-20 17:05:41 -08:00
Dmitry Vyukov
4c5476b066 tsan: fix NULL deref in TraceSwitchPart
There is a small chance that the slot may be not queued in TraceSwitchPart.
This can happen if the slot has kEpochLast epoch and another thread
in FindSlotAndLock discovered that it's exhausted and removed it from
the slot queue. kEpochLast can happen in 2 cases: (1) if TraceSwitchPart
was called with the slot locked and epoch already at kEpochLast,
or (2) if we've acquired a new slot in SlotLock in the beginning
of the function and the slot was at kEpochLast - 1, so after increment
in SlotAttachAndLock it become kEpochLast.

If this happens we crash on ctx->slot_queue.Remove(thr->slot).
Skip the requeueing if the slot is not queued.
The slot is exhausted, so it must not be ctx->slot_queue.

The existing stress test triggers this with very small probability.
I am not sure how to make this condition more likely to be triggered,
it evaded lots of testing.

Depends on D116040.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D116041
2021-12-20 18:55:51 +01:00
Dmitry Vyukov
2eb3e20461 tsan: fix deadlock during race reporting
SlotPairLocker calls SlotLock under ctx->multi_slot_mtx.
SlotLock can invoke global reset DoReset if we are out of slots/epochs.
But DoReset locks ctx->multi_slot_mtx as well, which leads to deadlock.

Resolve the deadlock by removing SlotPairLocker/multi_slot_mtx
and only lock one slot for which we will do RestoreStack.
We need to lock that slot because RestoreStack accesses the slot journal.
But it's unclear why we need to lock the current slot.
Initially I did it just to be on the safer side (but at that time
we dit not lock the second slot, so it was easy just to lock the current slot).

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D116040
2021-12-20 18:52:48 +01:00
Julian Lettner
64f4041725 [TSan][Darwin] Fix shadow mapping for iOS simulator on Apple Silicon
With the introduction of Apple Silicon `defined(__aarch64__)` is not a
reliable way to check for the platform anymore.

We want to use the "normal" `Mapping48AddressSpace` mapping everywhere
except devices, including the iOS simulators on AS.

Relevant revisions:
https://reviews.llvm.org/D35147
https://reviews.llvm.org/D86377
https://reviews.llvm.org/D107743
https://reviews.llvm.org/D107888

Differential Revision: https://reviews.llvm.org/D115843
2021-12-17 15:59:43 -08:00
Julian Lettner
4399f3b6b0 [TSan][Darwin] Make malloc_size interceptor more robust
Previously we would crash in the TSan runtime if the user program passes
a pointer to `malloc_size()` that doesn't point into app memory.

In these cases, `malloc_size()` should return 0.

For ASan, we fixed a similar issue here:
https://reviews.llvm.org/D15008

Radar-Id: rdar://problem/86213149

Differential Revision: https://reviews.llvm.org/D115947
2021-12-17 15:38:08 -08:00
Matt Kulukundis
406b538dea Add a flag to force tsan's background thread
Reviewed By: dvyukov, vitalybuka

Differential Revision: https://reviews.llvm.org/D115759
2021-12-16 11:47:33 -08:00
Julian Lettner
8f1ea2e85c [TSan][Darwin] Fix CheckAndProtect() for MappingAppleAarch64
In the new TSan runtime refactoring this line was changed:
```
ProtectRange(MetaShadowEnd(), TraceMemBeg());
-->
ProtectRange(MetaShadowEnd(), HeapMemBeg());
```

But for `MappingAppleAarch64` the app heap comes before the shadow,
resulting in:
```
CHECK failed: tsan_platform_posix.cpp:83 "((beg)) <= ((end))" (0xe00000000, 0x200000000)
```

rdar://86521924

Differential Revision: https://reviews.llvm.org/D115834
2021-12-15 18:03:58 -08:00
Dmitry Vyukov
9fb8058a80 tsan: enable the new runtime
This enables the new runtime (D112603) by default.

Depends on D112603.

Differential Revision: https://reviews.llvm.org/D115624
2021-12-13 12:50:13 +01:00
Dmitry Vyukov
b332134921 tsan: new runtime (v3)
This change switches tsan to the new runtime which features:
 - 2x smaller shadow memory (2x of app memory)
 - faster fully vectorized race detection
 - small fixed-size vector clocks (512b)
 - fast vectorized vector clock operations
 - unlimited number of alive threads/goroutimes

Depends on D112602.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D112603
2021-12-13 12:48:34 +01:00
Jonas Devlieghere
396113c19f Revert "tsan: new runtime (v3)"
This reverts commit 5a33e412815b8847610425a2a3b86d2c7c313b71 becuase it
breaks LLDB.

https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/39208/
2021-12-09 09:18:10 -08:00
Dmitry Vyukov
5a33e41281 tsan: new runtime (v3)
This change switches tsan to the new runtime which features:
 - 2x smaller shadow memory (2x of app memory)
 - faster fully vectorized race detection
 - small fixed-size vector clocks (512b)
 - fast vectorized vector clock operations
 - unlimited number of alive threads/goroutimes

Depends on D112602.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D112603
2021-12-09 09:09:52 +01:00
Dmitry Vyukov
8e93d4c996 tsan: fork runtime
Fork the current version of tsan runtime before commiting
rewrite of the runtime (D112603). The old runtime can be
enabled with TSAN_USE_OLD_RUNTIME option.
This is a temporal measure for emergencies and is required
for Chromium rollout (for context see http://crbug.com/1275581).
The old runtime is supposed to be deleted soon.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D115223
2021-12-09 07:28:26 +01:00
Nico Weber
63d518f31a [tsan] Move tsan/rtl build rules into tsan/rtl/CMakeLists.txt
That way, the build rules are closer to the source files they describe.

No intended behavior change.

Differential Revision: https://reviews.llvm.org/D115155
2021-12-06 19:58:30 -05:00
Vitaly Buka
6318001209 [sanitizer] Support IsRssLimitExceeded in all sanitizers
Reviewed By: kstoimenov

Differential Revision: https://reviews.llvm.org/D115000
2021-12-03 12:45:44 -08:00
Dmitry Vyukov
1b576585eb tsan: tolerate munmap with invalid arguments
We call UnmapShadow before the actual munmap, at that point we don't yet
know if the provided address/size are sane. We can't call UnmapShadow
after the actual munmap becuase at that point the memory range can
already be reused for something else, so we can't rely on the munmap
return value to understand is the values are sane.
While calling munmap with insane values (non-canonical address, negative
size, etc) is an error, the kernel won't crash. We must also try to not
crash as the failure mode is very confusing (paging fault inside of the
runtime on some derived shadow address).

Such invalid arguments are observed on Chromium tests:
https://bugs.chromium.org/p/chromium/issues/detail?id=1275581

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D114944
2021-12-02 17:50:51 +01:00
Dmitry Vyukov
97b4e63117 tsan: fix false positives in dynamic libs with static tls
The added test demonstrates  loading a dynamic library with static TLS.
Such static TLS is a hack that allows a dynamic library to have faster TLS,
but it can be loaded only iff all threads happened to allocate some excess
of static TLS space for whatever reason. If it's not the case loading fails with:

dlopen: cannot load any more object with static TLS

We used to produce a false positive because dlopen will write into TLS
of all existing threads to initialize/zero TLS region for the loaded library.
And this appears to be racing with initialization of TLS in the thread
since we model a write into the whole static TLS region (we don't what part
of it is currently unused):

WARNING: ThreadSanitizer: data race (pid=2317365)
  Write of size 1 at 0x7f1fa9bfcdd7 by main thread:
    0 memset
    1 init_one_static_tls
    2 __pthread_init_static_tls
    [[ this is where main calls dlopen ]]
    3 main
  Previous write of size 8 at 0x7f1fa9bfcdd0 by thread T1:
    0 __tsan_tls_initialization

Fix this by ignoring accesses during dlopen.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D114953
2021-12-02 17:47:05 +01:00
Dmitry Vyukov
09859113ed Revert "tsan: new runtime (v3)"
This reverts commit 66d4ce7e26a5ab00f7e4946b6e1bac8f805010fa.

Chromium tests started failing:
https://bugs.chromium.org/p/chromium/issues/detail?id=1275581
2021-12-01 18:00:46 +01:00
Julian Lettner
858eb8fc11 [TSan][Darwin] Avoid crashes due to interpreting non-zero shadow content as a pointer
We would like to use TLS to store the ThreadState object (or at least a
reference ot it), but on Darwin accessing TLS via __thread or manually
by using pthread_key_* is problematic, because there are several places
where interceptors are called when TLS is not accessible (early process
startup, thread cleanup, ...).

Previously, we used a "poor man's TLS" implementation, where we use the
shadow memory of the pointer returned by pthread_self() to store a
pointer to the ThreadState object.

The problem with that was that certain operations can populate shadow
bytes unbeknownst to TSan, and we later interpret these non-zero bytes
as the pointer to our ThreadState object and crash on when dereferencing
the pointer.

This patch changes the storage location of our reference to the
ThreadState object to "real" TLS.  We make this work by artificially
keeping this reference alive in the pthread_key destructor by resetting
the key value with pthread_setspecific().

This change also fixes the issue were the ThreadState object is
re-allocated after DestroyThreadState() because intercepted functions
can still get called on the terminating thread after the
THREAD_TERMINATE event.

Radar-Id: rdar://problem/72010355

Reviewed By: dvyukov

Differential Revision: https://reviews.llvm.org/D110236
2021-11-30 14:49:23 -08:00