At the moment, we use the os internal functions `__ulock_wait`. This
patch updates the code on macOS to use the public API
`os_sync_wait_on_address`.
Fixes#146142
Fixes#109290
See the GH issue for the details. In conclusion, we have two issues in
the `atomic<T>::wait` when `T` does not match our `contention_t`:
- We don't have a total single order which can leads to missed
notification based on the Herd7 simulation on relaxed architectural like
PowerPC
- We assumed the platform wait (`futex_wait`/`__ulock_wait`) has seq_cst
barrier internally but there is no such guarantee
This reverts commit c861fe8a71e64f3d2108c58147e7375cd9314521.
Unfortunately, this use of hidden visibility attributes causes
user-defined specializations of standard-library types to also be marked
hidden by default, which is incorrect. See discussion thread on #131156.
...and also reverts the follow-up commits:
Revert "[libc++] Add explicit ABI annotations to functions from the block runtime declared in <__functional/function.h> (#140592)"
This reverts commit 3e4c9dc299c35155934688184319d391b298fff7.
Revert "[libc++] Make ABI annotations explicit for windows-specific code (#140507)"
This reverts commit f73287e623a6c2e4a3485832bc3e10860cd26eb5.
Revert "[libc++][NFC] Replace a few "namespace std" with the correct macro (#140510)"
This reverts commit 1d411f27c769a32cb22ce50b9dc4421e34fd40dd.
This patch introduces `_LIBCPP_{BEGIN,END}_EXPLICIT_ABI_ANNOTATIONS`,
which allow us to have implicit annotations for most functions, and just
where it's not "hide_from_abi everything" we add explicit annotations.
This allows us to drop the `_LIBCPP_HIDE_FROM_ABI` macro from most
functions in libc++.
in `atomic::wait`, when we call the platform wait ulock_wait , we are
using UL_COMPARE_AND_WAIT. But we should use UL_COMPARE_AND_WAIT64
instead as the address we are waiting for is a 64 bit integer.
fixes https://github.com/llvm/llvm-project/issues/85107
It is rather hard to test directly because in `atomic::wait`, before
calling into the platform wait, our c++ code has some poll logic which
checks the value not changing. Thus in this patch, the test is using the
internal function.
- No indirect syscalls on OpenBSD. Instead there is a `futex` function
which issues a direct syscall.
- Monotonic clock is available despite the full POSIX suite of timers
not being available in its entirety.
See https://lists.boost.org/boost-bugs/2015/07/41690.php and
c98b1f459a
for a description of an analogous problem and fix for Boost.
This is a follow-up PR to
<https://github.com/llvm/llvm-project/pull/79265>. It aims to be a
gentle refactoring of the `__cxx_atomic_wait` function that takes a
predicate.
The key idea here is that this function's signature is changed to look
like this (`std::function` used just for clarity):
```c++
__cxx_atomic_wait_fn(Atp*, std::function<bool(Tp &)> poll, memory_order __order);
```
...where `Tp` is the corresponding `value_type` to the atomic variable
type `Atp`. The function's semantics are similar to `atomic`s `.wait()`,
but instead of having a hardcoded predicate (is the loaded value unequal
to `old`?) the predicate is specified explicitly.
The `poll` function may change its argument, and it is very important
that if it returns `false`, it leaves its current understanding of the
atomic's value in the argument. Internally, `__cxx_atomic_wait_fn`
dispatches to two waiting mechanisms, depending on the type of the
atomic variable:
1. If the atomic variable can be waited on directly (for example,
Linux's futex mechanism only supports waiting on 32 bit long variables),
the value of the atomic variable (which `poll` made its decision on) is
then given to the underlying system wait function (e.g. futex).
2. If the atomic variable can not be waited on directly, there is a
global pool of atomics that are used for this task. The ["eventcount"
pattern](<https://gist.github.com/mratsim/04a29bdd98d6295acda4d0677c4d0041>)
is employed to make this possible.
The eventcount pattern needs a "monitor" variable which is read before
the condition is checked another time. libcxx has the
`__libcpp_atomic_monitor` function for this. However, this function only
has to be called in case "2", i.e. when the eventcount is actually used.
In case "1", the futex is used directly, so the monitor must be the
value of the atomic variable that the `poll` function made its decision
on to continue blocking. Previously, `__libcpp_atomic_monitor` was
_also_ used in case "1". This was the source of the ABA style bug that
PR#79265 fixed.
However, the solution in PR#79265 has some disadvantages:
- It exposes internals such as `cxx_contention_t` or the fact that
`__libcpp_thread_poll_with_backoff` needs two functions to higher level
constructs such as `semaphore`.
- It doesn't prevent consumers calling `__cxx_atomic_wait` in an error
prone way, i.e. by providing to it a predicate that doesn't take an
argument. This makes ABA style issues more likely to appear.
Now, `__cxx_atomic_wait_fn` takes just _one_ function, which is then
transformed into the `poll` and `backoff` callables needed by
`__libcpp_thread_poll_with_backoff`.
Aside from the `__cxx_atomic_wait` changes, the only other change is the
weakening of the initial atomic load of `semaphore`'s `try_acquire` into
`memory_order_relaxed` and the CAS inside the loop is changed from
`strong` to `weak`. Both weakenings should be fine, since the CAS is
called in a loop, and the "acquire" semantics of `try_acquire` come from
the CAS, not from the initial load.
This patch runs clang-format on all of libcxx/include and libcxx/src, in
accordance with the RFC discussed at [1]. Follow-up patches will format
the benchmarks, the test suite and remaining parts of the code. I'm
splitting this one into its own patch so the diff is a bit easier to
review.
This patch was generated with:
find libcxx/include libcxx/src -type f \
| grep -v 'module.modulemap.in' \
| grep -v 'CMakeLists.txt' \
| grep -v 'README.txt' \
| grep -v 'libcxx.imp' \
| grep -v '__config_site.in' \
| xargs clang-format -i
A Git merge driver is available in libcxx/utils/clang-format-merge-driver.sh
to help resolve merge and rebase issues across these formatting changes.
[1]: https://discourse.llvm.org/t/rfc-clang-formatting-all-of-libc-once-and-for-all
Source files in libc++ are added to the CMake targets only if they are
required by the configuration. We do this pretty consistently for all
configurations like no-filesystem, no-random-device, etc. but we didn't
do it for no-threads. This patch makes this consistent for no-threads,
which is helpful in reducing the amount of work required to port libc++
to some platforms without threads.
Indeed, with the previous approach, several threads-related source files
would end up including headers that might fail to compile properly on
some platforms. This issue is sidestepped entirely by making the
approach for no-threads consistent with the other configurations.
Using __SIZEOF_LONG__ == 8 rather than __LP64__ is needed so we use umtx
on CHERI. I accidentally landed an older diff.
Fixes: 17ecbb3ea6ff0ae716dd524c0e2bf75a4815c95b
Only 64bit architectures can be supported this way, because libcxx
defines __cxx_contention_t to be int64_t for FreeBSD, and 32bit
arches do not have a kind of UMTX_OP_WAIT_INT64_PRIVATE operation.
Fixes: 83387dbc18e7998f87aa4a2d35320bcb2ed5c392
Reviewed by: arichardson, ldionne, emaste, Mordante
Differential Revision: https://reviews.llvm.org/D142422
This change is the basis for a further refactoring where I'm going to
split up the various implementations we have in __threading_support to
make that code easier to understand.
Note that I had to make __convert_to_timespec a template to break
circular dependencies. Concretely, we never seem to use it with anything
other than ::timespec, but I am wary of hardcoding that assumption as
part of this change, since I suspect there's a reason for going through
these hoops in the first place.
Differential Revision: https://reviews.llvm.org/D116944
We've stopped doing it in libc++ for a while now because these names
would end up rotting as we move things around and copy/paste stuff.
This cleans up all the existing files so as to stop the spreading
as people copy-paste headers around.