The overloaded `operator&` caused non-conforming behavior when
- using `operator==` to compare "addresses" of proxy reference objects,
and
- relying on the exact type of `&ref`.
No deprecation warning is added, becaue it should be portable to write
`&ref` where `ref` is a proxy reference variable, and this patch just
corrects the behavior.
`__bit_const_reference::operator&` is kept, because when one defines
`_LIBCPP_ABI_BITSET_VECTOR_BOOL_CONST_SUBSCRIPT_RETURN_BOOL` to make the
libc++ implementation strategy conforming, the `operator&` will never be
exposed to users.
Resolves#186801
Removed the non-standard member type `iterator_type` from `__wrap_iter`.
This member exposed the underlying iterator type, and its removal
prevents users from relying on the implementation detail.
When the allocators use fancy pointers, the internal map of `deque`
stores `FancyPtr<T>` objects, and the previous strategy accessed these
objects via `const FancyPtr<const T>` lvalues, which usually caused core
language undefined behavior. Now `const_iterator` stores `FancyPtr<const
FancyPtr<T>>` instead of `FancyPtr<const FancyPtr<const T>>`, and ABI
break can happen when such two types have incompatible layouts.
This is necessary for reducing undefined behavior and `constexpr`
support for `deque` in C++26, and I currently don't want to provide any
way to opt-out of that behavior.
For `iterator`, the current strategy makes it store
`FancyPtr<FancyPtr<T>>`. But it would make more sense to also store
`FancyPtr<const FancyPtr<T>>` because we never modify the map via
`iterator`.
For some pathological combinations of allocators and fancy pointers, the
rebinding trick doesn't work. These cases are rejected by
`static_assert`.
The existing test coverage seems to be sufficient.
There is no reason to double-wrap iterators, since we can already
achieve anything we want within `__bounded_iter`itself.
This is technically ABI breaking, but people using bounded iterators
shouldn't require ABI stability currently.
Fixes#178521
Resolves#126442
- Converts all the relevant functions that used `.base()` into friends
- Fixed usage in `<regex>`
---------
Co-authored-by: A. Jiang <de34@live.cn>
In #155245, we implemented an optimization to std::map and std::set
search operations. That optimization took advantage of something that is
guaranteed by the Standard, namely that the comparator provided to the
associative container is a valid strict weak ordering.
Sadly, some code in the wild did not satisfy this requirement, such as
Boost.ICL: boostorg/icl#51
Since this can have extremely tricky runtime consequences, this patch
introduces a temporary escape hatch for the LLVM 22 release that allows
reverting to the previous behavior. It also explicitly calls out the
change in the release notes, adds some regression tests and adds debug
mode support for catching some of these invalid predicates.
Fixes#183189
`std::launch::any` was a draft C++11 feature that was removed before the
final standard but it has remained in libc++ as an extension. This patch
marks it as deprecated and suggests using `std::launch::async |
std::launch::deferred` instead.
- Used `_LIBCPP_DEPRECATED_` to mark `std::launch::any` as deprecated
with an associated warning message recommending `std::launch::async |
std::launch::deferred` instead.
- Added a `.verify.cpp` test to validate the deprecation warning.
- Updated existing tests to avoid using the deprecated extension.
- Added note about deprecation in docs.
Fixes#173219
---------
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
P2321R2 has been implemented in various PRs. Based on the discussion
in #105169, the last bit in iterator.concept.winc doesn't require
any changes, so we can actually mark this as done.
Fixes#105169
This performs most of the post-branch tasks to start working on the LLVM
23 release. Things that are left to do:
- Update the unicode version
- Update CI versions and supported compiler versions
We haven't yet decided what we want the `optional::iterator` type to be
in the end, so let's make it experimental for now so that we don't
commit to an ABI yet.
Currently, `deque` and `vector`'s `append_range` is implemented in terms
of `insert_range`. The problem with that is that `insert_range` has more
preconditions, resulting in us rejecting valid code.
This also significantly improves performance for `deque` in some cases.
[P1789R3](https://isocpp.org/files/papers/P1789R3.pdf) was accepted for
C++26 through LWG motion 14 at the 2025 Kona meeting. This patch
implements it, along with tests and documentation changes.
Closes#167268
---------
Co-authored-by: Tsche <che@pydong.org>
This implements all of [P2404R3](https://wg21.link/p2404r3)'s concept
changes.
---------
Co-authored-by: A. Jiang <de34@live.cn>
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
`std::align` is heavily used in memory allocators. When we attempted to
switch from libstdc++ to libc++, we observed a **50%** performance
regression in a database query bench: the issue is that `std::align` in
libc++ is not an inline function, which prevents the compiler from
performing inlining optimizations.
make `std::align` an inline function will run about 2x faster. See
[benchmark
result](https://quick-bench.com/q/wPTnt9JCGn2S-3bu5gY9YrEf6KU).
---------
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
This takes an ABI break unconditionally, since it's small enough that
nobody should be affected. This both simplifies `bitset` a bit and makes
us more conforming.
This is technically ABI breaking, since `is_trivial` and
`is_trivially_default_constructible` now return different results.
However, I don't think that's a significant issue, since `allocator` is
almost always used in classes which own memory, making them non-trivial
anyways.
This is to address #146145
The issue before was that, for `std::atomic::wait/notify`, we only
support `uint64_t` to go through the native `ulock_wait` directly. Any
other types will go through the global contention table's `atomic`,
increasing the chances of spurious wakeup. This PR tries to allow any
types that are of size 4 or 8 to directly go to the `ulock_wait`.
This PR is just proof of concept. If we like this idea, I can go further
to update the Linux/FreeBSD branch and add ABI macros so the existing
behaviours are reserved under the stable ABI
Here are some benchmark results
```
Benchmark Time CPU Time Old Time New CPU Old CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------
BM_stop_token_single_thread_reg_unreg_callback/1024 -0.1113 -0.1165 51519 45785 51397 45408
BM_stop_token_single_thread_reg_unreg_callback/4096 -0.2727 -0.1447 249685 181608 211865 181203
BM_stop_token_single_thread_reg_unreg_callback/65536 -0.1241 -0.1237 3308930 2898396 3300986 2892608
BM_stop_token_single_thread_reg_unreg_callback/262144 +0.0335 -0.1920 13237682 13681632 13208849 10673254
OVERALL_GEOMEAN -0.1254 -0.1447 0 0 0 0
```
```
Benchmark Time CPU Time Old Time New CPU Old CPU New
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<0>>/65536 -0.3344 -0.2424 5960741 3967212 5232250 3964085
BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<0>>/131072 -0.1474 -0.1475 9144356 7796745 9137547 7790193
BM_1_atomic_1_waiter_1_notifier<KeepNotifying, NumHighPrioTasks<0>>/262144 -0.1336 -0.1340 18333441 15883805 18323711 15868500
OVERALL_GEOMEAN -0.2107 -0.1761 0 0 0 0
```
```
Benchmark Time CPU Time Old Time New CPU Old CPU New
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<2>, NumHighPrioTasks<0>>/16384 +0.2321 -0.0081 836618 1030772 833197 826476
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<2>, NumHighPrioTasks<0>>/32768 -0.3034 -0.1329 2182721 1520569 1747211 1515028
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<2>, NumHighPrioTasks<0>>/65536 -0.0924 -0.0924 3389098 3075897 3378486 3066448
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<8>, NumHighPrioTasks<0>>/4096 +0.0464 +0.0474 664233 695080 657736 688892
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<8>, NumHighPrioTasks<0>>/8192 -0.0279 -0.0267 1336041 1298794 1324270 1288953
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<8>, NumHighPrioTasks<0>>/16384 +0.0270 +0.0304 2543004 2611786 2517471 2593975
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<32>, NumHighPrioTasks<0>>/1024 +0.0423 +0.0941 473621 493657 325604 356245
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<32>, NumHighPrioTasks<0>>/2048 +0.0420 +0.0675 906266 944349 636253 679169
BM_1_atomic_multi_waiter_1_notifier<KeepNotifying, NumWaitingThreads<32>, NumHighPrioTasks<0>>/4096 +0.0359 +0.0378 1761584 1824783 1015092 1053439
OVERALL_GEOMEAN -0.0097 -0.0007 0 0 0 0
```
```
Benchmark Time CPU Time Old Time New CPU Old CPU New
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<2>, NumHighPrioTasks<0>>/4096 -0.0990 -0.1001 371100 334370 369984 332955
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<2>, NumHighPrioTasks<0>>/8192 -0.0305 -0.0314 698228 676908 696418 674585
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<2>, NumHighPrioTasks<0>>/16384 -0.0258 -0.0268 1383530 1347894 1380665 1343680
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<8>, NumHighPrioTasks<0>>/1024 +0.0465 +0.4702 937821 981388 472087 694082
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<8>, NumHighPrioTasks<0>>/2048 +0.1596 +0.9140 1704819 1976899 616419 1179852
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<8>, NumHighPrioTasks<0>>/4096 -0.1018 -0.2316 3793976 3407609 1912209 1469331
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<32>, NumHighPrioTasks<0>>/256 +0.0395 +0.5818 30102662 31292982 174650 276270
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<32>, NumHighPrioTasks<0>>/512 -0.0065 +1.2860 33079634 32863968 162150 370680
BM_N_atomics_N_waiter_N_notifier<KeepNotifying, NumberOfAtomics<32>, NumHighPrioTasks<0>>/1024 -0.0325 +0.4683 36581740 35392385 282320 414520
OVERALL_GEOMEAN -0.0084 +0.2878 0 0 0 0
```
---------
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
```
---------------------------------------------------
Benchmark old new
---------------------------------------------------
BM_num_get<bool> 86.5 ns 32.3 ns
BM_num_get<long> 82.1 ns 30.3 ns
BM_num_get<long long> 85.2 ns 33.4 ns
BM_num_get<unsigned short> 85.3 ns 31.2 ns
BM_num_get<unsigned int> 84.2 ns 31.1 ns
BM_num_get<unsigned long> 83.6 ns 31.9 ns
BM_num_get<unsigned long long> 87.7 ns 31.5 ns
BM_num_get<float> 116 ns 114 ns
BM_num_get<double> 114 ns 114 ns
BM_num_get<long double> 113 ns 114 ns
BM_num_get<void*> 151 ns 144 ns
```
This patch applies multiple optimizations:
- Stages two and three of do_get are merged and a custom integer parser
has been implemented
This avoids allocations, removes the need for strto{,u}ll and avoids
__stage2_int_loop (avoiding extra writes to memory)
- std::find has been replaced with __atoms_offset, which uses vector
instructions to look for a character
Fixes#158100Fixes#158102
Resolves#148131
- Unlock `std::optional<T&>` implementation
- Allow instantiations of `optional<T(&)(...)>` and `optional<T(&)[]>`
but disables `value_or()` and `optional::iterator` + all `iterator`
related functions
- Update documentation
- Update tests
This reapplication fixes the use after free caused by not properly
updating the bucket list in one case.
Original commit message:
Instead of just calling the single element `erase` on every element of
the range, we can combine some of the operations in a custom
implementation. Specifically, we don't need to search for the previous
node or re-link the list every iteration. Removing this unnecessary work
results in some nice performance improvements:
```
-----------------------------------------------------------------------------------------------------------------------
Benchmark old new
-----------------------------------------------------------------------------------------------------------------------
std::unordered_set<int>::erase(iterator, iterator) (erase half the container)/0 457 ns 459 ns
std::unordered_set<int>::erase(iterator, iterator) (erase half the container)/32 995 ns 626 ns
std::unordered_set<int>::erase(iterator, iterator) (erase half the container)/1024 18196 ns 7995 ns
std::unordered_set<int>::erase(iterator, iterator) (erase half the container)/8192 124722 ns 70125 ns
std::unordered_set<std::string>::erase(iterator, iterator) (erase half the container)/0 456 ns 461 ns
std::unordered_set<std::string>::erase(iterator, iterator) (erase half the container)/32 1183 ns 769 ns
std::unordered_set<std::string>::erase(iterator, iterator) (erase half the container)/1024 27827 ns 18614 ns
std::unordered_set<std::string>::erase(iterator, iterator) (erase half the container)/8192 266681 ns 226107 ns
std::unordered_map<int, int>::erase(iterator, iterator) (erase half the container)/0 455 ns 462 ns
std::unordered_map<int, int>::erase(iterator, iterator) (erase half the container)/32 996 ns 659 ns
std::unordered_map<int, int>::erase(iterator, iterator) (erase half the container)/1024 15963 ns 8108 ns
std::unordered_map<int, int>::erase(iterator, iterator) (erase half the container)/8192 136493 ns 71848 ns
std::unordered_multiset<int>::erase(iterator, iterator) (erase half the container)/0 454 ns 455 ns
std::unordered_multiset<int>::erase(iterator, iterator) (erase half the container)/32 985 ns 703 ns
std::unordered_multiset<int>::erase(iterator, iterator) (erase half the container)/1024 16277 ns 9085 ns
std::unordered_multiset<int>::erase(iterator, iterator) (erase half the container)/8192 125736 ns 82710 ns
std::unordered_multimap<int, int>::erase(iterator, iterator) (erase half the container)/0 457 ns 454 ns
std::unordered_multimap<int, int>::erase(iterator, iterator) (erase half the container)/32 1091 ns 646 ns
std::unordered_multimap<int, int>::erase(iterator, iterator) (erase half the container)/1024 17784 ns 7664 ns
std::unordered_multimap<int, int>::erase(iterator, iterator) (erase half the container)/8192 127098 ns 72806 ns
```
This reverts commit acc3a6234a91369b818fdd6482ded0ac32d8ffa6.
Part of #102817.
This patch attempts to optimize the performance of `std::generate` for
segmented iterators. Below are the benchmark numbers from
`libcxx\test\benchmarks\algorithms\modifying\generate.bench.cpp`. Test
cases that use segmented iterators have also been added.
- before
```
std::generate(deque<int>)/32 194 ns 193 ns 3733333
std::generate(deque<int>)/50 276 ns 276 ns 2488889
std::generate(deque<int>)/1024 5096 ns 5022 ns 112000
std::generate(deque<int>)/8192 40806 ns 40806 ns 17231
```
- after
```
std::generate(deque<int>)/32 106 ns 105 ns 6400000
std::generate(deque<int>)/50 139 ns 138 ns 4977778
std::generate(deque<int>)/1024 2713 ns 2699 ns 248889
std::generate(deque<int>)/8192 18983 ns 19252 ns 37333
```
---------
Co-authored-by: A. Jiang <de34@live.cn>
The completion of P2944R3 (4a509f853fa4821ecdb0f6bc3b90ddd48794cc8c)
just missed LLVM 21 release, and it seems controversial that whether
such feature completion should be backported.
I'm aware of following-up cleanup and bugfix about `<tuple>`. Perhaps it
will become more and more unwise to backport the changes. So let's
retarget P2944R3 to LLVM 22 in documentations.
Drive-by: Also fixes the formatting of the entry of P3379R0.