This PR optimizes the performance of `std::ranges::swap_ranges` for
`vector<bool>::iterator`, addressing a subtask outlined in issue #64038.
The optimizations yield performance improvements of up to **611x** for
aligned range swap and **78x** for unaligned range swap comparison.
Additionally, comprehensive tests covering up to 4 storage words (256
bytes) with odd and even bit sizes are provided, which validate the
proposed optimizations in this patch.
This PR optimizes the performance of `std::ranges::equal` for
`vector<bool>::iterator`, addressing a subtask outlined in issue #64038.
The optimizations yield performance improvements of up to 188x for
aligned equality comparison and 82x for unaligned equality
comparison. Moreover, comprehensive tests covering up to 4 storage words
(256 bytes) with odd and even bit sizes are provided, which validate the
proposed optimizations in this patch.
Completes:
- LWG3359 <chrono> leap second support should allow for negative leap
seconds
- P1361R2 Integration of chrono with text formatting
Implements parts of:
- P0355 Extending <chrono> to Calendars and Time Zones
Fixes: #100432Fixes: #100014
As a follow-up to #121013 (which optimized `ranges::copy`) and #121026
(which optimized `ranges::copy_backward`), this PR enhances the
performance of `std::ranges::{move, move_backward}` for
`vector<bool>::iterator`, addressing a subtask outlined in issue #64038.
The optimizations bring performance improvements analogous to those
achieved for the `{copy, copy_backward}` algorithms: up to 2000x for
aligned moves and 60x for unaligned moves. Moreover, comprehensive
tests covering up to 4 storage words (256 bytes) with odd and even bit
sizes are provided, which validate the proposed optimizations in this
patch.
As a follow-up to #121013 (which focused on `std::ranges::copy`), this
PR optimizes the performance of `std::ranges::copy_backward` for
`vector<bool>::iterator`, addressing a subtask outlined in issue #64038.
The optimizations yield performance improvements of up to 2000x for
aligned copies and 60x for unaligned copies.
As a follow-up to #113852, this PR optimizes the performance of the
`insert(const_iterator pos, InputIt first, InputIt last)` function for
`input_iterator`-pair inputs in `std::vector` for cases where
reallocation occurs during insertion. Additionally, this optimization
enhances exception safety by replacing the traditional `try-catch`
mechanism with a modern exception guard for the `insert` function.
The optimization targets cases where insertion trigger reallocation. In
scenarios without reallocation, the implementation remains unchanged.
Previous implementation
-----------------------
The previous implementation of `insert` is inefficient in reallocation
scenarios because it performs the following steps separately:
- `reserve()`: This leads to the first round of relocating old
elements to new memory;
- `rotate()`: This leads to the second round of reorganizing the
existing elements;
- Move-forward: Moves the elements after the insertion position to
their final positions.
- Insert: performs the actual insertion.
This approach results in a lot of redundant operations, requiring the
elements to undergo three rounds of relocations/reorganizations to be
placed in their final positions.
Proposed implementation
-----------------------
The proposed implementation jointly optimize the above 4 steps in the
previous implementation such that each element is placed in its final
position in just one round of relocation. Specifically, this
optimization reduces the total cost from 2 relocations + 1 std::rotate
call to just 1 relocation, without needing to call `std::rotate`,
thereby significantly improving overall performance.
According to
[[template.bitset.general]](https://eel.is/c++draft/template.bitset.general),
`std::bitset` is supposed to have only
one (public) member typedef, `reference`. However, libc++'s
implementation of `std::bitset` offers more that that. Specifically, it
offers a public typedef `const_reference` and two private typedefs
`size_type` and `difference_type`. These non-standard member typedefs,
despite being private, can cause potential ambiguities in name lookup in
user-defined classes, as demonstrated in issue #121618.
Fixing the public member typedef `const_reference` is straightforward:
we can simply replace it with an `__ugly_name` such as
`__const_reference`. However, fixing the private member typedefs
`size_type` and `difference_type` is not so straightforward as they are
required by the `__bit_iterator` class and the corresponding algorithms
optimized for `__bit_iterator`s (e.g., `ranges::fill`).
This PR fixes the member typedef `const_reference` by using uglified
name for it. Further work will be undertaken to address `size_type` and
`difference_type`.
Follows up #80706, #111127, and #112843,
The `std::error_code`/`std::error_category` functionality is designed to
support multiple error domains. On Unix, both system calls and libc
functions return the same error codes, and thus, libc++ today treats
`generic_category()` and `system_category()` as being equivalent.
However, on Windows, libc functions return `errno.h` error codes in the
`errno` global, but system calls return the very different `winerror.h`
error codes via `GetLastError()`.
As such, there is a need to map the winerror.h error codes into generic
errno codes. In libc++, however, the system_error facility does not
implement this mapping; instead the mapping is hidden inside libc++,
used directly by the std::filesystem implementation.
That has a few problems:
1. For std::filesystem APIs, the concrete windows error number is lost,
before users can see it. The intent of the distinction between
std::error_code and std::error_condition is that the error_code return
has the original (potentially more detailed) error code.
2. User-written code which calls Windows system APIs requires this same
mapping, so it also can also return error_code objects that other
(cross-platform) code can understand.
After this commit, an `error_code` with `generic_category()` is used to
report an error from `errno`, and, on Windows only, an `error_code` with
`system_category()` is used to report an error from `GetLastError()`. On
Unix, system_category remains identity-mapped to generic_category, but
is never used by libc++ itself.
The windows error code mapping is moved into system_error, so that
conversion of an `error_code` to `error_condition` correctly translates
the `system_category()` code into a `generic_category()` code, when
appropriate.
This allows code like:
`error_code(GetLastError(), system_category()) == errc::invalid_argument`
to work as expected -- as it does with MSVC STL.
(Continued from old phabricator review [D151493](https://reviews.llvm.org/D151493))
This patch reimplements the locale base support for Windows flavors in a
way that is more modules-friendly and without defining non-internal
names.
Since this changes the name of some types and entry points in the built
library, this is effectively an ABI break on Windows (which is
acceptable after checking with the Windows/libc++ maintainers).
This PR optimizes the input iterator overload of `assign(_InputIterator,
_InputIterator)` in `std::vector<_Tp, _Allocator>` by directly assigning
to already initialized memory, rather than first destroying existing
elements and then constructing new ones. By eliminating unnecessary
destruction and construction, the proposed algorithm enhances the
performance by up to 2x for trivial element types (e.g.,
`std::vector<int>`), up to 2.6x for non-trivial element types like
`std::vector<std::string>`, and up to 3.4x for more complex non-trivial
types (e.g., `std::vector<std::vector<int>>`).
### Google Benchmarks
Benchmark tests (`libcxx/test/benchmarks/vector_operations.bench.cpp`)
were conducted for the `assign()` implementations before and after this
patch. The tests focused on trivial element types like
`std::vector<int>`, and non-trivial element types such as
`std::vector<std::string>` and `std::vector<std::vector<int>>`.
#### Before
```
-------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------------------------------------
BM_AssignInputIterIter/vector_int/1024/1024 1157 ns 1169 ns 608188
BM_AssignInputIterIter<32>/vector_string/1024/1024 14559 ns 14710 ns 47277
BM_AssignInputIterIter<32>/vector_vector_int/1024/1024 26846 ns 27129 ns 25925
```
#### After
```
-------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-------------------------------------------------------------------------------------------------
BM_AssignInputIterIter/vector_int/1024/1024 561 ns 566 ns 1242251
BM_AssignInputIterIter<32>/vector_string/1024/1024 5604 ns 5664 ns 128365
BM_AssignInputIterIter<32>/vector_vector_int/1024/1024 7927 ns 8012 ns 88579
```
The pointer safety functions were never implemented by anyone in a
non-trivial way, which lead us to removing them entirely from the
headers when they were removed in C++23. This also means that just about
nobody ever called these functions, especially not in production code.
Because of that, we expect there to be no references in the wild to the
functions in the dylib. Because of that, this patch removes the symbols
from the dylib.
This patch introduces a new kind of bounded iterator that knows the size
of its valid range at compile-time, as in std::array. This allows computing
the end of the range from the start of the range and the size, which requires
storing only the start of the range in the iterator instead of both the start
and the size (or start and end). The iterator wrapper is otherwise identical
in design to the existing __bounded_iter.
Since this requires changing the type of the iterators returned by
std::array, this new bounded iterator is controlled by an ABI flag.
As a drive-by, centralize the tests for std::array::operator[] and add
missing tests for OOB operator[] on non-empty arrays.
Fixes#70864
This PR deprecates `<ccomplex>`, `<cstdbool>`, `<ctgmath>`, and
`<ciso646>` in C++17 and "removes" them in C++20 by special deprecation
warnings.
`<cstdalign>` is previously missing. This PR also tries to add them, and
then deprecates and "removes" `<cstdalign>`.
Papers:
- https://wg21.link/P0063R3
- https://wg21.link/P0619R4Closes#99985.
---------
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
Around half of the tests are based on the tests Arthur O'Dwyer's
original implementation of std::flat_map, with modifications and
removals.
partially implement #105190
Currently, libc++'s `bitset`, `forward_list`, and `list` have
non-conforming member typedef name `base`. The typedef is private, but
can cause ambiguity in name lookup.
Some other classes in libc++ that are either implementation details or
not precisely specified by the standard also have member typdef `base`.
I think this can still be conforming.
Follows up #80706 and #111127.
Make __libcpp_verbose_abort() noexcept (it is already noreturn), to
match std::terminate(). Clang's function effect analysis can use this to
ignore such functions as being beyond its scope. (See
https://github.com/llvm/llvm-project/pull/99656).
[template.bitset.general] indicates that `bitset` shouldn't have member
typedef-names `iterator` and `const_iterator`. Currently libc++'s
typedef-names are causing ambiguity in name lookup, which isn't
conforming.
As these iterator types are themselves useful, I think we should just
use __uglified member typedef-names for them.
Fixes#111125
This ABI break only affects fancy pointer which have a different value
representation when pointing to a base of T instead of T itself. This
seems like a rather small set of fancy pointers, which themselves
already represent a very small niche. This patch swaps a pointer to T
with a pointer to base of T in a few library-internal types.
Works towards P0619R4 / #99985.
The use of `std::get_temporary_buffer` and `std::return_temporary_buffer`
are replaced with `unique_ptr`-based RAII buffer holder.
Escape hatches:
- `_LIBCPP_ENABLE_CXX20_REMOVED_TEMPORARY_BUFFER` restores
`std::get_temporary_buffer` and `std::return_temporary_buffer`.
Drive-by changes:
- In `<syncstream>`, states that `get_temporary_buffer` is now removed,
because `<syncstream>` is added in C++20.
This significantly simplifies the code, improves compile times and
improves the object layout of types using `__compressed_pair` in the
unstable ABI. The only downside is that this is extremely ABI sensitive
and pedantically breaks the ABI for empty final types, since the address
of the subobject may change. The ABI of the whole object should not be
affected.
Fixes#91266Fixes#93069
There are a few lines in the release notes which are much wider than the
120 columns we usually use. This reflows the text to keep it below the
threshold.
We waited before supporting std::jthread fully because we wanted to
investigate other implementation strategies (in particular one involving
std::mutex). Since then, we did some benchmarking and decided that we
wouldn't be moving forward with std::mutex. Hence, there is no real
reason to punt on making std::jthread & friends non-experimental.
This patch implements https://wg21.link/P2747R2.
The library changes affect direct `operator new` and `operator new[]`
calls even when the core language changes are absent.
The changes are not available for MS ABI because the `operator new` and
`operator new[]` are from VCRuntime's `<vcruntime_new.h>`. A feature
request was submitted for that [1].
As a drive-by change, the patch reformatted the whole `new.pass.cpp` and
`new_array.pass.cpp` tests.
Closes#105427
[1]: https://developercommunity.visualstudio.com/t/constexpr-for-placement-operator-newope/10730304.
This patch implements https://wg21.link/p2609r3.
The test code was originally authored by JMazurkiewicz.
Notes:
- P2609R3 is not officially a Defect Report, but MSVC STL
implements it in C++20 mode.
Moreover, P2609R3 and P2997R1 touch exactly the same set of
concepts, and MSVC STL and libc++ have already treated P2997R1
as a DR.
- This patch also adjusted feature-test macros.
+ In C++20 mode, the value of __cpp_lib_ranges should be `202110L` because
- `202202L` covers `range_adaptor_closure` (P2387R3), and
- `202207L` covers move-only types in range adaptors (P2494R2).
And all of these changes are only available since C++23 mode.
+ In C++23 mode, the value should be `202406L` because
- `202211L` covers removing poison overloads (P2602R2),
- `202302L` covers relaxing projected value types (P2609R3), and
- `202406L` covers removing requirements on `iter_common_reference_t` (P2997R1).
And all of these changes are already or being implemented.
Fixes#105253.
Co-authored-by: Jakub Mazurkiewicz <mazkuba3@gmail.com>
Works towards P0619R4/#99985.
- std::uncaught_exception was not previously deprecated. This patch
deprecates it since C++17 as per N4259. std::uncaught_exceptions is
used instead as libc++ unconditionally provides this function.
- _LIBCPP_ENABLE_CXX20_REMOVED_UNCAUGHT_EXCEPTION restores
std::uncaught_exception.
- As a drive-by, this patch updates the C++20 status page to
explain that D.11 is already done, since it was done in
578d09c1b195d859ca7e62840ff6bb83421a77b5.
This trait is implemented in C++26 conditionally on the compiler
supporting the __builtin_is_virtual_base_of intrinsic. I believe only
tip-of-trunk Clang currently implements that builtin.
Closes#105432
Instead of tracking those using our static CSV files, I created lists of
subtasks in their respective issues (#99939 and #105169) to track the
work that is still left.
This patch removes obsolete status pages for projects that were
completed: LLVM 18 release, C++20 Ranges and Spaceship support.
Co-authored-by: Hristo Hristov <zingam@outlook.com>
In LLVM 19 removed the extension with an opt-in macro. This finally
removes that option too and removes a few `const_cast`s where I know
that they exist only to support this extension.
When we initially implemented the C++20 synchronization library, we
reluctantly accepted for the implementation to be backported to C++03
upon request from the person who provided the patch. This was when we
were only starting to have experience with the issues this can create,
so we flinched. Nowadays, we have a much stricter stance about not
backporting features to previous standards.
We have recently started fixing several bugs (and near bugs) in our
implementation of the synchronization library. A recurring theme during
these reviews has been how difficult to understand the current code is,
and upon inspection it becomes clear that being able to use a few recent
C++ features (in particular lambdas) would help a great deal. The code
would still be pretty intricate, but it would be a lot easier to reason
about the flow of callbacks through things like
__thread_poll_with_backoff.
As a result, this patch drops support for the synchronization library
before C++20. This makes us more strictly conforming and opens the door
to major simplifications, in particular around atomic_wait which was
supported all the way to C++03.
This change will probably have some impact on downstream users, however
since the C++20 synchronization library was added only in LLVM 10 (~3
years ago) and it's quite a niche feature, the set of people trying to
use this part of the library before C++20 should be reasonably small.
~~NB: This PR depends on #78876. Ignore the first commit when reviewing,
and don't merge it until #78876 is resolved. When/if #78876 lands, I'll
clean this up.~~
This partially restores parity with the old, since removed debug build.
We now can re-enable a bunch of the disabled tests. Some things of note:
- `bounded_iter`'s converting constructor has never worked. It needs a
friend declaration to access the other `bound_iter` instantiation's
private fields.
- The old debug iterators also checked that callers did not try to
compare iterators from different objects. `bounded_iter` does not
currently do this, so I've left those disabled. However, I think we
probably should add those. See
https://github.com/llvm/llvm-project/issues/78771#issuecomment-1902999181
- The `std::vector` iterators are bounded up to capacity, not size. This
makes for a weaker safety check. This is because the STL promises not to
invalidate iterators when appending up to the capacity. Since we cannot
retroactively update all the iterators on `push_back()`, I've instead
sized it to the capacity. This is not as good, but at least will stop
the iterator from going off the end of the buffer.
There was also no test for this, so I've added one in the `std`
directory.
- `std::string` has two ambiguities to deal with. First, I opted not to
size it against the capacity. https://eel.is/c++draft/string.require#4
says iterators are invalidated on an non-const operation. Second,
whether the iterator can reach the NUL terminator. The previous debug
tests and the special-case in https://eel.is/c++draft/string.access#2
suggest no. If either of these causes widespread problems, I figure we
can revisit.
- `resize_and_overwrite.pass.cpp` assumed `std::string`'s iterator
supported `s.begin().base()`, but I see no promise of this in the
standard. GCC also doesn't support this. I fixed the test to use
`std::to_address`.
- `alignof.compile.pass.cpp`'s pointer isn't enough of a real pointer.
(It needs to satisfy `NullablePointer`, `LegacyRandomAccessIterator`,
and `LegacyContiguousIterator`.) `__bounded_iter` seems to instantiate
enough to notice. I've added a few more bits to satisfy it.
Fixes#78805