As a follow-up to #121013 (which optimized `ranges::copy`) and #121026
(which optimized `ranges::copy_backward`), this PR enhances the
performance of `std::ranges::{move, move_backward}` for
`vector<bool>::iterator`, addressing a subtask outlined in issue #64038.
The optimizations bring performance improvements analogous to those
achieved for the `{copy, copy_backward}` algorithms: up to 2000x for
aligned moves and 60x for unaligned moves. Moreover, comprehensive
tests covering up to 4 storage words (256 bytes) with odd and even bit
sizes are provided, which validate the proposed optimizations in this
patch.
These member types were deprecated in C++17 by P0174R2 and removed in
C++20 by P0619R4, but the changes in `<variant>` seem missing.
Drive-by: Replace one `_NOEXCEPT` with `noexcept` as the `hash`
specialization is C++17-and-later only.
All non-existing local times in a contiguous range should map to the
same time point. This fixes a bug, were the times inside the range were
mapped to the wrong time.
Fixes: #113654
Asan reports it after #124103.
It's know case of false positive for Asan.
https://github.com/google/sanitizers/wiki/AddressSanitizerInitializationOrderFiasco#false-positives
It's can be avoided with `constexpr` constructors.
In general order global constructors in different
modules is undefined. If global constructor uses
external global, they can be not constructed yet.
However, implementation may contain workaround for
that, or the state of non-constructed global can
be still valid.
Asan will still falsely report such cases, as it
has no machinery to detect correctness of such
cases.
We need to fix/workaround the issue in libc++, as
it will affect many libc++ with Asan users.
The operators did not have a _Compare template arguement. The fix
updates the generic container test to use allocators for all types used.
No other issues were found.
Fixes: #127095
We have some benchmarks that were benchmarking very specific
functionality, namely the optimizations in vector<bool>::iterator. Call
this out in the benchmarks by renaming them appropriately. In the future
we will also increase the coverage of these benchmarks to test other
containers.
On older MacOS versions where `std::to_chars` for floating-point types
is not available the format library can't be used. Due to some issue
with the availability macro used to disable format on MacOS the issue
triggers regardless of the type being formatted.
The print library has the same issue.
Fixes: #125353
We've been improving these the tests for vector quite a bit and we are
probably not done improving our container tests. Formatting everything
at once will make subsequent reviews easier.
Currently `std::hash<Emplaceable>::operator()` relies implicit
conversion from `int` to `size_t`, which makes MSVC compelling. This PR
switches to use `static_cast`.
In `flat.map/flat.map.access/at_transparent.pass.cpp`, there's one
value-discarding use of `at` that wasn't marked `TEST_IGNORE_NODISCARD`.
This PR adds the missing `TEST_IGNORE_NODISCARD`.
Summary:
Currently we conditionally enable NVPTX lowering depending on the
language (C/C++/OpenMP). Unfortunately this causes problems because this
option is only present if the backend was enabled, which causes this to
error if you try to make LLVM-IR.
This patch instead makes it the only accepted lowering. The reason we
had it as opt-in before is because it is not handled by CUDA. So, this
pach also introduces diagnostics to prevent *all* creation of
device-side global constructors and destructors. We already did this for
variables, now we do it for attributes as well.
This inverts the responsibility of blocking this from the backend to the
langauage like it should be given that support for this is language
dependent.
This changes the code to use dataclasses instead of dict entries. It
also adds type aliases to use in the typing information and updates the
typing information.
`MinSequenceContainer::size` can be narrowing on 64-bit platforms, and
MSVC complains about such implicit conversion. This PR changes the
implicit conversion to explicit `static_cast`.
`min_allocator::allocate` and `min_allocator::deallocate` have
`ptrdiff_t` as the parameter type, which seems weird, because the
underlying `std::allocator`'s member functions take `size_t`. It seems
better to use `size_t` consistently.
This patch does not significantly change how the sequence container
benchmarks are done, but it adopts the same style as the associative
container benchmarks.
This commit does adjust how we were benchmarking push_back, where we
never really measured the overhead of the slow path of push_back (when
we need to reallocate).
This patch implements generic associative container benchmarks for
containers with unique keys. In doing so, it replaces the existing
std::map benchmarks which were based on the cartesian product
infrastructure and were too slow to execute.
These new benchmarks aim to strike a balance between exhaustive coverage
of all operations in the most interesting case, while executing fairly
rapidly (~40s on my machine).
This bumps the requirement for the map benchmarks from C++17 to C++20
because the common header that provides associative container benchmarks
requires support for C++20 concepts.
Implements parts of:
- P0355 Extending <chrono> to Calendars and Time Zones
- P1361 Integration of chrono with text formatting
- LWG3359 <chrono> leap second support should allow for negative leap
seconds
The capacity is now passed correctly and a test for this path is added.
Since we changed the implementation of `reserve(size_type)` to only ever
extend,
it doesn't make a ton of sense anymore to have `__shrink_or_extend`,
since the code
paths of `reserve` and `shrink_to_fit` are now almost completely
separate.
This patch splits up `__shrink_or_extend` so that the individual parts
are in `reserve`
and `shrink_to_fit` depending on where they are needed.
This reverts commit 59f57be94f38758616b1339b293b43af845571af.
This PR slightly simplifies the implementation of `vector<bool>::max_size`
and adds extensive tests for the `max_size()` function for both `vector<bool>`
and `vector<T>`. The main purposes of the new tests include:
- Verify correctness of `max_size()` under various `size_type` and
`difference_type` definitions: check that `max_size()` works properly
with allocators that have custom `size_type` and `difference_type`. This
is particularly useful for `vector<bool>`, as different `size_type` lead
to different `__storage_type` of different word lengths, resulting in
varying `max_size()` values for `vector<bool>`. Additionally, different
`difference_type` also sets different upper limit of `max_size()` for
both `vector<bool>` and `std::vector`. These tests were previously
missing.
- Eliminate incorrect implementations: Special tests are added to identify and
reject incorrect implementations of `vector<bool>::max_size` that unconditionally
return `std::min<size_type>(size-max, __internal_cap_to_external(allocator-max-size))`.
This can cause overflow in the `__internal_cap_to_external()` call and lead
to incorrect results. The new tests ensure that such incorrect
implementations are identified.
The __is_trivially_relocatable builtin has semantics that do not
correspond to any current or future notion of trivial relocation.
Furthermore, it currently leads to incorrect optimizations for some
types on supported compilers:
- Clang on Windows where types with non-trivial destructors get
incorrectly optimized
- AppleClang where types with non-trivial move constructors get
incorrectly optimized
Until there is an agreed upon and bugfree implementation of what it
means to be trivially relocatable, it is safer to simply use trivially
copyable instead. This doesn't leave a lot of types behind and is
definitely correct.
This PR addresses an undefined behavior that arises when using the
`std::fill` and `std::fill_n` algorithms, as well as their ranges
counterparts `ranges::fill` and `ranges::fill_n`, with `vector<bool, Alloc>`
that utilizes a custom-sized allocator with small integral types.
The previous tested TZDB did not contain %z for the rule letters. The
usage of %z in TZDB 2024b revealed a bug in the implementation. The
patch fixes it and has been locally tested with TZDB 2024b.
Fixes#108957
Rewrite the sequence container benchmarks to only rely on the actual
operations specified in SequenceContainer requirements and add
benchmarks for std::list, which is also considered a sequence container.
One of the major goals of this refactoring is also to make these
container benchmarks run faster so that they can be run more frequently.
The existing benchmarks have the significant problem that they take so
long to run that they must basically be run overnight. This patch
reduces the size of inputs such that the rewritten benchmarks each take
at most a minute to run.
This patch doesn't touch the string benchmarks, which were not using the
generic container benchmark functions previously.
As a follow-up to #121013 (which focused on `std::ranges::copy`), this
PR optimizes the performance of `std::ranges::copy_backward` for
`vector<bool>::iterator`, addressing a subtask outlined in issue #64038.
The optimizations yield performance improvements of up to 2000x for
aligned copies and 60x for unaligned copies.
This changes the __output_buffer to a new structure. This improves the
performace of std::format, std::format_to, std::format_to_n, and
std::formatted_size.
While implementing this feature and its associated LWG issues it turns
out
- LWG3316 Correctly define epoch for utc_clock / utc_timepoint only
added non-normative wording to the standard.
Implements parts of:
- P0355 Extending <chrono> to Calendars and Time Zones
- P1361 Integration of chrono with text formatting
- LWG3359 <chrono> leap second support should allow for negative leap
seconds
This is a continuation of what's been started in #89178.
As a drive-by, this also changes the PSTL macro to say `EXPERIMENTAL`
instead of `INCOMPLETE`.
This patch relands the following PRs:
* #111711
* #107350
* #111457
All of these patches were reverted due to an issue reported in
https://github.com/llvm/llvm-project/pull/111711#issuecomment-2406491485,
due to interdependencies.
---
[clang] Finish implementation of P0522
This finishes the clang implementation of P0522, getting rid
of the fallback to the old, pre-P0522 rules.
Before this patch, when partial ordering template template parameters,
we would perform, in order:
* If the old rules would match, we would accept it. Otherwise, don't
generate diagnostics yet.
* If the new rules would match, just accept it. Otherwise, don't
generate any diagnostics yet again.
* Apply the old rules again, this time with diagnostics.
This situation was far from ideal, as we would sometimes:
* Accept some things we shouldn't.
* Reject some things we shouldn't.
* Only diagnose rejection in terms of the old rules.
With this patch, we apply the P0522 rules throughout.
This needed to extend template argument deduction in order
to accept the historial rule for TTP matching pack parameter to non-pack
arguments.
This change also makes us accept some combinations of historical and P0522
allowances we wouldn't before.
It also fixes a bunch of bugs that were documented in the test suite,
which I am not sure there are issues already created for them.
This causes a lot of changes to the way these failures are diagnosed,
with related test suite churn.
The problem here is that the old rules were very simple and
non-recursive, making it easy to provide customized diagnostics,
and to keep them consistent with each other.
The new rules are a lot more complex and rely on template argument
deduction, substitutions, and they are recursive.
The approach taken here is to mostly rely on existing diagnostics,
and create a new instantiation context that keeps track of this context.
So for example when a substitution failure occurs, we use the error
produced there unmodified, and just attach notes to it explaining
that it occurred in the context of partial ordering this template
argument against that template parameter.
This diverges from the old diagnostics, which would lead with an
error pointing to the template argument, explain the problem
in subsequent notes, and produce a final note pointing to the parameter.
---
[clang] CWG2398: improve overload resolution backwards compat
With this change, we discriminate if the primary template and which partial
specializations would have participated in overload resolution prior to
P0522 changes.
We collect those in an initial set. If this set is not empty, or the
primary template would have matched, we proceed with this set as the
candidates for overload resolution.
Otherwise, we build a new overload set with everything else, and proceed
as usual.
---
[clang] Implement TTP 'reversed' pack matching for deduced function template calls.
Clang previously missed implementing P0522 pack matching
for deduced function template calls.
Some templates in the standard library are illegal to specialize for users
(even if the specialization contains user-defined types). The [[clang::no_specializations]]
attribute allows marking such base templates so that the compiler will
diagnose if users try adding a specialization.
Changes:
- Carve out sized but input-only ranges for C++23.
- Call `std::move` for related functions when the iterator is possibly input-only.
Fixes#115727
By calling `std::move` for related functions when the iterator is
possibly input-only. Also slightly changes the conditions of branch for
contiguous iterators to avoid error.
Fixes#116502
This PR addresses an issue where the `shrink_to_fit` function in
`vector<bool>` is effectively a no-op, meaning it will never shrink the
capacity.
Fixes#122502