We were only checking that the comparator was rvalue callable,
when in reality the algorithms always call comparators as lvalues.
This patch also refactors the tests for callable requirements and
expands it to a few missing algorithms.
Fixes#69554
We originally put implementation-detail function objects into individual
namespaces for `std::ranges` without a good reason for doing so. This
practice was continued, presumably because there was prior art. Since
there's no reason to keep these namespaces, this commit removes them,
which will slightly impact binary size.
This commit does not apply to CPOs, some of which need additional work.
Those two `__find_end` functions are no longer used after 101d1e9b3c86.
After that commit, `std::find_end` started dispatching to `__find_end_classic`,
and `ranges::find_end` to `__find_end_impl`, which means that the two `__find_end`
functions were no longer necessary.
Fixes#100569
Implements [P1223R5][] completely.
Includes an implementation of `find_last`, `find_last_if`, and
`find_last_if_not`.
[P1223R5]: https://wg21.link/p1223r5
One-sided binary search, aka meta binary search, has been in the public
domain for decades, and has the general advantage of being constant time
in the best case, with the downside of executing at most 2*log(N)
comparisons vs classic binary search's exact log(N). There are two
scenarios in which it really shines: the first one is when operating
over non-random-access iterators, because the classic algorithm requires
knowing the container's size upfront, which adds N iterator increments
to the complexity. The second one is when traversing the container in
order, trying to fast-forward to the next value: in that case the
classic algorithm requires at least O(N*log(N)) comparisons and, for
non-random-access iterators, O(N^2) iterator increments, whereas the
one-sided version will yield O(N) operations on both counts, with a
best-case of O(log(N)) comparisons which is very common in practice.
As time went by, a few files have become mis-formatted w.r.t.
clang-format. This was made worse by the fact that formatting was not
being enforced in extensionless headers. This commit simply brings all
of libcxx/include in-line with clang-format again.
We might have to do this from time to time as we update our clang-format
version, but frankly this is really low effort now that we've formatted
everything once.
We want the PSTL implementation details to be available regardless of
the Standard mode or whether the experimental PSTL is enabled. This
patch guards the inclusion of the PSTL to the top-level headers that
define the public API in `__numeric` and `__algorithm`.
The experimental PSTL's current dispatching mechanism was designed with
flexibility in mind. However, while reviewing the in-progress OpenMP
backend, I realized that the dispatching mechanism based on ADL and
default definitions in the frontend had several downsides. To name a
few:
1. The dispatching of an algorithm to the back-end and its default
implementation is bundled together via `_LIBCPP_PSTL_CUSTOMIZATION_POINT`.
This makes the dispatching really confusing and leads to annoyances
such as variable shadowing and weird lambda captures in the front-end.
2. The distinction between back-end functions and front-end algorithms
is not as clear as it could be, which led us to call one where we meant
the other in a few cases. This is bad due to the exception requirements
of the PSTL: calling a front-end algorithm inside the implementation of
a back-end is incorrect for exception-safety.
3. There are two levels of back-end dispatching in the PSTL, which treat
CPU backends as a special case. This was confusing and not as flexible
as we'd like. For example, there was no straightforward way to dispatch
all uses of `unseq` to a specific back-end from the OpenMP backend,
or for CPU backends to fall back on each other.
This patch rewrites the backend dispatching mechanism to solve these
problems, but doesn't touch any of the actual implementation of
algorithms. Specifically, this rewrite has the following
characteristics:
- There is a single level of backend dispatching, however partial backends can
be stacked to provide a full implementation of the PSTL. The two-level dispatching
that was used for CPU-based backends is handled by providing CPU-based basis
operations as simple helpers that can easily be reused when defining any PSTL
backend.
- The default definitions for algorithms are separated from their dispatching logic.
- The front-end is thus simplified a whole lot and made very consistent
for all algorithms, which makes it easier to audit the front-end for
things like exception-correctness, appropriate forwarding, etc.
Fixes#70718
This is an intermediate step towards the PSTL dispatching mechanism
rework. It will make it a lot easier to track the upcoming front-end
changes. After the rework, there are basically no implementation details
in the front-end, so the definition of each algorithm will become much
simpler. Otherwise, it wouldn't make sense to define all the algorithms
in the same header.
There were various places where we incorrectly handled exceptions in the
PSTL. Typical issues were missing `noexcept` and taking iterators by
value instead of by reference.
This patch fixes those inconsistent and incorrect instances, and adds
proper tests for all of those. Note that the previous tests were often
incorrectly turned into no-ops by the compiler due to copy ellision,
which doesn't happen with these new tests.
Fix `std/ranges/range.adaptors/range.lazy.split/general.pass.cpp` which
started failing on z/OS after this
[commit](https://github.com/llvm/llvm-project/commit/985c1a44f8d49e0af).
This test case is passing on other platforms such as AIX. This is
because the `__ALTIVEC__` macro is defined and `__mismatch` under
`_LIBCPP_VECTORIZE_ALGORITHMS` guard is compiled out. However, on z/OS
`_LIBCPP_VECTORIZE_ALGORITHMS` is defined. Analyzing the algorithm of
`__mismatch` shows that the culprit is the definition of
`__native_vector_size` which was defined wrongly as 1. This PR corrects
the definition of `__native_vector_size` and fixes the affected test.
This patch removes the two-level backend dispatching mechanism we had in
the PSTL. Instead of selecting both a PSTL backend and a PSTL CPU
backend, we now only select a top-level PSTL backend. This greatly
simplifies the PSTL configuration layer.
While this patch technically removes some flexibility from the PSTL
configuration mechanism because CPU backends are not considered
separately, it opens the door to a much more powerful configuration
mechanism based on chained backends in a follow-up patch.
This is a step towards overhauling the PSTL dispatching mechanism.
Functions inside __cpu_traits were needlessly prefixed with __parallel,
which doesn't serve a real purpose anymore now that they are inside a
traits class.
Currently, CPU backends in the PSTL are created by defining functions
in the __par_backend namespace. Then, the PSTL includes the CPU backend
that gets configured via CMake and gets those definitions.
This prevents CPU backends from easily co-existing and is a bit
confusing.
To solve this problem, this patch introduces the notion of __cpu_traits,
which is a cheap encapsulation of the basis operations required to
implement a CPU-based PSTL. Different backends can now define their own
tag and coexist, and the CPU-based PSTL will simply use __cpu_traits to
dispatch to the right implementation of e.g. __for_each.
Note that this patch doesn't change the actual implementation of the
backends in any way, it only modifies how that implementation is
accessed
to implement PSTL algorithms.
This patch is a step towards #88131.
Both `std::distance` or `ranges::distance` are inefficient for
non-sized ranges. Also, calculating the range using `int` type is
seriously problematic.
This patch avoids using `distance` and calculation of the length of
non-sized ranges.
Fixes#86833.
The exposition-only type trait `pair-like` includes `ranges::subrange`,
but in every single case excludes `ranges::subrange` from the list. This
patch introduces two new traits `__tuple_like_no_subrange` and
`__pair_like_no_subrange`, which exclude `ranges::subrange` from the
possible matches. `__pair_like` is no longer required, and thus removed.
`__tuple_like` is implemented as `__tuple_like_no_subrange` or a
`ranges::subrange` specialization.
We've introduced `__constexpr_memmove` a while ago, which simplified the
implementation of the copy and move lowering a bit. This allows us to
remove some of the boilerplate.
## Abstract
This pull request converts the `operator()` of all CPOs and niebloids
related to C++23 ranges to `static`.
## Motivation
In `libc++`, CPOs and niebloids are implemented as function objects.
Currently, the `operator()` for such a function object is a
`const`-qualified member function. This means that even if the function
object is has no data members, an extra register is used to pass in the
`this` pointer when calling `operator()`, unless the compiler can inline
the function call. Declaraing `operator()` as `static` would optimize
away the unnecessary `this` pointer passing for stateless function
objects, since there is no object instance state that needs to be
accessed.
## Reference
- [P1169R4: static `operator()`](https://wg21.link/P1169R4)
These headers have become very small by using compiler builtins, often
containing only two declarations. This merges these headers, since
there doesn't seem to be much of a benefit keeping them separate.
Specifically, `is_{,_nothrow,_trivially}{assignable,constructible}` are
kept and the `copy`, `move` and `default` versions of these type traits
are moved in to the respective headers.
We recently noticed that the unwrap_iter.h file was pushing macros, but
it was pushing them again instead of popping them at the end of the
file. This led to libc++ basically swallowing any custom definition of
these macros in user code:
#define min HELLO
#include <algorithm>
// min is not HELLO anymore, it's not defined
While investigating this issue, I noticed that our push/pop pragmas were
actually entirely wrong too. Indeed, instead of pushing macros like
`move`, we'd push `move(int, int)` in the pragma, which is not a valid
macro name. As a result, we would not actually push macros like `move`
-- instead we'd simply undefine them. This led to the following code not
working:
#define move HELLO
#include <algorithm>
// move is not HELLO anymore
Fixing the pragma push/pop incantations led to a cascade of issues
because we use identifiers like `move` in a large number of places, and
all of these headers would now need to do the push/pop dance.
This patch fixes all these issues. First, it adds a check that we don't
swallow important names like min, max, move or refresh as explained
above. This is done by augmenting the existing
system_reserved_names.gen.py test to also check that the macros are what
we expect after including each header.
Second, it fixes the push/pop pragmas to work properly and adds missing
pragmas to all the files I could detect a failure in via the newly added
test.
rdar://121365472
If a user passes a comparator that doesn't satisfy strict weak ordering
(see https://eel.is/c++draft/algorithms#alg.sorting.general) to
a sorting algorithm, the algorithm can produce an incorrect result or
even lead
to an out-of-bounds access. Unfortunately, comprehensively validating
that a given comparator indeed satisfies the strict weak ordering
requirement is prohibitively expensive (see [the related
RFC](https://discourse.llvm.org/t/rfc-strict-weak-ordering-checks-in-the-debug-libc/70217)).
As a result, we have three independent sets of checks:
- assertions that catch out-of-bounds accesses within the algorithms'
implementation. These are relatively cheap; however, they cannot catch
the underlying cause and cannot prevent the case where an invalid
comparator would result in an incorrectly-sorted sequence without
actually triggering an OOB access;
- debug comparators that wrap a given comparator and on each comparison
check that if `(a < b)`, then `!(b < a)`, where `<` stands for the
user-provided comparator. This performs up to 2x number of comparisons
but doesn't affect the algorithmic complexity. While this approach can
find more issues, it is still a heuristic;
- a comprehensive check of the comparator that validates up to 100
elements in the resulting sorted sequence (see the RFC above for
details). The check is expensive but the 100 element limit can somewhat
compensate for that, especially for large values of `N`.
The first set of checks is enabled in the fast hardening mode while the
other two are only enabled in the debug mode.
This patch also removes the
`_LIBCPP_DEBUG_STRICT_WEAK_ORDERING_CHECK` macro that
previously was used to selectively enable the 100-element check.
Now this check is enabled unconditionally in the debug mode.
Also, introduce a new category
`_LIBCPP_ASSERT_SEMANTIC_REQUIREMENT`. This category is
intended for checking the semantic requirements from the Standard.
Typically, these are hard or impossible to completely validate, so
these checks are expected to be heuristic in nature and potentially
quite expensive.
See https://reviews.llvm.org/D150264 for additional background.
Fixes#71496
Introduce a new `argument-within-domain` category that covers cases
where the given arguments make it impossible to produce a correct result
(or create a valid object in case of constructors). While the incorrect
result doesn't create an immediate problem within the library (like e.g.
a null pointer dereference would), it always indicates a logic error in
user code and is highly likely to lead to a bug in the program once the
value is used.
Also introduce `_LIBCPP_ASSERT_PEDANTIC` for assertions violating which
results in a no-op or other benign behavior, but which may nevertheless
indicate a bug in the invoking code.
Notable things in this commit:
* refactors `__indirect_binary_left_foldable`, making it slightly
different (but equivalent) to _`indirect-binary-left-foldable`_, which
improves readability (a [patch to the Working Paper][patch] was made)
* omits `__cpo` namespace, since it is not required for implementing
niebloids (a cleanup should happen in 2024)
* puts tests ensuring invocable robustness and dangling correctness
inside the correctness testing to ensure that the algorithms' results
are still correct
[patch]: https://github.com/cplusplus/draft/pull/6734
This patch runs clang-format on all of libcxx/include and libcxx/src, in
accordance with the RFC discussed at [1]. Follow-up patches will format
the benchmarks, the test suite and remaining parts of the code. I'm
splitting this one into its own patch so the diff is a bit easier to
review.
This patch was generated with:
find libcxx/include libcxx/src -type f \
| grep -v 'module.modulemap.in' \
| grep -v 'CMakeLists.txt' \
| grep -v 'README.txt' \
| grep -v 'libcxx.imp' \
| grep -v '__config_site.in' \
| xargs clang-format -i
A Git merge driver is available in libcxx/utils/clang-format-merge-driver.sh
to help resolve merge and rebase issues across these formatting changes.
[1]: https://discourse.llvm.org/t/rfc-clang-formatting-all-of-libc-once-and-for-all