344 Commits

Author SHA1 Message Date
Louis Dionne
d6832a611a
[libc++][modules] Modularize <cstddef> (#107254)
Many headers include `<cstddef>` just for size_t, and pulling in
additional content (e.g. the traits used for std::byte) is unnecessary.
To solve this problem, this patch splits up `<cstddef>` into
subcomponents so that headers can include only the parts that they
actually require.

This has the added benefit of making the modules build a lot stricter
with respect to IWYU, and also providing a canonical location where we
define `std::size_t` and friends (which were previously defined in
multiple headers like `<cstddef>` and `<ctime>`).

After this patch, there's still many places in the codebase where we
include `<cstddef>` when `<__cstddef/size_t.h>` would be sufficient.
This patch focuses on removing `<cstddef>` includes from __type_traits
to make these headers non-circular with `<cstddef>`. Additional
refactorings can be tackled separately.
2024-09-05 08:28:33 -04:00
Louis Dionne
0df78123fd [libc++] Add missing include to three_way_comp_ref_type.h
We were using a `_LIBCPP_ASSERT_FOO` macro without including `<__assert>`.

rdar://134425695
2024-08-27 14:22:49 -04:00
Nikolas Klauser
d4ffccfce1
[libc++] Simplify the implementation of std::sort a bit (#104902)
This does a few things to canonicalize the library a bit. Specifically
- use `__desugars_to_v` instead of the custom `__is_simple_comparator`
- make `__use_branchless_sort` an inline variable
- remove the `_maybe_branchless` versions of the `__sortN` functions and
overload based on whether we can do branchless sorting instead.
2024-08-27 16:54:05 +02:00
Louis Dionne
f73050e722
[libc++] Fix several double-moves in the code base (#104616)
This patch hardens the "test iterators" we use to test algorithms by
ensuring that they don't get double-moved. As a result of this
hardening, the tests started reporting multiple failures where we would
double-move iterators, which are being fixed in this patch.

In particular:
- Fixed a double-move in pstl.partition
- Add coverage for begin()/end() in subrange tests
- Fix tests for ranges::ends_with and ranges::contains, which were
  incorrectly calling begin() twice on the same subrange containing
  non-copyable input iterators.

Fixes #100709
2024-08-20 14:36:11 -04:00
Louis Dionne
257831582c
[libc++] Check correctly ref-qualified __is_callable in algorithms (#101553)
We were only checking that the comparator was rvalue callable,
when in reality the algorithms always call comparators as lvalues.
This patch also refactors the tests for callable requirements and
expands it to a few missing algorithms.

This is take 2 of #73451, which was reverted because it broke some
CI bots. The issue was that we checked __is_callable with arguments
in the wrong order inside std::upper_bound. This has now been fixed
and a test was added.

Fixes #69554
2024-08-05 11:23:06 -04:00
Nikolas Klauser
d07fdf9779
[libc++] Optimize lexicographical_compare (#65279)
If the comparison operation is equivalent to < and that is a total
order, we know that we can use equality comparison on that type instead
to extract some information. Furthermore, if equality comparison on that
type is trivial, the user can't observe that we're calling it. So
instead of using the user-provided total order, we use std::mismatch,
which uses equality comparison (and is vertorized). Additionally, if the
type is trivially lexicographically comparable, we can go one step
further and use std::memcmp directly instead of calling std::mismatch.

Benchmarks:
```
-------------------------------------------------------------------------------------
Benchmark                                                         old             new
-------------------------------------------------------------------------------------
bm_lexicographical_compare<unsigned char>/1                   1.17 ns         2.34 ns
bm_lexicographical_compare<unsigned char>/2                   1.64 ns         2.57 ns
bm_lexicographical_compare<unsigned char>/3                   2.23 ns         2.58 ns
bm_lexicographical_compare<unsigned char>/4                   2.82 ns         2.57 ns
bm_lexicographical_compare<unsigned char>/5                   3.34 ns         2.11 ns
bm_lexicographical_compare<unsigned char>/6                   3.94 ns         2.21 ns
bm_lexicographical_compare<unsigned char>/7                   4.56 ns         2.11 ns
bm_lexicographical_compare<unsigned char>/8                   5.25 ns         2.11 ns
bm_lexicographical_compare<unsigned char>/16                  9.88 ns         2.11 ns
bm_lexicographical_compare<unsigned char>/64                  38.9 ns         2.36 ns
bm_lexicographical_compare<unsigned char>/512                  317 ns         6.54 ns
bm_lexicographical_compare<unsigned char>/4096                2517 ns         41.4 ns
bm_lexicographical_compare<unsigned char>/32768              20052 ns          488 ns
bm_lexicographical_compare<unsigned char>/262144            159579 ns         4409 ns
bm_lexicographical_compare<unsigned char>/1048576           640456 ns        20342 ns
bm_lexicographical_compare<signed char>/1                     1.18 ns         2.37 ns
bm_lexicographical_compare<signed char>/2                     1.65 ns         2.60 ns
bm_lexicographical_compare<signed char>/3                     2.23 ns         2.83 ns
bm_lexicographical_compare<signed char>/4                     2.81 ns         3.06 ns
bm_lexicographical_compare<signed char>/5                     3.35 ns         3.30 ns
bm_lexicographical_compare<signed char>/6                     3.90 ns         3.99 ns
bm_lexicographical_compare<signed char>/7                     4.56 ns         3.78 ns
bm_lexicographical_compare<signed char>/8                     5.20 ns         4.02 ns
bm_lexicographical_compare<signed char>/16                    9.80 ns         6.21 ns
bm_lexicographical_compare<signed char>/64                    39.0 ns         3.16 ns
bm_lexicographical_compare<signed char>/512                    318 ns         7.58 ns
bm_lexicographical_compare<signed char>/4096                  2514 ns         47.4 ns
bm_lexicographical_compare<signed char>/32768                20096 ns          504 ns
bm_lexicographical_compare<signed char>/262144              156617 ns         4146 ns
bm_lexicographical_compare<signed char>/1048576             624265 ns        19810 ns
bm_lexicographical_compare<int>/1                             1.15 ns         2.12 ns
bm_lexicographical_compare<int>/2                             1.60 ns         2.36 ns
bm_lexicographical_compare<int>/3                             2.21 ns         2.59 ns
bm_lexicographical_compare<int>/4                             2.74 ns         2.83 ns
bm_lexicographical_compare<int>/5                             3.26 ns         3.06 ns
bm_lexicographical_compare<int>/6                             3.81 ns         4.53 ns
bm_lexicographical_compare<int>/7                             4.41 ns         4.72 ns
bm_lexicographical_compare<int>/8                             5.08 ns         2.36 ns
bm_lexicographical_compare<int>/16                            9.54 ns         3.08 ns
bm_lexicographical_compare<int>/64                            37.8 ns         4.71 ns
bm_lexicographical_compare<int>/512                            309 ns         24.6 ns
bm_lexicographical_compare<int>/4096                          2422 ns          204 ns
bm_lexicographical_compare<int>/32768                        19362 ns         1947 ns
bm_lexicographical_compare<int>/262144                      155727 ns        19793 ns
bm_lexicographical_compare<int>/1048576                     623614 ns        80180 ns
bm_ranges_lexicographical_compare<unsigned char>/1            1.07 ns         2.35 ns
bm_ranges_lexicographical_compare<unsigned char>/2            1.72 ns         2.13 ns
bm_ranges_lexicographical_compare<unsigned char>/3            2.46 ns         2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/4            3.17 ns         2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/5            3.86 ns         2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/6            4.55 ns         2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/7            5.25 ns         2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/8            5.95 ns         2.13 ns
bm_ranges_lexicographical_compare<unsigned char>/16           11.7 ns         2.13 ns
bm_ranges_lexicographical_compare<unsigned char>/64           45.5 ns         2.36 ns
bm_ranges_lexicographical_compare<unsigned char>/512           366 ns         6.35 ns
bm_ranges_lexicographical_compare<unsigned char>/4096         2886 ns         40.9 ns
bm_ranges_lexicographical_compare<unsigned char>/32768       23054 ns          489 ns
bm_ranges_lexicographical_compare<unsigned char>/262144     185302 ns         4339 ns
bm_ranges_lexicographical_compare<unsigned char>/1048576    741576 ns        19430 ns
bm_ranges_lexicographical_compare<signed char>/1              1.10 ns         2.12 ns
bm_ranges_lexicographical_compare<signed char>/2              1.66 ns         2.35 ns
bm_ranges_lexicographical_compare<signed char>/3              2.23 ns         2.58 ns
bm_ranges_lexicographical_compare<signed char>/4              2.82 ns         2.82 ns
bm_ranges_lexicographical_compare<signed char>/5              3.34 ns         3.06 ns
bm_ranges_lexicographical_compare<signed char>/6              3.92 ns         3.99 ns
bm_ranges_lexicographical_compare<signed char>/7              4.64 ns         4.10 ns
bm_ranges_lexicographical_compare<signed char>/8              5.21 ns         4.61 ns
bm_ranges_lexicographical_compare<signed char>/16             9.79 ns         7.42 ns
bm_ranges_lexicographical_compare<signed char>/64             38.9 ns         2.93 ns
bm_ranges_lexicographical_compare<signed char>/512             317 ns         7.31 ns
bm_ranges_lexicographical_compare<signed char>/4096           2500 ns         47.5 ns
bm_ranges_lexicographical_compare<signed char>/32768         19940 ns          496 ns
bm_ranges_lexicographical_compare<signed char>/262144       159166 ns         4393 ns
bm_ranges_lexicographical_compare<signed char>/1048576      638206 ns        19786 ns
bm_ranges_lexicographical_compare<int>/1                      1.10 ns         2.12 ns
bm_ranges_lexicographical_compare<int>/2                      1.64 ns         3.04 ns
bm_ranges_lexicographical_compare<int>/3                      2.23 ns         2.58 ns
bm_ranges_lexicographical_compare<int>/4                      2.81 ns         2.81 ns
bm_ranges_lexicographical_compare<int>/5                      3.35 ns         3.05 ns
bm_ranges_lexicographical_compare<int>/6                      3.94 ns         4.60 ns
bm_ranges_lexicographical_compare<int>/7                      4.60 ns         4.81 ns
bm_ranges_lexicographical_compare<int>/8                      5.19 ns         2.35 ns
bm_ranges_lexicographical_compare<int>/16                     9.85 ns         2.87 ns
bm_ranges_lexicographical_compare<int>/64                     38.9 ns         4.70 ns
bm_ranges_lexicographical_compare<int>/512                     318 ns         24.5 ns
bm_ranges_lexicographical_compare<int>/4096                   2494 ns          202 ns
bm_ranges_lexicographical_compare<int>/32768                 20000 ns         1939 ns
bm_ranges_lexicographical_compare<int>/262144               160433 ns        19730 ns
bm_ranges_lexicographical_compare<int>/1048576              642636 ns        80760 ns
```
2024-08-04 10:02:43 +02:00
Louis Dionne
451bba6fbf [libc++] Revert "Check correctly ref-qualified __is_callable in algorithms (#73451)"
This reverts commit 8d151f804ff43aaed1edf810bb2a07607b8bba14, which
broke some build bots. I think that is caused by an invalid argument
order when checking __is_comparable in upper_bound.
2024-08-01 15:56:06 -04:00
Nhat Nguyen
8d151f804f
[libc++] Check correctly ref-qualified __is_callable in algorithms (#73451)
We were only checking that the comparator was rvalue callable,
when in reality the algorithms always call comparators as lvalues.
This patch also refactors the tests for callable requirements and
expands it to a few missing algorithms.

Fixes #69554
2024-08-01 14:08:21 -04:00
Christopher Di Bella
d10dc5a06f
[libc++] Remove dedicated namespaces for ranges functions (#76543)
We originally put implementation-detail function objects into individual
namespaces for `std::ranges` without a good reason for doing so. This
practice was continued, presumably because there was prior art. Since
there's no reason to keep these namespaces, this commit removes them,
which will slightly impact binary size.

This commit does not apply to CPOs, some of which need additional work.
2024-08-01 08:54:06 -04:00
Hewill Kang
5b6b48800e
[libc++][NFC] Remove two unused implementation details __find_end (#100685)
Those two `__find_end` functions are no longer used after 101d1e9b3c86.
After that commit, `std::find_end` started dispatching to `__find_end_classic`,
and `ranges::find_end` to `__find_end_impl`, which means that the two `__find_end`
functions were no longer necessary.

Fixes #100569
2024-07-31 10:34:19 -04:00
nicole mazzuca
04760bfadb
[libc++][ranges] P1223R5: find_last (#99312)
Implements [P1223R5][] completely.

Includes an implementation of `find_last`, `find_last_if`, and
`find_last_if_not`.

[P1223R5]: https://wg21.link/p1223r5
2024-07-19 09:42:16 -07:00
Iuri Chaer
a0662176a9
[libc++] Speed up set_intersection() by fast-forwarding over ranges of non-matching elements with one-sided binary search. (#75230)
One-sided binary search, aka meta binary search, has been in the public
domain for decades, and has the general advantage of being constant time
in the best case, with the downside of executing at most 2*log(N)
comparisons vs classic binary search's exact log(N). There are two
scenarios in which it really shines: the first one is when operating
over non-random-access iterators, because the classic algorithm requires
knowing the container's size upfront, which adds N iterator increments
to the complexity. The second one is when traversing the container in
order, trying to fast-forward to the next value: in that case the
classic algorithm requires at least O(N*log(N)) comparisons and, for
non-random-access iterators, O(N^2) iterator increments, whereas the
one-sided version will yield O(N) operations on both counts, with a
best-case of O(log(N)) comparisons which is very common in practice.
2024-07-18 16:11:24 -04:00
Louis Dionne
e2c2ffbe7a
[libc++][NFC] Run clang-format on libcxx/include again (#95874)
As time went by, a few files have become mis-formatted w.r.t.
clang-format. This was made worse by the fact that formatting was not
being enforced in extensionless headers. This commit simply brings all
of libcxx/include in-line with clang-format again.

We might have to do this from time to time as we update our clang-format
version, but frankly this is really low effort now that we've formatted
everything once.
2024-06-18 09:13:45 -04:00
Louis Dionne
acb896a344
[libc++] Remove unnecessary #ifdef guards around PSTL implementation details (#95268)
We want the PSTL implementation details to be available regardless of
the Standard mode or whether the experimental PSTL is enabled. This
patch guards the inclusion of the PSTL to the top-level headers that
define the public API in `__numeric` and `__algorithm`.
2024-06-12 17:25:43 -04:00
Louis Dionne
fe4cd104a8 [libc++][NFC] Fix typo in concept PSTL concept check 2024-06-12 12:31:05 -04:00
Louis Dionne
9540950a45
[libc++] Overhaul the PSTL dispatching mechanism (#88131)
The experimental PSTL's current dispatching mechanism was designed with
flexibility in mind. However, while reviewing the in-progress OpenMP
backend, I realized that the dispatching mechanism based on ADL and
default definitions in the frontend had several downsides. To name a
few:

1. The dispatching of an algorithm to the back-end and its default
   implementation is bundled together via `_LIBCPP_PSTL_CUSTOMIZATION_POINT`.
   This makes the dispatching really confusing and leads to annoyances
   such as variable shadowing and weird lambda captures in the front-end.
2. The distinction between back-end functions and front-end algorithms
   is not as clear as it could be, which led us to call one where we meant
   the other in a few cases. This is bad due to the exception requirements
   of the PSTL: calling a front-end algorithm inside the implementation of
   a back-end is incorrect for exception-safety.
3. There are two levels of back-end dispatching in the PSTL, which treat
   CPU backends as a special case. This was confusing and not as flexible
   as we'd like. For example, there was no straightforward way to dispatch
   all uses of `unseq` to a specific back-end from the OpenMP backend,
   or for CPU backends to fall back on each other.

This patch rewrites the backend dispatching mechanism to solve these
problems, but doesn't touch any of the actual implementation of
algorithms. Specifically, this rewrite has the following
characteristics:

- There is a single level of backend dispatching, however partial backends can
  be stacked to provide a full implementation of the PSTL. The two-level dispatching
  that was used for CPU-based backends is handled by providing CPU-based basis 
  operations as simple helpers that can easily be reused when defining any PSTL 
  backend.

- The default definitions for algorithms are separated from their dispatching logic.

- The front-end is thus simplified a whole lot and made very consistent
  for all algorithms, which makes it easier to audit the front-end for
  things like exception-correctness, appropriate forwarding, etc.

Fixes #70718
2024-06-12 12:24:34 -04:00
Zibi Sarbinowski
ffc3a6b286
[libc++] Fix endianness for algorithm mismatch (#93082)
This PR is required to fix
`std/algorithms/alg.nonmodifying/mismatch/mismatch.pass.cpp` test for
big endian platrofrms such as z/OS.
2024-06-11 08:29:12 -04:00
Louis Dionne
e406d5ed9c
[libc++][pstl] Merge all frontend functions for the PSTL (#89219)
This is an intermediate step towards the PSTL dispatching mechanism
rework. It will make it a lot easier to track the upcoming front-end
changes. After the rework, there are basically no implementation details
in the front-end, so the definition of each algorithm will become much
simpler. Otherwise, it wouldn't make sense to define all the algorithms
in the same header.
2024-05-27 17:51:12 -04:00
Louis Dionne
72417920d3
[libc++] Remove a few unused includes of trivially_copyable.h (#93200) 2024-05-23 15:58:51 -04:00
Louis Dionne
bd3f5a4bd3
[libc++][pstl] Improve exception handling (#88998)
There were various places where we incorrectly handled exceptions in the
PSTL. Typical issues were missing `noexcept` and taking iterators by
value instead of by reference.

This patch fixes those inconsistent and incorrect instances, and adds
proper tests for all of those. Note that the previous tests were often
incorrectly turned into no-ops by the compiler due to copy ellision,
which doesn't happen with these new tests.
2024-05-22 12:39:21 -07:00
zibi2
af57ad6536
[libc++][z/OS] Correct a definition of __native_vector_size (#91995)
Fix `std/ranges/range.adaptors/range.lazy.split/general.pass.cpp` which
started failing on z/OS after this
[commit](https://github.com/llvm/llvm-project/commit/985c1a44f8d49e0af).

This test case is passing on other platforms such as AIX. This is
because the `__ALTIVEC__` macro is defined and `__mismatch` under
`_LIBCPP_VECTORIZE_ALGORITHMS` guard is compiled out. However, on z/OS
`_LIBCPP_VECTORIZE_ALGORITHMS` is defined. Analyzing the algorithm of
`__mismatch` shows that the culprit is the definition of
`__native_vector_size` which was defined wrongly as 1. This PR corrects
the definition of `__native_vector_size` and fixes the affected test.
2024-05-16 08:58:44 -04:00
Nikolas Klauser
05cc2d5fe1
[libc++] Vectorize std::mismatch with trivially equality comparable types (#87716) 2024-05-11 23:32:48 +02:00
Nikolas Klauser
840032419d
[libc++][NFC] Rename __find_impl to __find (#90163)
For most algorithms we've just added underscores to the detail function.
This changes `std::find` to match that pattern.
2024-04-27 09:51:59 +02:00
Nikolas Klauser
83bc7b5771
[libc++] Remove _LIBCPP_DISABLE_NODISCARD_EXTENSIONS and refactor the tests (#87094)
This also adds a few tests that were missing.
2024-04-22 22:13:58 +02:00
Louis Dionne
0e08bce142
[libc++][pstl] Move the CPU algorithm implementations to __pstl (#89109)
This colocates the CPU algorithms closer to the rest of the PSTL
implementation details.
2024-04-18 07:50:57 -04:00
Louis Dionne
d423d80e56
[libc++][pstl] Promote CPU backends to top-level backends (#88968)
This patch removes the two-level backend dispatching mechanism we had in
the PSTL. Instead of selecting both a PSTL backend and a PSTL CPU
backend, we now only select a top-level PSTL backend. This greatly
simplifies the PSTL configuration layer.

While this patch technically removes some flexibility from the PSTL
configuration mechanism because CPU backends are not considered
separately, it opens the door to a much more powerful configuration
mechanism based on chained backends in a follow-up patch.

This is a step towards overhauling the PSTL dispatching mechanism.
2024-04-17 13:36:53 -04:00
Louis Dionne
d57907d0b4
[libc++] Add missing iterator requirement checks in the PSTL (#88127)
Also add tests for those, and add a few missing requirements to testing
iterators in the test suite.
2024-04-17 08:21:48 -04:00
Louis Dionne
5b811562a5
[libc++] Rename __cpu_traits functions (#88741)
Functions inside __cpu_traits were needlessly prefixed with __parallel,
which doesn't serve a real purpose anymore now that they are inside a
traits class.
2024-04-16 10:33:39 +02:00
Louis Dionne
a3ce29f7bb
[libc++][PSTL] Introduce cpu traits (#88134)
Currently, CPU backends in the PSTL are created by defining functions
in the __par_backend namespace. Then, the PSTL includes the CPU backend
that gets configured via CMake and gets those definitions.

This prevents CPU backends from easily co-existing and is a bit
confusing.
To solve this problem, this patch introduces the notion of __cpu_traits,
which is a cheap encapsulation of the basis operations required to
implement a CPU-based PSTL. Different backends can now define their own
tag and coexist, and the CPU-based PSTL will simply use __cpu_traits to
dispatch to the right implementation of e.g. __for_each.

Note that this patch doesn't change the actual implementation of the
backends in any way, it only modifies how that implementation is
accessed
to implement PSTL algorithms.

This patch is a step towards #88131.
2024-04-15 10:30:00 -04:00
Mark de Wever
1a895bd95e
[libc++] Marks a variable const. (#88562)
This removes a TODO from the code base.
2024-04-13 13:45:59 +02:00
Mark de Wever
0ad663ead1
[libc++] Removes Clang-16 support. (#87810)
With the release of Clang-18 we no longer officially support Clang-16.
2024-04-10 17:51:02 +02:00
Nikolas Klauser
935e699173
[libc++] Optimize ranges::minmax (#87335)
This allows Clang to vectorize the loop.
```
---------------------------------------------------------------------
Benchmark                                         old             new
---------------------------------------------------------------------
BM_std_minmax<char>/1                        0.659 ns         1.41 ns
BM_std_minmax<char>/2                         1.08 ns         2.16 ns
BM_std_minmax<char>/3                         2.16 ns         2.96 ns
BM_std_minmax<char>/4                         2.82 ns         3.81 ns
BM_std_minmax<char>/5                         3.43 ns         4.69 ns
BM_std_minmax<char>/6                         4.08 ns         5.63 ns
BM_std_minmax<char>/7                         4.75 ns         6.51 ns
BM_std_minmax<char>/8                         5.42 ns         7.41 ns
BM_std_minmax<char>/9                         6.05 ns         8.34 ns
BM_std_minmax<char>/10                        6.68 ns         9.29 ns
BM_std_minmax<char>/11                        7.47 ns         10.6 ns
BM_std_minmax<char>/12                        7.95 ns         11.4 ns
BM_std_minmax<char>/13                        8.64 ns         12.4 ns
BM_std_minmax<char>/14                        9.35 ns         13.4 ns
BM_std_minmax<char>/15                        10.1 ns         14.4 ns
BM_std_minmax<char>/16                        10.6 ns         2.25 ns
BM_std_minmax<char>/17                        11.3 ns         2.82 ns
BM_std_minmax<char>/18                        11.8 ns         3.71 ns
BM_std_minmax<char>/19                        12.6 ns         4.52 ns
BM_std_minmax<char>/20                        13.2 ns         5.47 ns
BM_std_minmax<char>/21                        14.1 ns         6.67 ns
BM_std_minmax<char>/22                        14.5 ns         7.78 ns
BM_std_minmax<char>/23                        15.1 ns         8.67 ns
BM_std_minmax<char>/24                        15.7 ns         9.68 ns
BM_std_minmax<char>/25                        16.4 ns         10.7 ns
BM_std_minmax<char>/26                        17.1 ns         11.7 ns
BM_std_minmax<char>/27                        17.8 ns         12.8 ns
BM_std_minmax<char>/28                        18.4 ns         14.1 ns
BM_std_minmax<char>/29                        19.0 ns         15.0 ns
BM_std_minmax<char>/30                        19.6 ns         16.0 ns
BM_std_minmax<char>/31                        20.2 ns         17.0 ns
BM_std_minmax<char>/32                        20.8 ns         2.46 ns
BM_std_minmax<char>/64                        41.5 ns         2.97 ns
BM_std_minmax<char>/512                        340 ns         6.05 ns
BM_std_minmax<char>/1024                       667 ns         8.83 ns
BM_std_minmax<char>/4000                      2571 ns         28.6 ns
BM_std_minmax<char>/4096                      2632 ns         25.8 ns
BM_std_minmax<char>/5500                      3554 ns         51.1 ns
BM_std_minmax<char>/64000                    41175 ns          480 ns
BM_std_minmax<char>/65536                    42039 ns          490 ns
BM_std_minmax<char>/70000                    44931 ns          528 ns
BM_std_minmax<short>/1                       0.708 ns         1.20 ns
BM_std_minmax<short>/2                        1.18 ns         1.78 ns
BM_std_minmax<short>/3                        1.98 ns         2.42 ns
BM_std_minmax<short>/4                        2.47 ns         3.05 ns
BM_std_minmax<short>/5                        3.09 ns         3.72 ns
BM_std_minmax<short>/6                        3.49 ns         4.37 ns
BM_std_minmax<short>/7                        4.24 ns         5.03 ns
BM_std_minmax<short>/8                        4.65 ns         2.12 ns
BM_std_minmax<short>/9                        5.34 ns         2.51 ns
BM_std_minmax<short>/10                       5.82 ns         3.18 ns
BM_std_minmax<short>/11                       6.36 ns         3.97 ns
BM_std_minmax<short>/12                       6.73 ns         4.68 ns
BM_std_minmax<short>/13                       7.59 ns         5.49 ns
BM_std_minmax<short>/14                       7.77 ns         6.45 ns
BM_std_minmax<short>/15                       8.54 ns         7.55 ns
BM_std_minmax<short>/16                       8.74 ns         2.38 ns
BM_std_minmax<short>/17                       9.59 ns         2.76 ns
BM_std_minmax<short>/18                       9.88 ns         3.37 ns
BM_std_minmax<short>/19                       10.7 ns         4.17 ns
BM_std_minmax<short>/20                       10.9 ns         4.88 ns
BM_std_minmax<short>/21                       12.1 ns         5.70 ns
BM_std_minmax<short>/22                       12.6 ns         6.64 ns
BM_std_minmax<short>/23                       13.5 ns         7.72 ns
BM_std_minmax<short>/24                       13.2 ns         2.87 ns
BM_std_minmax<short>/25                       14.2 ns         3.10 ns
BM_std_minmax<short>/26                       14.2 ns         3.59 ns
BM_std_minmax<short>/27                       15.4 ns         4.35 ns
BM_std_minmax<short>/28                       15.3 ns         5.10 ns
BM_std_minmax<short>/29                       16.2 ns         5.87 ns
BM_std_minmax<short>/30                       16.2 ns         6.88 ns
BM_std_minmax<short>/31                       17.0 ns         7.78 ns
BM_std_minmax<short>/32                       17.2 ns         3.45 ns
BM_std_minmax<short>/64                       34.1 ns         3.35 ns
BM_std_minmax<short>/512                       279 ns         8.37 ns
BM_std_minmax<short>/1024                      549 ns         14.2 ns
BM_std_minmax<short>/4000                     2111 ns         50.1 ns
BM_std_minmax<short>/4096                     2167 ns         47.9 ns
BM_std_minmax<short>/5500                     2895 ns         69.7 ns
BM_std_minmax<short>/64000                   33454 ns          953 ns
BM_std_minmax<short>/65536                   34474 ns          970 ns
BM_std_minmax<short>/70000                   36691 ns         1037 ns
BM_std_minmax<int>/1                         0.664 ns         1.17 ns
BM_std_minmax<int>/2                          1.11 ns         1.69 ns
BM_std_minmax<int>/3                          2.36 ns         2.29 ns
BM_std_minmax<int>/4                          2.53 ns         2.91 ns
BM_std_minmax<int>/5                          3.23 ns         3.56 ns
BM_std_minmax<int>/6                          3.56 ns         4.23 ns
BM_std_minmax<int>/7                          4.28 ns         4.91 ns
BM_std_minmax<int>/8                          4.60 ns         5.60 ns
BM_std_minmax<int>/9                          5.38 ns         6.31 ns
BM_std_minmax<int>/10                         5.69 ns         7.03 ns
BM_std_minmax<int>/11                         6.41 ns         7.70 ns
BM_std_minmax<int>/12                         6.73 ns         8.39 ns
BM_std_minmax<int>/13                         7.38 ns         9.07 ns
BM_std_minmax<int>/14                         7.74 ns         9.79 ns
BM_std_minmax<int>/15                         8.53 ns         10.5 ns
BM_std_minmax<int>/16                         8.79 ns         11.2 ns
BM_std_minmax<int>/17                         9.63 ns         12.0 ns
BM_std_minmax<int>/18                         9.84 ns         12.7 ns
BM_std_minmax<int>/19                         10.6 ns         13.5 ns
BM_std_minmax<int>/20                         11.0 ns         14.3 ns
BM_std_minmax<int>/21                         11.7 ns         15.0 ns
BM_std_minmax<int>/22                         12.0 ns         15.7 ns
BM_std_minmax<int>/23                         13.1 ns         16.5 ns
BM_std_minmax<int>/24                         13.0 ns         17.3 ns
BM_std_minmax<int>/25                         13.7 ns         17.9 ns
BM_std_minmax<int>/26                         14.0 ns         18.6 ns
BM_std_minmax<int>/27                         14.8 ns         19.4 ns
BM_std_minmax<int>/28                         15.1 ns         20.3 ns
BM_std_minmax<int>/29                         15.8 ns         20.9 ns
BM_std_minmax<int>/30                         16.1 ns         21.7 ns
BM_std_minmax<int>/31                         16.9 ns         22.5 ns
BM_std_minmax<int>/32                         17.2 ns         3.40 ns
BM_std_minmax<int>/64                         33.9 ns         4.04 ns
BM_std_minmax<int>/512                         275 ns         14.6 ns
BM_std_minmax<int>/1024                        541 ns         27.5 ns
BM_std_minmax<int>/4000                       2093 ns         96.3 ns
BM_std_minmax<int>/4096                       2146 ns         98.3 ns
BM_std_minmax<int>/5500                       2866 ns          157 ns
BM_std_minmax<int>/64000                     33619 ns         1954 ns
BM_std_minmax<int>/65536                     34252 ns         2009 ns
BM_std_minmax<int>/70000                     36618 ns         2125 ns
BM_std_minmax<long long>/1                   0.709 ns         1.19 ns
BM_std_minmax<long long>/2                    1.01 ns         1.65 ns
BM_std_minmax<long long>/3                    2.14 ns         2.21 ns
BM_std_minmax<long long>/4                    2.45 ns         2.83 ns
BM_std_minmax<long long>/5                    3.09 ns         3.47 ns
BM_std_minmax<long long>/6                    3.44 ns         4.11 ns
BM_std_minmax<long long>/7                    4.16 ns         4.79 ns
BM_std_minmax<long long>/8                    4.54 ns         5.47 ns
BM_std_minmax<long long>/9                    5.37 ns         6.20 ns
BM_std_minmax<long long>/10                   5.71 ns         6.93 ns
BM_std_minmax<long long>/11                   6.00 ns         7.60 ns
BM_std_minmax<long long>/12                   6.43 ns         8.27 ns
BM_std_minmax<long long>/13                   7.01 ns         8.94 ns
BM_std_minmax<long long>/14                   7.45 ns         9.65 ns
BM_std_minmax<long long>/15                   8.16 ns         10.4 ns
BM_std_minmax<long long>/16                   8.46 ns         5.22 ns
BM_std_minmax<long long>/17                   9.16 ns         5.22 ns
BM_std_minmax<long long>/18                   9.53 ns         5.52 ns
BM_std_minmax<long long>/19                   10.2 ns         6.02 ns
BM_std_minmax<long long>/20                   10.5 ns         6.89 ns
BM_std_minmax<long long>/21                   11.3 ns         7.83 ns
BM_std_minmax<long long>/22                   11.6 ns         8.59 ns
BM_std_minmax<long long>/23                   12.3 ns         9.91 ns
BM_std_minmax<long long>/24                   12.6 ns         10.1 ns
BM_std_minmax<long long>/25                   13.2 ns         12.0 ns
BM_std_minmax<long long>/26                   13.6 ns         13.5 ns
BM_std_minmax<long long>/27                   14.2 ns         14.8 ns
BM_std_minmax<long long>/28                   14.7 ns         15.9 ns
BM_std_minmax<long long>/29                   15.3 ns         16.6 ns
BM_std_minmax<long long>/30                   15.8 ns         17.3 ns
BM_std_minmax<long long>/31                   16.3 ns         18.2 ns
BM_std_minmax<long long>/32                   16.7 ns         7.18 ns
BM_std_minmax<long long>/64                   33.1 ns         11.5 ns
BM_std_minmax<long long>/512                   268 ns         71.0 ns
BM_std_minmax<long long>/1024                  532 ns          138 ns
BM_std_minmax<long long>/4000                 2056 ns          533 ns
BM_std_minmax<long long>/4096                 2112 ns          539 ns
BM_std_minmax<long long>/5500                 2823 ns          749 ns
BM_std_minmax<long long>/64000               32956 ns         8590 ns
BM_std_minmax<long long>/65536               33795 ns         8791 ns
BM_std_minmax<long long>/70000               36084 ns         9442 ns
BM_std_minmax<unsigned char>/1               0.714 ns         1.41 ns
BM_std_minmax<unsigned char>/2               0.955 ns         1.96 ns
BM_std_minmax<unsigned char>/3                1.90 ns         2.63 ns
BM_std_minmax<unsigned char>/4                2.40 ns         3.34 ns
BM_std_minmax<unsigned char>/5                2.87 ns         4.10 ns
BM_std_minmax<unsigned char>/6                3.47 ns         4.88 ns
BM_std_minmax<unsigned char>/7                4.04 ns         5.66 ns
BM_std_minmax<unsigned char>/8                4.65 ns         6.45 ns
BM_std_minmax<unsigned char>/9                5.18 ns         7.24 ns
BM_std_minmax<unsigned char>/10               5.80 ns         8.05 ns
BM_std_minmax<unsigned char>/11               6.24 ns         8.86 ns
BM_std_minmax<unsigned char>/12               6.78 ns         9.70 ns
BM_std_minmax<unsigned char>/13               7.30 ns         10.6 ns
BM_std_minmax<unsigned char>/14               7.86 ns         11.4 ns
BM_std_minmax<unsigned char>/15               8.46 ns         12.3 ns
BM_std_minmax<unsigned char>/16               9.00 ns         2.12 ns
BM_std_minmax<unsigned char>/17               9.58 ns         2.83 ns
BM_std_minmax<unsigned char>/18               10.1 ns         3.37 ns
BM_std_minmax<unsigned char>/19               10.7 ns         4.11 ns
BM_std_minmax<unsigned char>/20               11.2 ns         4.85 ns
BM_std_minmax<unsigned char>/21               11.9 ns         5.69 ns
BM_std_minmax<unsigned char>/22               12.3 ns         6.77 ns
BM_std_minmax<unsigned char>/23               13.1 ns         7.56 ns
BM_std_minmax<unsigned char>/24               13.5 ns         8.40 ns
BM_std_minmax<unsigned char>/25               14.2 ns         9.30 ns
BM_std_minmax<unsigned char>/26               14.4 ns         10.1 ns
BM_std_minmax<unsigned char>/27               15.0 ns         11.1 ns
BM_std_minmax<unsigned char>/28               15.3 ns         11.9 ns
BM_std_minmax<unsigned char>/29               16.2 ns         12.9 ns
BM_std_minmax<unsigned char>/30               16.5 ns         13.9 ns
BM_std_minmax<unsigned char>/31               17.2 ns         14.8 ns
BM_std_minmax<unsigned char>/32               17.6 ns         2.36 ns
BM_std_minmax<unsigned char>/64               35.6 ns         3.21 ns
BM_std_minmax<unsigned char>/512               288 ns         6.00 ns
BM_std_minmax<unsigned char>/1024              573 ns         8.80 ns
BM_std_minmax<unsigned char>/4000             2222 ns         28.6 ns
BM_std_minmax<unsigned char>/4096             2265 ns         25.9 ns
BM_std_minmax<unsigned char>/5500             3047 ns         48.8 ns
BM_std_minmax<unsigned char>/64000           35059 ns          480 ns
BM_std_minmax<unsigned char>/65536           35941 ns          491 ns
BM_std_minmax<unsigned char>/70000           38922 ns          525 ns
BM_std_minmax<unsigned short>/1              0.711 ns         1.18 ns
BM_std_minmax<unsigned short>/2              0.957 ns         1.65 ns
BM_std_minmax<unsigned short>/3               2.13 ns         2.21 ns
BM_std_minmax<unsigned short>/4               2.14 ns         2.78 ns
BM_std_minmax<unsigned short>/5               3.06 ns         3.29 ns
BM_std_minmax<unsigned short>/6               2.89 ns         3.87 ns
BM_std_minmax<unsigned short>/7               3.80 ns         4.55 ns
BM_std_minmax<unsigned short>/8               3.68 ns         2.02 ns
BM_std_minmax<unsigned short>/9               4.53 ns         2.40 ns
BM_std_minmax<unsigned short>/10              4.60 ns         2.94 ns
BM_std_minmax<unsigned short>/11              5.67 ns         3.67 ns
BM_std_minmax<unsigned short>/12              5.39 ns         4.22 ns
BM_std_minmax<unsigned short>/13              6.58 ns         4.78 ns
BM_std_minmax<unsigned short>/14              6.33 ns         5.54 ns
BM_std_minmax<unsigned short>/15              7.34 ns         6.30 ns
BM_std_minmax<unsigned short>/16              7.17 ns         2.25 ns
BM_std_minmax<unsigned short>/17              8.19 ns         2.61 ns
BM_std_minmax<unsigned short>/18              8.02 ns         3.19 ns
BM_std_minmax<unsigned short>/19              9.03 ns         3.72 ns
BM_std_minmax<unsigned short>/20              8.89 ns         4.36 ns
BM_std_minmax<unsigned short>/21              9.77 ns         5.10 ns
BM_std_minmax<unsigned short>/22              9.70 ns         5.55 ns
BM_std_minmax<unsigned short>/23              10.8 ns         6.29 ns
BM_std_minmax<unsigned short>/24              10.6 ns         2.41 ns
BM_std_minmax<unsigned short>/25              11.6 ns         2.75 ns
BM_std_minmax<unsigned short>/26              11.4 ns         3.26 ns
BM_std_minmax<unsigned short>/27              12.4 ns         3.86 ns
BM_std_minmax<unsigned short>/28              12.3 ns         4.45 ns
BM_std_minmax<unsigned short>/29              13.2 ns         5.07 ns
BM_std_minmax<unsigned short>/30              13.1 ns         5.77 ns
BM_std_minmax<unsigned short>/31              13.9 ns         6.65 ns
BM_std_minmax<unsigned short>/32              13.9 ns         2.72 ns
BM_std_minmax<unsigned short>/64              27.8 ns         3.25 ns
BM_std_minmax<unsigned short>/512              220 ns         8.30 ns
BM_std_minmax<unsigned short>/1024             435 ns         14.1 ns
BM_std_minmax<unsigned short>/4000            1703 ns         49.8 ns
BM_std_minmax<unsigned short>/4096            1746 ns         47.9 ns
BM_std_minmax<unsigned short>/5500            2350 ns         69.9 ns
BM_std_minmax<unsigned short>/64000          27388 ns          953 ns
BM_std_minmax<unsigned short>/65536          28040 ns          975 ns
BM_std_minmax<unsigned short>/70000          29967 ns         1040 ns
BM_std_minmax<unsigned int>/1                0.712 ns         1.18 ns
BM_std_minmax<unsigned int>/2                0.965 ns         1.65 ns
BM_std_minmax<unsigned int>/3                 2.13 ns         2.14 ns
BM_std_minmax<unsigned int>/4                 2.09 ns         2.64 ns
BM_std_minmax<unsigned int>/5                 3.02 ns         3.21 ns
BM_std_minmax<unsigned int>/6                 2.94 ns         3.81 ns
BM_std_minmax<unsigned int>/7                 3.91 ns         4.38 ns
BM_std_minmax<unsigned int>/8                 3.75 ns         4.93 ns
BM_std_minmax<unsigned int>/9                 4.71 ns         5.60 ns
BM_std_minmax<unsigned int>/10                4.59 ns         6.26 ns
BM_std_minmax<unsigned int>/11                5.57 ns         6.80 ns
BM_std_minmax<unsigned int>/12                5.43 ns         7.47 ns
BM_std_minmax<unsigned int>/13                6.45 ns         8.10 ns
BM_std_minmax<unsigned int>/14                6.32 ns         8.69 ns
BM_std_minmax<unsigned int>/15                7.29 ns         9.37 ns
BM_std_minmax<unsigned int>/16                7.12 ns         9.99 ns
BM_std_minmax<unsigned int>/17                8.24 ns         10.6 ns
BM_std_minmax<unsigned int>/18                8.00 ns         11.2 ns
BM_std_minmax<unsigned int>/19                8.94 ns         12.0 ns
BM_std_minmax<unsigned int>/20                8.91 ns         12.6 ns
BM_std_minmax<unsigned int>/21                9.73 ns         17.2 ns
BM_std_minmax<unsigned int>/22                9.75 ns         13.8 ns
BM_std_minmax<unsigned int>/23                10.6 ns         14.5 ns
BM_std_minmax<unsigned int>/24                10.6 ns         15.1 ns
BM_std_minmax<unsigned int>/25                11.5 ns         15.7 ns
BM_std_minmax<unsigned int>/26                11.4 ns         16.3 ns
BM_std_minmax<unsigned int>/27                12.3 ns         17.0 ns
BM_std_minmax<unsigned int>/28                12.3 ns         17.6 ns
BM_std_minmax<unsigned int>/29                13.2 ns         18.3 ns
BM_std_minmax<unsigned int>/30                13.2 ns         19.0 ns
BM_std_minmax<unsigned int>/31                14.0 ns         19.6 ns
BM_std_minmax<unsigned int>/32                14.0 ns         3.39 ns
BM_std_minmax<unsigned int>/64                27.6 ns         4.05 ns
BM_std_minmax<unsigned int>/512                221 ns         14.2 ns
BM_std_minmax<unsigned int>/1024               439 ns         25.5 ns
BM_std_minmax<unsigned int>/4000              1720 ns         96.3 ns
BM_std_minmax<unsigned int>/4096              1762 ns         97.8 ns
BM_std_minmax<unsigned int>/5500              2364 ns          146 ns
BM_std_minmax<unsigned int>/64000            27874 ns         1905 ns
BM_std_minmax<unsigned int>/65536            28012 ns         1961 ns
BM_std_minmax<unsigned int>/70000            29899 ns         2087 ns
BM_std_minmax<unsigned long long>/1          0.707 ns         1.18 ns
BM_std_minmax<unsigned long long>/2          0.909 ns         1.65 ns
BM_std_minmax<unsigned long long>/3           1.65 ns         2.70 ns
BM_std_minmax<unsigned long long>/4           1.93 ns         2.69 ns
BM_std_minmax<unsigned long long>/5           2.45 ns         3.34 ns
BM_std_minmax<unsigned long long>/6           2.78 ns         3.81 ns
BM_std_minmax<unsigned long long>/7           3.28 ns         4.43 ns
BM_std_minmax<unsigned long long>/8           3.70 ns         4.92 ns
BM_std_minmax<unsigned long long>/9           4.12 ns         5.64 ns
BM_std_minmax<unsigned long long>/10          4.44 ns         6.15 ns
BM_std_minmax<unsigned long long>/11          4.91 ns         6.81 ns
BM_std_minmax<unsigned long long>/12          5.31 ns         7.41 ns
BM_std_minmax<unsigned long long>/13          5.72 ns         7.96 ns
BM_std_minmax<unsigned long long>/14          6.05 ns         8.66 ns
BM_std_minmax<unsigned long long>/15          6.55 ns         9.37 ns
BM_std_minmax<unsigned long long>/16          6.89 ns         7.98 ns
BM_std_minmax<unsigned long long>/17          7.34 ns         8.13 ns
BM_std_minmax<unsigned long long>/18          7.73 ns         8.42 ns
BM_std_minmax<unsigned long long>/19          8.26 ns         8.63 ns
BM_std_minmax<unsigned long long>/20          8.54 ns         8.96 ns
BM_std_minmax<unsigned long long>/21          9.14 ns         9.37 ns
BM_std_minmax<unsigned long long>/22          9.39 ns         9.67 ns
BM_std_minmax<unsigned long long>/23          10.1 ns         10.1 ns
BM_std_minmax<unsigned long long>/24          10.4 ns         10.6 ns
BM_std_minmax<unsigned long long>/25          11.0 ns         11.3 ns
BM_std_minmax<unsigned long long>/26          11.3 ns         12.1 ns
BM_std_minmax<unsigned long long>/27          11.8 ns         14.2 ns
BM_std_minmax<unsigned long long>/28          12.1 ns         15.8 ns
BM_std_minmax<unsigned long long>/29          12.6 ns         17.4 ns
BM_std_minmax<unsigned long long>/30          13.1 ns         18.1 ns
BM_std_minmax<unsigned long long>/31          13.4 ns         18.8 ns
BM_std_minmax<unsigned long long>/32          13.8 ns         10.4 ns
BM_std_minmax<unsigned long long>/64          27.3 ns         15.5 ns
BM_std_minmax<unsigned long long>/512          222 ns         80.6 ns
BM_std_minmax<unsigned long long>/1024         443 ns          156 ns
BM_std_minmax<unsigned long long>/4000        1731 ns          591 ns
BM_std_minmax<unsigned long long>/4096        1752 ns          609 ns
BM_std_minmax<unsigned long long>/5500        2340 ns          819 ns
BM_std_minmax<unsigned long long>/64000      27166 ns         9652 ns
BM_std_minmax<unsigned long long>/65536      27869 ns         9876 ns
BM_std_minmax<unsigned long long>/70000      29920 ns        10680 ns
```
2024-04-06 17:22:07 +02:00
Nikolas Klauser
f5960c168d
[libc++][NFC] Make __desugars_to a variable template and rename the header to desugars_to.h (#87337)
This improves compile times and memory usage slightly and removes some
boilerplate.
2024-04-04 23:02:19 +02:00
A. Jiang
04dbf7ad44
[libc++][ranges] Avoid using distance in ranges::contains_subrange (#87155)
Both `std::distance` or `ranges::distance` are inefficient for
non-sized ranges. Also, calculating the range using `int` type is
seriously problematic.

This patch avoids using `distance` and calculation of the length of
non-sized ranges.

Fixes #86833.
2024-04-02 17:21:15 -07:00
Nikolas Klauser
985c1a44f8
[libc++] Optimize the two range overload of mismatch (#86853)
```
-----------------------------------------------------------------------------
Benchmark                                                 old             new
-----------------------------------------------------------------------------
bm_mismatch_two_range_overload<char>/1               0.941 ns         1.88 ns
bm_mismatch_two_range_overload<char>/2                1.43 ns         2.15 ns
bm_mismatch_two_range_overload<char>/3                1.95 ns         2.55 ns
bm_mismatch_two_range_overload<char>/4                2.58 ns         2.90 ns
bm_mismatch_two_range_overload<char>/5                3.75 ns         3.31 ns
bm_mismatch_two_range_overload<char>/6                5.00 ns         3.83 ns
bm_mismatch_two_range_overload<char>/7                5.59 ns         4.35 ns
bm_mismatch_two_range_overload<char>/8                6.37 ns         4.84 ns
bm_mismatch_two_range_overload<char>/16               11.8 ns         6.72 ns
bm_mismatch_two_range_overload<char>/64               45.5 ns         2.59 ns
bm_mismatch_two_range_overload<char>/512               366 ns         12.6 ns
bm_mismatch_two_range_overload<char>/4096             2890 ns         91.6 ns
bm_mismatch_two_range_overload<char>/32768           23038 ns          758 ns
bm_mismatch_two_range_overload<char>/262144         142813 ns         6573 ns
bm_mismatch_two_range_overload<char>/1048576        366679 ns        26710 ns
bm_mismatch_two_range_overload<short>/1              0.934 ns         1.88 ns
bm_mismatch_two_range_overload<short>/2               1.30 ns         2.58 ns
bm_mismatch_two_range_overload<short>/3               1.76 ns         3.28 ns
bm_mismatch_two_range_overload<short>/4               2.24 ns         3.98 ns
bm_mismatch_two_range_overload<short>/5               2.80 ns         4.92 ns
bm_mismatch_two_range_overload<short>/6               3.58 ns         6.01 ns
bm_mismatch_two_range_overload<short>/7               4.29 ns         7.03 ns
bm_mismatch_two_range_overload<short>/8               4.67 ns         7.39 ns
bm_mismatch_two_range_overload<short>/16              9.86 ns         13.1 ns
bm_mismatch_two_range_overload<short>/64              38.9 ns         4.55 ns
bm_mismatch_two_range_overload<short>/512              348 ns         27.7 ns
bm_mismatch_two_range_overload<short>/4096            2881 ns          225 ns
bm_mismatch_two_range_overload<short>/32768          23111 ns         1715 ns
bm_mismatch_two_range_overload<short>/262144        184846 ns        14416 ns
bm_mismatch_two_range_overload<short>/1048576       742885 ns        57264 ns
bm_mismatch_two_range_overload<int>/1                0.838 ns         1.19 ns
bm_mismatch_two_range_overload<int>/2                 1.19 ns         1.65 ns
bm_mismatch_two_range_overload<int>/3                 1.83 ns         2.06 ns
bm_mismatch_two_range_overload<int>/4                 2.38 ns         2.42 ns
bm_mismatch_two_range_overload<int>/5                 3.60 ns         2.47 ns
bm_mismatch_two_range_overload<int>/6                 3.68 ns         3.05 ns
bm_mismatch_two_range_overload<int>/7                 4.32 ns         3.36 ns
bm_mismatch_two_range_overload<int>/8                 5.18 ns         3.58 ns
bm_mismatch_two_range_overload<int>/16                10.6 ns         2.84 ns
bm_mismatch_two_range_overload<int>/64                39.0 ns         7.78 ns
bm_mismatch_two_range_overload<int>/512                247 ns         53.9 ns
bm_mismatch_two_range_overload<int>/4096              1927 ns          429 ns
bm_mismatch_two_range_overload<int>/32768            15569 ns         3393 ns
bm_mismatch_two_range_overload<int>/262144          125413 ns        28504 ns
bm_mismatch_two_range_overload<int>/1048576         504549 ns       112729 ns
```
2024-04-01 18:21:51 +02:00
Nikolas Klauser
1679b27959
[libc++] Refactor __tuple_like and __pair_like (#85206)
The exposition-only type trait `pair-like` includes `ranges::subrange`,
but in every single case excludes `ranges::subrange` from the list. This
patch introduces two new traits `__tuple_like_no_subrange` and
`__pair_like_no_subrange`, which exclude `ranges::subrange` from the
possible matches. `__pair_like` is no longer required, and thus removed.
`__tuple_like` is implemented as `__tuple_like_no_subrange` or a
`ranges::subrange` specialization.
2024-04-01 08:46:57 +02:00
Nikolas Klauser
beaff78528
[libc++] Optimize the std::mismatch tail (#83440)
This adds vectorization to the last 0-3 vectors and, if the range is
large enough, the remaining elements that don't fill a vector
completely.
```
-----------------------------------------------------------------------
Benchmark                           old    full vectors  partial vector
-----------------------------------------------------------------------
bm_mismatch<char>/1             1.40 ns         1.62 ns         2.09 ns
bm_mismatch<char>/2             1.88 ns         2.10 ns         2.33 ns
bm_mismatch<char>/3             2.67 ns         2.56 ns         2.72 ns
bm_mismatch<char>/4             3.01 ns         3.20 ns         3.70 ns
bm_mismatch<char>/5             3.51 ns         3.73 ns         3.64 ns
bm_mismatch<char>/6             4.71 ns         4.85 ns         4.37 ns
bm_mismatch<char>/7             5.12 ns         5.33 ns         4.37 ns
bm_mismatch<char>/8             5.79 ns         6.02 ns         4.75 ns
bm_mismatch<char>/15            9.20 ns         10.5 ns         7.23 ns
bm_mismatch<char>/16            10.2 ns         10.1 ns         7.46 ns
bm_mismatch<char>/17            10.2 ns         10.8 ns         7.57 ns
bm_mismatch<char>/31            17.6 ns         17.1 ns         10.8 ns
bm_mismatch<char>/32            17.4 ns         1.64 ns         1.64 ns
bm_mismatch<char>/33            23.3 ns         2.10 ns         2.33 ns
bm_mismatch<char>/63            31.8 ns         16.9 ns         2.33 ns
bm_mismatch<char>/64            32.6 ns         2.10 ns         2.10 ns
bm_mismatch<char>/65            33.6 ns         2.57 ns         2.80 ns
bm_mismatch<char>/127           67.3 ns         18.1 ns         3.27 ns
bm_mismatch<char>/128           2.17 ns         2.14 ns         2.57 ns
bm_mismatch<char>/129           2.36 ns         2.80 ns         3.27 ns
bm_mismatch<char>/255           67.5 ns         19.6 ns         4.68 ns
bm_mismatch<char>/256           3.76 ns         3.71 ns         3.97 ns
bm_mismatch<char>/257           3.77 ns         4.04 ns         4.43 ns
bm_mismatch<char>/511           70.8 ns         22.1 ns         7.47 ns
bm_mismatch<char>/512           7.27 ns         7.30 ns         6.95 ns
bm_mismatch<char>/513           7.11 ns         7.05 ns         6.96 ns
bm_mismatch<char>/1023          75.9 ns         27.4 ns         13.3 ns
bm_mismatch<char>/1024          13.9 ns         13.8 ns         12.4 ns
bm_mismatch<char>/1025          13.6 ns         13.6 ns         12.8 ns
bm_mismatch<char>/2047          87.3 ns         37.5 ns         25.4 ns
bm_mismatch<char>/2048          26.8 ns         27.4 ns         24.0 ns
bm_mismatch<char>/2049          26.7 ns         27.3 ns         25.5 ns
bm_mismatch<char>/4095           112 ns         64.7 ns         48.7 ns
bm_mismatch<char>/4096          53.0 ns         54.2 ns         46.8 ns
bm_mismatch<char>/4097          52.7 ns         54.2 ns         48.4 ns
bm_mismatch<char>/8191           160 ns          118 ns         98.4 ns
bm_mismatch<char>/8192           107 ns          108 ns         96.0 ns
bm_mismatch<char>/8193           106 ns          108 ns         97.2 ns
bm_mismatch<char>/16383          283 ns          234 ns          215 ns
bm_mismatch<char>/16384          227 ns          223 ns          217 ns
bm_mismatch<char>/16385          221 ns          221 ns          215 ns
bm_mismatch<char>/32767          547 ns          499 ns          488 ns
bm_mismatch<char>/32768          495 ns          492 ns          492 ns
bm_mismatch<char>/32769          491 ns          489 ns          488 ns
bm_mismatch<char>/65535         1028 ns          979 ns          971 ns
bm_mismatch<char>/65536          976 ns          970 ns          974 ns
bm_mismatch<char>/65537          970 ns          965 ns          971 ns
bm_mismatch<char>/131071        2031 ns         1948 ns         2005 ns
bm_mismatch<char>/131072        1973 ns         1955 ns         1974 ns
bm_mismatch<char>/131073        1989 ns         1932 ns         2001 ns
bm_mismatch<char>/262143        4469 ns         4244 ns         4223 ns
bm_mismatch<char>/262144        4443 ns         4183 ns         4243 ns
bm_mismatch<char>/262145        4400 ns         4232 ns         4246 ns
bm_mismatch<char>/524287       10169 ns         9733 ns         9592 ns
bm_mismatch<char>/524288       10154 ns         9664 ns         9843 ns
bm_mismatch<char>/524289       10113 ns         9641 ns        10003 ns
bm_mismatch<short>/1            1.86 ns         2.53 ns         2.32 ns
bm_mismatch<short>/2            2.57 ns         2.77 ns         2.55 ns
bm_mismatch<short>/3            3.26 ns         3.00 ns         2.79 ns
bm_mismatch<short>/4            3.95 ns         3.39 ns         3.15 ns
bm_mismatch<short>/5            4.83 ns         3.97 ns         3.72 ns
bm_mismatch<short>/6            5.43 ns         4.34 ns         4.03 ns
bm_mismatch<short>/7            6.11 ns         4.73 ns         4.44 ns
bm_mismatch<short>/8            6.84 ns         5.02 ns         4.79 ns
bm_mismatch<short>/15           11.5 ns         7.12 ns         6.50 ns
bm_mismatch<short>/16           13.9 ns         1.87 ns         2.11 ns
bm_mismatch<short>/17           14.0 ns         3.00 ns         2.47 ns
bm_mismatch<short>/31           23.1 ns         7.87 ns         2.47 ns
bm_mismatch<short>/32           23.8 ns         2.57 ns         2.81 ns
bm_mismatch<short>/33           24.5 ns         3.70 ns         2.94 ns
bm_mismatch<short>/63           44.8 ns         9.37 ns         3.46 ns
bm_mismatch<short>/64           2.32 ns         2.57 ns         2.64 ns
bm_mismatch<short>/65           2.52 ns         3.02 ns         3.51 ns
bm_mismatch<short>/127          45.6 ns         9.97 ns         5.18 ns
bm_mismatch<short>/128          3.85 ns         3.93 ns         3.94 ns
bm_mismatch<short>/129          3.82 ns         4.20 ns         4.70 ns
bm_mismatch<short>/255          50.4 ns         12.6 ns         8.07 ns
bm_mismatch<short>/256          7.23 ns         6.91 ns         6.98 ns
bm_mismatch<short>/257          7.24 ns         7.19 ns         7.55 ns
bm_mismatch<short>/511          52.3 ns         17.8 ns         14.0 ns
bm_mismatch<short>/512          13.6 ns         13.7 ns         13.6 ns
bm_mismatch<short>/513          13.9 ns         13.8 ns         18.5 ns
bm_mismatch<short>/1023         60.9 ns         30.9 ns         26.3 ns
bm_mismatch<short>/1024         26.7 ns         27.7 ns         25.7 ns
bm_mismatch<short>/1025         27.7 ns         27.6 ns         25.3 ns
bm_mismatch<short>/2047         88.4 ns         58.0 ns         51.6 ns
bm_mismatch<short>/2048         52.8 ns         55.3 ns         50.6 ns
bm_mismatch<short>/2049         55.2 ns         54.8 ns         48.7 ns
bm_mismatch<short>/4095          153 ns          113 ns          102 ns
bm_mismatch<short>/4096          105 ns          110 ns          101 ns
bm_mismatch<short>/4097          110 ns          110 ns         99.1 ns
bm_mismatch<short>/8191          277 ns          219 ns          206 ns
bm_mismatch<short>/8192          226 ns          214 ns          250 ns
bm_mismatch<short>/8193          226 ns          207 ns          208 ns
bm_mismatch<short>/16383         519 ns          492 ns          488 ns
bm_mismatch<short>/16384         494 ns          492 ns          492 ns
bm_mismatch<short>/16385         492 ns          488 ns          489 ns
bm_mismatch<short>/32767        1007 ns          968 ns          964 ns
bm_mismatch<short>/32768         977 ns          972 ns          970 ns
bm_mismatch<short>/32769         972 ns          962 ns          967 ns
bm_mismatch<short>/65535        1978 ns         1918 ns         1956 ns
bm_mismatch<short>/65536        1940 ns         1927 ns         1970 ns
bm_mismatch<short>/65537        1937 ns         1922 ns         1959 ns
bm_mismatch<short>/131071       4524 ns         4193 ns         4304 ns
bm_mismatch<short>/131072       4445 ns         4196 ns         4306 ns
bm_mismatch<short>/131073       4452 ns         4278 ns         4311 ns
bm_mismatch<short>/262143       9801 ns        10188 ns         9634 ns
bm_mismatch<short>/262144       9738 ns        10151 ns         9651 ns
bm_mismatch<short>/262145       9716 ns        10171 ns         9715 ns
bm_mismatch<short>/524287      19944 ns        20718 ns        20044 ns
bm_mismatch<short>/524288      21139 ns        20647 ns        20008 ns
bm_mismatch<short>/524289      21162 ns        19512 ns        20068 ns
bm_mismatch<int>/1              1.40 ns         1.84 ns         1.87 ns
bm_mismatch<int>/2              1.87 ns         2.08 ns         2.09 ns
bm_mismatch<int>/3              2.36 ns         2.31 ns         2.87 ns
bm_mismatch<int>/4              3.06 ns         2.72 ns         2.95 ns
bm_mismatch<int>/5              3.66 ns         3.37 ns         3.42 ns
bm_mismatch<int>/6              4.55 ns         3.65 ns         3.73 ns
bm_mismatch<int>/7              5.03 ns         3.93 ns         3.94 ns
bm_mismatch<int>/8              5.67 ns         1.86 ns         1.87 ns
bm_mismatch<int>/15             9.89 ns         4.41 ns         2.34 ns
bm_mismatch<int>/16             10.1 ns         2.33 ns         2.34 ns
bm_mismatch<int>/17             10.2 ns         3.34 ns         2.86 ns
bm_mismatch<int>/31             17.2 ns         5.54 ns         3.28 ns
bm_mismatch<int>/32             2.16 ns         2.15 ns         2.58 ns
bm_mismatch<int>/33             2.36 ns         3.01 ns         3.28 ns
bm_mismatch<int>/63             17.7 ns         6.50 ns         4.93 ns
bm_mismatch<int>/64             3.81 ns         3.58 ns         3.90 ns
bm_mismatch<int>/65             3.74 ns         4.36 ns         4.45 ns
bm_mismatch<int>/127            19.5 ns         9.56 ns         7.74 ns
bm_mismatch<int>/128            7.30 ns         6.41 ns         6.85 ns
bm_mismatch<int>/129            7.09 ns         7.04 ns         7.06 ns
bm_mismatch<int>/255            24.7 ns         14.8 ns         13.3 ns
bm_mismatch<int>/256            14.0 ns         12.1 ns         12.3 ns
bm_mismatch<int>/257            13.8 ns         12.7 ns         12.8 ns
bm_mismatch<int>/511            34.3 ns         26.3 ns         24.8 ns
bm_mismatch<int>/512            27.6 ns         23.6 ns         23.9 ns
bm_mismatch<int>/513            27.3 ns         24.4 ns         25.1 ns
bm_mismatch<int>/1023           62.5 ns         50.9 ns         48.3 ns
bm_mismatch<int>/1024           54.4 ns         46.1 ns         46.6 ns
bm_mismatch<int>/1025           54.2 ns         48.4 ns         47.5 ns
bm_mismatch<int>/2047            116 ns         97.8 ns         94.1 ns
bm_mismatch<int>/2048            108 ns         92.6 ns         92.4 ns
bm_mismatch<int>/2049            108 ns          104 ns         94.0 ns
bm_mismatch<int>/4095            233 ns          222 ns          205 ns
bm_mismatch<int>/4096            226 ns          223 ns          225 ns
bm_mismatch<int>/4097            221 ns          219 ns          210 ns
bm_mismatch<int>/8191            499 ns          485 ns          488 ns
bm_mismatch<int>/8192            496 ns          490 ns          495 ns
bm_mismatch<int>/8193            491 ns          485 ns          488 ns
bm_mismatch<int>/16383           982 ns          962 ns          964 ns
bm_mismatch<int>/16384           974 ns          971 ns          971 ns
bm_mismatch<int>/16385           971 ns          961 ns          968 ns
bm_mismatch<int>/32767          2003 ns         1959 ns         1920 ns
bm_mismatch<int>/32768          1996 ns         1947 ns         1928 ns
bm_mismatch<int>/32769          1990 ns         1945 ns         1926 ns
bm_mismatch<int>/65535          4434 ns         4275 ns         4312 ns
bm_mismatch<int>/65536          4437 ns         4267 ns         4321 ns
bm_mismatch<int>/65537          4442 ns         4261 ns         4321 ns
bm_mismatch<int>/131071         9673 ns         9648 ns         9465 ns
bm_mismatch<int>/131072         9667 ns         9671 ns         9465 ns
bm_mismatch<int>/131073         9661 ns         9653 ns         9464 ns
bm_mismatch<int>/262143        20595 ns        19605 ns        19064 ns
bm_mismatch<int>/262144        19894 ns        19572 ns        19009 ns
bm_mismatch<int>/262145        19851 ns        19656 ns        18999 ns
bm_mismatch<int>/524287        39556 ns        39364 ns        38131 ns
bm_mismatch<int>/524288        39678 ns        39573 ns        38183 ns
bm_mismatch<int>/524289        40168 ns        39301 ns        38121 ns
```
2024-03-29 19:29:54 +01:00
Nikolas Klauser
c388690a8b
[libc++][NFC] Simplify copy and move lowering to memmove a bit (#83574)
We've introduced `__constexpr_memmove` a while ago, which simplified the
implementation of the copy and move lowering a bit. This allows us to
remove some of the boilerplate.
2024-03-27 16:54:50 +01:00
Nikolas Klauser
b68e2eba0b
[libc++] Vectorize mismatch (#73255)
```
---------------------------------------------------
Benchmark                           old         new
---------------------------------------------------
bm_mismatch<char>/1           0.835 ns      2.37 ns
bm_mismatch<char>/2            1.44 ns      2.60 ns
bm_mismatch<char>/3            2.06 ns      2.83 ns
bm_mismatch<char>/4            2.60 ns      3.29 ns
bm_mismatch<char>/5            3.15 ns      3.77 ns
bm_mismatch<char>/6            3.82 ns      4.17 ns
bm_mismatch<char>/7            4.29 ns      4.52 ns
bm_mismatch<char>/8            4.78 ns      4.86 ns
bm_mismatch<char>/16           9.06 ns      7.54 ns
bm_mismatch<char>/64           31.7 ns      19.1 ns
bm_mismatch<char>/512           249 ns      8.16 ns
bm_mismatch<char>/4096         1956 ns      44.2 ns
bm_mismatch<char>/32768       15498 ns       501 ns
bm_mismatch<char>/262144     123965 ns      4479 ns
bm_mismatch<char>/1048576    495668 ns     21306 ns
bm_mismatch<short>/1          0.710 ns      2.12 ns
bm_mismatch<short>/2           1.03 ns      2.66 ns
bm_mismatch<short>/3           1.29 ns      3.56 ns
bm_mismatch<short>/4           1.68 ns      4.29 ns
bm_mismatch<short>/5           1.96 ns      5.18 ns
bm_mismatch<short>/6           2.59 ns      5.91 ns
bm_mismatch<short>/7           2.86 ns      6.63 ns
bm_mismatch<short>/8           3.19 ns      7.33 ns
bm_mismatch<short>/16          5.48 ns      13.0 ns
bm_mismatch<short>/64          16.6 ns      4.06 ns
bm_mismatch<short>/512          130 ns      13.8 ns
bm_mismatch<short>/4096         985 ns      93.8 ns
bm_mismatch<short>/32768       7846 ns      1002 ns
bm_mismatch<short>/262144     63217 ns     10637 ns
bm_mismatch<short>/1048576   251782 ns     42471 ns
bm_mismatch<int>/1            0.716 ns      1.91 ns
bm_mismatch<int>/2             1.21 ns      2.49 ns
bm_mismatch<int>/3             1.38 ns      3.46 ns
bm_mismatch<int>/4             1.71 ns      4.04 ns
bm_mismatch<int>/5             2.00 ns      4.98 ns
bm_mismatch<int>/6             2.43 ns      5.67 ns
bm_mismatch<int>/7             3.05 ns      6.38 ns
bm_mismatch<int>/8             3.22 ns      7.09 ns
bm_mismatch<int>/16            5.18 ns      12.8 ns
bm_mismatch<int>/64            16.6 ns      5.28 ns
bm_mismatch<int>/512            129 ns      25.2 ns
bm_mismatch<int>/4096          1009 ns       201 ns
bm_mismatch<int>/32768         7776 ns      2144 ns
bm_mismatch<int>/262144       62371 ns     20551 ns
bm_mismatch<int>/1048576     254750 ns     90097 ns
```
2024-03-23 15:28:22 +01:00
Xiaoyang Liu
c3747883a0
[libc++][ranges] use static operator() for C++23 ranges (#86052)
## Abstract

This pull request converts the `operator()` of all CPOs and niebloids
related to C++23 ranges to `static`.

## Motivation

In `libc++`, CPOs and niebloids are implemented as function objects.
Currently, the `operator()` for such a function object is a
`const`-qualified member function. This means that even if the function
object is has no data members, an extra register is used to pass in the
`this` pointer when calling `operator()`, unless the compiler can inline
the function call. Declaraing `operator()` as `static` would optimize
away the unnecessary `this` pointer passing for stateless function
objects, since there is no object instance state that needs to be
accessed.

## Reference

- [P1169R4: static `operator()`](https://wg21.link/P1169R4)
2024-03-23 00:32:02 +01:00
Nikolas Klauser
4ea850b52f
[libc++] Remove __unconstrained_reverse_iterator (#85582)
`__unconstrained_reverse_iterator` has outlived its usefullness, since
the standard and subsequently the compilers have been fixed.
2024-03-18 14:19:51 +01:00
Nikolas Klauser
a2fe410581
[libc++][NFC] Simplify the implementation of equal a bit (#84754)
We can simplify the implementation of the two range overload of `equal`
a bit since we can now use `if constexpr`.
2024-03-18 08:32:09 +01:00
Nikolas Klauser
580f60484e
[libc++][NFC] Merge is{,_nothrow,_trivially}{,_copy,_move,_default}{_assignable,_constructible} (#85308)
These headers have become very small by using compiler builtins, often
containing only two declarations. This merges these headers, since
there doesn't seem to be much of a benefit keeping them separate.

Specifically, `is_{,_nothrow,_trivially}{assignable,constructible}` are
kept and the `copy`, `move` and `default` versions of these type traits
are moved in to the respective headers.
2024-03-18 08:29:44 +01:00
Nikolas Klauser
07b18c5e1b
[libc++] Optimize ranges::fill{,_n} for vector<bool>::iterator (#84642)
```
------------------------------------------------------
Benchmark                          old             new
------------------------------------------------------
bm_ranges_fill_n/1             1.64 ns         3.06 ns
bm_ranges_fill_n/2             3.45 ns         3.06 ns
bm_ranges_fill_n/3             4.88 ns         3.06 ns
bm_ranges_fill_n/4             6.46 ns         3.06 ns
bm_ranges_fill_n/5             8.03 ns         3.06 ns
bm_ranges_fill_n/6             9.65 ns         3.07 ns
bm_ranges_fill_n/7             11.5 ns         3.06 ns
bm_ranges_fill_n/8             13.0 ns         3.06 ns
bm_ranges_fill_n/16            25.9 ns         3.06 ns
bm_ranges_fill_n/64             103 ns         4.62 ns
bm_ranges_fill_n/512            711 ns         4.40 ns
bm_ranges_fill_n/4096          5642 ns         9.86 ns
bm_ranges_fill_n/32768        45135 ns         33.6 ns
bm_ranges_fill_n/262144      360818 ns          243 ns
bm_ranges_fill_n/1048576    1442828 ns          982 ns
bm_ranges_fill/1               1.63 ns         3.17 ns
bm_ranges_fill/2               3.43 ns         3.28 ns
bm_ranges_fill/3               4.97 ns         3.31 ns
bm_ranges_fill/4               6.53 ns         3.27 ns
bm_ranges_fill/5               8.12 ns         3.33 ns
bm_ranges_fill/6               9.76 ns         3.32 ns
bm_ranges_fill/7               11.6 ns         3.29 ns
bm_ranges_fill/8               13.2 ns         3.26 ns
bm_ranges_fill/16              26.3 ns         3.26 ns
bm_ranges_fill/64               104 ns         4.92 ns
bm_ranges_fill/512              716 ns         4.47 ns
bm_ranges_fill/4096            5772 ns         8.21 ns
bm_ranges_fill/32768          45778 ns         33.1 ns
bm_ranges_fill/262144        351422 ns          241 ns
bm_ranges_fill/1048576      1404710 ns          965 ns
```
2024-03-17 20:00:54 +01:00
Nikolas Klauser
76a2472715
[libc++] Refactor more __enable_ifs to the canonical style (#81457)
This brings the code base closer to having only a single style of
`enable_if`s.
2024-02-20 01:47:38 +01:00
ZijunZhaoCCK
a6b846ae1e
[libc++][ranges] Implement ranges::contains_subrange (#66963) 2024-02-13 15:42:37 -08:00
Louis Dionne
7b4622514d
[libc++] Fix missing and incorrect push/pop macros (#79204)
We recently noticed that the unwrap_iter.h file was pushing macros, but
it was pushing them again instead of popping them at the end of the
file. This led to libc++ basically swallowing any custom definition of
these macros in user code:

    #define min HELLO
    #include <algorithm>
    // min is not HELLO anymore, it's not defined

While investigating this issue, I noticed that our push/pop pragmas were
actually entirely wrong too. Indeed, instead of pushing macros like
`move`, we'd push `move(int, int)` in the pragma, which is not a valid
macro name. As a result, we would not actually push macros like `move`
-- instead we'd simply undefine them. This led to the following code not
working:

    #define move HELLO
    #include <algorithm>
    // move is not HELLO anymore

Fixing the pragma push/pop incantations led to a cascade of issues
because we use identifiers like `move` in a large number of places, and
all of these headers would now need to do the push/pop dance.

This patch fixes all these issues. First, it adds a check that we don't
swallow important names like min, max, move or refresh as explained
above. This is done by augmenting the existing
system_reserved_names.gen.py test to also check that the macros are what
we expect after including each header.

Second, it fixes the push/pop pragmas to work properly and adds missing
pragmas to all the files I could detect a failure in via the newly added
test.

rdar://121365472
2024-01-25 15:48:46 -05:00
Louis Dionne
03a9f07e18 [libc++][NFC] Fix leftover && in comment 2024-01-24 09:41:02 -05:00
Konstantin Varlamov
8938bc0ad0
[libc++][hardening] Categorize assertions related to strict weak ordering (#77405)
If a user passes a comparator that doesn't satisfy strict weak ordering
(see https://eel.is/c++draft/algorithms#alg.sorting.general) to
a sorting algorithm, the algorithm can produce an incorrect result or
even lead
to an out-of-bounds access. Unfortunately, comprehensively validating
that a given comparator indeed satisfies the strict weak ordering
requirement is prohibitively expensive (see [the related
RFC](https://discourse.llvm.org/t/rfc-strict-weak-ordering-checks-in-the-debug-libc/70217)).
As a result, we have three independent sets of checks:

- assertions that catch out-of-bounds accesses within the algorithms'
  implementation. These are relatively cheap; however, they cannot catch
  the underlying cause and cannot prevent the case where an invalid
  comparator would result in an incorrectly-sorted sequence without
  actually triggering an OOB access;

- debug comparators that wrap a given comparator and on each comparison
  check that if `(a < b)`, then `!(b < a)`, where `<` stands for the
  user-provided comparator. This performs up to 2x number of comparisons
  but doesn't affect the algorithmic complexity. While this approach can
  find more issues, it is still a heuristic;

- a comprehensive check of the comparator that validates up to 100
  elements in the resulting sorted sequence (see the RFC above for
details). The check is expensive but the 100 element limit can somewhat
  compensate for that, especially for large values of `N`.

The first set of checks is enabled in the fast hardening mode while the
other two are only enabled in the debug mode.

This patch also removes the
`_LIBCPP_DEBUG_STRICT_WEAK_ORDERING_CHECK` macro that
previously was used to selectively enable the 100-element check.
Now this check is enabled unconditionally in the debug mode.

Also, introduce a new category
`_LIBCPP_ASSERT_SEMANTIC_REQUIREMENT`. This category is
intended for checking the semantic requirements from the Standard.
Typically, these are hard or impossible to completely validate, so
these checks are expected to be heuristic in nature and potentially
quite expensive.

See https://reviews.llvm.org/D150264 for additional background.
Fixes #71496
2024-01-22 23:31:58 -08:00
Konstantin Varlamov
dc57752031
[libc++][hardening] Categorize assertions that produce incorrect results (#77183)
Introduce a new `argument-within-domain` category that covers cases
where the given arguments make it impossible to produce a correct result
(or create a valid object in case of constructors). While the incorrect
result doesn't create an immediate problem within the library (like e.g.
a null pointer dereference would), it always indicates a logic error in
user code and is highly likely to lead to a bug in the program once the
value is used.
2024-01-20 23:38:02 -08:00