Nikolas Klauser
0fb76bae6b
Reapply "[libc++] Simplify the implementation of std::sort a bit ( #104902 )" ( #114023 )
...
This reverts commit ef44e4659878f2. The patch was originally reverted
because it was
deemed to introduce a performance regression for small inputs, however
it also fixed
a previous performance regression for larger inputs. So overall, this
patch is desirable.
2024-10-30 11:51:55 +01:00
Louis Dionne
ef44e46598
Revert "[libc++] Simplify the implementation of std::sort a bit ( #104902 )"
...
This reverts commit d4ffccfce103b01401b8a9222e373f2d404f8439, which
caused a performance regression that needs to be investigated further.
2024-09-20 09:48:58 -04:00
Louis Dionne
d6832a611a
[libc++][modules] Modularize <cstddef> ( #107254 )
...
Many headers include `<cstddef>` just for size_t, and pulling in
additional content (e.g. the traits used for std::byte) is unnecessary.
To solve this problem, this patch splits up `<cstddef>` into
subcomponents so that headers can include only the parts that they
actually require.
This has the added benefit of making the modules build a lot stricter
with respect to IWYU, and also providing a canonical location where we
define `std::size_t` and friends (which were previously defined in
multiple headers like `<cstddef>` and `<ctime>`).
After this patch, there's still many places in the codebase where we
include `<cstddef>` when `<__cstddef/size_t.h>` would be sufficient.
This patch focuses on removing `<cstddef>` includes from __type_traits
to make these headers non-circular with `<cstddef>`. Additional
refactorings can be tackled separately.
2024-09-05 08:28:33 -04:00
Nikolas Klauser
d4ffccfce1
[libc++] Simplify the implementation of std::sort a bit ( #104902 )
...
This does a few things to canonicalize the library a bit. Specifically
- use `__desugars_to_v` instead of the custom `__is_simple_comparator`
- make `__use_branchless_sort` an inline variable
- remove the `_maybe_branchless` versions of the `__sortN` functions and
overload based on whether we can do branchless sorting instead.
2024-08-27 16:54:05 +02:00
Nikolas Klauser
d07fdf9779
[libc++] Optimize lexicographical_compare ( #65279 )
...
If the comparison operation is equivalent to < and that is a total
order, we know that we can use equality comparison on that type instead
to extract some information. Furthermore, if equality comparison on that
type is trivial, the user can't observe that we're calling it. So
instead of using the user-provided total order, we use std::mismatch,
which uses equality comparison (and is vertorized). Additionally, if the
type is trivially lexicographically comparable, we can go one step
further and use std::memcmp directly instead of calling std::mismatch.
Benchmarks:
```
-------------------------------------------------------------------------------------
Benchmark old new
-------------------------------------------------------------------------------------
bm_lexicographical_compare<unsigned char>/1 1.17 ns 2.34 ns
bm_lexicographical_compare<unsigned char>/2 1.64 ns 2.57 ns
bm_lexicographical_compare<unsigned char>/3 2.23 ns 2.58 ns
bm_lexicographical_compare<unsigned char>/4 2.82 ns 2.57 ns
bm_lexicographical_compare<unsigned char>/5 3.34 ns 2.11 ns
bm_lexicographical_compare<unsigned char>/6 3.94 ns 2.21 ns
bm_lexicographical_compare<unsigned char>/7 4.56 ns 2.11 ns
bm_lexicographical_compare<unsigned char>/8 5.25 ns 2.11 ns
bm_lexicographical_compare<unsigned char>/16 9.88 ns 2.11 ns
bm_lexicographical_compare<unsigned char>/64 38.9 ns 2.36 ns
bm_lexicographical_compare<unsigned char>/512 317 ns 6.54 ns
bm_lexicographical_compare<unsigned char>/4096 2517 ns 41.4 ns
bm_lexicographical_compare<unsigned char>/32768 20052 ns 488 ns
bm_lexicographical_compare<unsigned char>/262144 159579 ns 4409 ns
bm_lexicographical_compare<unsigned char>/1048576 640456 ns 20342 ns
bm_lexicographical_compare<signed char>/1 1.18 ns 2.37 ns
bm_lexicographical_compare<signed char>/2 1.65 ns 2.60 ns
bm_lexicographical_compare<signed char>/3 2.23 ns 2.83 ns
bm_lexicographical_compare<signed char>/4 2.81 ns 3.06 ns
bm_lexicographical_compare<signed char>/5 3.35 ns 3.30 ns
bm_lexicographical_compare<signed char>/6 3.90 ns 3.99 ns
bm_lexicographical_compare<signed char>/7 4.56 ns 3.78 ns
bm_lexicographical_compare<signed char>/8 5.20 ns 4.02 ns
bm_lexicographical_compare<signed char>/16 9.80 ns 6.21 ns
bm_lexicographical_compare<signed char>/64 39.0 ns 3.16 ns
bm_lexicographical_compare<signed char>/512 318 ns 7.58 ns
bm_lexicographical_compare<signed char>/4096 2514 ns 47.4 ns
bm_lexicographical_compare<signed char>/32768 20096 ns 504 ns
bm_lexicographical_compare<signed char>/262144 156617 ns 4146 ns
bm_lexicographical_compare<signed char>/1048576 624265 ns 19810 ns
bm_lexicographical_compare<int>/1 1.15 ns 2.12 ns
bm_lexicographical_compare<int>/2 1.60 ns 2.36 ns
bm_lexicographical_compare<int>/3 2.21 ns 2.59 ns
bm_lexicographical_compare<int>/4 2.74 ns 2.83 ns
bm_lexicographical_compare<int>/5 3.26 ns 3.06 ns
bm_lexicographical_compare<int>/6 3.81 ns 4.53 ns
bm_lexicographical_compare<int>/7 4.41 ns 4.72 ns
bm_lexicographical_compare<int>/8 5.08 ns 2.36 ns
bm_lexicographical_compare<int>/16 9.54 ns 3.08 ns
bm_lexicographical_compare<int>/64 37.8 ns 4.71 ns
bm_lexicographical_compare<int>/512 309 ns 24.6 ns
bm_lexicographical_compare<int>/4096 2422 ns 204 ns
bm_lexicographical_compare<int>/32768 19362 ns 1947 ns
bm_lexicographical_compare<int>/262144 155727 ns 19793 ns
bm_lexicographical_compare<int>/1048576 623614 ns 80180 ns
bm_ranges_lexicographical_compare<unsigned char>/1 1.07 ns 2.35 ns
bm_ranges_lexicographical_compare<unsigned char>/2 1.72 ns 2.13 ns
bm_ranges_lexicographical_compare<unsigned char>/3 2.46 ns 2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/4 3.17 ns 2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/5 3.86 ns 2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/6 4.55 ns 2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/7 5.25 ns 2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/8 5.95 ns 2.13 ns
bm_ranges_lexicographical_compare<unsigned char>/16 11.7 ns 2.13 ns
bm_ranges_lexicographical_compare<unsigned char>/64 45.5 ns 2.36 ns
bm_ranges_lexicographical_compare<unsigned char>/512 366 ns 6.35 ns
bm_ranges_lexicographical_compare<unsigned char>/4096 2886 ns 40.9 ns
bm_ranges_lexicographical_compare<unsigned char>/32768 23054 ns 489 ns
bm_ranges_lexicographical_compare<unsigned char>/262144 185302 ns 4339 ns
bm_ranges_lexicographical_compare<unsigned char>/1048576 741576 ns 19430 ns
bm_ranges_lexicographical_compare<signed char>/1 1.10 ns 2.12 ns
bm_ranges_lexicographical_compare<signed char>/2 1.66 ns 2.35 ns
bm_ranges_lexicographical_compare<signed char>/3 2.23 ns 2.58 ns
bm_ranges_lexicographical_compare<signed char>/4 2.82 ns 2.82 ns
bm_ranges_lexicographical_compare<signed char>/5 3.34 ns 3.06 ns
bm_ranges_lexicographical_compare<signed char>/6 3.92 ns 3.99 ns
bm_ranges_lexicographical_compare<signed char>/7 4.64 ns 4.10 ns
bm_ranges_lexicographical_compare<signed char>/8 5.21 ns 4.61 ns
bm_ranges_lexicographical_compare<signed char>/16 9.79 ns 7.42 ns
bm_ranges_lexicographical_compare<signed char>/64 38.9 ns 2.93 ns
bm_ranges_lexicographical_compare<signed char>/512 317 ns 7.31 ns
bm_ranges_lexicographical_compare<signed char>/4096 2500 ns 47.5 ns
bm_ranges_lexicographical_compare<signed char>/32768 19940 ns 496 ns
bm_ranges_lexicographical_compare<signed char>/262144 159166 ns 4393 ns
bm_ranges_lexicographical_compare<signed char>/1048576 638206 ns 19786 ns
bm_ranges_lexicographical_compare<int>/1 1.10 ns 2.12 ns
bm_ranges_lexicographical_compare<int>/2 1.64 ns 3.04 ns
bm_ranges_lexicographical_compare<int>/3 2.23 ns 2.58 ns
bm_ranges_lexicographical_compare<int>/4 2.81 ns 2.81 ns
bm_ranges_lexicographical_compare<int>/5 3.35 ns 3.05 ns
bm_ranges_lexicographical_compare<int>/6 3.94 ns 4.60 ns
bm_ranges_lexicographical_compare<int>/7 4.60 ns 4.81 ns
bm_ranges_lexicographical_compare<int>/8 5.19 ns 2.35 ns
bm_ranges_lexicographical_compare<int>/16 9.85 ns 2.87 ns
bm_ranges_lexicographical_compare<int>/64 38.9 ns 4.70 ns
bm_ranges_lexicographical_compare<int>/512 318 ns 24.5 ns
bm_ranges_lexicographical_compare<int>/4096 2494 ns 202 ns
bm_ranges_lexicographical_compare<int>/32768 20000 ns 1939 ns
bm_ranges_lexicographical_compare<int>/262144 160433 ns 19730 ns
bm_ranges_lexicographical_compare<int>/1048576 642636 ns 80760 ns
```
2024-08-04 10:02:43 +02:00
Christopher Di Bella
d10dc5a06f
[libc++] Remove dedicated namespaces for ranges functions ( #76543 )
...
We originally put implementation-detail function objects into individual
namespaces for `std::ranges` without a good reason for doing so. This
practice was continued, presumably because there was prior art. Since
there's no reason to keep these namespaces, this commit removes them,
which will slightly impact binary size.
This commit does not apply to CPOs, some of which need additional work.
2024-08-01 08:54:06 -04:00
Nikolas Klauser
83bc7b5771
[libc++] Remove _LIBCPP_DISABLE_NODISCARD_EXTENSIONS and refactor the tests ( #87094 )
...
This also adds a few tests that were missing.
2024-04-22 22:13:58 +02:00
Nikolas Klauser
935e699173
[libc++] Optimize ranges::minmax ( #87335 )
...
This allows Clang to vectorize the loop.
```
---------------------------------------------------------------------
Benchmark old new
---------------------------------------------------------------------
BM_std_minmax<char>/1 0.659 ns 1.41 ns
BM_std_minmax<char>/2 1.08 ns 2.16 ns
BM_std_minmax<char>/3 2.16 ns 2.96 ns
BM_std_minmax<char>/4 2.82 ns 3.81 ns
BM_std_minmax<char>/5 3.43 ns 4.69 ns
BM_std_minmax<char>/6 4.08 ns 5.63 ns
BM_std_minmax<char>/7 4.75 ns 6.51 ns
BM_std_minmax<char>/8 5.42 ns 7.41 ns
BM_std_minmax<char>/9 6.05 ns 8.34 ns
BM_std_minmax<char>/10 6.68 ns 9.29 ns
BM_std_minmax<char>/11 7.47 ns 10.6 ns
BM_std_minmax<char>/12 7.95 ns 11.4 ns
BM_std_minmax<char>/13 8.64 ns 12.4 ns
BM_std_minmax<char>/14 9.35 ns 13.4 ns
BM_std_minmax<char>/15 10.1 ns 14.4 ns
BM_std_minmax<char>/16 10.6 ns 2.25 ns
BM_std_minmax<char>/17 11.3 ns 2.82 ns
BM_std_minmax<char>/18 11.8 ns 3.71 ns
BM_std_minmax<char>/19 12.6 ns 4.52 ns
BM_std_minmax<char>/20 13.2 ns 5.47 ns
BM_std_minmax<char>/21 14.1 ns 6.67 ns
BM_std_minmax<char>/22 14.5 ns 7.78 ns
BM_std_minmax<char>/23 15.1 ns 8.67 ns
BM_std_minmax<char>/24 15.7 ns 9.68 ns
BM_std_minmax<char>/25 16.4 ns 10.7 ns
BM_std_minmax<char>/26 17.1 ns 11.7 ns
BM_std_minmax<char>/27 17.8 ns 12.8 ns
BM_std_minmax<char>/28 18.4 ns 14.1 ns
BM_std_minmax<char>/29 19.0 ns 15.0 ns
BM_std_minmax<char>/30 19.6 ns 16.0 ns
BM_std_minmax<char>/31 20.2 ns 17.0 ns
BM_std_minmax<char>/32 20.8 ns 2.46 ns
BM_std_minmax<char>/64 41.5 ns 2.97 ns
BM_std_minmax<char>/512 340 ns 6.05 ns
BM_std_minmax<char>/1024 667 ns 8.83 ns
BM_std_minmax<char>/4000 2571 ns 28.6 ns
BM_std_minmax<char>/4096 2632 ns 25.8 ns
BM_std_minmax<char>/5500 3554 ns 51.1 ns
BM_std_minmax<char>/64000 41175 ns 480 ns
BM_std_minmax<char>/65536 42039 ns 490 ns
BM_std_minmax<char>/70000 44931 ns 528 ns
BM_std_minmax<short>/1 0.708 ns 1.20 ns
BM_std_minmax<short>/2 1.18 ns 1.78 ns
BM_std_minmax<short>/3 1.98 ns 2.42 ns
BM_std_minmax<short>/4 2.47 ns 3.05 ns
BM_std_minmax<short>/5 3.09 ns 3.72 ns
BM_std_minmax<short>/6 3.49 ns 4.37 ns
BM_std_minmax<short>/7 4.24 ns 5.03 ns
BM_std_minmax<short>/8 4.65 ns 2.12 ns
BM_std_minmax<short>/9 5.34 ns 2.51 ns
BM_std_minmax<short>/10 5.82 ns 3.18 ns
BM_std_minmax<short>/11 6.36 ns 3.97 ns
BM_std_minmax<short>/12 6.73 ns 4.68 ns
BM_std_minmax<short>/13 7.59 ns 5.49 ns
BM_std_minmax<short>/14 7.77 ns 6.45 ns
BM_std_minmax<short>/15 8.54 ns 7.55 ns
BM_std_minmax<short>/16 8.74 ns 2.38 ns
BM_std_minmax<short>/17 9.59 ns 2.76 ns
BM_std_minmax<short>/18 9.88 ns 3.37 ns
BM_std_minmax<short>/19 10.7 ns 4.17 ns
BM_std_minmax<short>/20 10.9 ns 4.88 ns
BM_std_minmax<short>/21 12.1 ns 5.70 ns
BM_std_minmax<short>/22 12.6 ns 6.64 ns
BM_std_minmax<short>/23 13.5 ns 7.72 ns
BM_std_minmax<short>/24 13.2 ns 2.87 ns
BM_std_minmax<short>/25 14.2 ns 3.10 ns
BM_std_minmax<short>/26 14.2 ns 3.59 ns
BM_std_minmax<short>/27 15.4 ns 4.35 ns
BM_std_minmax<short>/28 15.3 ns 5.10 ns
BM_std_minmax<short>/29 16.2 ns 5.87 ns
BM_std_minmax<short>/30 16.2 ns 6.88 ns
BM_std_minmax<short>/31 17.0 ns 7.78 ns
BM_std_minmax<short>/32 17.2 ns 3.45 ns
BM_std_minmax<short>/64 34.1 ns 3.35 ns
BM_std_minmax<short>/512 279 ns 8.37 ns
BM_std_minmax<short>/1024 549 ns 14.2 ns
BM_std_minmax<short>/4000 2111 ns 50.1 ns
BM_std_minmax<short>/4096 2167 ns 47.9 ns
BM_std_minmax<short>/5500 2895 ns 69.7 ns
BM_std_minmax<short>/64000 33454 ns 953 ns
BM_std_minmax<short>/65536 34474 ns 970 ns
BM_std_minmax<short>/70000 36691 ns 1037 ns
BM_std_minmax<int>/1 0.664 ns 1.17 ns
BM_std_minmax<int>/2 1.11 ns 1.69 ns
BM_std_minmax<int>/3 2.36 ns 2.29 ns
BM_std_minmax<int>/4 2.53 ns 2.91 ns
BM_std_minmax<int>/5 3.23 ns 3.56 ns
BM_std_minmax<int>/6 3.56 ns 4.23 ns
BM_std_minmax<int>/7 4.28 ns 4.91 ns
BM_std_minmax<int>/8 4.60 ns 5.60 ns
BM_std_minmax<int>/9 5.38 ns 6.31 ns
BM_std_minmax<int>/10 5.69 ns 7.03 ns
BM_std_minmax<int>/11 6.41 ns 7.70 ns
BM_std_minmax<int>/12 6.73 ns 8.39 ns
BM_std_minmax<int>/13 7.38 ns 9.07 ns
BM_std_minmax<int>/14 7.74 ns 9.79 ns
BM_std_minmax<int>/15 8.53 ns 10.5 ns
BM_std_minmax<int>/16 8.79 ns 11.2 ns
BM_std_minmax<int>/17 9.63 ns 12.0 ns
BM_std_minmax<int>/18 9.84 ns 12.7 ns
BM_std_minmax<int>/19 10.6 ns 13.5 ns
BM_std_minmax<int>/20 11.0 ns 14.3 ns
BM_std_minmax<int>/21 11.7 ns 15.0 ns
BM_std_minmax<int>/22 12.0 ns 15.7 ns
BM_std_minmax<int>/23 13.1 ns 16.5 ns
BM_std_minmax<int>/24 13.0 ns 17.3 ns
BM_std_minmax<int>/25 13.7 ns 17.9 ns
BM_std_minmax<int>/26 14.0 ns 18.6 ns
BM_std_minmax<int>/27 14.8 ns 19.4 ns
BM_std_minmax<int>/28 15.1 ns 20.3 ns
BM_std_minmax<int>/29 15.8 ns 20.9 ns
BM_std_minmax<int>/30 16.1 ns 21.7 ns
BM_std_minmax<int>/31 16.9 ns 22.5 ns
BM_std_minmax<int>/32 17.2 ns 3.40 ns
BM_std_minmax<int>/64 33.9 ns 4.04 ns
BM_std_minmax<int>/512 275 ns 14.6 ns
BM_std_minmax<int>/1024 541 ns 27.5 ns
BM_std_minmax<int>/4000 2093 ns 96.3 ns
BM_std_minmax<int>/4096 2146 ns 98.3 ns
BM_std_minmax<int>/5500 2866 ns 157 ns
BM_std_minmax<int>/64000 33619 ns 1954 ns
BM_std_minmax<int>/65536 34252 ns 2009 ns
BM_std_minmax<int>/70000 36618 ns 2125 ns
BM_std_minmax<long long>/1 0.709 ns 1.19 ns
BM_std_minmax<long long>/2 1.01 ns 1.65 ns
BM_std_minmax<long long>/3 2.14 ns 2.21 ns
BM_std_minmax<long long>/4 2.45 ns 2.83 ns
BM_std_minmax<long long>/5 3.09 ns 3.47 ns
BM_std_minmax<long long>/6 3.44 ns 4.11 ns
BM_std_minmax<long long>/7 4.16 ns 4.79 ns
BM_std_minmax<long long>/8 4.54 ns 5.47 ns
BM_std_minmax<long long>/9 5.37 ns 6.20 ns
BM_std_minmax<long long>/10 5.71 ns 6.93 ns
BM_std_minmax<long long>/11 6.00 ns 7.60 ns
BM_std_minmax<long long>/12 6.43 ns 8.27 ns
BM_std_minmax<long long>/13 7.01 ns 8.94 ns
BM_std_minmax<long long>/14 7.45 ns 9.65 ns
BM_std_minmax<long long>/15 8.16 ns 10.4 ns
BM_std_minmax<long long>/16 8.46 ns 5.22 ns
BM_std_minmax<long long>/17 9.16 ns 5.22 ns
BM_std_minmax<long long>/18 9.53 ns 5.52 ns
BM_std_minmax<long long>/19 10.2 ns 6.02 ns
BM_std_minmax<long long>/20 10.5 ns 6.89 ns
BM_std_minmax<long long>/21 11.3 ns 7.83 ns
BM_std_minmax<long long>/22 11.6 ns 8.59 ns
BM_std_minmax<long long>/23 12.3 ns 9.91 ns
BM_std_minmax<long long>/24 12.6 ns 10.1 ns
BM_std_minmax<long long>/25 13.2 ns 12.0 ns
BM_std_minmax<long long>/26 13.6 ns 13.5 ns
BM_std_minmax<long long>/27 14.2 ns 14.8 ns
BM_std_minmax<long long>/28 14.7 ns 15.9 ns
BM_std_minmax<long long>/29 15.3 ns 16.6 ns
BM_std_minmax<long long>/30 15.8 ns 17.3 ns
BM_std_minmax<long long>/31 16.3 ns 18.2 ns
BM_std_minmax<long long>/32 16.7 ns 7.18 ns
BM_std_minmax<long long>/64 33.1 ns 11.5 ns
BM_std_minmax<long long>/512 268 ns 71.0 ns
BM_std_minmax<long long>/1024 532 ns 138 ns
BM_std_minmax<long long>/4000 2056 ns 533 ns
BM_std_minmax<long long>/4096 2112 ns 539 ns
BM_std_minmax<long long>/5500 2823 ns 749 ns
BM_std_minmax<long long>/64000 32956 ns 8590 ns
BM_std_minmax<long long>/65536 33795 ns 8791 ns
BM_std_minmax<long long>/70000 36084 ns 9442 ns
BM_std_minmax<unsigned char>/1 0.714 ns 1.41 ns
BM_std_minmax<unsigned char>/2 0.955 ns 1.96 ns
BM_std_minmax<unsigned char>/3 1.90 ns 2.63 ns
BM_std_minmax<unsigned char>/4 2.40 ns 3.34 ns
BM_std_minmax<unsigned char>/5 2.87 ns 4.10 ns
BM_std_minmax<unsigned char>/6 3.47 ns 4.88 ns
BM_std_minmax<unsigned char>/7 4.04 ns 5.66 ns
BM_std_minmax<unsigned char>/8 4.65 ns 6.45 ns
BM_std_minmax<unsigned char>/9 5.18 ns 7.24 ns
BM_std_minmax<unsigned char>/10 5.80 ns 8.05 ns
BM_std_minmax<unsigned char>/11 6.24 ns 8.86 ns
BM_std_minmax<unsigned char>/12 6.78 ns 9.70 ns
BM_std_minmax<unsigned char>/13 7.30 ns 10.6 ns
BM_std_minmax<unsigned char>/14 7.86 ns 11.4 ns
BM_std_minmax<unsigned char>/15 8.46 ns 12.3 ns
BM_std_minmax<unsigned char>/16 9.00 ns 2.12 ns
BM_std_minmax<unsigned char>/17 9.58 ns 2.83 ns
BM_std_minmax<unsigned char>/18 10.1 ns 3.37 ns
BM_std_minmax<unsigned char>/19 10.7 ns 4.11 ns
BM_std_minmax<unsigned char>/20 11.2 ns 4.85 ns
BM_std_minmax<unsigned char>/21 11.9 ns 5.69 ns
BM_std_minmax<unsigned char>/22 12.3 ns 6.77 ns
BM_std_minmax<unsigned char>/23 13.1 ns 7.56 ns
BM_std_minmax<unsigned char>/24 13.5 ns 8.40 ns
BM_std_minmax<unsigned char>/25 14.2 ns 9.30 ns
BM_std_minmax<unsigned char>/26 14.4 ns 10.1 ns
BM_std_minmax<unsigned char>/27 15.0 ns 11.1 ns
BM_std_minmax<unsigned char>/28 15.3 ns 11.9 ns
BM_std_minmax<unsigned char>/29 16.2 ns 12.9 ns
BM_std_minmax<unsigned char>/30 16.5 ns 13.9 ns
BM_std_minmax<unsigned char>/31 17.2 ns 14.8 ns
BM_std_minmax<unsigned char>/32 17.6 ns 2.36 ns
BM_std_minmax<unsigned char>/64 35.6 ns 3.21 ns
BM_std_minmax<unsigned char>/512 288 ns 6.00 ns
BM_std_minmax<unsigned char>/1024 573 ns 8.80 ns
BM_std_minmax<unsigned char>/4000 2222 ns 28.6 ns
BM_std_minmax<unsigned char>/4096 2265 ns 25.9 ns
BM_std_minmax<unsigned char>/5500 3047 ns 48.8 ns
BM_std_minmax<unsigned char>/64000 35059 ns 480 ns
BM_std_minmax<unsigned char>/65536 35941 ns 491 ns
BM_std_minmax<unsigned char>/70000 38922 ns 525 ns
BM_std_minmax<unsigned short>/1 0.711 ns 1.18 ns
BM_std_minmax<unsigned short>/2 0.957 ns 1.65 ns
BM_std_minmax<unsigned short>/3 2.13 ns 2.21 ns
BM_std_minmax<unsigned short>/4 2.14 ns 2.78 ns
BM_std_minmax<unsigned short>/5 3.06 ns 3.29 ns
BM_std_minmax<unsigned short>/6 2.89 ns 3.87 ns
BM_std_minmax<unsigned short>/7 3.80 ns 4.55 ns
BM_std_minmax<unsigned short>/8 3.68 ns 2.02 ns
BM_std_minmax<unsigned short>/9 4.53 ns 2.40 ns
BM_std_minmax<unsigned short>/10 4.60 ns 2.94 ns
BM_std_minmax<unsigned short>/11 5.67 ns 3.67 ns
BM_std_minmax<unsigned short>/12 5.39 ns 4.22 ns
BM_std_minmax<unsigned short>/13 6.58 ns 4.78 ns
BM_std_minmax<unsigned short>/14 6.33 ns 5.54 ns
BM_std_minmax<unsigned short>/15 7.34 ns 6.30 ns
BM_std_minmax<unsigned short>/16 7.17 ns 2.25 ns
BM_std_minmax<unsigned short>/17 8.19 ns 2.61 ns
BM_std_minmax<unsigned short>/18 8.02 ns 3.19 ns
BM_std_minmax<unsigned short>/19 9.03 ns 3.72 ns
BM_std_minmax<unsigned short>/20 8.89 ns 4.36 ns
BM_std_minmax<unsigned short>/21 9.77 ns 5.10 ns
BM_std_minmax<unsigned short>/22 9.70 ns 5.55 ns
BM_std_minmax<unsigned short>/23 10.8 ns 6.29 ns
BM_std_minmax<unsigned short>/24 10.6 ns 2.41 ns
BM_std_minmax<unsigned short>/25 11.6 ns 2.75 ns
BM_std_minmax<unsigned short>/26 11.4 ns 3.26 ns
BM_std_minmax<unsigned short>/27 12.4 ns 3.86 ns
BM_std_minmax<unsigned short>/28 12.3 ns 4.45 ns
BM_std_minmax<unsigned short>/29 13.2 ns 5.07 ns
BM_std_minmax<unsigned short>/30 13.1 ns 5.77 ns
BM_std_minmax<unsigned short>/31 13.9 ns 6.65 ns
BM_std_minmax<unsigned short>/32 13.9 ns 2.72 ns
BM_std_minmax<unsigned short>/64 27.8 ns 3.25 ns
BM_std_minmax<unsigned short>/512 220 ns 8.30 ns
BM_std_minmax<unsigned short>/1024 435 ns 14.1 ns
BM_std_minmax<unsigned short>/4000 1703 ns 49.8 ns
BM_std_minmax<unsigned short>/4096 1746 ns 47.9 ns
BM_std_minmax<unsigned short>/5500 2350 ns 69.9 ns
BM_std_minmax<unsigned short>/64000 27388 ns 953 ns
BM_std_minmax<unsigned short>/65536 28040 ns 975 ns
BM_std_minmax<unsigned short>/70000 29967 ns 1040 ns
BM_std_minmax<unsigned int>/1 0.712 ns 1.18 ns
BM_std_minmax<unsigned int>/2 0.965 ns 1.65 ns
BM_std_minmax<unsigned int>/3 2.13 ns 2.14 ns
BM_std_minmax<unsigned int>/4 2.09 ns 2.64 ns
BM_std_minmax<unsigned int>/5 3.02 ns 3.21 ns
BM_std_minmax<unsigned int>/6 2.94 ns 3.81 ns
BM_std_minmax<unsigned int>/7 3.91 ns 4.38 ns
BM_std_minmax<unsigned int>/8 3.75 ns 4.93 ns
BM_std_minmax<unsigned int>/9 4.71 ns 5.60 ns
BM_std_minmax<unsigned int>/10 4.59 ns 6.26 ns
BM_std_minmax<unsigned int>/11 5.57 ns 6.80 ns
BM_std_minmax<unsigned int>/12 5.43 ns 7.47 ns
BM_std_minmax<unsigned int>/13 6.45 ns 8.10 ns
BM_std_minmax<unsigned int>/14 6.32 ns 8.69 ns
BM_std_minmax<unsigned int>/15 7.29 ns 9.37 ns
BM_std_minmax<unsigned int>/16 7.12 ns 9.99 ns
BM_std_minmax<unsigned int>/17 8.24 ns 10.6 ns
BM_std_minmax<unsigned int>/18 8.00 ns 11.2 ns
BM_std_minmax<unsigned int>/19 8.94 ns 12.0 ns
BM_std_minmax<unsigned int>/20 8.91 ns 12.6 ns
BM_std_minmax<unsigned int>/21 9.73 ns 17.2 ns
BM_std_minmax<unsigned int>/22 9.75 ns 13.8 ns
BM_std_minmax<unsigned int>/23 10.6 ns 14.5 ns
BM_std_minmax<unsigned int>/24 10.6 ns 15.1 ns
BM_std_minmax<unsigned int>/25 11.5 ns 15.7 ns
BM_std_minmax<unsigned int>/26 11.4 ns 16.3 ns
BM_std_minmax<unsigned int>/27 12.3 ns 17.0 ns
BM_std_minmax<unsigned int>/28 12.3 ns 17.6 ns
BM_std_minmax<unsigned int>/29 13.2 ns 18.3 ns
BM_std_minmax<unsigned int>/30 13.2 ns 19.0 ns
BM_std_minmax<unsigned int>/31 14.0 ns 19.6 ns
BM_std_minmax<unsigned int>/32 14.0 ns 3.39 ns
BM_std_minmax<unsigned int>/64 27.6 ns 4.05 ns
BM_std_minmax<unsigned int>/512 221 ns 14.2 ns
BM_std_minmax<unsigned int>/1024 439 ns 25.5 ns
BM_std_minmax<unsigned int>/4000 1720 ns 96.3 ns
BM_std_minmax<unsigned int>/4096 1762 ns 97.8 ns
BM_std_minmax<unsigned int>/5500 2364 ns 146 ns
BM_std_minmax<unsigned int>/64000 27874 ns 1905 ns
BM_std_minmax<unsigned int>/65536 28012 ns 1961 ns
BM_std_minmax<unsigned int>/70000 29899 ns 2087 ns
BM_std_minmax<unsigned long long>/1 0.707 ns 1.18 ns
BM_std_minmax<unsigned long long>/2 0.909 ns 1.65 ns
BM_std_minmax<unsigned long long>/3 1.65 ns 2.70 ns
BM_std_minmax<unsigned long long>/4 1.93 ns 2.69 ns
BM_std_minmax<unsigned long long>/5 2.45 ns 3.34 ns
BM_std_minmax<unsigned long long>/6 2.78 ns 3.81 ns
BM_std_minmax<unsigned long long>/7 3.28 ns 4.43 ns
BM_std_minmax<unsigned long long>/8 3.70 ns 4.92 ns
BM_std_minmax<unsigned long long>/9 4.12 ns 5.64 ns
BM_std_minmax<unsigned long long>/10 4.44 ns 6.15 ns
BM_std_minmax<unsigned long long>/11 4.91 ns 6.81 ns
BM_std_minmax<unsigned long long>/12 5.31 ns 7.41 ns
BM_std_minmax<unsigned long long>/13 5.72 ns 7.96 ns
BM_std_minmax<unsigned long long>/14 6.05 ns 8.66 ns
BM_std_minmax<unsigned long long>/15 6.55 ns 9.37 ns
BM_std_minmax<unsigned long long>/16 6.89 ns 7.98 ns
BM_std_minmax<unsigned long long>/17 7.34 ns 8.13 ns
BM_std_minmax<unsigned long long>/18 7.73 ns 8.42 ns
BM_std_minmax<unsigned long long>/19 8.26 ns 8.63 ns
BM_std_minmax<unsigned long long>/20 8.54 ns 8.96 ns
BM_std_minmax<unsigned long long>/21 9.14 ns 9.37 ns
BM_std_minmax<unsigned long long>/22 9.39 ns 9.67 ns
BM_std_minmax<unsigned long long>/23 10.1 ns 10.1 ns
BM_std_minmax<unsigned long long>/24 10.4 ns 10.6 ns
BM_std_minmax<unsigned long long>/25 11.0 ns 11.3 ns
BM_std_minmax<unsigned long long>/26 11.3 ns 12.1 ns
BM_std_minmax<unsigned long long>/27 11.8 ns 14.2 ns
BM_std_minmax<unsigned long long>/28 12.1 ns 15.8 ns
BM_std_minmax<unsigned long long>/29 12.6 ns 17.4 ns
BM_std_minmax<unsigned long long>/30 13.1 ns 18.1 ns
BM_std_minmax<unsigned long long>/31 13.4 ns 18.8 ns
BM_std_minmax<unsigned long long>/32 13.8 ns 10.4 ns
BM_std_minmax<unsigned long long>/64 27.3 ns 15.5 ns
BM_std_minmax<unsigned long long>/512 222 ns 80.6 ns
BM_std_minmax<unsigned long long>/1024 443 ns 156 ns
BM_std_minmax<unsigned long long>/4000 1731 ns 591 ns
BM_std_minmax<unsigned long long>/4096 1752 ns 609 ns
BM_std_minmax<unsigned long long>/5500 2340 ns 819 ns
BM_std_minmax<unsigned long long>/64000 27166 ns 9652 ns
BM_std_minmax<unsigned long long>/65536 27869 ns 9876 ns
BM_std_minmax<unsigned long long>/70000 29920 ns 10680 ns
```
2024-04-06 17:22:07 +02:00
Konstantin Varlamov
1638657dce
[libc++][hardening] Categorize more 'valid-element-access' checks. ( #71620 )
2023-12-20 17:24:48 -08:00
varconst
cd0ad4216c
[libc++][hardening][NFC] Introduce _LIBCPP_ASSERT_UNCATEGORIZED
.
...
Replace most uses of `_LIBCPP_ASSERT` with
`_LIBCPP_ASSERT_UNCATEGORIZED`.
This is done as a prerequisite to introducing hardened mode to libc++.
The idea is to make enabling assertions an opt-in with (somewhat)
fine-grained controls over which categories of assertions are enabled.
The vast majority of assertions are currently uncategorized; the new
macro will allow turning on `_LIBCPP_ASSERT` (the underlying mechanism
for all kinds of assertions) without enabling all the uncategorized
assertions (in the future; this patch preserves the current behavior).
Differential Revision: https://reviews.llvm.org/D153816
2023-06-28 15:10:31 -07:00
Louis Dionne
5aa03b648b
[libc++][NFC] Apply clang-format on large parts of the code base
...
This commit does a pass of clang-format over files in libc++ that
don't require major changes to conform to our style guide, or for
which we're not overly concerned about conflicting with in-flight
patches or hindering the git blame.
This roughly covers:
- benchmarks
- range algorithms
- concepts
- type traits
I did a manual verification of all the changes, and in particular I
applied clang-format on/off annotations in a few places where the
result was less readable after than before. This was not necessary
in a lot of places, however I did find that clang-format had pretty
bad taste when it comes to formatting concepts.
Differential Revision: https://reviews.llvm.org/D153140
2023-06-19 11:19:51 -04:00
Ian Anderson
79702f7f59
[libc++][Modules] Add missing includes and exports
...
Several headers are missing includes for things they use.
type_traits.is_enum needs to export type_traits.integral_constant so that clients can access its `value` member without explicitly including __type_traits/integral_constant.h themselves.
Make `subrange_fwd` a peer submodule to `subrange` rather than a submodule of it, and have `subrange` export `subrange_fwd`. That will make it easier to programmatically generate modules for the private detail headers, and it will accomplish the same effect that __ranges/subrange.h will make subrange_kind visible.
Reviewed By: Mordante, #libc
Differential Revision: https://reviews.llvm.org/D150055
2023-05-07 19:54:49 -05:00
Nikolas Klauser
4f15267d3d
[libc++][NFC] Replace _LIBCPP_STD_VER > x with _LIBCPP_STD_VER >= x
...
This change is almost fully mechanical. The only interesting change is in `generate_feature_test_macro_components.py` to generate `_LIBCPP_STD_VER >=` instead. To avoid churn in the git-blame this commit should be added to the `.git-blame-ignore-revs` once committed.
Reviewed By: ldionne, var-const, #libc
Spies: jloser, libcxx-commits, arichardson, arphaman, wenlei
Differential Revision: https://reviews.llvm.org/D143962
2023-02-15 16:52:25 +01:00
Igor Zhukov
0a0d58f546
[libc++] <algorithm>
: ranges::minmax
should dereference iterators only once
...
Reviewed By: philnik, #libc
Differential Revision: https://reviews.llvm.org/D142864
2023-02-15 06:02:12 +07:00
Nikolas Klauser
934650b24f
[libc++] Add [[clang::lifetimebound]] to min/max algorithms
...
Reviewed By: Mordante, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D142608
2023-01-30 18:01:46 +01:00
Nikolas Klauser
660b243120
[libc++] Add [[nodiscard]] extensions to ranges algorithms
...
This mirrors what we have done in the classic algorithms
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D137186
2022-11-05 16:38:46 +01:00
Nikolas Klauser
d5e26775d0
[libc++] Granularize the rest of memory
...
Reviewed By: ldionne, #libc
Spies: vitalybuka, paulkirth, libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D132790
2022-09-05 12:36:41 +02:00
Vitaly Buka
bc8fd9c633
Revert "[libc++] Granularize the rest of memory"
...
Breaks buildbots.
This reverts commit 30adaa730c4768b5eb06719c808b2884fcf53cf3.
2022-09-02 19:42:49 -07:00
Nikolas Klauser
30adaa730c
[libc++] Granularize the rest of memory
...
Reviewed By: ldionne, #libc
Spies: libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D132790
2022-09-02 21:42:41 +02:00
Louis Dionne
b8cb1dc9ea
[libc++] Make <ranges> non-experimental
...
When we ship LLVM 16, <ranges> won't be considered experimental anymore.
We might as well do this sooner rather than later.
Differential Revision: https://reviews.llvm.org/D132151
2022-08-18 16:59:58 -04:00
Louis Dionne
a2c1603206
[libc++] Add a few missing min/max macro push/pop
...
Also, improve the test for nasty macros to define min and max, so this
will be caught in the future.
Differential Revision: https://reviews.llvm.org/D128655
2022-06-27 12:57:39 -04:00
Nikolas Klauser
58d9ab70ae
[libc++][ranges] Implement ranges::minmax and ranges::minmax_element
...
Reviewed By: var-const, #libc, ldionne
Spies: sstefan1, ldionne, BRevzin, libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D120637
2022-04-14 15:37:22 +02:00