22 Commits

Author SHA1 Message Date
Nikolas Klauser
0fb76bae6b
Reapply "[libc++] Simplify the implementation of std::sort a bit (#104902)" (#114023)
This reverts commit ef44e4659878f2. The patch was originally reverted
because it was
deemed to introduce a performance regression for small inputs, however
it also fixed
a previous performance regression for larger inputs. So overall, this
patch is desirable.
2024-10-30 11:51:55 +01:00
Louis Dionne
ef44e46598 Revert "[libc++] Simplify the implementation of std::sort a bit (#104902)"
This reverts commit d4ffccfce103b01401b8a9222e373f2d404f8439, which
caused a performance regression that needs to be investigated further.
2024-09-20 09:48:58 -04:00
Louis Dionne
d6832a611a
[libc++][modules] Modularize <cstddef> (#107254)
Many headers include `<cstddef>` just for size_t, and pulling in
additional content (e.g. the traits used for std::byte) is unnecessary.
To solve this problem, this patch splits up `<cstddef>` into
subcomponents so that headers can include only the parts that they
actually require.

This has the added benefit of making the modules build a lot stricter
with respect to IWYU, and also providing a canonical location where we
define `std::size_t` and friends (which were previously defined in
multiple headers like `<cstddef>` and `<ctime>`).

After this patch, there's still many places in the codebase where we
include `<cstddef>` when `<__cstddef/size_t.h>` would be sufficient.
This patch focuses on removing `<cstddef>` includes from __type_traits
to make these headers non-circular with `<cstddef>`. Additional
refactorings can be tackled separately.
2024-09-05 08:28:33 -04:00
Nikolas Klauser
d4ffccfce1
[libc++] Simplify the implementation of std::sort a bit (#104902)
This does a few things to canonicalize the library a bit. Specifically
- use `__desugars_to_v` instead of the custom `__is_simple_comparator`
- make `__use_branchless_sort` an inline variable
- remove the `_maybe_branchless` versions of the `__sortN` functions and
overload based on whether we can do branchless sorting instead.
2024-08-27 16:54:05 +02:00
Nikolas Klauser
d07fdf9779
[libc++] Optimize lexicographical_compare (#65279)
If the comparison operation is equivalent to < and that is a total
order, we know that we can use equality comparison on that type instead
to extract some information. Furthermore, if equality comparison on that
type is trivial, the user can't observe that we're calling it. So
instead of using the user-provided total order, we use std::mismatch,
which uses equality comparison (and is vertorized). Additionally, if the
type is trivially lexicographically comparable, we can go one step
further and use std::memcmp directly instead of calling std::mismatch.

Benchmarks:
```
-------------------------------------------------------------------------------------
Benchmark                                                         old             new
-------------------------------------------------------------------------------------
bm_lexicographical_compare<unsigned char>/1                   1.17 ns         2.34 ns
bm_lexicographical_compare<unsigned char>/2                   1.64 ns         2.57 ns
bm_lexicographical_compare<unsigned char>/3                   2.23 ns         2.58 ns
bm_lexicographical_compare<unsigned char>/4                   2.82 ns         2.57 ns
bm_lexicographical_compare<unsigned char>/5                   3.34 ns         2.11 ns
bm_lexicographical_compare<unsigned char>/6                   3.94 ns         2.21 ns
bm_lexicographical_compare<unsigned char>/7                   4.56 ns         2.11 ns
bm_lexicographical_compare<unsigned char>/8                   5.25 ns         2.11 ns
bm_lexicographical_compare<unsigned char>/16                  9.88 ns         2.11 ns
bm_lexicographical_compare<unsigned char>/64                  38.9 ns         2.36 ns
bm_lexicographical_compare<unsigned char>/512                  317 ns         6.54 ns
bm_lexicographical_compare<unsigned char>/4096                2517 ns         41.4 ns
bm_lexicographical_compare<unsigned char>/32768              20052 ns          488 ns
bm_lexicographical_compare<unsigned char>/262144            159579 ns         4409 ns
bm_lexicographical_compare<unsigned char>/1048576           640456 ns        20342 ns
bm_lexicographical_compare<signed char>/1                     1.18 ns         2.37 ns
bm_lexicographical_compare<signed char>/2                     1.65 ns         2.60 ns
bm_lexicographical_compare<signed char>/3                     2.23 ns         2.83 ns
bm_lexicographical_compare<signed char>/4                     2.81 ns         3.06 ns
bm_lexicographical_compare<signed char>/5                     3.35 ns         3.30 ns
bm_lexicographical_compare<signed char>/6                     3.90 ns         3.99 ns
bm_lexicographical_compare<signed char>/7                     4.56 ns         3.78 ns
bm_lexicographical_compare<signed char>/8                     5.20 ns         4.02 ns
bm_lexicographical_compare<signed char>/16                    9.80 ns         6.21 ns
bm_lexicographical_compare<signed char>/64                    39.0 ns         3.16 ns
bm_lexicographical_compare<signed char>/512                    318 ns         7.58 ns
bm_lexicographical_compare<signed char>/4096                  2514 ns         47.4 ns
bm_lexicographical_compare<signed char>/32768                20096 ns          504 ns
bm_lexicographical_compare<signed char>/262144              156617 ns         4146 ns
bm_lexicographical_compare<signed char>/1048576             624265 ns        19810 ns
bm_lexicographical_compare<int>/1                             1.15 ns         2.12 ns
bm_lexicographical_compare<int>/2                             1.60 ns         2.36 ns
bm_lexicographical_compare<int>/3                             2.21 ns         2.59 ns
bm_lexicographical_compare<int>/4                             2.74 ns         2.83 ns
bm_lexicographical_compare<int>/5                             3.26 ns         3.06 ns
bm_lexicographical_compare<int>/6                             3.81 ns         4.53 ns
bm_lexicographical_compare<int>/7                             4.41 ns         4.72 ns
bm_lexicographical_compare<int>/8                             5.08 ns         2.36 ns
bm_lexicographical_compare<int>/16                            9.54 ns         3.08 ns
bm_lexicographical_compare<int>/64                            37.8 ns         4.71 ns
bm_lexicographical_compare<int>/512                            309 ns         24.6 ns
bm_lexicographical_compare<int>/4096                          2422 ns          204 ns
bm_lexicographical_compare<int>/32768                        19362 ns         1947 ns
bm_lexicographical_compare<int>/262144                      155727 ns        19793 ns
bm_lexicographical_compare<int>/1048576                     623614 ns        80180 ns
bm_ranges_lexicographical_compare<unsigned char>/1            1.07 ns         2.35 ns
bm_ranges_lexicographical_compare<unsigned char>/2            1.72 ns         2.13 ns
bm_ranges_lexicographical_compare<unsigned char>/3            2.46 ns         2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/4            3.17 ns         2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/5            3.86 ns         2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/6            4.55 ns         2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/7            5.25 ns         2.12 ns
bm_ranges_lexicographical_compare<unsigned char>/8            5.95 ns         2.13 ns
bm_ranges_lexicographical_compare<unsigned char>/16           11.7 ns         2.13 ns
bm_ranges_lexicographical_compare<unsigned char>/64           45.5 ns         2.36 ns
bm_ranges_lexicographical_compare<unsigned char>/512           366 ns         6.35 ns
bm_ranges_lexicographical_compare<unsigned char>/4096         2886 ns         40.9 ns
bm_ranges_lexicographical_compare<unsigned char>/32768       23054 ns          489 ns
bm_ranges_lexicographical_compare<unsigned char>/262144     185302 ns         4339 ns
bm_ranges_lexicographical_compare<unsigned char>/1048576    741576 ns        19430 ns
bm_ranges_lexicographical_compare<signed char>/1              1.10 ns         2.12 ns
bm_ranges_lexicographical_compare<signed char>/2              1.66 ns         2.35 ns
bm_ranges_lexicographical_compare<signed char>/3              2.23 ns         2.58 ns
bm_ranges_lexicographical_compare<signed char>/4              2.82 ns         2.82 ns
bm_ranges_lexicographical_compare<signed char>/5              3.34 ns         3.06 ns
bm_ranges_lexicographical_compare<signed char>/6              3.92 ns         3.99 ns
bm_ranges_lexicographical_compare<signed char>/7              4.64 ns         4.10 ns
bm_ranges_lexicographical_compare<signed char>/8              5.21 ns         4.61 ns
bm_ranges_lexicographical_compare<signed char>/16             9.79 ns         7.42 ns
bm_ranges_lexicographical_compare<signed char>/64             38.9 ns         2.93 ns
bm_ranges_lexicographical_compare<signed char>/512             317 ns         7.31 ns
bm_ranges_lexicographical_compare<signed char>/4096           2500 ns         47.5 ns
bm_ranges_lexicographical_compare<signed char>/32768         19940 ns          496 ns
bm_ranges_lexicographical_compare<signed char>/262144       159166 ns         4393 ns
bm_ranges_lexicographical_compare<signed char>/1048576      638206 ns        19786 ns
bm_ranges_lexicographical_compare<int>/1                      1.10 ns         2.12 ns
bm_ranges_lexicographical_compare<int>/2                      1.64 ns         3.04 ns
bm_ranges_lexicographical_compare<int>/3                      2.23 ns         2.58 ns
bm_ranges_lexicographical_compare<int>/4                      2.81 ns         2.81 ns
bm_ranges_lexicographical_compare<int>/5                      3.35 ns         3.05 ns
bm_ranges_lexicographical_compare<int>/6                      3.94 ns         4.60 ns
bm_ranges_lexicographical_compare<int>/7                      4.60 ns         4.81 ns
bm_ranges_lexicographical_compare<int>/8                      5.19 ns         2.35 ns
bm_ranges_lexicographical_compare<int>/16                     9.85 ns         2.87 ns
bm_ranges_lexicographical_compare<int>/64                     38.9 ns         4.70 ns
bm_ranges_lexicographical_compare<int>/512                     318 ns         24.5 ns
bm_ranges_lexicographical_compare<int>/4096                   2494 ns          202 ns
bm_ranges_lexicographical_compare<int>/32768                 20000 ns         1939 ns
bm_ranges_lexicographical_compare<int>/262144               160433 ns        19730 ns
bm_ranges_lexicographical_compare<int>/1048576              642636 ns        80760 ns
```
2024-08-04 10:02:43 +02:00
Christopher Di Bella
d10dc5a06f
[libc++] Remove dedicated namespaces for ranges functions (#76543)
We originally put implementation-detail function objects into individual
namespaces for `std::ranges` without a good reason for doing so. This
practice was continued, presumably because there was prior art. Since
there's no reason to keep these namespaces, this commit removes them,
which will slightly impact binary size.

This commit does not apply to CPOs, some of which need additional work.
2024-08-01 08:54:06 -04:00
Nikolas Klauser
83bc7b5771
[libc++] Remove _LIBCPP_DISABLE_NODISCARD_EXTENSIONS and refactor the tests (#87094)
This also adds a few tests that were missing.
2024-04-22 22:13:58 +02:00
Nikolas Klauser
935e699173
[libc++] Optimize ranges::minmax (#87335)
This allows Clang to vectorize the loop.
```
---------------------------------------------------------------------
Benchmark                                         old             new
---------------------------------------------------------------------
BM_std_minmax<char>/1                        0.659 ns         1.41 ns
BM_std_minmax<char>/2                         1.08 ns         2.16 ns
BM_std_minmax<char>/3                         2.16 ns         2.96 ns
BM_std_minmax<char>/4                         2.82 ns         3.81 ns
BM_std_minmax<char>/5                         3.43 ns         4.69 ns
BM_std_minmax<char>/6                         4.08 ns         5.63 ns
BM_std_minmax<char>/7                         4.75 ns         6.51 ns
BM_std_minmax<char>/8                         5.42 ns         7.41 ns
BM_std_minmax<char>/9                         6.05 ns         8.34 ns
BM_std_minmax<char>/10                        6.68 ns         9.29 ns
BM_std_minmax<char>/11                        7.47 ns         10.6 ns
BM_std_minmax<char>/12                        7.95 ns         11.4 ns
BM_std_minmax<char>/13                        8.64 ns         12.4 ns
BM_std_minmax<char>/14                        9.35 ns         13.4 ns
BM_std_minmax<char>/15                        10.1 ns         14.4 ns
BM_std_minmax<char>/16                        10.6 ns         2.25 ns
BM_std_minmax<char>/17                        11.3 ns         2.82 ns
BM_std_minmax<char>/18                        11.8 ns         3.71 ns
BM_std_minmax<char>/19                        12.6 ns         4.52 ns
BM_std_minmax<char>/20                        13.2 ns         5.47 ns
BM_std_minmax<char>/21                        14.1 ns         6.67 ns
BM_std_minmax<char>/22                        14.5 ns         7.78 ns
BM_std_minmax<char>/23                        15.1 ns         8.67 ns
BM_std_minmax<char>/24                        15.7 ns         9.68 ns
BM_std_minmax<char>/25                        16.4 ns         10.7 ns
BM_std_minmax<char>/26                        17.1 ns         11.7 ns
BM_std_minmax<char>/27                        17.8 ns         12.8 ns
BM_std_minmax<char>/28                        18.4 ns         14.1 ns
BM_std_minmax<char>/29                        19.0 ns         15.0 ns
BM_std_minmax<char>/30                        19.6 ns         16.0 ns
BM_std_minmax<char>/31                        20.2 ns         17.0 ns
BM_std_minmax<char>/32                        20.8 ns         2.46 ns
BM_std_minmax<char>/64                        41.5 ns         2.97 ns
BM_std_minmax<char>/512                        340 ns         6.05 ns
BM_std_minmax<char>/1024                       667 ns         8.83 ns
BM_std_minmax<char>/4000                      2571 ns         28.6 ns
BM_std_minmax<char>/4096                      2632 ns         25.8 ns
BM_std_minmax<char>/5500                      3554 ns         51.1 ns
BM_std_minmax<char>/64000                    41175 ns          480 ns
BM_std_minmax<char>/65536                    42039 ns          490 ns
BM_std_minmax<char>/70000                    44931 ns          528 ns
BM_std_minmax<short>/1                       0.708 ns         1.20 ns
BM_std_minmax<short>/2                        1.18 ns         1.78 ns
BM_std_minmax<short>/3                        1.98 ns         2.42 ns
BM_std_minmax<short>/4                        2.47 ns         3.05 ns
BM_std_minmax<short>/5                        3.09 ns         3.72 ns
BM_std_minmax<short>/6                        3.49 ns         4.37 ns
BM_std_minmax<short>/7                        4.24 ns         5.03 ns
BM_std_minmax<short>/8                        4.65 ns         2.12 ns
BM_std_minmax<short>/9                        5.34 ns         2.51 ns
BM_std_minmax<short>/10                       5.82 ns         3.18 ns
BM_std_minmax<short>/11                       6.36 ns         3.97 ns
BM_std_minmax<short>/12                       6.73 ns         4.68 ns
BM_std_minmax<short>/13                       7.59 ns         5.49 ns
BM_std_minmax<short>/14                       7.77 ns         6.45 ns
BM_std_minmax<short>/15                       8.54 ns         7.55 ns
BM_std_minmax<short>/16                       8.74 ns         2.38 ns
BM_std_minmax<short>/17                       9.59 ns         2.76 ns
BM_std_minmax<short>/18                       9.88 ns         3.37 ns
BM_std_minmax<short>/19                       10.7 ns         4.17 ns
BM_std_minmax<short>/20                       10.9 ns         4.88 ns
BM_std_minmax<short>/21                       12.1 ns         5.70 ns
BM_std_minmax<short>/22                       12.6 ns         6.64 ns
BM_std_minmax<short>/23                       13.5 ns         7.72 ns
BM_std_minmax<short>/24                       13.2 ns         2.87 ns
BM_std_minmax<short>/25                       14.2 ns         3.10 ns
BM_std_minmax<short>/26                       14.2 ns         3.59 ns
BM_std_minmax<short>/27                       15.4 ns         4.35 ns
BM_std_minmax<short>/28                       15.3 ns         5.10 ns
BM_std_minmax<short>/29                       16.2 ns         5.87 ns
BM_std_minmax<short>/30                       16.2 ns         6.88 ns
BM_std_minmax<short>/31                       17.0 ns         7.78 ns
BM_std_minmax<short>/32                       17.2 ns         3.45 ns
BM_std_minmax<short>/64                       34.1 ns         3.35 ns
BM_std_minmax<short>/512                       279 ns         8.37 ns
BM_std_minmax<short>/1024                      549 ns         14.2 ns
BM_std_minmax<short>/4000                     2111 ns         50.1 ns
BM_std_minmax<short>/4096                     2167 ns         47.9 ns
BM_std_minmax<short>/5500                     2895 ns         69.7 ns
BM_std_minmax<short>/64000                   33454 ns          953 ns
BM_std_minmax<short>/65536                   34474 ns          970 ns
BM_std_minmax<short>/70000                   36691 ns         1037 ns
BM_std_minmax<int>/1                         0.664 ns         1.17 ns
BM_std_minmax<int>/2                          1.11 ns         1.69 ns
BM_std_minmax<int>/3                          2.36 ns         2.29 ns
BM_std_minmax<int>/4                          2.53 ns         2.91 ns
BM_std_minmax<int>/5                          3.23 ns         3.56 ns
BM_std_minmax<int>/6                          3.56 ns         4.23 ns
BM_std_minmax<int>/7                          4.28 ns         4.91 ns
BM_std_minmax<int>/8                          4.60 ns         5.60 ns
BM_std_minmax<int>/9                          5.38 ns         6.31 ns
BM_std_minmax<int>/10                         5.69 ns         7.03 ns
BM_std_minmax<int>/11                         6.41 ns         7.70 ns
BM_std_minmax<int>/12                         6.73 ns         8.39 ns
BM_std_minmax<int>/13                         7.38 ns         9.07 ns
BM_std_minmax<int>/14                         7.74 ns         9.79 ns
BM_std_minmax<int>/15                         8.53 ns         10.5 ns
BM_std_minmax<int>/16                         8.79 ns         11.2 ns
BM_std_minmax<int>/17                         9.63 ns         12.0 ns
BM_std_minmax<int>/18                         9.84 ns         12.7 ns
BM_std_minmax<int>/19                         10.6 ns         13.5 ns
BM_std_minmax<int>/20                         11.0 ns         14.3 ns
BM_std_minmax<int>/21                         11.7 ns         15.0 ns
BM_std_minmax<int>/22                         12.0 ns         15.7 ns
BM_std_minmax<int>/23                         13.1 ns         16.5 ns
BM_std_minmax<int>/24                         13.0 ns         17.3 ns
BM_std_minmax<int>/25                         13.7 ns         17.9 ns
BM_std_minmax<int>/26                         14.0 ns         18.6 ns
BM_std_minmax<int>/27                         14.8 ns         19.4 ns
BM_std_minmax<int>/28                         15.1 ns         20.3 ns
BM_std_minmax<int>/29                         15.8 ns         20.9 ns
BM_std_minmax<int>/30                         16.1 ns         21.7 ns
BM_std_minmax<int>/31                         16.9 ns         22.5 ns
BM_std_minmax<int>/32                         17.2 ns         3.40 ns
BM_std_minmax<int>/64                         33.9 ns         4.04 ns
BM_std_minmax<int>/512                         275 ns         14.6 ns
BM_std_minmax<int>/1024                        541 ns         27.5 ns
BM_std_minmax<int>/4000                       2093 ns         96.3 ns
BM_std_minmax<int>/4096                       2146 ns         98.3 ns
BM_std_minmax<int>/5500                       2866 ns          157 ns
BM_std_minmax<int>/64000                     33619 ns         1954 ns
BM_std_minmax<int>/65536                     34252 ns         2009 ns
BM_std_minmax<int>/70000                     36618 ns         2125 ns
BM_std_minmax<long long>/1                   0.709 ns         1.19 ns
BM_std_minmax<long long>/2                    1.01 ns         1.65 ns
BM_std_minmax<long long>/3                    2.14 ns         2.21 ns
BM_std_minmax<long long>/4                    2.45 ns         2.83 ns
BM_std_minmax<long long>/5                    3.09 ns         3.47 ns
BM_std_minmax<long long>/6                    3.44 ns         4.11 ns
BM_std_minmax<long long>/7                    4.16 ns         4.79 ns
BM_std_minmax<long long>/8                    4.54 ns         5.47 ns
BM_std_minmax<long long>/9                    5.37 ns         6.20 ns
BM_std_minmax<long long>/10                   5.71 ns         6.93 ns
BM_std_minmax<long long>/11                   6.00 ns         7.60 ns
BM_std_minmax<long long>/12                   6.43 ns         8.27 ns
BM_std_minmax<long long>/13                   7.01 ns         8.94 ns
BM_std_minmax<long long>/14                   7.45 ns         9.65 ns
BM_std_minmax<long long>/15                   8.16 ns         10.4 ns
BM_std_minmax<long long>/16                   8.46 ns         5.22 ns
BM_std_minmax<long long>/17                   9.16 ns         5.22 ns
BM_std_minmax<long long>/18                   9.53 ns         5.52 ns
BM_std_minmax<long long>/19                   10.2 ns         6.02 ns
BM_std_minmax<long long>/20                   10.5 ns         6.89 ns
BM_std_minmax<long long>/21                   11.3 ns         7.83 ns
BM_std_minmax<long long>/22                   11.6 ns         8.59 ns
BM_std_minmax<long long>/23                   12.3 ns         9.91 ns
BM_std_minmax<long long>/24                   12.6 ns         10.1 ns
BM_std_minmax<long long>/25                   13.2 ns         12.0 ns
BM_std_minmax<long long>/26                   13.6 ns         13.5 ns
BM_std_minmax<long long>/27                   14.2 ns         14.8 ns
BM_std_minmax<long long>/28                   14.7 ns         15.9 ns
BM_std_minmax<long long>/29                   15.3 ns         16.6 ns
BM_std_minmax<long long>/30                   15.8 ns         17.3 ns
BM_std_minmax<long long>/31                   16.3 ns         18.2 ns
BM_std_minmax<long long>/32                   16.7 ns         7.18 ns
BM_std_minmax<long long>/64                   33.1 ns         11.5 ns
BM_std_minmax<long long>/512                   268 ns         71.0 ns
BM_std_minmax<long long>/1024                  532 ns          138 ns
BM_std_minmax<long long>/4000                 2056 ns          533 ns
BM_std_minmax<long long>/4096                 2112 ns          539 ns
BM_std_minmax<long long>/5500                 2823 ns          749 ns
BM_std_minmax<long long>/64000               32956 ns         8590 ns
BM_std_minmax<long long>/65536               33795 ns         8791 ns
BM_std_minmax<long long>/70000               36084 ns         9442 ns
BM_std_minmax<unsigned char>/1               0.714 ns         1.41 ns
BM_std_minmax<unsigned char>/2               0.955 ns         1.96 ns
BM_std_minmax<unsigned char>/3                1.90 ns         2.63 ns
BM_std_minmax<unsigned char>/4                2.40 ns         3.34 ns
BM_std_minmax<unsigned char>/5                2.87 ns         4.10 ns
BM_std_minmax<unsigned char>/6                3.47 ns         4.88 ns
BM_std_minmax<unsigned char>/7                4.04 ns         5.66 ns
BM_std_minmax<unsigned char>/8                4.65 ns         6.45 ns
BM_std_minmax<unsigned char>/9                5.18 ns         7.24 ns
BM_std_minmax<unsigned char>/10               5.80 ns         8.05 ns
BM_std_minmax<unsigned char>/11               6.24 ns         8.86 ns
BM_std_minmax<unsigned char>/12               6.78 ns         9.70 ns
BM_std_minmax<unsigned char>/13               7.30 ns         10.6 ns
BM_std_minmax<unsigned char>/14               7.86 ns         11.4 ns
BM_std_minmax<unsigned char>/15               8.46 ns         12.3 ns
BM_std_minmax<unsigned char>/16               9.00 ns         2.12 ns
BM_std_minmax<unsigned char>/17               9.58 ns         2.83 ns
BM_std_minmax<unsigned char>/18               10.1 ns         3.37 ns
BM_std_minmax<unsigned char>/19               10.7 ns         4.11 ns
BM_std_minmax<unsigned char>/20               11.2 ns         4.85 ns
BM_std_minmax<unsigned char>/21               11.9 ns         5.69 ns
BM_std_minmax<unsigned char>/22               12.3 ns         6.77 ns
BM_std_minmax<unsigned char>/23               13.1 ns         7.56 ns
BM_std_minmax<unsigned char>/24               13.5 ns         8.40 ns
BM_std_minmax<unsigned char>/25               14.2 ns         9.30 ns
BM_std_minmax<unsigned char>/26               14.4 ns         10.1 ns
BM_std_minmax<unsigned char>/27               15.0 ns         11.1 ns
BM_std_minmax<unsigned char>/28               15.3 ns         11.9 ns
BM_std_minmax<unsigned char>/29               16.2 ns         12.9 ns
BM_std_minmax<unsigned char>/30               16.5 ns         13.9 ns
BM_std_minmax<unsigned char>/31               17.2 ns         14.8 ns
BM_std_minmax<unsigned char>/32               17.6 ns         2.36 ns
BM_std_minmax<unsigned char>/64               35.6 ns         3.21 ns
BM_std_minmax<unsigned char>/512               288 ns         6.00 ns
BM_std_minmax<unsigned char>/1024              573 ns         8.80 ns
BM_std_minmax<unsigned char>/4000             2222 ns         28.6 ns
BM_std_minmax<unsigned char>/4096             2265 ns         25.9 ns
BM_std_minmax<unsigned char>/5500             3047 ns         48.8 ns
BM_std_minmax<unsigned char>/64000           35059 ns          480 ns
BM_std_minmax<unsigned char>/65536           35941 ns          491 ns
BM_std_minmax<unsigned char>/70000           38922 ns          525 ns
BM_std_minmax<unsigned short>/1              0.711 ns         1.18 ns
BM_std_minmax<unsigned short>/2              0.957 ns         1.65 ns
BM_std_minmax<unsigned short>/3               2.13 ns         2.21 ns
BM_std_minmax<unsigned short>/4               2.14 ns         2.78 ns
BM_std_minmax<unsigned short>/5               3.06 ns         3.29 ns
BM_std_minmax<unsigned short>/6               2.89 ns         3.87 ns
BM_std_minmax<unsigned short>/7               3.80 ns         4.55 ns
BM_std_minmax<unsigned short>/8               3.68 ns         2.02 ns
BM_std_minmax<unsigned short>/9               4.53 ns         2.40 ns
BM_std_minmax<unsigned short>/10              4.60 ns         2.94 ns
BM_std_minmax<unsigned short>/11              5.67 ns         3.67 ns
BM_std_minmax<unsigned short>/12              5.39 ns         4.22 ns
BM_std_minmax<unsigned short>/13              6.58 ns         4.78 ns
BM_std_minmax<unsigned short>/14              6.33 ns         5.54 ns
BM_std_minmax<unsigned short>/15              7.34 ns         6.30 ns
BM_std_minmax<unsigned short>/16              7.17 ns         2.25 ns
BM_std_minmax<unsigned short>/17              8.19 ns         2.61 ns
BM_std_minmax<unsigned short>/18              8.02 ns         3.19 ns
BM_std_minmax<unsigned short>/19              9.03 ns         3.72 ns
BM_std_minmax<unsigned short>/20              8.89 ns         4.36 ns
BM_std_minmax<unsigned short>/21              9.77 ns         5.10 ns
BM_std_minmax<unsigned short>/22              9.70 ns         5.55 ns
BM_std_minmax<unsigned short>/23              10.8 ns         6.29 ns
BM_std_minmax<unsigned short>/24              10.6 ns         2.41 ns
BM_std_minmax<unsigned short>/25              11.6 ns         2.75 ns
BM_std_minmax<unsigned short>/26              11.4 ns         3.26 ns
BM_std_minmax<unsigned short>/27              12.4 ns         3.86 ns
BM_std_minmax<unsigned short>/28              12.3 ns         4.45 ns
BM_std_minmax<unsigned short>/29              13.2 ns         5.07 ns
BM_std_minmax<unsigned short>/30              13.1 ns         5.77 ns
BM_std_minmax<unsigned short>/31              13.9 ns         6.65 ns
BM_std_minmax<unsigned short>/32              13.9 ns         2.72 ns
BM_std_minmax<unsigned short>/64              27.8 ns         3.25 ns
BM_std_minmax<unsigned short>/512              220 ns         8.30 ns
BM_std_minmax<unsigned short>/1024             435 ns         14.1 ns
BM_std_minmax<unsigned short>/4000            1703 ns         49.8 ns
BM_std_minmax<unsigned short>/4096            1746 ns         47.9 ns
BM_std_minmax<unsigned short>/5500            2350 ns         69.9 ns
BM_std_minmax<unsigned short>/64000          27388 ns          953 ns
BM_std_minmax<unsigned short>/65536          28040 ns          975 ns
BM_std_minmax<unsigned short>/70000          29967 ns         1040 ns
BM_std_minmax<unsigned int>/1                0.712 ns         1.18 ns
BM_std_minmax<unsigned int>/2                0.965 ns         1.65 ns
BM_std_minmax<unsigned int>/3                 2.13 ns         2.14 ns
BM_std_minmax<unsigned int>/4                 2.09 ns         2.64 ns
BM_std_minmax<unsigned int>/5                 3.02 ns         3.21 ns
BM_std_minmax<unsigned int>/6                 2.94 ns         3.81 ns
BM_std_minmax<unsigned int>/7                 3.91 ns         4.38 ns
BM_std_minmax<unsigned int>/8                 3.75 ns         4.93 ns
BM_std_minmax<unsigned int>/9                 4.71 ns         5.60 ns
BM_std_minmax<unsigned int>/10                4.59 ns         6.26 ns
BM_std_minmax<unsigned int>/11                5.57 ns         6.80 ns
BM_std_minmax<unsigned int>/12                5.43 ns         7.47 ns
BM_std_minmax<unsigned int>/13                6.45 ns         8.10 ns
BM_std_minmax<unsigned int>/14                6.32 ns         8.69 ns
BM_std_minmax<unsigned int>/15                7.29 ns         9.37 ns
BM_std_minmax<unsigned int>/16                7.12 ns         9.99 ns
BM_std_minmax<unsigned int>/17                8.24 ns         10.6 ns
BM_std_minmax<unsigned int>/18                8.00 ns         11.2 ns
BM_std_minmax<unsigned int>/19                8.94 ns         12.0 ns
BM_std_minmax<unsigned int>/20                8.91 ns         12.6 ns
BM_std_minmax<unsigned int>/21                9.73 ns         17.2 ns
BM_std_minmax<unsigned int>/22                9.75 ns         13.8 ns
BM_std_minmax<unsigned int>/23                10.6 ns         14.5 ns
BM_std_minmax<unsigned int>/24                10.6 ns         15.1 ns
BM_std_minmax<unsigned int>/25                11.5 ns         15.7 ns
BM_std_minmax<unsigned int>/26                11.4 ns         16.3 ns
BM_std_minmax<unsigned int>/27                12.3 ns         17.0 ns
BM_std_minmax<unsigned int>/28                12.3 ns         17.6 ns
BM_std_minmax<unsigned int>/29                13.2 ns         18.3 ns
BM_std_minmax<unsigned int>/30                13.2 ns         19.0 ns
BM_std_minmax<unsigned int>/31                14.0 ns         19.6 ns
BM_std_minmax<unsigned int>/32                14.0 ns         3.39 ns
BM_std_minmax<unsigned int>/64                27.6 ns         4.05 ns
BM_std_minmax<unsigned int>/512                221 ns         14.2 ns
BM_std_minmax<unsigned int>/1024               439 ns         25.5 ns
BM_std_minmax<unsigned int>/4000              1720 ns         96.3 ns
BM_std_minmax<unsigned int>/4096              1762 ns         97.8 ns
BM_std_minmax<unsigned int>/5500              2364 ns          146 ns
BM_std_minmax<unsigned int>/64000            27874 ns         1905 ns
BM_std_minmax<unsigned int>/65536            28012 ns         1961 ns
BM_std_minmax<unsigned int>/70000            29899 ns         2087 ns
BM_std_minmax<unsigned long long>/1          0.707 ns         1.18 ns
BM_std_minmax<unsigned long long>/2          0.909 ns         1.65 ns
BM_std_minmax<unsigned long long>/3           1.65 ns         2.70 ns
BM_std_minmax<unsigned long long>/4           1.93 ns         2.69 ns
BM_std_minmax<unsigned long long>/5           2.45 ns         3.34 ns
BM_std_minmax<unsigned long long>/6           2.78 ns         3.81 ns
BM_std_minmax<unsigned long long>/7           3.28 ns         4.43 ns
BM_std_minmax<unsigned long long>/8           3.70 ns         4.92 ns
BM_std_minmax<unsigned long long>/9           4.12 ns         5.64 ns
BM_std_minmax<unsigned long long>/10          4.44 ns         6.15 ns
BM_std_minmax<unsigned long long>/11          4.91 ns         6.81 ns
BM_std_minmax<unsigned long long>/12          5.31 ns         7.41 ns
BM_std_minmax<unsigned long long>/13          5.72 ns         7.96 ns
BM_std_minmax<unsigned long long>/14          6.05 ns         8.66 ns
BM_std_minmax<unsigned long long>/15          6.55 ns         9.37 ns
BM_std_minmax<unsigned long long>/16          6.89 ns         7.98 ns
BM_std_minmax<unsigned long long>/17          7.34 ns         8.13 ns
BM_std_minmax<unsigned long long>/18          7.73 ns         8.42 ns
BM_std_minmax<unsigned long long>/19          8.26 ns         8.63 ns
BM_std_minmax<unsigned long long>/20          8.54 ns         8.96 ns
BM_std_minmax<unsigned long long>/21          9.14 ns         9.37 ns
BM_std_minmax<unsigned long long>/22          9.39 ns         9.67 ns
BM_std_minmax<unsigned long long>/23          10.1 ns         10.1 ns
BM_std_minmax<unsigned long long>/24          10.4 ns         10.6 ns
BM_std_minmax<unsigned long long>/25          11.0 ns         11.3 ns
BM_std_minmax<unsigned long long>/26          11.3 ns         12.1 ns
BM_std_minmax<unsigned long long>/27          11.8 ns         14.2 ns
BM_std_minmax<unsigned long long>/28          12.1 ns         15.8 ns
BM_std_minmax<unsigned long long>/29          12.6 ns         17.4 ns
BM_std_minmax<unsigned long long>/30          13.1 ns         18.1 ns
BM_std_minmax<unsigned long long>/31          13.4 ns         18.8 ns
BM_std_minmax<unsigned long long>/32          13.8 ns         10.4 ns
BM_std_minmax<unsigned long long>/64          27.3 ns         15.5 ns
BM_std_minmax<unsigned long long>/512          222 ns         80.6 ns
BM_std_minmax<unsigned long long>/1024         443 ns          156 ns
BM_std_minmax<unsigned long long>/4000        1731 ns          591 ns
BM_std_minmax<unsigned long long>/4096        1752 ns          609 ns
BM_std_minmax<unsigned long long>/5500        2340 ns          819 ns
BM_std_minmax<unsigned long long>/64000      27166 ns         9652 ns
BM_std_minmax<unsigned long long>/65536      27869 ns         9876 ns
BM_std_minmax<unsigned long long>/70000      29920 ns        10680 ns
```
2024-04-06 17:22:07 +02:00
Konstantin Varlamov
1638657dce
[libc++][hardening] Categorize more 'valid-element-access' checks. (#71620) 2023-12-20 17:24:48 -08:00
varconst
cd0ad4216c [libc++][hardening][NFC] Introduce _LIBCPP_ASSERT_UNCATEGORIZED.
Replace most uses of `_LIBCPP_ASSERT` with
`_LIBCPP_ASSERT_UNCATEGORIZED`.

This is done as a prerequisite to introducing hardened mode to libc++.
The idea is to make enabling assertions an opt-in with (somewhat)
fine-grained controls over which categories of assertions are enabled.
The vast majority of assertions are currently uncategorized; the new
macro will allow turning on `_LIBCPP_ASSERT` (the underlying mechanism
for all kinds of assertions) without enabling all the uncategorized
assertions (in the future; this patch preserves the current behavior).

Differential Revision: https://reviews.llvm.org/D153816
2023-06-28 15:10:31 -07:00
Louis Dionne
5aa03b648b [libc++][NFC] Apply clang-format on large parts of the code base
This commit does a pass of clang-format over files in libc++ that
don't require major changes to conform to our style guide, or for
which we're not overly concerned about conflicting with in-flight
patches or hindering the git blame.

This roughly covers:
- benchmarks
- range algorithms
- concepts
- type traits

I did a manual verification of all the changes, and in particular I
applied clang-format on/off annotations in a few places where the
result was less readable after than before. This was not necessary
in a lot of places, however I did find that clang-format had pretty
bad taste when it comes to formatting concepts.

Differential Revision: https://reviews.llvm.org/D153140
2023-06-19 11:19:51 -04:00
Ian Anderson
79702f7f59 [libc++][Modules] Add missing includes and exports
Several headers are missing includes for things they use.

type_traits.is_enum needs to export type_traits.integral_constant so that clients can access its `value` member without explicitly including __type_traits/integral_constant.h themselves.

Make `subrange_fwd` a peer submodule to `subrange` rather than a submodule of it, and have `subrange` export `subrange_fwd`. That will make it easier to programmatically generate modules for the private detail headers, and it will accomplish the same effect that __ranges/subrange.h will make subrange_kind visible.

Reviewed By: Mordante, #libc

Differential Revision: https://reviews.llvm.org/D150055
2023-05-07 19:54:49 -05:00
Nikolas Klauser
4f15267d3d [libc++][NFC] Replace _LIBCPP_STD_VER > x with _LIBCPP_STD_VER >= x
This change is almost fully mechanical. The only interesting change is in `generate_feature_test_macro_components.py` to generate `_LIBCPP_STD_VER >=` instead. To avoid churn in the git-blame this commit should be added to the `.git-blame-ignore-revs` once committed.

Reviewed By: ldionne, var-const, #libc

Spies: jloser, libcxx-commits, arichardson, arphaman, wenlei

Differential Revision: https://reviews.llvm.org/D143962
2023-02-15 16:52:25 +01:00
Igor Zhukov
0a0d58f546 [libc++] <algorithm>: ranges::minmax should dereference iterators only once
Reviewed By: philnik, #libc

Differential Revision: https://reviews.llvm.org/D142864
2023-02-15 06:02:12 +07:00
Nikolas Klauser
934650b24f [libc++] Add [[clang::lifetimebound]] to min/max algorithms
Reviewed By: Mordante, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D142608
2023-01-30 18:01:46 +01:00
Nikolas Klauser
660b243120 [libc++] Add [[nodiscard]] extensions to ranges algorithms
This mirrors what we have done in the classic algorithms

Reviewed By: ldionne, #libc

Spies: libcxx-commits

Differential Revision: https://reviews.llvm.org/D137186
2022-11-05 16:38:46 +01:00
Nikolas Klauser
d5e26775d0 [libc++] Granularize the rest of memory
Reviewed By: ldionne, #libc

Spies: vitalybuka, paulkirth, libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D132790
2022-09-05 12:36:41 +02:00
Vitaly Buka
bc8fd9c633 Revert "[libc++] Granularize the rest of memory"
Breaks buildbots.

This reverts commit 30adaa730c4768b5eb06719c808b2884fcf53cf3.
2022-09-02 19:42:49 -07:00
Nikolas Klauser
30adaa730c [libc++] Granularize the rest of memory
Reviewed By: ldionne, #libc

Spies: libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D132790
2022-09-02 21:42:41 +02:00
Louis Dionne
b8cb1dc9ea [libc++] Make <ranges> non-experimental
When we ship LLVM 16, <ranges> won't be considered experimental anymore.
We might as well do this sooner rather than later.

Differential Revision: https://reviews.llvm.org/D132151
2022-08-18 16:59:58 -04:00
Louis Dionne
a2c1603206 [libc++] Add a few missing min/max macro push/pop
Also, improve the test for nasty macros to define min and max, so this
will be caught in the future.

Differential Revision: https://reviews.llvm.org/D128655
2022-06-27 12:57:39 -04:00
Nikolas Klauser
58d9ab70ae [libc++][ranges] Implement ranges::minmax and ranges::minmax_element
Reviewed By: var-const, #libc, ldionne

Spies: sstefan1, ldionne, BRevzin, libcxx-commits, mgorny

Differential Revision: https://reviews.llvm.org/D120637
2022-04-14 15:37:22 +02:00