Louis Dionne
d423d80e56
[libc++][pstl] Promote CPU backends to top-level backends ( #88968 )
...
This patch removes the two-level backend dispatching mechanism we had in
the PSTL. Instead of selecting both a PSTL backend and a PSTL CPU
backend, we now only select a top-level PSTL backend. This greatly
simplifies the PSTL configuration layer.
While this patch technically removes some flexibility from the PSTL
configuration mechanism because CPU backends are not considered
separately, it opens the door to a much more powerful configuration
mechanism based on chained backends in a follow-up patch.
This is a step towards overhauling the PSTL dispatching mechanism.
2024-04-17 13:36:53 -04:00
Louis Dionne
d57907d0b4
[libc++] Add missing iterator requirement checks in the PSTL ( #88127 )
...
Also add tests for those, and add a few missing requirements to testing
iterators in the test suite.
2024-04-17 08:21:48 -04:00
Louis Dionne
5b811562a5
[libc++] Rename __cpu_traits functions ( #88741 )
...
Functions inside __cpu_traits were needlessly prefixed with __parallel,
which doesn't serve a real purpose anymore now that they are inside a
traits class.
2024-04-16 10:33:39 +02:00
Louis Dionne
a3ce29f7bb
[libc++][PSTL] Introduce cpu traits ( #88134 )
...
Currently, CPU backends in the PSTL are created by defining functions
in the __par_backend namespace. Then, the PSTL includes the CPU backend
that gets configured via CMake and gets those definitions.
This prevents CPU backends from easily co-existing and is a bit
confusing.
To solve this problem, this patch introduces the notion of __cpu_traits,
which is a cheap encapsulation of the basis operations required to
implement a CPU-based PSTL. Different backends can now define their own
tag and coexist, and the CPU-based PSTL will simply use __cpu_traits to
dispatch to the right implementation of e.g. __for_each.
Note that this patch doesn't change the actual implementation of the
backends in any way, it only modifies how that implementation is
accessed
to implement PSTL algorithms.
This patch is a step towards #88131 .
2024-04-15 10:30:00 -04:00
Mark de Wever
1a895bd95e
[libc++] Marks a variable const. ( #88562 )
...
This removes a TODO from the code base.
2024-04-13 13:45:59 +02:00
Mark de Wever
0ad663ead1
[libc++] Removes Clang-16 support. ( #87810 )
...
With the release of Clang-18 we no longer officially support Clang-16.
2024-04-10 17:51:02 +02:00
Nikolas Klauser
935e699173
[libc++] Optimize ranges::minmax ( #87335 )
...
This allows Clang to vectorize the loop.
```
---------------------------------------------------------------------
Benchmark old new
---------------------------------------------------------------------
BM_std_minmax<char>/1 0.659 ns 1.41 ns
BM_std_minmax<char>/2 1.08 ns 2.16 ns
BM_std_minmax<char>/3 2.16 ns 2.96 ns
BM_std_minmax<char>/4 2.82 ns 3.81 ns
BM_std_minmax<char>/5 3.43 ns 4.69 ns
BM_std_minmax<char>/6 4.08 ns 5.63 ns
BM_std_minmax<char>/7 4.75 ns 6.51 ns
BM_std_minmax<char>/8 5.42 ns 7.41 ns
BM_std_minmax<char>/9 6.05 ns 8.34 ns
BM_std_minmax<char>/10 6.68 ns 9.29 ns
BM_std_minmax<char>/11 7.47 ns 10.6 ns
BM_std_minmax<char>/12 7.95 ns 11.4 ns
BM_std_minmax<char>/13 8.64 ns 12.4 ns
BM_std_minmax<char>/14 9.35 ns 13.4 ns
BM_std_minmax<char>/15 10.1 ns 14.4 ns
BM_std_minmax<char>/16 10.6 ns 2.25 ns
BM_std_minmax<char>/17 11.3 ns 2.82 ns
BM_std_minmax<char>/18 11.8 ns 3.71 ns
BM_std_minmax<char>/19 12.6 ns 4.52 ns
BM_std_minmax<char>/20 13.2 ns 5.47 ns
BM_std_minmax<char>/21 14.1 ns 6.67 ns
BM_std_minmax<char>/22 14.5 ns 7.78 ns
BM_std_minmax<char>/23 15.1 ns 8.67 ns
BM_std_minmax<char>/24 15.7 ns 9.68 ns
BM_std_minmax<char>/25 16.4 ns 10.7 ns
BM_std_minmax<char>/26 17.1 ns 11.7 ns
BM_std_minmax<char>/27 17.8 ns 12.8 ns
BM_std_minmax<char>/28 18.4 ns 14.1 ns
BM_std_minmax<char>/29 19.0 ns 15.0 ns
BM_std_minmax<char>/30 19.6 ns 16.0 ns
BM_std_minmax<char>/31 20.2 ns 17.0 ns
BM_std_minmax<char>/32 20.8 ns 2.46 ns
BM_std_minmax<char>/64 41.5 ns 2.97 ns
BM_std_minmax<char>/512 340 ns 6.05 ns
BM_std_minmax<char>/1024 667 ns 8.83 ns
BM_std_minmax<char>/4000 2571 ns 28.6 ns
BM_std_minmax<char>/4096 2632 ns 25.8 ns
BM_std_minmax<char>/5500 3554 ns 51.1 ns
BM_std_minmax<char>/64000 41175 ns 480 ns
BM_std_minmax<char>/65536 42039 ns 490 ns
BM_std_minmax<char>/70000 44931 ns 528 ns
BM_std_minmax<short>/1 0.708 ns 1.20 ns
BM_std_minmax<short>/2 1.18 ns 1.78 ns
BM_std_minmax<short>/3 1.98 ns 2.42 ns
BM_std_minmax<short>/4 2.47 ns 3.05 ns
BM_std_minmax<short>/5 3.09 ns 3.72 ns
BM_std_minmax<short>/6 3.49 ns 4.37 ns
BM_std_minmax<short>/7 4.24 ns 5.03 ns
BM_std_minmax<short>/8 4.65 ns 2.12 ns
BM_std_minmax<short>/9 5.34 ns 2.51 ns
BM_std_minmax<short>/10 5.82 ns 3.18 ns
BM_std_minmax<short>/11 6.36 ns 3.97 ns
BM_std_minmax<short>/12 6.73 ns 4.68 ns
BM_std_minmax<short>/13 7.59 ns 5.49 ns
BM_std_minmax<short>/14 7.77 ns 6.45 ns
BM_std_minmax<short>/15 8.54 ns 7.55 ns
BM_std_minmax<short>/16 8.74 ns 2.38 ns
BM_std_minmax<short>/17 9.59 ns 2.76 ns
BM_std_minmax<short>/18 9.88 ns 3.37 ns
BM_std_minmax<short>/19 10.7 ns 4.17 ns
BM_std_minmax<short>/20 10.9 ns 4.88 ns
BM_std_minmax<short>/21 12.1 ns 5.70 ns
BM_std_minmax<short>/22 12.6 ns 6.64 ns
BM_std_minmax<short>/23 13.5 ns 7.72 ns
BM_std_minmax<short>/24 13.2 ns 2.87 ns
BM_std_minmax<short>/25 14.2 ns 3.10 ns
BM_std_minmax<short>/26 14.2 ns 3.59 ns
BM_std_minmax<short>/27 15.4 ns 4.35 ns
BM_std_minmax<short>/28 15.3 ns 5.10 ns
BM_std_minmax<short>/29 16.2 ns 5.87 ns
BM_std_minmax<short>/30 16.2 ns 6.88 ns
BM_std_minmax<short>/31 17.0 ns 7.78 ns
BM_std_minmax<short>/32 17.2 ns 3.45 ns
BM_std_minmax<short>/64 34.1 ns 3.35 ns
BM_std_minmax<short>/512 279 ns 8.37 ns
BM_std_minmax<short>/1024 549 ns 14.2 ns
BM_std_minmax<short>/4000 2111 ns 50.1 ns
BM_std_minmax<short>/4096 2167 ns 47.9 ns
BM_std_minmax<short>/5500 2895 ns 69.7 ns
BM_std_minmax<short>/64000 33454 ns 953 ns
BM_std_minmax<short>/65536 34474 ns 970 ns
BM_std_minmax<short>/70000 36691 ns 1037 ns
BM_std_minmax<int>/1 0.664 ns 1.17 ns
BM_std_minmax<int>/2 1.11 ns 1.69 ns
BM_std_minmax<int>/3 2.36 ns 2.29 ns
BM_std_minmax<int>/4 2.53 ns 2.91 ns
BM_std_minmax<int>/5 3.23 ns 3.56 ns
BM_std_minmax<int>/6 3.56 ns 4.23 ns
BM_std_minmax<int>/7 4.28 ns 4.91 ns
BM_std_minmax<int>/8 4.60 ns 5.60 ns
BM_std_minmax<int>/9 5.38 ns 6.31 ns
BM_std_minmax<int>/10 5.69 ns 7.03 ns
BM_std_minmax<int>/11 6.41 ns 7.70 ns
BM_std_minmax<int>/12 6.73 ns 8.39 ns
BM_std_minmax<int>/13 7.38 ns 9.07 ns
BM_std_minmax<int>/14 7.74 ns 9.79 ns
BM_std_minmax<int>/15 8.53 ns 10.5 ns
BM_std_minmax<int>/16 8.79 ns 11.2 ns
BM_std_minmax<int>/17 9.63 ns 12.0 ns
BM_std_minmax<int>/18 9.84 ns 12.7 ns
BM_std_minmax<int>/19 10.6 ns 13.5 ns
BM_std_minmax<int>/20 11.0 ns 14.3 ns
BM_std_minmax<int>/21 11.7 ns 15.0 ns
BM_std_minmax<int>/22 12.0 ns 15.7 ns
BM_std_minmax<int>/23 13.1 ns 16.5 ns
BM_std_minmax<int>/24 13.0 ns 17.3 ns
BM_std_minmax<int>/25 13.7 ns 17.9 ns
BM_std_minmax<int>/26 14.0 ns 18.6 ns
BM_std_minmax<int>/27 14.8 ns 19.4 ns
BM_std_minmax<int>/28 15.1 ns 20.3 ns
BM_std_minmax<int>/29 15.8 ns 20.9 ns
BM_std_minmax<int>/30 16.1 ns 21.7 ns
BM_std_minmax<int>/31 16.9 ns 22.5 ns
BM_std_minmax<int>/32 17.2 ns 3.40 ns
BM_std_minmax<int>/64 33.9 ns 4.04 ns
BM_std_minmax<int>/512 275 ns 14.6 ns
BM_std_minmax<int>/1024 541 ns 27.5 ns
BM_std_minmax<int>/4000 2093 ns 96.3 ns
BM_std_minmax<int>/4096 2146 ns 98.3 ns
BM_std_minmax<int>/5500 2866 ns 157 ns
BM_std_minmax<int>/64000 33619 ns 1954 ns
BM_std_minmax<int>/65536 34252 ns 2009 ns
BM_std_minmax<int>/70000 36618 ns 2125 ns
BM_std_minmax<long long>/1 0.709 ns 1.19 ns
BM_std_minmax<long long>/2 1.01 ns 1.65 ns
BM_std_minmax<long long>/3 2.14 ns 2.21 ns
BM_std_minmax<long long>/4 2.45 ns 2.83 ns
BM_std_minmax<long long>/5 3.09 ns 3.47 ns
BM_std_minmax<long long>/6 3.44 ns 4.11 ns
BM_std_minmax<long long>/7 4.16 ns 4.79 ns
BM_std_minmax<long long>/8 4.54 ns 5.47 ns
BM_std_minmax<long long>/9 5.37 ns 6.20 ns
BM_std_minmax<long long>/10 5.71 ns 6.93 ns
BM_std_minmax<long long>/11 6.00 ns 7.60 ns
BM_std_minmax<long long>/12 6.43 ns 8.27 ns
BM_std_minmax<long long>/13 7.01 ns 8.94 ns
BM_std_minmax<long long>/14 7.45 ns 9.65 ns
BM_std_minmax<long long>/15 8.16 ns 10.4 ns
BM_std_minmax<long long>/16 8.46 ns 5.22 ns
BM_std_minmax<long long>/17 9.16 ns 5.22 ns
BM_std_minmax<long long>/18 9.53 ns 5.52 ns
BM_std_minmax<long long>/19 10.2 ns 6.02 ns
BM_std_minmax<long long>/20 10.5 ns 6.89 ns
BM_std_minmax<long long>/21 11.3 ns 7.83 ns
BM_std_minmax<long long>/22 11.6 ns 8.59 ns
BM_std_minmax<long long>/23 12.3 ns 9.91 ns
BM_std_minmax<long long>/24 12.6 ns 10.1 ns
BM_std_minmax<long long>/25 13.2 ns 12.0 ns
BM_std_minmax<long long>/26 13.6 ns 13.5 ns
BM_std_minmax<long long>/27 14.2 ns 14.8 ns
BM_std_minmax<long long>/28 14.7 ns 15.9 ns
BM_std_minmax<long long>/29 15.3 ns 16.6 ns
BM_std_minmax<long long>/30 15.8 ns 17.3 ns
BM_std_minmax<long long>/31 16.3 ns 18.2 ns
BM_std_minmax<long long>/32 16.7 ns 7.18 ns
BM_std_minmax<long long>/64 33.1 ns 11.5 ns
BM_std_minmax<long long>/512 268 ns 71.0 ns
BM_std_minmax<long long>/1024 532 ns 138 ns
BM_std_minmax<long long>/4000 2056 ns 533 ns
BM_std_minmax<long long>/4096 2112 ns 539 ns
BM_std_minmax<long long>/5500 2823 ns 749 ns
BM_std_minmax<long long>/64000 32956 ns 8590 ns
BM_std_minmax<long long>/65536 33795 ns 8791 ns
BM_std_minmax<long long>/70000 36084 ns 9442 ns
BM_std_minmax<unsigned char>/1 0.714 ns 1.41 ns
BM_std_minmax<unsigned char>/2 0.955 ns 1.96 ns
BM_std_minmax<unsigned char>/3 1.90 ns 2.63 ns
BM_std_minmax<unsigned char>/4 2.40 ns 3.34 ns
BM_std_minmax<unsigned char>/5 2.87 ns 4.10 ns
BM_std_minmax<unsigned char>/6 3.47 ns 4.88 ns
BM_std_minmax<unsigned char>/7 4.04 ns 5.66 ns
BM_std_minmax<unsigned char>/8 4.65 ns 6.45 ns
BM_std_minmax<unsigned char>/9 5.18 ns 7.24 ns
BM_std_minmax<unsigned char>/10 5.80 ns 8.05 ns
BM_std_minmax<unsigned char>/11 6.24 ns 8.86 ns
BM_std_minmax<unsigned char>/12 6.78 ns 9.70 ns
BM_std_minmax<unsigned char>/13 7.30 ns 10.6 ns
BM_std_minmax<unsigned char>/14 7.86 ns 11.4 ns
BM_std_minmax<unsigned char>/15 8.46 ns 12.3 ns
BM_std_minmax<unsigned char>/16 9.00 ns 2.12 ns
BM_std_minmax<unsigned char>/17 9.58 ns 2.83 ns
BM_std_minmax<unsigned char>/18 10.1 ns 3.37 ns
BM_std_minmax<unsigned char>/19 10.7 ns 4.11 ns
BM_std_minmax<unsigned char>/20 11.2 ns 4.85 ns
BM_std_minmax<unsigned char>/21 11.9 ns 5.69 ns
BM_std_minmax<unsigned char>/22 12.3 ns 6.77 ns
BM_std_minmax<unsigned char>/23 13.1 ns 7.56 ns
BM_std_minmax<unsigned char>/24 13.5 ns 8.40 ns
BM_std_minmax<unsigned char>/25 14.2 ns 9.30 ns
BM_std_minmax<unsigned char>/26 14.4 ns 10.1 ns
BM_std_minmax<unsigned char>/27 15.0 ns 11.1 ns
BM_std_minmax<unsigned char>/28 15.3 ns 11.9 ns
BM_std_minmax<unsigned char>/29 16.2 ns 12.9 ns
BM_std_minmax<unsigned char>/30 16.5 ns 13.9 ns
BM_std_minmax<unsigned char>/31 17.2 ns 14.8 ns
BM_std_minmax<unsigned char>/32 17.6 ns 2.36 ns
BM_std_minmax<unsigned char>/64 35.6 ns 3.21 ns
BM_std_minmax<unsigned char>/512 288 ns 6.00 ns
BM_std_minmax<unsigned char>/1024 573 ns 8.80 ns
BM_std_minmax<unsigned char>/4000 2222 ns 28.6 ns
BM_std_minmax<unsigned char>/4096 2265 ns 25.9 ns
BM_std_minmax<unsigned char>/5500 3047 ns 48.8 ns
BM_std_minmax<unsigned char>/64000 35059 ns 480 ns
BM_std_minmax<unsigned char>/65536 35941 ns 491 ns
BM_std_minmax<unsigned char>/70000 38922 ns 525 ns
BM_std_minmax<unsigned short>/1 0.711 ns 1.18 ns
BM_std_minmax<unsigned short>/2 0.957 ns 1.65 ns
BM_std_minmax<unsigned short>/3 2.13 ns 2.21 ns
BM_std_minmax<unsigned short>/4 2.14 ns 2.78 ns
BM_std_minmax<unsigned short>/5 3.06 ns 3.29 ns
BM_std_minmax<unsigned short>/6 2.89 ns 3.87 ns
BM_std_minmax<unsigned short>/7 3.80 ns 4.55 ns
BM_std_minmax<unsigned short>/8 3.68 ns 2.02 ns
BM_std_minmax<unsigned short>/9 4.53 ns 2.40 ns
BM_std_minmax<unsigned short>/10 4.60 ns 2.94 ns
BM_std_minmax<unsigned short>/11 5.67 ns 3.67 ns
BM_std_minmax<unsigned short>/12 5.39 ns 4.22 ns
BM_std_minmax<unsigned short>/13 6.58 ns 4.78 ns
BM_std_minmax<unsigned short>/14 6.33 ns 5.54 ns
BM_std_minmax<unsigned short>/15 7.34 ns 6.30 ns
BM_std_minmax<unsigned short>/16 7.17 ns 2.25 ns
BM_std_minmax<unsigned short>/17 8.19 ns 2.61 ns
BM_std_minmax<unsigned short>/18 8.02 ns 3.19 ns
BM_std_minmax<unsigned short>/19 9.03 ns 3.72 ns
BM_std_minmax<unsigned short>/20 8.89 ns 4.36 ns
BM_std_minmax<unsigned short>/21 9.77 ns 5.10 ns
BM_std_minmax<unsigned short>/22 9.70 ns 5.55 ns
BM_std_minmax<unsigned short>/23 10.8 ns 6.29 ns
BM_std_minmax<unsigned short>/24 10.6 ns 2.41 ns
BM_std_minmax<unsigned short>/25 11.6 ns 2.75 ns
BM_std_minmax<unsigned short>/26 11.4 ns 3.26 ns
BM_std_minmax<unsigned short>/27 12.4 ns 3.86 ns
BM_std_minmax<unsigned short>/28 12.3 ns 4.45 ns
BM_std_minmax<unsigned short>/29 13.2 ns 5.07 ns
BM_std_minmax<unsigned short>/30 13.1 ns 5.77 ns
BM_std_minmax<unsigned short>/31 13.9 ns 6.65 ns
BM_std_minmax<unsigned short>/32 13.9 ns 2.72 ns
BM_std_minmax<unsigned short>/64 27.8 ns 3.25 ns
BM_std_minmax<unsigned short>/512 220 ns 8.30 ns
BM_std_minmax<unsigned short>/1024 435 ns 14.1 ns
BM_std_minmax<unsigned short>/4000 1703 ns 49.8 ns
BM_std_minmax<unsigned short>/4096 1746 ns 47.9 ns
BM_std_minmax<unsigned short>/5500 2350 ns 69.9 ns
BM_std_minmax<unsigned short>/64000 27388 ns 953 ns
BM_std_minmax<unsigned short>/65536 28040 ns 975 ns
BM_std_minmax<unsigned short>/70000 29967 ns 1040 ns
BM_std_minmax<unsigned int>/1 0.712 ns 1.18 ns
BM_std_minmax<unsigned int>/2 0.965 ns 1.65 ns
BM_std_minmax<unsigned int>/3 2.13 ns 2.14 ns
BM_std_minmax<unsigned int>/4 2.09 ns 2.64 ns
BM_std_minmax<unsigned int>/5 3.02 ns 3.21 ns
BM_std_minmax<unsigned int>/6 2.94 ns 3.81 ns
BM_std_minmax<unsigned int>/7 3.91 ns 4.38 ns
BM_std_minmax<unsigned int>/8 3.75 ns 4.93 ns
BM_std_minmax<unsigned int>/9 4.71 ns 5.60 ns
BM_std_minmax<unsigned int>/10 4.59 ns 6.26 ns
BM_std_minmax<unsigned int>/11 5.57 ns 6.80 ns
BM_std_minmax<unsigned int>/12 5.43 ns 7.47 ns
BM_std_minmax<unsigned int>/13 6.45 ns 8.10 ns
BM_std_minmax<unsigned int>/14 6.32 ns 8.69 ns
BM_std_minmax<unsigned int>/15 7.29 ns 9.37 ns
BM_std_minmax<unsigned int>/16 7.12 ns 9.99 ns
BM_std_minmax<unsigned int>/17 8.24 ns 10.6 ns
BM_std_minmax<unsigned int>/18 8.00 ns 11.2 ns
BM_std_minmax<unsigned int>/19 8.94 ns 12.0 ns
BM_std_minmax<unsigned int>/20 8.91 ns 12.6 ns
BM_std_minmax<unsigned int>/21 9.73 ns 17.2 ns
BM_std_minmax<unsigned int>/22 9.75 ns 13.8 ns
BM_std_minmax<unsigned int>/23 10.6 ns 14.5 ns
BM_std_minmax<unsigned int>/24 10.6 ns 15.1 ns
BM_std_minmax<unsigned int>/25 11.5 ns 15.7 ns
BM_std_minmax<unsigned int>/26 11.4 ns 16.3 ns
BM_std_minmax<unsigned int>/27 12.3 ns 17.0 ns
BM_std_minmax<unsigned int>/28 12.3 ns 17.6 ns
BM_std_minmax<unsigned int>/29 13.2 ns 18.3 ns
BM_std_minmax<unsigned int>/30 13.2 ns 19.0 ns
BM_std_minmax<unsigned int>/31 14.0 ns 19.6 ns
BM_std_minmax<unsigned int>/32 14.0 ns 3.39 ns
BM_std_minmax<unsigned int>/64 27.6 ns 4.05 ns
BM_std_minmax<unsigned int>/512 221 ns 14.2 ns
BM_std_minmax<unsigned int>/1024 439 ns 25.5 ns
BM_std_minmax<unsigned int>/4000 1720 ns 96.3 ns
BM_std_minmax<unsigned int>/4096 1762 ns 97.8 ns
BM_std_minmax<unsigned int>/5500 2364 ns 146 ns
BM_std_minmax<unsigned int>/64000 27874 ns 1905 ns
BM_std_minmax<unsigned int>/65536 28012 ns 1961 ns
BM_std_minmax<unsigned int>/70000 29899 ns 2087 ns
BM_std_minmax<unsigned long long>/1 0.707 ns 1.18 ns
BM_std_minmax<unsigned long long>/2 0.909 ns 1.65 ns
BM_std_minmax<unsigned long long>/3 1.65 ns 2.70 ns
BM_std_minmax<unsigned long long>/4 1.93 ns 2.69 ns
BM_std_minmax<unsigned long long>/5 2.45 ns 3.34 ns
BM_std_minmax<unsigned long long>/6 2.78 ns 3.81 ns
BM_std_minmax<unsigned long long>/7 3.28 ns 4.43 ns
BM_std_minmax<unsigned long long>/8 3.70 ns 4.92 ns
BM_std_minmax<unsigned long long>/9 4.12 ns 5.64 ns
BM_std_minmax<unsigned long long>/10 4.44 ns 6.15 ns
BM_std_minmax<unsigned long long>/11 4.91 ns 6.81 ns
BM_std_minmax<unsigned long long>/12 5.31 ns 7.41 ns
BM_std_minmax<unsigned long long>/13 5.72 ns 7.96 ns
BM_std_minmax<unsigned long long>/14 6.05 ns 8.66 ns
BM_std_minmax<unsigned long long>/15 6.55 ns 9.37 ns
BM_std_minmax<unsigned long long>/16 6.89 ns 7.98 ns
BM_std_minmax<unsigned long long>/17 7.34 ns 8.13 ns
BM_std_minmax<unsigned long long>/18 7.73 ns 8.42 ns
BM_std_minmax<unsigned long long>/19 8.26 ns 8.63 ns
BM_std_minmax<unsigned long long>/20 8.54 ns 8.96 ns
BM_std_minmax<unsigned long long>/21 9.14 ns 9.37 ns
BM_std_minmax<unsigned long long>/22 9.39 ns 9.67 ns
BM_std_minmax<unsigned long long>/23 10.1 ns 10.1 ns
BM_std_minmax<unsigned long long>/24 10.4 ns 10.6 ns
BM_std_minmax<unsigned long long>/25 11.0 ns 11.3 ns
BM_std_minmax<unsigned long long>/26 11.3 ns 12.1 ns
BM_std_minmax<unsigned long long>/27 11.8 ns 14.2 ns
BM_std_minmax<unsigned long long>/28 12.1 ns 15.8 ns
BM_std_minmax<unsigned long long>/29 12.6 ns 17.4 ns
BM_std_minmax<unsigned long long>/30 13.1 ns 18.1 ns
BM_std_minmax<unsigned long long>/31 13.4 ns 18.8 ns
BM_std_minmax<unsigned long long>/32 13.8 ns 10.4 ns
BM_std_minmax<unsigned long long>/64 27.3 ns 15.5 ns
BM_std_minmax<unsigned long long>/512 222 ns 80.6 ns
BM_std_minmax<unsigned long long>/1024 443 ns 156 ns
BM_std_minmax<unsigned long long>/4000 1731 ns 591 ns
BM_std_minmax<unsigned long long>/4096 1752 ns 609 ns
BM_std_minmax<unsigned long long>/5500 2340 ns 819 ns
BM_std_minmax<unsigned long long>/64000 27166 ns 9652 ns
BM_std_minmax<unsigned long long>/65536 27869 ns 9876 ns
BM_std_minmax<unsigned long long>/70000 29920 ns 10680 ns
```
2024-04-06 17:22:07 +02:00
Nikolas Klauser
f5960c168d
[libc++][NFC] Make __desugars_to a variable template and rename the header to desugars_to.h ( #87337 )
...
This improves compile times and memory usage slightly and removes some
boilerplate.
2024-04-04 23:02:19 +02:00
A. Jiang
04dbf7ad44
[libc++][ranges] Avoid using distance
in ranges::contains_subrange
( #87155 )
...
Both `std::distance` or `ranges::distance` are inefficient for
non-sized ranges. Also, calculating the range using `int` type is
seriously problematic.
This patch avoids using `distance` and calculation of the length of
non-sized ranges.
Fixes #86833 .
2024-04-02 17:21:15 -07:00
Nikolas Klauser
985c1a44f8
[libc++] Optimize the two range overload of mismatch ( #86853 )
...
```
-----------------------------------------------------------------------------
Benchmark old new
-----------------------------------------------------------------------------
bm_mismatch_two_range_overload<char>/1 0.941 ns 1.88 ns
bm_mismatch_two_range_overload<char>/2 1.43 ns 2.15 ns
bm_mismatch_two_range_overload<char>/3 1.95 ns 2.55 ns
bm_mismatch_two_range_overload<char>/4 2.58 ns 2.90 ns
bm_mismatch_two_range_overload<char>/5 3.75 ns 3.31 ns
bm_mismatch_two_range_overload<char>/6 5.00 ns 3.83 ns
bm_mismatch_two_range_overload<char>/7 5.59 ns 4.35 ns
bm_mismatch_two_range_overload<char>/8 6.37 ns 4.84 ns
bm_mismatch_two_range_overload<char>/16 11.8 ns 6.72 ns
bm_mismatch_two_range_overload<char>/64 45.5 ns 2.59 ns
bm_mismatch_two_range_overload<char>/512 366 ns 12.6 ns
bm_mismatch_two_range_overload<char>/4096 2890 ns 91.6 ns
bm_mismatch_two_range_overload<char>/32768 23038 ns 758 ns
bm_mismatch_two_range_overload<char>/262144 142813 ns 6573 ns
bm_mismatch_two_range_overload<char>/1048576 366679 ns 26710 ns
bm_mismatch_two_range_overload<short>/1 0.934 ns 1.88 ns
bm_mismatch_two_range_overload<short>/2 1.30 ns 2.58 ns
bm_mismatch_two_range_overload<short>/3 1.76 ns 3.28 ns
bm_mismatch_two_range_overload<short>/4 2.24 ns 3.98 ns
bm_mismatch_two_range_overload<short>/5 2.80 ns 4.92 ns
bm_mismatch_two_range_overload<short>/6 3.58 ns 6.01 ns
bm_mismatch_two_range_overload<short>/7 4.29 ns 7.03 ns
bm_mismatch_two_range_overload<short>/8 4.67 ns 7.39 ns
bm_mismatch_two_range_overload<short>/16 9.86 ns 13.1 ns
bm_mismatch_two_range_overload<short>/64 38.9 ns 4.55 ns
bm_mismatch_two_range_overload<short>/512 348 ns 27.7 ns
bm_mismatch_two_range_overload<short>/4096 2881 ns 225 ns
bm_mismatch_two_range_overload<short>/32768 23111 ns 1715 ns
bm_mismatch_two_range_overload<short>/262144 184846 ns 14416 ns
bm_mismatch_two_range_overload<short>/1048576 742885 ns 57264 ns
bm_mismatch_two_range_overload<int>/1 0.838 ns 1.19 ns
bm_mismatch_two_range_overload<int>/2 1.19 ns 1.65 ns
bm_mismatch_two_range_overload<int>/3 1.83 ns 2.06 ns
bm_mismatch_two_range_overload<int>/4 2.38 ns 2.42 ns
bm_mismatch_two_range_overload<int>/5 3.60 ns 2.47 ns
bm_mismatch_two_range_overload<int>/6 3.68 ns 3.05 ns
bm_mismatch_two_range_overload<int>/7 4.32 ns 3.36 ns
bm_mismatch_two_range_overload<int>/8 5.18 ns 3.58 ns
bm_mismatch_two_range_overload<int>/16 10.6 ns 2.84 ns
bm_mismatch_two_range_overload<int>/64 39.0 ns 7.78 ns
bm_mismatch_two_range_overload<int>/512 247 ns 53.9 ns
bm_mismatch_two_range_overload<int>/4096 1927 ns 429 ns
bm_mismatch_two_range_overload<int>/32768 15569 ns 3393 ns
bm_mismatch_two_range_overload<int>/262144 125413 ns 28504 ns
bm_mismatch_two_range_overload<int>/1048576 504549 ns 112729 ns
```
2024-04-01 18:21:51 +02:00
Nikolas Klauser
1679b27959
[libc++] Refactor __tuple_like and __pair_like ( #85206 )
...
The exposition-only type trait `pair-like` includes `ranges::subrange`,
but in every single case excludes `ranges::subrange` from the list. This
patch introduces two new traits `__tuple_like_no_subrange` and
`__pair_like_no_subrange`, which exclude `ranges::subrange` from the
possible matches. `__pair_like` is no longer required, and thus removed.
`__tuple_like` is implemented as `__tuple_like_no_subrange` or a
`ranges::subrange` specialization.
2024-04-01 08:46:57 +02:00
Nikolas Klauser
beaff78528
[libc++] Optimize the std::mismatch tail ( #83440 )
...
This adds vectorization to the last 0-3 vectors and, if the range is
large enough, the remaining elements that don't fill a vector
completely.
```
-----------------------------------------------------------------------
Benchmark old full vectors partial vector
-----------------------------------------------------------------------
bm_mismatch<char>/1 1.40 ns 1.62 ns 2.09 ns
bm_mismatch<char>/2 1.88 ns 2.10 ns 2.33 ns
bm_mismatch<char>/3 2.67 ns 2.56 ns 2.72 ns
bm_mismatch<char>/4 3.01 ns 3.20 ns 3.70 ns
bm_mismatch<char>/5 3.51 ns 3.73 ns 3.64 ns
bm_mismatch<char>/6 4.71 ns 4.85 ns 4.37 ns
bm_mismatch<char>/7 5.12 ns 5.33 ns 4.37 ns
bm_mismatch<char>/8 5.79 ns 6.02 ns 4.75 ns
bm_mismatch<char>/15 9.20 ns 10.5 ns 7.23 ns
bm_mismatch<char>/16 10.2 ns 10.1 ns 7.46 ns
bm_mismatch<char>/17 10.2 ns 10.8 ns 7.57 ns
bm_mismatch<char>/31 17.6 ns 17.1 ns 10.8 ns
bm_mismatch<char>/32 17.4 ns 1.64 ns 1.64 ns
bm_mismatch<char>/33 23.3 ns 2.10 ns 2.33 ns
bm_mismatch<char>/63 31.8 ns 16.9 ns 2.33 ns
bm_mismatch<char>/64 32.6 ns 2.10 ns 2.10 ns
bm_mismatch<char>/65 33.6 ns 2.57 ns 2.80 ns
bm_mismatch<char>/127 67.3 ns 18.1 ns 3.27 ns
bm_mismatch<char>/128 2.17 ns 2.14 ns 2.57 ns
bm_mismatch<char>/129 2.36 ns 2.80 ns 3.27 ns
bm_mismatch<char>/255 67.5 ns 19.6 ns 4.68 ns
bm_mismatch<char>/256 3.76 ns 3.71 ns 3.97 ns
bm_mismatch<char>/257 3.77 ns 4.04 ns 4.43 ns
bm_mismatch<char>/511 70.8 ns 22.1 ns 7.47 ns
bm_mismatch<char>/512 7.27 ns 7.30 ns 6.95 ns
bm_mismatch<char>/513 7.11 ns 7.05 ns 6.96 ns
bm_mismatch<char>/1023 75.9 ns 27.4 ns 13.3 ns
bm_mismatch<char>/1024 13.9 ns 13.8 ns 12.4 ns
bm_mismatch<char>/1025 13.6 ns 13.6 ns 12.8 ns
bm_mismatch<char>/2047 87.3 ns 37.5 ns 25.4 ns
bm_mismatch<char>/2048 26.8 ns 27.4 ns 24.0 ns
bm_mismatch<char>/2049 26.7 ns 27.3 ns 25.5 ns
bm_mismatch<char>/4095 112 ns 64.7 ns 48.7 ns
bm_mismatch<char>/4096 53.0 ns 54.2 ns 46.8 ns
bm_mismatch<char>/4097 52.7 ns 54.2 ns 48.4 ns
bm_mismatch<char>/8191 160 ns 118 ns 98.4 ns
bm_mismatch<char>/8192 107 ns 108 ns 96.0 ns
bm_mismatch<char>/8193 106 ns 108 ns 97.2 ns
bm_mismatch<char>/16383 283 ns 234 ns 215 ns
bm_mismatch<char>/16384 227 ns 223 ns 217 ns
bm_mismatch<char>/16385 221 ns 221 ns 215 ns
bm_mismatch<char>/32767 547 ns 499 ns 488 ns
bm_mismatch<char>/32768 495 ns 492 ns 492 ns
bm_mismatch<char>/32769 491 ns 489 ns 488 ns
bm_mismatch<char>/65535 1028 ns 979 ns 971 ns
bm_mismatch<char>/65536 976 ns 970 ns 974 ns
bm_mismatch<char>/65537 970 ns 965 ns 971 ns
bm_mismatch<char>/131071 2031 ns 1948 ns 2005 ns
bm_mismatch<char>/131072 1973 ns 1955 ns 1974 ns
bm_mismatch<char>/131073 1989 ns 1932 ns 2001 ns
bm_mismatch<char>/262143 4469 ns 4244 ns 4223 ns
bm_mismatch<char>/262144 4443 ns 4183 ns 4243 ns
bm_mismatch<char>/262145 4400 ns 4232 ns 4246 ns
bm_mismatch<char>/524287 10169 ns 9733 ns 9592 ns
bm_mismatch<char>/524288 10154 ns 9664 ns 9843 ns
bm_mismatch<char>/524289 10113 ns 9641 ns 10003 ns
bm_mismatch<short>/1 1.86 ns 2.53 ns 2.32 ns
bm_mismatch<short>/2 2.57 ns 2.77 ns 2.55 ns
bm_mismatch<short>/3 3.26 ns 3.00 ns 2.79 ns
bm_mismatch<short>/4 3.95 ns 3.39 ns 3.15 ns
bm_mismatch<short>/5 4.83 ns 3.97 ns 3.72 ns
bm_mismatch<short>/6 5.43 ns 4.34 ns 4.03 ns
bm_mismatch<short>/7 6.11 ns 4.73 ns 4.44 ns
bm_mismatch<short>/8 6.84 ns 5.02 ns 4.79 ns
bm_mismatch<short>/15 11.5 ns 7.12 ns 6.50 ns
bm_mismatch<short>/16 13.9 ns 1.87 ns 2.11 ns
bm_mismatch<short>/17 14.0 ns 3.00 ns 2.47 ns
bm_mismatch<short>/31 23.1 ns 7.87 ns 2.47 ns
bm_mismatch<short>/32 23.8 ns 2.57 ns 2.81 ns
bm_mismatch<short>/33 24.5 ns 3.70 ns 2.94 ns
bm_mismatch<short>/63 44.8 ns 9.37 ns 3.46 ns
bm_mismatch<short>/64 2.32 ns 2.57 ns 2.64 ns
bm_mismatch<short>/65 2.52 ns 3.02 ns 3.51 ns
bm_mismatch<short>/127 45.6 ns 9.97 ns 5.18 ns
bm_mismatch<short>/128 3.85 ns 3.93 ns 3.94 ns
bm_mismatch<short>/129 3.82 ns 4.20 ns 4.70 ns
bm_mismatch<short>/255 50.4 ns 12.6 ns 8.07 ns
bm_mismatch<short>/256 7.23 ns 6.91 ns 6.98 ns
bm_mismatch<short>/257 7.24 ns 7.19 ns 7.55 ns
bm_mismatch<short>/511 52.3 ns 17.8 ns 14.0 ns
bm_mismatch<short>/512 13.6 ns 13.7 ns 13.6 ns
bm_mismatch<short>/513 13.9 ns 13.8 ns 18.5 ns
bm_mismatch<short>/1023 60.9 ns 30.9 ns 26.3 ns
bm_mismatch<short>/1024 26.7 ns 27.7 ns 25.7 ns
bm_mismatch<short>/1025 27.7 ns 27.6 ns 25.3 ns
bm_mismatch<short>/2047 88.4 ns 58.0 ns 51.6 ns
bm_mismatch<short>/2048 52.8 ns 55.3 ns 50.6 ns
bm_mismatch<short>/2049 55.2 ns 54.8 ns 48.7 ns
bm_mismatch<short>/4095 153 ns 113 ns 102 ns
bm_mismatch<short>/4096 105 ns 110 ns 101 ns
bm_mismatch<short>/4097 110 ns 110 ns 99.1 ns
bm_mismatch<short>/8191 277 ns 219 ns 206 ns
bm_mismatch<short>/8192 226 ns 214 ns 250 ns
bm_mismatch<short>/8193 226 ns 207 ns 208 ns
bm_mismatch<short>/16383 519 ns 492 ns 488 ns
bm_mismatch<short>/16384 494 ns 492 ns 492 ns
bm_mismatch<short>/16385 492 ns 488 ns 489 ns
bm_mismatch<short>/32767 1007 ns 968 ns 964 ns
bm_mismatch<short>/32768 977 ns 972 ns 970 ns
bm_mismatch<short>/32769 972 ns 962 ns 967 ns
bm_mismatch<short>/65535 1978 ns 1918 ns 1956 ns
bm_mismatch<short>/65536 1940 ns 1927 ns 1970 ns
bm_mismatch<short>/65537 1937 ns 1922 ns 1959 ns
bm_mismatch<short>/131071 4524 ns 4193 ns 4304 ns
bm_mismatch<short>/131072 4445 ns 4196 ns 4306 ns
bm_mismatch<short>/131073 4452 ns 4278 ns 4311 ns
bm_mismatch<short>/262143 9801 ns 10188 ns 9634 ns
bm_mismatch<short>/262144 9738 ns 10151 ns 9651 ns
bm_mismatch<short>/262145 9716 ns 10171 ns 9715 ns
bm_mismatch<short>/524287 19944 ns 20718 ns 20044 ns
bm_mismatch<short>/524288 21139 ns 20647 ns 20008 ns
bm_mismatch<short>/524289 21162 ns 19512 ns 20068 ns
bm_mismatch<int>/1 1.40 ns 1.84 ns 1.87 ns
bm_mismatch<int>/2 1.87 ns 2.08 ns 2.09 ns
bm_mismatch<int>/3 2.36 ns 2.31 ns 2.87 ns
bm_mismatch<int>/4 3.06 ns 2.72 ns 2.95 ns
bm_mismatch<int>/5 3.66 ns 3.37 ns 3.42 ns
bm_mismatch<int>/6 4.55 ns 3.65 ns 3.73 ns
bm_mismatch<int>/7 5.03 ns 3.93 ns 3.94 ns
bm_mismatch<int>/8 5.67 ns 1.86 ns 1.87 ns
bm_mismatch<int>/15 9.89 ns 4.41 ns 2.34 ns
bm_mismatch<int>/16 10.1 ns 2.33 ns 2.34 ns
bm_mismatch<int>/17 10.2 ns 3.34 ns 2.86 ns
bm_mismatch<int>/31 17.2 ns 5.54 ns 3.28 ns
bm_mismatch<int>/32 2.16 ns 2.15 ns 2.58 ns
bm_mismatch<int>/33 2.36 ns 3.01 ns 3.28 ns
bm_mismatch<int>/63 17.7 ns 6.50 ns 4.93 ns
bm_mismatch<int>/64 3.81 ns 3.58 ns 3.90 ns
bm_mismatch<int>/65 3.74 ns 4.36 ns 4.45 ns
bm_mismatch<int>/127 19.5 ns 9.56 ns 7.74 ns
bm_mismatch<int>/128 7.30 ns 6.41 ns 6.85 ns
bm_mismatch<int>/129 7.09 ns 7.04 ns 7.06 ns
bm_mismatch<int>/255 24.7 ns 14.8 ns 13.3 ns
bm_mismatch<int>/256 14.0 ns 12.1 ns 12.3 ns
bm_mismatch<int>/257 13.8 ns 12.7 ns 12.8 ns
bm_mismatch<int>/511 34.3 ns 26.3 ns 24.8 ns
bm_mismatch<int>/512 27.6 ns 23.6 ns 23.9 ns
bm_mismatch<int>/513 27.3 ns 24.4 ns 25.1 ns
bm_mismatch<int>/1023 62.5 ns 50.9 ns 48.3 ns
bm_mismatch<int>/1024 54.4 ns 46.1 ns 46.6 ns
bm_mismatch<int>/1025 54.2 ns 48.4 ns 47.5 ns
bm_mismatch<int>/2047 116 ns 97.8 ns 94.1 ns
bm_mismatch<int>/2048 108 ns 92.6 ns 92.4 ns
bm_mismatch<int>/2049 108 ns 104 ns 94.0 ns
bm_mismatch<int>/4095 233 ns 222 ns 205 ns
bm_mismatch<int>/4096 226 ns 223 ns 225 ns
bm_mismatch<int>/4097 221 ns 219 ns 210 ns
bm_mismatch<int>/8191 499 ns 485 ns 488 ns
bm_mismatch<int>/8192 496 ns 490 ns 495 ns
bm_mismatch<int>/8193 491 ns 485 ns 488 ns
bm_mismatch<int>/16383 982 ns 962 ns 964 ns
bm_mismatch<int>/16384 974 ns 971 ns 971 ns
bm_mismatch<int>/16385 971 ns 961 ns 968 ns
bm_mismatch<int>/32767 2003 ns 1959 ns 1920 ns
bm_mismatch<int>/32768 1996 ns 1947 ns 1928 ns
bm_mismatch<int>/32769 1990 ns 1945 ns 1926 ns
bm_mismatch<int>/65535 4434 ns 4275 ns 4312 ns
bm_mismatch<int>/65536 4437 ns 4267 ns 4321 ns
bm_mismatch<int>/65537 4442 ns 4261 ns 4321 ns
bm_mismatch<int>/131071 9673 ns 9648 ns 9465 ns
bm_mismatch<int>/131072 9667 ns 9671 ns 9465 ns
bm_mismatch<int>/131073 9661 ns 9653 ns 9464 ns
bm_mismatch<int>/262143 20595 ns 19605 ns 19064 ns
bm_mismatch<int>/262144 19894 ns 19572 ns 19009 ns
bm_mismatch<int>/262145 19851 ns 19656 ns 18999 ns
bm_mismatch<int>/524287 39556 ns 39364 ns 38131 ns
bm_mismatch<int>/524288 39678 ns 39573 ns 38183 ns
bm_mismatch<int>/524289 40168 ns 39301 ns 38121 ns
```
2024-03-29 19:29:54 +01:00
Nikolas Klauser
c388690a8b
[libc++][NFC] Simplify copy and move lowering to memmove a bit ( #83574 )
...
We've introduced `__constexpr_memmove` a while ago, which simplified the
implementation of the copy and move lowering a bit. This allows us to
remove some of the boilerplate.
2024-03-27 16:54:50 +01:00
Nikolas Klauser
b68e2eba0b
[libc++] Vectorize mismatch ( #73255 )
...
```
---------------------------------------------------
Benchmark old new
---------------------------------------------------
bm_mismatch<char>/1 0.835 ns 2.37 ns
bm_mismatch<char>/2 1.44 ns 2.60 ns
bm_mismatch<char>/3 2.06 ns 2.83 ns
bm_mismatch<char>/4 2.60 ns 3.29 ns
bm_mismatch<char>/5 3.15 ns 3.77 ns
bm_mismatch<char>/6 3.82 ns 4.17 ns
bm_mismatch<char>/7 4.29 ns 4.52 ns
bm_mismatch<char>/8 4.78 ns 4.86 ns
bm_mismatch<char>/16 9.06 ns 7.54 ns
bm_mismatch<char>/64 31.7 ns 19.1 ns
bm_mismatch<char>/512 249 ns 8.16 ns
bm_mismatch<char>/4096 1956 ns 44.2 ns
bm_mismatch<char>/32768 15498 ns 501 ns
bm_mismatch<char>/262144 123965 ns 4479 ns
bm_mismatch<char>/1048576 495668 ns 21306 ns
bm_mismatch<short>/1 0.710 ns 2.12 ns
bm_mismatch<short>/2 1.03 ns 2.66 ns
bm_mismatch<short>/3 1.29 ns 3.56 ns
bm_mismatch<short>/4 1.68 ns 4.29 ns
bm_mismatch<short>/5 1.96 ns 5.18 ns
bm_mismatch<short>/6 2.59 ns 5.91 ns
bm_mismatch<short>/7 2.86 ns 6.63 ns
bm_mismatch<short>/8 3.19 ns 7.33 ns
bm_mismatch<short>/16 5.48 ns 13.0 ns
bm_mismatch<short>/64 16.6 ns 4.06 ns
bm_mismatch<short>/512 130 ns 13.8 ns
bm_mismatch<short>/4096 985 ns 93.8 ns
bm_mismatch<short>/32768 7846 ns 1002 ns
bm_mismatch<short>/262144 63217 ns 10637 ns
bm_mismatch<short>/1048576 251782 ns 42471 ns
bm_mismatch<int>/1 0.716 ns 1.91 ns
bm_mismatch<int>/2 1.21 ns 2.49 ns
bm_mismatch<int>/3 1.38 ns 3.46 ns
bm_mismatch<int>/4 1.71 ns 4.04 ns
bm_mismatch<int>/5 2.00 ns 4.98 ns
bm_mismatch<int>/6 2.43 ns 5.67 ns
bm_mismatch<int>/7 3.05 ns 6.38 ns
bm_mismatch<int>/8 3.22 ns 7.09 ns
bm_mismatch<int>/16 5.18 ns 12.8 ns
bm_mismatch<int>/64 16.6 ns 5.28 ns
bm_mismatch<int>/512 129 ns 25.2 ns
bm_mismatch<int>/4096 1009 ns 201 ns
bm_mismatch<int>/32768 7776 ns 2144 ns
bm_mismatch<int>/262144 62371 ns 20551 ns
bm_mismatch<int>/1048576 254750 ns 90097 ns
```
2024-03-23 15:28:22 +01:00
Xiaoyang Liu
c3747883a0
[libc++][ranges] use static operator()
for C++23 ranges ( #86052 )
...
## Abstract
This pull request converts the `operator()` of all CPOs and niebloids
related to C++23 ranges to `static`.
## Motivation
In `libc++`, CPOs and niebloids are implemented as function objects.
Currently, the `operator()` for such a function object is a
`const`-qualified member function. This means that even if the function
object is has no data members, an extra register is used to pass in the
`this` pointer when calling `operator()`, unless the compiler can inline
the function call. Declaraing `operator()` as `static` would optimize
away the unnecessary `this` pointer passing for stateless function
objects, since there is no object instance state that needs to be
accessed.
## Reference
- [P1169R4: static `operator()`](https://wg21.link/P1169R4 )
2024-03-23 00:32:02 +01:00
Nikolas Klauser
4ea850b52f
[libc++] Remove __unconstrained_reverse_iterator ( #85582 )
...
`__unconstrained_reverse_iterator` has outlived its usefullness, since
the standard and subsequently the compilers have been fixed.
2024-03-18 14:19:51 +01:00
Nikolas Klauser
a2fe410581
[libc++][NFC] Simplify the implementation of equal a bit ( #84754 )
...
We can simplify the implementation of the two range overload of `equal`
a bit since we can now use `if constexpr`.
2024-03-18 08:32:09 +01:00
Nikolas Klauser
580f60484e
[libc++][NFC] Merge is{,_nothrow,_trivially}{,_copy,_move,_default}{_assignable,_constructible} ( #85308 )
...
These headers have become very small by using compiler builtins, often
containing only two declarations. This merges these headers, since
there doesn't seem to be much of a benefit keeping them separate.
Specifically, `is_{,_nothrow,_trivially}{assignable,constructible}` are
kept and the `copy`, `move` and `default` versions of these type traits
are moved in to the respective headers.
2024-03-18 08:29:44 +01:00
Nikolas Klauser
07b18c5e1b
[libc++] Optimize ranges::fill{,_n} for vector<bool>::iterator ( #84642 )
...
```
------------------------------------------------------
Benchmark old new
------------------------------------------------------
bm_ranges_fill_n/1 1.64 ns 3.06 ns
bm_ranges_fill_n/2 3.45 ns 3.06 ns
bm_ranges_fill_n/3 4.88 ns 3.06 ns
bm_ranges_fill_n/4 6.46 ns 3.06 ns
bm_ranges_fill_n/5 8.03 ns 3.06 ns
bm_ranges_fill_n/6 9.65 ns 3.07 ns
bm_ranges_fill_n/7 11.5 ns 3.06 ns
bm_ranges_fill_n/8 13.0 ns 3.06 ns
bm_ranges_fill_n/16 25.9 ns 3.06 ns
bm_ranges_fill_n/64 103 ns 4.62 ns
bm_ranges_fill_n/512 711 ns 4.40 ns
bm_ranges_fill_n/4096 5642 ns 9.86 ns
bm_ranges_fill_n/32768 45135 ns 33.6 ns
bm_ranges_fill_n/262144 360818 ns 243 ns
bm_ranges_fill_n/1048576 1442828 ns 982 ns
bm_ranges_fill/1 1.63 ns 3.17 ns
bm_ranges_fill/2 3.43 ns 3.28 ns
bm_ranges_fill/3 4.97 ns 3.31 ns
bm_ranges_fill/4 6.53 ns 3.27 ns
bm_ranges_fill/5 8.12 ns 3.33 ns
bm_ranges_fill/6 9.76 ns 3.32 ns
bm_ranges_fill/7 11.6 ns 3.29 ns
bm_ranges_fill/8 13.2 ns 3.26 ns
bm_ranges_fill/16 26.3 ns 3.26 ns
bm_ranges_fill/64 104 ns 4.92 ns
bm_ranges_fill/512 716 ns 4.47 ns
bm_ranges_fill/4096 5772 ns 8.21 ns
bm_ranges_fill/32768 45778 ns 33.1 ns
bm_ranges_fill/262144 351422 ns 241 ns
bm_ranges_fill/1048576 1404710 ns 965 ns
```
2024-03-17 20:00:54 +01:00
Nikolas Klauser
76a2472715
[libc++] Refactor more __enable_ifs to the canonical style ( #81457 )
...
This brings the code base closer to having only a single style of
`enable_if`s.
2024-02-20 01:47:38 +01:00
ZijunZhaoCCK
a6b846ae1e
[libc++][ranges] Implement ranges::contains_subrange ( #66963 )
2024-02-13 15:42:37 -08:00
Louis Dionne
7b4622514d
[libc++] Fix missing and incorrect push/pop macros ( #79204 )
...
We recently noticed that the unwrap_iter.h file was pushing macros, but
it was pushing them again instead of popping them at the end of the
file. This led to libc++ basically swallowing any custom definition of
these macros in user code:
#define min HELLO
#include <algorithm>
// min is not HELLO anymore, it's not defined
While investigating this issue, I noticed that our push/pop pragmas were
actually entirely wrong too. Indeed, instead of pushing macros like
`move`, we'd push `move(int, int)` in the pragma, which is not a valid
macro name. As a result, we would not actually push macros like `move`
-- instead we'd simply undefine them. This led to the following code not
working:
#define move HELLO
#include <algorithm>
// move is not HELLO anymore
Fixing the pragma push/pop incantations led to a cascade of issues
because we use identifiers like `move` in a large number of places, and
all of these headers would now need to do the push/pop dance.
This patch fixes all these issues. First, it adds a check that we don't
swallow important names like min, max, move or refresh as explained
above. This is done by augmenting the existing
system_reserved_names.gen.py test to also check that the macros are what
we expect after including each header.
Second, it fixes the push/pop pragmas to work properly and adds missing
pragmas to all the files I could detect a failure in via the newly added
test.
rdar://121365472
2024-01-25 15:48:46 -05:00
Louis Dionne
03a9f07e18
[libc++][NFC] Fix leftover && in comment
2024-01-24 09:41:02 -05:00
Konstantin Varlamov
8938bc0ad0
[libc++][hardening] Categorize assertions related to strict weak ordering ( #77405 )
...
If a user passes a comparator that doesn't satisfy strict weak ordering
(see https://eel.is/c++draft/algorithms#alg.sorting.general ) to
a sorting algorithm, the algorithm can produce an incorrect result or
even lead
to an out-of-bounds access. Unfortunately, comprehensively validating
that a given comparator indeed satisfies the strict weak ordering
requirement is prohibitively expensive (see [the related
RFC](https://discourse.llvm.org/t/rfc-strict-weak-ordering-checks-in-the-debug-libc/70217 )).
As a result, we have three independent sets of checks:
- assertions that catch out-of-bounds accesses within the algorithms'
implementation. These are relatively cheap; however, they cannot catch
the underlying cause and cannot prevent the case where an invalid
comparator would result in an incorrectly-sorted sequence without
actually triggering an OOB access;
- debug comparators that wrap a given comparator and on each comparison
check that if `(a < b)`, then `!(b < a)`, where `<` stands for the
user-provided comparator. This performs up to 2x number of comparisons
but doesn't affect the algorithmic complexity. While this approach can
find more issues, it is still a heuristic;
- a comprehensive check of the comparator that validates up to 100
elements in the resulting sorted sequence (see the RFC above for
details). The check is expensive but the 100 element limit can somewhat
compensate for that, especially for large values of `N`.
The first set of checks is enabled in the fast hardening mode while the
other two are only enabled in the debug mode.
This patch also removes the
`_LIBCPP_DEBUG_STRICT_WEAK_ORDERING_CHECK` macro that
previously was used to selectively enable the 100-element check.
Now this check is enabled unconditionally in the debug mode.
Also, introduce a new category
`_LIBCPP_ASSERT_SEMANTIC_REQUIREMENT`. This category is
intended for checking the semantic requirements from the Standard.
Typically, these are hard or impossible to completely validate, so
these checks are expected to be heuristic in nature and potentially
quite expensive.
See https://reviews.llvm.org/D150264 for additional background.
Fixes #71496
2024-01-22 23:31:58 -08:00
Konstantin Varlamov
dc57752031
[libc++][hardening] Categorize assertions that produce incorrect results ( #77183 )
...
Introduce a new `argument-within-domain` category that covers cases
where the given arguments make it impossible to produce a correct result
(or create a valid object in case of constructors). While the incorrect
result doesn't create an immediate problem within the library (like e.g.
a null pointer dereference would), it always indicates a logic error in
user code and is highly likely to lead to a bug in the program once the
value is used.
2024-01-20 23:38:02 -08:00
Konstantin Varlamov
4f215fdd62
[libc++][hardening] Categorize more assertions. ( #75918 )
...
Also introduce `_LIBCPP_ASSERT_PEDANTIC` for assertions violating which
results in a no-op or other benign behavior, but which may nevertheless
indicate a bug in the invoking code.
2024-01-05 16:29:23 -08:00
Nikolas Klauser
b203d5320d
[libc++] Optimize std::find if types are integral and have the same signedness ( #70345 )
...
Fixes #70238
2023-12-23 11:21:27 +01:00
Konstantin Varlamov
1638657dce
[libc++][hardening] Categorize more 'valid-element-access' checks. ( #71620 )
2023-12-20 17:24:48 -08:00
Christopher Di Bella
3903438860
[libcxx] adds ranges::fold_left_with_iter and ranges::fold_left ( #75259 )
...
Notable things in this commit:
* refactors `__indirect_binary_left_foldable`, making it slightly
different (but equivalent) to _`indirect-binary-left-foldable`_, which
improves readability (a [patch to the Working Paper][patch] was made)
* omits `__cpo` namespace, since it is not required for implementing
niebloids (a cleanup should happen in 2024)
* puts tests ensuring invocable robustness and dangling correctness
inside the correctness testing to ensure that the algorithms' results
are still correct
[patch]: https://github.com/cplusplus/draft/pull/6734
2023-12-19 21:57:50 -08:00
ZijunZhaoCCK
fdd089b500
[libc++] Implement ranges::contains ( #65148 )
...
Differential Revision: https://reviews.llvm.org/D159232
```
Running ./ranges_contains.libcxx.out
Run on (10 X 24.121 MHz CPU s)
CPU Caches:
L1 Data 64 KiB (x10)
L1 Instruction 128 KiB (x10)
L2 Unified 4096 KiB (x5)
Load Average: 3.37, 6.77, 5.27
--------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------
bm_contains_char/16 1.88 ns 1.87 ns 371607095
bm_contains_char/256 7.48 ns 7.47 ns 93292285
bm_contains_char/4096 99.7 ns 99.6 ns 7013185
bm_contains_char/65536 1296 ns 1294 ns 540436
bm_contains_char/1048576 23887 ns 23860 ns 29302
bm_contains_char/16777216 389420 ns 389095 ns 1796
bm_contains_int/16 7.14 ns 7.14 ns 97776288
bm_contains_int/256 90.4 ns 90.3 ns 7558089
bm_contains_int/4096 1294 ns 1290 ns 543052
bm_contains_int/65536 20482 ns 20443 ns 34334
bm_contains_int/1048576 328817 ns 327965 ns 2147
bm_contains_int/16777216 5246279 ns 5239361 ns 133
bm_contains_bool/16 2.19 ns 2.19 ns 322565780
bm_contains_bool/256 3.42 ns 3.41 ns 205025467
bm_contains_bool/4096 22.1 ns 22.1 ns 31780479
bm_contains_bool/65536 333 ns 332 ns 2106606
bm_contains_bool/1048576 5126 ns 5119 ns 135901
bm_contains_bool/16777216 81656 ns 81574 ns 8569
```
---------
Co-authored-by: Nathan Gauër <brioche@google.com>
2023-12-19 16:34:19 -08:00
Louis Dionne
9783f28cbb
[libc++] Format the code base ( #74334 )
...
This patch runs clang-format on all of libcxx/include and libcxx/src, in
accordance with the RFC discussed at [1]. Follow-up patches will format
the benchmarks, the test suite and remaining parts of the code. I'm
splitting this one into its own patch so the diff is a bit easier to
review.
This patch was generated with:
find libcxx/include libcxx/src -type f \
| grep -v 'module.modulemap.in' \
| grep -v 'CMakeLists.txt' \
| grep -v 'README.txt' \
| grep -v 'libcxx.imp' \
| grep -v '__config_site.in' \
| xargs clang-format -i
A Git merge driver is available in libcxx/utils/clang-format-merge-driver.sh
to help resolve merge and rebase issues across these formatting changes.
[1]: https://discourse.llvm.org/t/rfc-clang-formatting-all-of-libc-once-and-for-all
2023-12-18 14:01:33 -05:00
Nikolas Klauser
f7407411a1
[libc++] Optimize std::find for segmented iterators ( #67224 )
...
```
--------------------------------------------------------------------------
Benchmark old new
--------------------------------------------------------------------------
bm_find<std::deque<char>>/1 6.06 ns 10.6 ns
bm_find<std::deque<char>>/2 15.5 ns 10.6 ns
bm_find<std::deque<char>>/3 19.0 ns 10.6 ns
bm_find<std::deque<char>>/4 20.8 ns 10.6 ns
bm_find<std::deque<char>>/5 22.0 ns 10.6 ns
bm_find<std::deque<char>>/6 23.0 ns 10.5 ns
bm_find<std::deque<char>>/7 24.8 ns 10.7 ns
bm_find<std::deque<char>>/8 25.7 ns 10.6 ns
bm_find<std::deque<char>>/16 28.3 ns 10.6 ns
bm_find<std::deque<char>>/64 44.2 ns 27.0 ns
bm_find<std::deque<char>>/512 133 ns 37.6 ns
bm_find<std::deque<char>>/4096 867 ns 53.1 ns
bm_find<std::deque<char>>/32768 6838 ns 160 ns
bm_find<std::deque<char>>/262144 52897 ns 1495 ns
bm_find<std::deque<char>>/1048576 215621 ns 6077 ns
bm_find<std::deque<short>>/1 6.03 ns 6.28 ns
bm_find<std::deque<short>>/2 15.8 ns 15.8 ns
bm_find<std::deque<short>>/3 20.5 ns 20.3 ns
bm_find<std::deque<short>>/4 21.0 ns 21.0 ns
bm_find<std::deque<short>>/5 23.0 ns 22.1 ns
bm_find<std::deque<short>>/6 22.6 ns 23.0 ns
bm_find<std::deque<short>>/7 23.4 ns 23.7 ns
bm_find<std::deque<short>>/8 24.4 ns 24.9 ns
bm_find<std::deque<short>>/16 26.6 ns 27.2 ns
bm_find<std::deque<short>>/64 43.2 ns 40.9 ns
bm_find<std::deque<short>>/512 124 ns 90.7 ns
bm_find<std::deque<short>>/4096 845 ns 525 ns
bm_find<std::deque<short>>/32768 7273 ns 3194 ns
bm_find<std::deque<short>>/262144 53710 ns 24385 ns
bm_find<std::deque<short>>/1048576 216086 ns 96195 ns
bm_find<std::deque<int>>/1 6.03 ns 10.3 ns
bm_find<std::deque<int>>/2 15.6 ns 10.3 ns
bm_find<std::deque<int>>/3 19.1 ns 10.3 ns
bm_find<std::deque<int>>/4 22.3 ns 10.3 ns
bm_find<std::deque<int>>/5 23.5 ns 10.4 ns
bm_find<std::deque<int>>/6 23.1 ns 10.3 ns
bm_find<std::deque<int>>/7 23.7 ns 10.2 ns
bm_find<std::deque<int>>/8 24.5 ns 10.2 ns
bm_find<std::deque<int>>/16 27.9 ns 26.6 ns
bm_find<std::deque<int>>/64 42.6 ns 32.2 ns
bm_find<std::deque<int>>/512 123 ns 43.0 ns
bm_find<std::deque<int>>/4096 874 ns 93.5 ns
bm_find<std::deque<int>>/32768 7031 ns 751 ns
bm_find<std::deque<int>>/262144 57723 ns 6169 ns
bm_find<std::deque<int>>/1048576 230867 ns 35851 ns
bm_ranges_find<std::deque<char>>/1 5.97 ns 10.6 ns
bm_ranges_find<std::deque<char>>/2 16.0 ns 10.5 ns
bm_ranges_find<std::deque<char>>/3 19.5 ns 10.5 ns
bm_ranges_find<std::deque<char>>/4 21.1 ns 10.6 ns
bm_ranges_find<std::deque<char>>/5 22.8 ns 10.5 ns
bm_ranges_find<std::deque<char>>/6 22.8 ns 10.6 ns
bm_ranges_find<std::deque<char>>/7 23.4 ns 10.8 ns
bm_ranges_find<std::deque<char>>/8 24.1 ns 10.5 ns
bm_ranges_find<std::deque<char>>/16 26.9 ns 10.6 ns
bm_ranges_find<std::deque<char>>/64 50.2 ns 27.2 ns
bm_ranges_find<std::deque<char>>/512 126 ns 38.3 ns
bm_ranges_find<std::deque<char>>/4096 868 ns 53.8 ns
bm_ranges_find<std::deque<char>>/32768 6695 ns 161 ns
bm_ranges_find<std::deque<char>>/262144 54411 ns 1497 ns
bm_ranges_find<std::deque<char>>/1048576 241699 ns 6042 ns
bm_ranges_find<std::deque<short>>/1 6.39 ns 6.31 ns
bm_ranges_find<std::deque<short>>/2 15.8 ns 15.9 ns
bm_ranges_find<std::deque<short>>/3 19.0 ns 19.8 ns
bm_ranges_find<std::deque<short>>/4 20.8 ns 20.9 ns
bm_ranges_find<std::deque<short>>/5 21.8 ns 22.1 ns
bm_ranges_find<std::deque<short>>/6 23.0 ns 23.0 ns
bm_ranges_find<std::deque<short>>/7 23.2 ns 23.9 ns
bm_ranges_find<std::deque<short>>/8 23.7 ns 24.4 ns
bm_ranges_find<std::deque<short>>/16 26.6 ns 26.8 ns
bm_ranges_find<std::deque<short>>/64 43.4 ns 39.7 ns
bm_ranges_find<std::deque<short>>/512 131 ns 90.5 ns
bm_ranges_find<std::deque<short>>/4096 851 ns 523 ns
bm_ranges_find<std::deque<short>>/32768 7370 ns 3166 ns
bm_ranges_find<std::deque<short>>/262144 60778 ns 24814 ns
bm_ranges_find<std::deque<short>>/1048576 229288 ns 99273 ns
bm_ranges_find<std::deque<int>>/1 6.43 ns 10.2 ns
bm_ranges_find<std::deque<int>>/2 16.6 ns 10.2 ns
bm_ranges_find<std::deque<int>>/3 19.6 ns 10.2 ns
bm_ranges_find<std::deque<int>>/4 21.0 ns 10.2 ns
bm_ranges_find<std::deque<int>>/5 21.9 ns 10.4 ns
bm_ranges_find<std::deque<int>>/6 22.7 ns 10.2 ns
bm_ranges_find<std::deque<int>>/7 23.9 ns 10.2 ns
bm_ranges_find<std::deque<int>>/8 23.8 ns 10.2 ns
bm_ranges_find<std::deque<int>>/16 27.2 ns 27.1 ns
bm_ranges_find<std::deque<int>>/64 42.4 ns 32.4 ns
bm_ranges_find<std::deque<int>>/512 122 ns 43.0 ns
bm_ranges_find<std::deque<int>>/4096 895 ns 93.7 ns
bm_ranges_find<std::deque<int>>/32768 6890 ns 756 ns
bm_ranges_find<std::deque<int>>/262144 54025 ns 6102 ns
bm_ranges_find<std::deque<int>>/1048576 221558 ns 32783 ns
```
2023-12-15 17:10:16 +01:00
Stephan T. Lavavej
bfdc562d0c
[libc++] Fix copy-paste damage in ranges::rotate_copy
and its test ( #74544 )
...
Found while running libc++'s tests with MSVC's STL.
`ranges::rotate_copy` takes `forward_iterator`s as this test's comment
banner correctly depicts. However, this test had bogus assertions
expecting that `ranges::rotate_copy` would be constrained away for
not-quite-**bidi** iterators. @philnik777 confirmed that these were
copy-paste relics from the `ranges::reverse_copy` test.
I fixed this by replacing the assertions with the test types that aren't
quite **forward** iterators/ranges. Additionally, I noticed that the
top-level `test()` function was missing coverage with the weakest
possible `forward_iterator<int*>`.
This revealed that the product code in `ranges_rotate_copy.h` was
similarly damaged. In addition to fixing it by taking `forward_iterator`
and `forward_range` as depicted in the Standard, this drops the
inclusion of `<__iterator/reverse_iterator.h>` as this algorithm doesn't
need `std::__reverse_range`.
2023-12-06 02:29:09 -08:00
Louis Dionne
77a00c0d54
[libc++] Replace uses of _VSTD:: by std:: ( #74331 )
...
As part of the upcoming clang-formatting of libc++, this patch performs
the long desired removal of the _VSTD macro.
See https://discourse.llvm.org/t/rfc-clang-formatting-all-of-libc-once-and-for-all
for the clang-format proposal.
2023-12-05 11:19:15 -05:00
Louis Dionne
4c19854222
[libc++] Rename _LIBCPP_INLINE_VISIBILITY to _LIBCPP_HIDE_FROM_ABI ( #74095 )
...
In preparation for running clang-format on the whole code base, we are
also removing mentions of the legacy _LIBCPP_INLINE_VISIBILITY macro in
favor of the newer _LIBCPP_HIDE_FROM_ABI.
We're still leaving the definition of _LIBCPP_INLINE_VISIBILITY to avoid
creating needless breakage in case some older patches are checked-in
with mentions of the old macro. After we branch for LLVM 18, we can do
another pass to clean up remaining uses of the macro that might have
gotten introduced by mistake (if any) and remove the macro itself at the
same time. This is just a minor convenience to smooth out the transition
as much as possible.
See
https://discourse.llvm.org/t/rfc-clang-formatting-all-of-libc-once-and-for-all
for the clang-format proposal.
2023-12-04 10:25:14 -05:00
Nikolas Klauser
ed27a4edb0
[libc++][PSTL] Implement std::equal ( #72448 )
...
Differential Revision: https://reviews.llvm.org/D157131
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
2023-11-28 16:02:18 -05:00
Louis Dionne
936180a5e8
[libc++][NFC] Fix typo in comment
2023-11-27 10:35:07 -05:00
philnik777
1314e8774f
[libc++] Add missing headers to the modulemap ( #71127 )
...
I don't know when, but at some point we lost test coverage to ensue that
all the headers are in the modulemap. This adds a test to make sure all
the headers (excluding a few which shouldn't be part of the modulemap)
are at least mentioned. This also fixes a few headers which bit-rotted
while we were missing the coverage.
2023-11-27 00:14:59 +01:00
Anton Rydahl
aea7929b0a
[libc++] Unify __is_trivial_equality_predicate and __is_trivial_plus_operation into __desugars_to ( #68642 )
...
When working on an OpenMP offloading backend for standard parallel
algorithms (https://github.com/llvm/llvm-project/pull/66968 ) we noticed
the need of a generalization of `__is_trivial_plus_operation`. This patch
merges `__is_trivial_equality_predicate` and `__is_trivial_plus_operation`
into `__desugars_to`, and in the future we might extend the latter to support
other binary operations as well.
Co-authored-by: Louis Dionne <ldionne.2@gmail.com>
2023-11-23 13:55:55 -05:00
Nikolas Klauser
c81bfc61da
[libc++] Optimize for_each for segmented iterators
...
```
---------------------------------------------------
Benchmark old new
---------------------------------------------------
bm_for_each/1 3.00 ns 2.98 ns
bm_for_each/2 4.53 ns 4.57 ns
bm_for_each/3 5.82 ns 5.82 ns
bm_for_each/4 6.94 ns 6.91 ns
bm_for_each/5 7.55 ns 7.75 ns
bm_for_each/6 7.06 ns 7.45 ns
bm_for_each/7 6.69 ns 7.14 ns
bm_for_each/8 6.86 ns 4.06 ns
bm_for_each/16 11.5 ns 5.73 ns
bm_for_each/64 43.7 ns 4.06 ns
bm_for_each/512 356 ns 7.98 ns
bm_for_each/4096 2787 ns 53.6 ns
bm_for_each/32768 20836 ns 438 ns
bm_for_each/262144 195362 ns 4945 ns
bm_for_each/1048576 685482 ns 19822 ns
```
Reviewed By: ldionne, Mordante, #libc
Spies: bgraur, sberg, arichardson, libcxx-commits
Differential Revision: https://reviews.llvm.org/D151274
2023-11-14 23:55:24 +01:00
Louis Dionne
acb9156266
[libc++][NFC] Fix license comment typo
...
Fixes #72024
2023-11-11 08:24:19 -10:00
Konstantin Varlamov
64d413efdd
[libc++][hardening] Rework macros for enabling the hardening mode. ( #70575 )
...
1. Instead of using individual "boolean" macros, have an "enum" macro
`_LIBCPP_HARDENING_MODE`. This avoids issues with macros being
mutually exclusive and makes overriding the hardening mode within a TU
more straightforward.
2. Rename the safe mode to debug-lite.
This brings the code in line with the RFC:
https://discourse.llvm.org/t/rfc-hardening-in-libc/73925
Fixes #65101
2023-11-08 09:10:00 -10:00
Louis Dionne
02540b2f6d
[libc++] Make sure ranges algorithms and views handle boolean-testable correctly ( #69378 )
...
Before this patch, we would fail to implicitly convert the result of
predicates to bool, which means we'd potentially perform a copy or move
construction of the boolean-testable, which isn't allowed. The same
holds true for comparing iterators against sentinels, which is allowed
to return a boolean-testable type.
We already had tests aiming to ensure correct handling of these types,
but they failed to provide appropriate coverage in several cases due to
guaranteed RVO. This patch fixes the tests, adds tests for missing
algorithms and views, and fixes the actual problems in the code.
Fixes #69074
2023-11-06 21:19:49 -10:00
Louis Dionne
979c19ab12
[libc++] Fix complexity guarantee in ranges::clamp ( #68413 )
...
This patch prevents us from calling the projection more than 3 times in
std::clamp, as required by the Standard.
Fixes #64717
2023-11-01 10:43:05 -04:00
Rajveer Singh Bharadwaj
dd4891318c
[libc++] Fix _CopySegment helper in ranges::copy(join_view, out) when called in a static assertion context ( #69593 )
...
Resolves Issue #69083
The `_CopySegment` helper for `ranges::copy(join_view, out)` is not
`constexpr` causing rejection in `libc++` in a static assertion context
as in the issue snippet.
2023-10-27 11:07:12 +02:00
Nikolas Klauser
5d7f346bd3
[libc++][PSTL] Implement std::rotate_copy
...
Reviewed By: #libc, ldionne
Spies: ldionne, libcxx-commits
Differential Revision: https://reviews.llvm.org/D155025
2023-10-24 14:02:37 +02:00
Nikolas Klauser
d2a46e6480
[libc++][PSTL] Implement std::move
...
Reviewed By: #libc, ldionne
Spies: ldionne, libcxx-commits
Differential Revision: https://reviews.llvm.org/D155330
2023-10-22 10:25:49 +02:00
Daniel Kutenin
ea9af5e7fd
[libc++] Add assertions for potential OOB reads in std::nth_element ( #67023 )
...
Same as https://reviews.llvm.org/D147089 but for std::nth_element
2023-10-18 20:22:17 -07:00
Louis Dionne
70fedaf89b
[libc++][NFC] Fix slightly incorrect comment in PSTL documentation
2023-10-13 17:27:44 -07:00
Anton Rydahl
f2b79ed9c6
[libcxx] Refactoring SIMD function names in PSTL CPU backend ( #69029 )
...
This PR addresses a smaller detail discussed in the code review for
https://github.com/llvm/llvm-project/pull/66968 . Currently, some
functions in the `libc++` PSTL CPU backend have been appended with a
digit to indicate the number of input iterator arguments. However, there
is no need to change the name for each version as overloading can be
used instead. This PR will make the naming more consistent in the the
CPU and the proposed OpenMP backend.
2023-10-13 17:08:15 -07:00