llvm-project

Author	SHA1	Message	Date
Nikolas Klauser	a2042521a0	[libc++] Remove _AlgPolicy from std::copy and algorithms using std::copy (#115887 ) `std::copy` doesn't use the `_AlgPolicy` for anything other than calling itself with it, so we can just remove the argument. This also removes the need in a few other algorithms which had an `_AlgPolicy` argument only to call `copy`.	2024-11-12 23:03:52 +01:00
Nikolas Klauser	5b67372aec	[libc++] Remove a few unused includes from <__algorithm/find_end.h>	2024-11-12 22:11:15 +01:00
Nikolas Klauser	eab7be5d42	[libc++] Forward more algorithms to the classic algorithms (#114674 ) This partially addresses #105687.	2024-11-06 12:10:06 +01:00
Nikolas Klauser	c6f3b7bcd0	[libc++] Refactor the configuration macros to being always defined (#112094 ) This is a follow-up to #89178. This updates the `<__config_site>` macros.	2024-11-06 10:39:19 +01:00
Nikolas Klauser	e99c4906e4	[libc++] Granularize <cstddef> includes (#108696 )	2024-10-31 02:20:10 +01:00
Nikolas Klauser	0fb76bae6b	Reapply "[libc++] Simplify the implementation of std::sort a bit (#104902 )" (#114023 ) This reverts commit ef44e4659878f2. The patch was originally reverted because it was deemed to introduce a performance regression for small inputs, however it also fixed a previous performance regression for larger inputs. So overall, this patch is desirable.	2024-10-30 11:51:55 +01:00
Louis Dionne	8e6bba230e	[libc++][NFC] Rename fold.h to ranges_fold.h (#109696 ) This follows the pattern we use consistently for ranges algorithms. This is a re-application of 24bc3244d4e which had been reverted in f11abac65 due to unrelated failures.	2024-09-30 08:30:16 -04:00
Chris B	f11abac652	Revert "[libc++][modules] Rewrite the modulemap to have fewer top-level modules (#107638 )" (#110384 ) This reverts 3 commits: 45a09d1811d5d6597385ef02ecf2d4b7320c37c5 24bc3244d4e221f4e6740f45e2bf15a1441a3076 bc6bd3bc1e99c7ec9e22dff23b4f4373fa02cae3 The GitHub pre-merge CI has been broken since this PR went in. This change reverts it to see if I can get the pre-merge CI working again.	2024-09-28 21:47:09 -05:00
Louis Dionne	24bc3244d4	[libc++][NFC] Rename fold.h to ranges_fold.h (#109696 ) This follows the pattern we use consistently for ranges algorithms.	2024-09-27 01:02:21 -04:00
Louis Dionne	ef44e46598	Revert "[libc++] Simplify the implementation of std::sort a bit (#104902 )" This reverts commit d4ffccfce103b01401b8a9222e373f2d404f8439, which caused a performance regression that needs to be investigated further.	2024-09-20 09:48:58 -04:00
Thurston Dang	0ea40bf021	Revert "[libc++] Explicitly convert to masks in SIMD code (#107983 )" This reverts commit 1603f99a37c5b179a21dbb8000c39a471a950927. Reason: buildbot breakage e.g., https://lab.llvm.org/buildbot/#/builders/55/builds/2061 llvm-libc++-shared.cfg.in :: std/algorithms/alg.nonmodifying/alg.starts_with/ranges.starts_with.pass.cpp llvm-libc++-shared.cfg.in :: std/algorithms/alg.nonmodifying/mismatch/mismatch.pass.cpp llvm-libc++-shared.cfg.in :: std/algorithms/alg.nonmodifying/mismatch/ranges_mismatch.pass.cpp ... (Buildbot re-run passed with the previous revision, 1fc288bf481726393c73133eef9aa73c0f78312e)	2024-09-17 21:52:33 +00:00
Nikolas Klauser	1603f99a37	[libc++] Explicitly convert to masks in SIMD code (#107983 ) This makes it clearer when we use masks and avoids MSan complaining.	2024-09-17 12:04:54 +02:00
Louis Dionne	09e3a36058	[libc++][modules] Fix missing and incorrect includes (#108850 ) This patch adds a large number of missing includes in the libc++ headers and the test suite. Those were found as part of the effort to move towards a mostly monolithic top-level std module.	2024-09-16 15:06:20 -04:00
A. Jiang	94e7c0b051	[libc++] Remove get_temporary_buffer and return_temporary_buffer (#100914 ) Works towards P0619R4 / #99985. The use of `std::get_temporary_buffer` and `std::return_temporary_buffer` are replaced with `unique_ptr`-based RAII buffer holder. Escape hatches: - `_LIBCPP_ENABLE_CXX20_REMOVED_TEMPORARY_BUFFER` restores `std::get_temporary_buffer` and `std::return_temporary_buffer`. Drive-by changes: - In `<syncstream>`, states that `get_temporary_buffer` is now removed, because `<syncstream>` is added in C++20.	2024-09-16 11:53:05 -04:00
Nikolas Klauser	17e0686ab1	[libc++][NFC] Use [[__nodiscard__]] unconditionally (#80454 ) `__has_cpp_attribute(__nodiscard__)` is always true now, so we might as well replace `_LIBCPP_NODISCARD`. It's one less macro that can result in bad diagnostics.	2024-09-12 21:18:43 +02:00
Louis Dionne	d6832a611a	[libc++][modules] Modularize <cstddef> (#107254 ) Many headers include `<cstddef>` just for size_t, and pulling in additional content (e.g. the traits used for std::byte) is unnecessary. To solve this problem, this patch splits up `<cstddef>` into subcomponents so that headers can include only the parts that they actually require. This has the added benefit of making the modules build a lot stricter with respect to IWYU, and also providing a canonical location where we define `std::size_t` and friends (which were previously defined in multiple headers like `<cstddef>` and `<ctime>`). After this patch, there's still many places in the codebase where we include `<cstddef>` when `<__cstddef/size_t.h>` would be sufficient. This patch focuses on removing `<cstddef>` includes from __type_traits to make these headers non-circular with `<cstddef>`. Additional refactorings can be tackled separately.	2024-09-05 08:28:33 -04:00
Louis Dionne	0df78123fd	[libc++] Add missing include to three_way_comp_ref_type.h We were using a `_LIBCPP_ASSERT_FOO` macro without including `<__assert>`. rdar://134425695	2024-08-27 14:22:49 -04:00
Nikolas Klauser	d4ffccfce1	[libc++] Simplify the implementation of std::sort a bit (#104902 ) This does a few things to canonicalize the library a bit. Specifically - use `__desugars_to_v` instead of the custom `__is_simple_comparator` - make `__use_branchless_sort` an inline variable - remove the `_maybe_branchless` versions of the `__sortN` functions and overload based on whether we can do branchless sorting instead.	2024-08-27 16:54:05 +02:00
Louis Dionne	f73050e722	[libc++] Fix several double-moves in the code base (#104616 ) This patch hardens the "test iterators" we use to test algorithms by ensuring that they don't get double-moved. As a result of this hardening, the tests started reporting multiple failures where we would double-move iterators, which are being fixed in this patch. In particular: - Fixed a double-move in pstl.partition - Add coverage for begin()/end() in subrange tests - Fix tests for ranges::ends_with and ranges::contains, which were incorrectly calling begin() twice on the same subrange containing non-copyable input iterators. Fixes #100709	2024-08-20 14:36:11 -04:00
Louis Dionne	257831582c	[libc++] Check correctly ref-qualified __is_callable in algorithms (#101553 ) We were only checking that the comparator was rvalue callable, when in reality the algorithms always call comparators as lvalues. This patch also refactors the tests for callable requirements and expands it to a few missing algorithms. This is take 2 of #73451, which was reverted because it broke some CI bots. The issue was that we checked __is_callable with arguments in the wrong order inside std::upper_bound. This has now been fixed and a test was added. Fixes #69554	2024-08-05 11:23:06 -04:00
Nikolas Klauser	d07fdf9779	[libc++] Optimize lexicographical_compare (#65279 ) If the comparison operation is equivalent to < and that is a total order, we know that we can use equality comparison on that type instead to extract some information. Furthermore, if equality comparison on that type is trivial, the user can't observe that we're calling it. So instead of using the user-provided total order, we use std::mismatch, which uses equality comparison (and is vertorized). Additionally, if the type is trivially lexicographically comparable, we can go one step further and use std::memcmp directly instead of calling std::mismatch. Benchmarks: ``` ------------------------------------------------------------------------------------- Benchmark old new ------------------------------------------------------------------------------------- bm_lexicographical_compare<unsigned char>/1 1.17 ns 2.34 ns bm_lexicographical_compare<unsigned char>/2 1.64 ns 2.57 ns bm_lexicographical_compare<unsigned char>/3 2.23 ns 2.58 ns bm_lexicographical_compare<unsigned char>/4 2.82 ns 2.57 ns bm_lexicographical_compare<unsigned char>/5 3.34 ns 2.11 ns bm_lexicographical_compare<unsigned char>/6 3.94 ns 2.21 ns bm_lexicographical_compare<unsigned char>/7 4.56 ns 2.11 ns bm_lexicographical_compare<unsigned char>/8 5.25 ns 2.11 ns bm_lexicographical_compare<unsigned char>/16 9.88 ns 2.11 ns bm_lexicographical_compare<unsigned char>/64 38.9 ns 2.36 ns bm_lexicographical_compare<unsigned char>/512 317 ns 6.54 ns bm_lexicographical_compare<unsigned char>/4096 2517 ns 41.4 ns bm_lexicographical_compare<unsigned char>/32768 20052 ns 488 ns bm_lexicographical_compare<unsigned char>/262144 159579 ns 4409 ns bm_lexicographical_compare<unsigned char>/1048576 640456 ns 20342 ns bm_lexicographical_compare<signed char>/1 1.18 ns 2.37 ns bm_lexicographical_compare<signed char>/2 1.65 ns 2.60 ns bm_lexicographical_compare<signed char>/3 2.23 ns 2.83 ns bm_lexicographical_compare<signed char>/4 2.81 ns 3.06 ns bm_lexicographical_compare<signed char>/5 3.35 ns 3.30 ns bm_lexicographical_compare<signed char>/6 3.90 ns 3.99 ns bm_lexicographical_compare<signed char>/7 4.56 ns 3.78 ns bm_lexicographical_compare<signed char>/8 5.20 ns 4.02 ns bm_lexicographical_compare<signed char>/16 9.80 ns 6.21 ns bm_lexicographical_compare<signed char>/64 39.0 ns 3.16 ns bm_lexicographical_compare<signed char>/512 318 ns 7.58 ns bm_lexicographical_compare<signed char>/4096 2514 ns 47.4 ns bm_lexicographical_compare<signed char>/32768 20096 ns 504 ns bm_lexicographical_compare<signed char>/262144 156617 ns 4146 ns bm_lexicographical_compare<signed char>/1048576 624265 ns 19810 ns bm_lexicographical_compare<int>/1 1.15 ns 2.12 ns bm_lexicographical_compare<int>/2 1.60 ns 2.36 ns bm_lexicographical_compare<int>/3 2.21 ns 2.59 ns bm_lexicographical_compare<int>/4 2.74 ns 2.83 ns bm_lexicographical_compare<int>/5 3.26 ns 3.06 ns bm_lexicographical_compare<int>/6 3.81 ns 4.53 ns bm_lexicographical_compare<int>/7 4.41 ns 4.72 ns bm_lexicographical_compare<int>/8 5.08 ns 2.36 ns bm_lexicographical_compare<int>/16 9.54 ns 3.08 ns bm_lexicographical_compare<int>/64 37.8 ns 4.71 ns bm_lexicographical_compare<int>/512 309 ns 24.6 ns bm_lexicographical_compare<int>/4096 2422 ns 204 ns bm_lexicographical_compare<int>/32768 19362 ns 1947 ns bm_lexicographical_compare<int>/262144 155727 ns 19793 ns bm_lexicographical_compare<int>/1048576 623614 ns 80180 ns bm_ranges_lexicographical_compare<unsigned char>/1 1.07 ns 2.35 ns bm_ranges_lexicographical_compare<unsigned char>/2 1.72 ns 2.13 ns bm_ranges_lexicographical_compare<unsigned char>/3 2.46 ns 2.12 ns bm_ranges_lexicographical_compare<unsigned char>/4 3.17 ns 2.12 ns bm_ranges_lexicographical_compare<unsigned char>/5 3.86 ns 2.12 ns bm_ranges_lexicographical_compare<unsigned char>/6 4.55 ns 2.12 ns bm_ranges_lexicographical_compare<unsigned char>/7 5.25 ns 2.12 ns bm_ranges_lexicographical_compare<unsigned char>/8 5.95 ns 2.13 ns bm_ranges_lexicographical_compare<unsigned char>/16 11.7 ns 2.13 ns bm_ranges_lexicographical_compare<unsigned char>/64 45.5 ns 2.36 ns bm_ranges_lexicographical_compare<unsigned char>/512 366 ns 6.35 ns bm_ranges_lexicographical_compare<unsigned char>/4096 2886 ns 40.9 ns bm_ranges_lexicographical_compare<unsigned char>/32768 23054 ns 489 ns bm_ranges_lexicographical_compare<unsigned char>/262144 185302 ns 4339 ns bm_ranges_lexicographical_compare<unsigned char>/1048576 741576 ns 19430 ns bm_ranges_lexicographical_compare<signed char>/1 1.10 ns 2.12 ns bm_ranges_lexicographical_compare<signed char>/2 1.66 ns 2.35 ns bm_ranges_lexicographical_compare<signed char>/3 2.23 ns 2.58 ns bm_ranges_lexicographical_compare<signed char>/4 2.82 ns 2.82 ns bm_ranges_lexicographical_compare<signed char>/5 3.34 ns 3.06 ns bm_ranges_lexicographical_compare<signed char>/6 3.92 ns 3.99 ns bm_ranges_lexicographical_compare<signed char>/7 4.64 ns 4.10 ns bm_ranges_lexicographical_compare<signed char>/8 5.21 ns 4.61 ns bm_ranges_lexicographical_compare<signed char>/16 9.79 ns 7.42 ns bm_ranges_lexicographical_compare<signed char>/64 38.9 ns 2.93 ns bm_ranges_lexicographical_compare<signed char>/512 317 ns 7.31 ns bm_ranges_lexicographical_compare<signed char>/4096 2500 ns 47.5 ns bm_ranges_lexicographical_compare<signed char>/32768 19940 ns 496 ns bm_ranges_lexicographical_compare<signed char>/262144 159166 ns 4393 ns bm_ranges_lexicographical_compare<signed char>/1048576 638206 ns 19786 ns bm_ranges_lexicographical_compare<int>/1 1.10 ns 2.12 ns bm_ranges_lexicographical_compare<int>/2 1.64 ns 3.04 ns bm_ranges_lexicographical_compare<int>/3 2.23 ns 2.58 ns bm_ranges_lexicographical_compare<int>/4 2.81 ns 2.81 ns bm_ranges_lexicographical_compare<int>/5 3.35 ns 3.05 ns bm_ranges_lexicographical_compare<int>/6 3.94 ns 4.60 ns bm_ranges_lexicographical_compare<int>/7 4.60 ns 4.81 ns bm_ranges_lexicographical_compare<int>/8 5.19 ns 2.35 ns bm_ranges_lexicographical_compare<int>/16 9.85 ns 2.87 ns bm_ranges_lexicographical_compare<int>/64 38.9 ns 4.70 ns bm_ranges_lexicographical_compare<int>/512 318 ns 24.5 ns bm_ranges_lexicographical_compare<int>/4096 2494 ns 202 ns bm_ranges_lexicographical_compare<int>/32768 20000 ns 1939 ns bm_ranges_lexicographical_compare<int>/262144 160433 ns 19730 ns bm_ranges_lexicographical_compare<int>/1048576 642636 ns 80760 ns ```	2024-08-04 10:02:43 +02:00
Louis Dionne	451bba6fbf	[libc++] Revert "Check correctly ref-qualified __is_callable in algorithms (#73451 )" This reverts commit 8d151f804ff43aaed1edf810bb2a07607b8bba14, which broke some build bots. I think that is caused by an invalid argument order when checking __is_comparable in upper_bound.	2024-08-01 15:56:06 -04:00
Nhat Nguyen	8d151f804f	[libc++] Check correctly ref-qualified __is_callable in algorithms (#73451 ) We were only checking that the comparator was rvalue callable, when in reality the algorithms always call comparators as lvalues. This patch also refactors the tests for callable requirements and expands it to a few missing algorithms. Fixes #69554	2024-08-01 14:08:21 -04:00
Christopher Di Bella	d10dc5a06f	[libc++] Remove dedicated namespaces for ranges functions (#76543 ) We originally put implementation-detail function objects into individual namespaces for `std::ranges` without a good reason for doing so. This practice was continued, presumably because there was prior art. Since there's no reason to keep these namespaces, this commit removes them, which will slightly impact binary size. This commit does not apply to CPOs, some of which need additional work.	2024-08-01 08:54:06 -04:00
Hewill Kang	5b6b48800e	[libc++][NFC] Remove two unused implementation details `__find_end` (#100685 ) Those two `__find_end` functions are no longer used after 101d1e9b3c86. After that commit, `std::find_end` started dispatching to `__find_end_classic`, and `ranges::find_end` to `__find_end_impl`, which means that the two `__find_end` functions were no longer necessary. Fixes #100569	2024-07-31 10:34:19 -04:00
nicole mazzuca	04760bfadb	[libc++][ranges] P1223R5: `find_last` (#99312 ) Implements [P1223R5][] completely. Includes an implementation of `find_last`, `find_last_if`, and `find_last_if_not`. [P1223R5]: https://wg21.link/p1223r5	2024-07-19 09:42:16 -07:00
Iuri Chaer	a0662176a9	[libc++] Speed up set_intersection() by fast-forwarding over ranges of non-matching elements with one-sided binary search. (#75230 ) One-sided binary search, aka meta binary search, has been in the public domain for decades, and has the general advantage of being constant time in the best case, with the downside of executing at most 2log(N) comparisons vs classic binary search's exact log(N). There are two scenarios in which it really shines: the first one is when operating over non-random-access iterators, because the classic algorithm requires knowing the container's size upfront, which adds N iterator increments to the complexity. The second one is when traversing the container in order, trying to fast-forward to the next value: in that case the classic algorithm requires at least O(Nlog(N)) comparisons and, for non-random-access iterators, O(N^2) iterator increments, whereas the one-sided version will yield O(N) operations on both counts, with a best-case of O(log(N)) comparisons which is very common in practice.	2024-07-18 16:11:24 -04:00
Louis Dionne	e2c2ffbe7a	[libc++][NFC] Run clang-format on libcxx/include again (#95874 ) As time went by, a few files have become mis-formatted w.r.t. clang-format. This was made worse by the fact that formatting was not being enforced in extensionless headers. This commit simply brings all of libcxx/include in-line with clang-format again. We might have to do this from time to time as we update our clang-format version, but frankly this is really low effort now that we've formatted everything once.	2024-06-18 09:13:45 -04:00
Louis Dionne	acb896a344	[libc++] Remove unnecessary #ifdef guards around PSTL implementation details (#95268 ) We want the PSTL implementation details to be available regardless of the Standard mode or whether the experimental PSTL is enabled. This patch guards the inclusion of the PSTL to the top-level headers that define the public API in `__numeric` and `__algorithm`.	2024-06-12 17:25:43 -04:00
Louis Dionne	fe4cd104a8	[libc++][NFC] Fix typo in concept PSTL concept check	2024-06-12 12:31:05 -04:00
Louis Dionne	9540950a45	[libc++] Overhaul the PSTL dispatching mechanism (#88131 ) The experimental PSTL's current dispatching mechanism was designed with flexibility in mind. However, while reviewing the in-progress OpenMP backend, I realized that the dispatching mechanism based on ADL and default definitions in the frontend had several downsides. To name a few: 1. The dispatching of an algorithm to the back-end and its default implementation is bundled together via `_LIBCPP_PSTL_CUSTOMIZATION_POINT`. This makes the dispatching really confusing and leads to annoyances such as variable shadowing and weird lambda captures in the front-end. 2. The distinction between back-end functions and front-end algorithms is not as clear as it could be, which led us to call one where we meant the other in a few cases. This is bad due to the exception requirements of the PSTL: calling a front-end algorithm inside the implementation of a back-end is incorrect for exception-safety. 3. There are two levels of back-end dispatching in the PSTL, which treat CPU backends as a special case. This was confusing and not as flexible as we'd like. For example, there was no straightforward way to dispatch all uses of `unseq` to a specific back-end from the OpenMP backend, or for CPU backends to fall back on each other. This patch rewrites the backend dispatching mechanism to solve these problems, but doesn't touch any of the actual implementation of algorithms. Specifically, this rewrite has the following characteristics: - There is a single level of backend dispatching, however partial backends can be stacked to provide a full implementation of the PSTL. The two-level dispatching that was used for CPU-based backends is handled by providing CPU-based basis operations as simple helpers that can easily be reused when defining any PSTL backend. - The default definitions for algorithms are separated from their dispatching logic. - The front-end is thus simplified a whole lot and made very consistent for all algorithms, which makes it easier to audit the front-end for things like exception-correctness, appropriate forwarding, etc. Fixes #70718	2024-06-12 12:24:34 -04:00
Zibi Sarbinowski	ffc3a6b286	[libc++] Fix endianness for algorithm mismatch (#93082 ) This PR is required to fix `std/algorithms/alg.nonmodifying/mismatch/mismatch.pass.cpp` test for big endian platrofrms such as z/OS.	2024-06-11 08:29:12 -04:00
Louis Dionne	e406d5ed9c	[libc++][pstl] Merge all frontend functions for the PSTL (#89219 ) This is an intermediate step towards the PSTL dispatching mechanism rework. It will make it a lot easier to track the upcoming front-end changes. After the rework, there are basically no implementation details in the front-end, so the definition of each algorithm will become much simpler. Otherwise, it wouldn't make sense to define all the algorithms in the same header.	2024-05-27 17:51:12 -04:00
Louis Dionne	72417920d3	[libc++] Remove a few unused includes of trivially_copyable.h (#93200 )	2024-05-23 15:58:51 -04:00
Louis Dionne	bd3f5a4bd3	[libc++][pstl] Improve exception handling (#88998 ) There were various places where we incorrectly handled exceptions in the PSTL. Typical issues were missing `noexcept` and taking iterators by value instead of by reference. This patch fixes those inconsistent and incorrect instances, and adds proper tests for all of those. Note that the previous tests were often incorrectly turned into no-ops by the compiler due to copy ellision, which doesn't happen with these new tests.	2024-05-22 12:39:21 -07:00
zibi2	af57ad6536	[libc++][z/OS] Correct a definition of __native_vector_size (#91995 ) Fix `std/ranges/range.adaptors/range.lazy.split/general.pass.cpp` which started failing on z/OS after this [commit](https://github.com/llvm/llvm-project/commit/985c1a44f8d49e0af). This test case is passing on other platforms such as AIX. This is because the `__ALTIVEC__` macro is defined and `__mismatch` under `_LIBCPP_VECTORIZE_ALGORITHMS` guard is compiled out. However, on z/OS `_LIBCPP_VECTORIZE_ALGORITHMS` is defined. Analyzing the algorithm of `__mismatch` shows that the culprit is the definition of `__native_vector_size` which was defined wrongly as 1. This PR corrects the definition of `__native_vector_size` and fixes the affected test.	2024-05-16 08:58:44 -04:00
Nikolas Klauser	05cc2d5fe1	[libc++] Vectorize std::mismatch with trivially equality comparable types (#87716 )	2024-05-11 23:32:48 +02:00
Nikolas Klauser	840032419d	[libc++][NFC] Rename __find_impl to __find (#90163 ) For most algorithms we've just added underscores to the detail function. This changes `std::find` to match that pattern.	2024-04-27 09:51:59 +02:00
Nikolas Klauser	83bc7b5771	[libc++] Remove _LIBCPP_DISABLE_NODISCARD_EXTENSIONS and refactor the tests (#87094 ) This also adds a few tests that were missing.	2024-04-22 22:13:58 +02:00
Louis Dionne	0e08bce142	[libc++][pstl] Move the CPU algorithm implementations to __pstl (#89109 ) This colocates the CPU algorithms closer to the rest of the PSTL implementation details.	2024-04-18 07:50:57 -04:00
Louis Dionne	d423d80e56	[libc++][pstl] Promote CPU backends to top-level backends (#88968 ) This patch removes the two-level backend dispatching mechanism we had in the PSTL. Instead of selecting both a PSTL backend and a PSTL CPU backend, we now only select a top-level PSTL backend. This greatly simplifies the PSTL configuration layer. While this patch technically removes some flexibility from the PSTL configuration mechanism because CPU backends are not considered separately, it opens the door to a much more powerful configuration mechanism based on chained backends in a follow-up patch. This is a step towards overhauling the PSTL dispatching mechanism.	2024-04-17 13:36:53 -04:00
Louis Dionne	d57907d0b4	[libc++] Add missing iterator requirement checks in the PSTL (#88127 ) Also add tests for those, and add a few missing requirements to testing iterators in the test suite.	2024-04-17 08:21:48 -04:00
Louis Dionne	5b811562a5	[libc++] Rename __cpu_traits functions (#88741 ) Functions inside __cpu_traits were needlessly prefixed with __parallel, which doesn't serve a real purpose anymore now that they are inside a traits class.	2024-04-16 10:33:39 +02:00
Louis Dionne	a3ce29f7bb	[libc++][PSTL] Introduce cpu traits (#88134 ) Currently, CPU backends in the PSTL are created by defining functions in the __par_backend namespace. Then, the PSTL includes the CPU backend that gets configured via CMake and gets those definitions. This prevents CPU backends from easily co-existing and is a bit confusing. To solve this problem, this patch introduces the notion of __cpu_traits, which is a cheap encapsulation of the basis operations required to implement a CPU-based PSTL. Different backends can now define their own tag and coexist, and the CPU-based PSTL will simply use __cpu_traits to dispatch to the right implementation of e.g. __for_each. Note that this patch doesn't change the actual implementation of the backends in any way, it only modifies how that implementation is accessed to implement PSTL algorithms. This patch is a step towards #88131.	2024-04-15 10:30:00 -04:00
Mark de Wever	1a895bd95e	[libc++] Marks a variable const. (#88562 ) This removes a TODO from the code base.	2024-04-13 13:45:59 +02:00
Mark de Wever	0ad663ead1	[libc++] Removes Clang-16 support. (#87810 ) With the release of Clang-18 we no longer officially support Clang-16.	2024-04-10 17:51:02 +02:00
Nikolas Klauser	935e699173	[libc++] Optimize ranges::minmax (#87335 ) This allows Clang to vectorize the loop. ``` --------------------------------------------------------------------- Benchmark old new --------------------------------------------------------------------- BM_std_minmax<char>/1 0.659 ns 1.41 ns BM_std_minmax<char>/2 1.08 ns 2.16 ns BM_std_minmax<char>/3 2.16 ns 2.96 ns BM_std_minmax<char>/4 2.82 ns 3.81 ns BM_std_minmax<char>/5 3.43 ns 4.69 ns BM_std_minmax<char>/6 4.08 ns 5.63 ns BM_std_minmax<char>/7 4.75 ns 6.51 ns BM_std_minmax<char>/8 5.42 ns 7.41 ns BM_std_minmax<char>/9 6.05 ns 8.34 ns BM_std_minmax<char>/10 6.68 ns 9.29 ns BM_std_minmax<char>/11 7.47 ns 10.6 ns BM_std_minmax<char>/12 7.95 ns 11.4 ns BM_std_minmax<char>/13 8.64 ns 12.4 ns BM_std_minmax<char>/14 9.35 ns 13.4 ns BM_std_minmax<char>/15 10.1 ns 14.4 ns BM_std_minmax<char>/16 10.6 ns 2.25 ns BM_std_minmax<char>/17 11.3 ns 2.82 ns BM_std_minmax<char>/18 11.8 ns 3.71 ns BM_std_minmax<char>/19 12.6 ns 4.52 ns BM_std_minmax<char>/20 13.2 ns 5.47 ns BM_std_minmax<char>/21 14.1 ns 6.67 ns BM_std_minmax<char>/22 14.5 ns 7.78 ns BM_std_minmax<char>/23 15.1 ns 8.67 ns BM_std_minmax<char>/24 15.7 ns 9.68 ns BM_std_minmax<char>/25 16.4 ns 10.7 ns BM_std_minmax<char>/26 17.1 ns 11.7 ns BM_std_minmax<char>/27 17.8 ns 12.8 ns BM_std_minmax<char>/28 18.4 ns 14.1 ns BM_std_minmax<char>/29 19.0 ns 15.0 ns BM_std_minmax<char>/30 19.6 ns 16.0 ns BM_std_minmax<char>/31 20.2 ns 17.0 ns BM_std_minmax<char>/32 20.8 ns 2.46 ns BM_std_minmax<char>/64 41.5 ns 2.97 ns BM_std_minmax<char>/512 340 ns 6.05 ns BM_std_minmax<char>/1024 667 ns 8.83 ns BM_std_minmax<char>/4000 2571 ns 28.6 ns BM_std_minmax<char>/4096 2632 ns 25.8 ns BM_std_minmax<char>/5500 3554 ns 51.1 ns BM_std_minmax<char>/64000 41175 ns 480 ns BM_std_minmax<char>/65536 42039 ns 490 ns BM_std_minmax<char>/70000 44931 ns 528 ns BM_std_minmax<short>/1 0.708 ns 1.20 ns BM_std_minmax<short>/2 1.18 ns 1.78 ns BM_std_minmax<short>/3 1.98 ns 2.42 ns BM_std_minmax<short>/4 2.47 ns 3.05 ns BM_std_minmax<short>/5 3.09 ns 3.72 ns BM_std_minmax<short>/6 3.49 ns 4.37 ns BM_std_minmax<short>/7 4.24 ns 5.03 ns BM_std_minmax<short>/8 4.65 ns 2.12 ns BM_std_minmax<short>/9 5.34 ns 2.51 ns BM_std_minmax<short>/10 5.82 ns 3.18 ns BM_std_minmax<short>/11 6.36 ns 3.97 ns BM_std_minmax<short>/12 6.73 ns 4.68 ns BM_std_minmax<short>/13 7.59 ns 5.49 ns BM_std_minmax<short>/14 7.77 ns 6.45 ns BM_std_minmax<short>/15 8.54 ns 7.55 ns BM_std_minmax<short>/16 8.74 ns 2.38 ns BM_std_minmax<short>/17 9.59 ns 2.76 ns BM_std_minmax<short>/18 9.88 ns 3.37 ns BM_std_minmax<short>/19 10.7 ns 4.17 ns BM_std_minmax<short>/20 10.9 ns 4.88 ns BM_std_minmax<short>/21 12.1 ns 5.70 ns BM_std_minmax<short>/22 12.6 ns 6.64 ns BM_std_minmax<short>/23 13.5 ns 7.72 ns BM_std_minmax<short>/24 13.2 ns 2.87 ns BM_std_minmax<short>/25 14.2 ns 3.10 ns BM_std_minmax<short>/26 14.2 ns 3.59 ns BM_std_minmax<short>/27 15.4 ns 4.35 ns BM_std_minmax<short>/28 15.3 ns 5.10 ns BM_std_minmax<short>/29 16.2 ns 5.87 ns BM_std_minmax<short>/30 16.2 ns 6.88 ns BM_std_minmax<short>/31 17.0 ns 7.78 ns BM_std_minmax<short>/32 17.2 ns 3.45 ns BM_std_minmax<short>/64 34.1 ns 3.35 ns BM_std_minmax<short>/512 279 ns 8.37 ns BM_std_minmax<short>/1024 549 ns 14.2 ns BM_std_minmax<short>/4000 2111 ns 50.1 ns BM_std_minmax<short>/4096 2167 ns 47.9 ns BM_std_minmax<short>/5500 2895 ns 69.7 ns BM_std_minmax<short>/64000 33454 ns 953 ns BM_std_minmax<short>/65536 34474 ns 970 ns BM_std_minmax<short>/70000 36691 ns 1037 ns BM_std_minmax<int>/1 0.664 ns 1.17 ns BM_std_minmax<int>/2 1.11 ns 1.69 ns BM_std_minmax<int>/3 2.36 ns 2.29 ns BM_std_minmax<int>/4 2.53 ns 2.91 ns BM_std_minmax<int>/5 3.23 ns 3.56 ns BM_std_minmax<int>/6 3.56 ns 4.23 ns BM_std_minmax<int>/7 4.28 ns 4.91 ns BM_std_minmax<int>/8 4.60 ns 5.60 ns BM_std_minmax<int>/9 5.38 ns 6.31 ns BM_std_minmax<int>/10 5.69 ns 7.03 ns BM_std_minmax<int>/11 6.41 ns 7.70 ns BM_std_minmax<int>/12 6.73 ns 8.39 ns BM_std_minmax<int>/13 7.38 ns 9.07 ns BM_std_minmax<int>/14 7.74 ns 9.79 ns BM_std_minmax<int>/15 8.53 ns 10.5 ns BM_std_minmax<int>/16 8.79 ns 11.2 ns BM_std_minmax<int>/17 9.63 ns 12.0 ns BM_std_minmax<int>/18 9.84 ns 12.7 ns BM_std_minmax<int>/19 10.6 ns 13.5 ns BM_std_minmax<int>/20 11.0 ns 14.3 ns BM_std_minmax<int>/21 11.7 ns 15.0 ns BM_std_minmax<int>/22 12.0 ns 15.7 ns BM_std_minmax<int>/23 13.1 ns 16.5 ns BM_std_minmax<int>/24 13.0 ns 17.3 ns BM_std_minmax<int>/25 13.7 ns 17.9 ns BM_std_minmax<int>/26 14.0 ns 18.6 ns BM_std_minmax<int>/27 14.8 ns 19.4 ns BM_std_minmax<int>/28 15.1 ns 20.3 ns BM_std_minmax<int>/29 15.8 ns 20.9 ns BM_std_minmax<int>/30 16.1 ns 21.7 ns BM_std_minmax<int>/31 16.9 ns 22.5 ns BM_std_minmax<int>/32 17.2 ns 3.40 ns BM_std_minmax<int>/64 33.9 ns 4.04 ns BM_std_minmax<int>/512 275 ns 14.6 ns BM_std_minmax<int>/1024 541 ns 27.5 ns BM_std_minmax<int>/4000 2093 ns 96.3 ns BM_std_minmax<int>/4096 2146 ns 98.3 ns BM_std_minmax<int>/5500 2866 ns 157 ns BM_std_minmax<int>/64000 33619 ns 1954 ns BM_std_minmax<int>/65536 34252 ns 2009 ns BM_std_minmax<int>/70000 36618 ns 2125 ns BM_std_minmax<long long>/1 0.709 ns 1.19 ns BM_std_minmax<long long>/2 1.01 ns 1.65 ns BM_std_minmax<long long>/3 2.14 ns 2.21 ns BM_std_minmax<long long>/4 2.45 ns 2.83 ns BM_std_minmax<long long>/5 3.09 ns 3.47 ns BM_std_minmax<long long>/6 3.44 ns 4.11 ns BM_std_minmax<long long>/7 4.16 ns 4.79 ns BM_std_minmax<long long>/8 4.54 ns 5.47 ns BM_std_minmax<long long>/9 5.37 ns 6.20 ns BM_std_minmax<long long>/10 5.71 ns 6.93 ns BM_std_minmax<long long>/11 6.00 ns 7.60 ns BM_std_minmax<long long>/12 6.43 ns 8.27 ns BM_std_minmax<long long>/13 7.01 ns 8.94 ns BM_std_minmax<long long>/14 7.45 ns 9.65 ns BM_std_minmax<long long>/15 8.16 ns 10.4 ns BM_std_minmax<long long>/16 8.46 ns 5.22 ns BM_std_minmax<long long>/17 9.16 ns 5.22 ns BM_std_minmax<long long>/18 9.53 ns 5.52 ns BM_std_minmax<long long>/19 10.2 ns 6.02 ns BM_std_minmax<long long>/20 10.5 ns 6.89 ns BM_std_minmax<long long>/21 11.3 ns 7.83 ns BM_std_minmax<long long>/22 11.6 ns 8.59 ns BM_std_minmax<long long>/23 12.3 ns 9.91 ns BM_std_minmax<long long>/24 12.6 ns 10.1 ns BM_std_minmax<long long>/25 13.2 ns 12.0 ns BM_std_minmax<long long>/26 13.6 ns 13.5 ns BM_std_minmax<long long>/27 14.2 ns 14.8 ns BM_std_minmax<long long>/28 14.7 ns 15.9 ns BM_std_minmax<long long>/29 15.3 ns 16.6 ns BM_std_minmax<long long>/30 15.8 ns 17.3 ns BM_std_minmax<long long>/31 16.3 ns 18.2 ns BM_std_minmax<long long>/32 16.7 ns 7.18 ns BM_std_minmax<long long>/64 33.1 ns 11.5 ns BM_std_minmax<long long>/512 268 ns 71.0 ns BM_std_minmax<long long>/1024 532 ns 138 ns BM_std_minmax<long long>/4000 2056 ns 533 ns BM_std_minmax<long long>/4096 2112 ns 539 ns BM_std_minmax<long long>/5500 2823 ns 749 ns BM_std_minmax<long long>/64000 32956 ns 8590 ns BM_std_minmax<long long>/65536 33795 ns 8791 ns BM_std_minmax<long long>/70000 36084 ns 9442 ns BM_std_minmax<unsigned char>/1 0.714 ns 1.41 ns BM_std_minmax<unsigned char>/2 0.955 ns 1.96 ns BM_std_minmax<unsigned char>/3 1.90 ns 2.63 ns BM_std_minmax<unsigned char>/4 2.40 ns 3.34 ns BM_std_minmax<unsigned char>/5 2.87 ns 4.10 ns BM_std_minmax<unsigned char>/6 3.47 ns 4.88 ns BM_std_minmax<unsigned char>/7 4.04 ns 5.66 ns BM_std_minmax<unsigned char>/8 4.65 ns 6.45 ns BM_std_minmax<unsigned char>/9 5.18 ns 7.24 ns BM_std_minmax<unsigned char>/10 5.80 ns 8.05 ns BM_std_minmax<unsigned char>/11 6.24 ns 8.86 ns BM_std_minmax<unsigned char>/12 6.78 ns 9.70 ns BM_std_minmax<unsigned char>/13 7.30 ns 10.6 ns BM_std_minmax<unsigned char>/14 7.86 ns 11.4 ns BM_std_minmax<unsigned char>/15 8.46 ns 12.3 ns BM_std_minmax<unsigned char>/16 9.00 ns 2.12 ns BM_std_minmax<unsigned char>/17 9.58 ns 2.83 ns BM_std_minmax<unsigned char>/18 10.1 ns 3.37 ns BM_std_minmax<unsigned char>/19 10.7 ns 4.11 ns BM_std_minmax<unsigned char>/20 11.2 ns 4.85 ns BM_std_minmax<unsigned char>/21 11.9 ns 5.69 ns BM_std_minmax<unsigned char>/22 12.3 ns 6.77 ns BM_std_minmax<unsigned char>/23 13.1 ns 7.56 ns BM_std_minmax<unsigned char>/24 13.5 ns 8.40 ns BM_std_minmax<unsigned char>/25 14.2 ns 9.30 ns BM_std_minmax<unsigned char>/26 14.4 ns 10.1 ns BM_std_minmax<unsigned char>/27 15.0 ns 11.1 ns BM_std_minmax<unsigned char>/28 15.3 ns 11.9 ns BM_std_minmax<unsigned char>/29 16.2 ns 12.9 ns BM_std_minmax<unsigned char>/30 16.5 ns 13.9 ns BM_std_minmax<unsigned char>/31 17.2 ns 14.8 ns BM_std_minmax<unsigned char>/32 17.6 ns 2.36 ns BM_std_minmax<unsigned char>/64 35.6 ns 3.21 ns BM_std_minmax<unsigned char>/512 288 ns 6.00 ns BM_std_minmax<unsigned char>/1024 573 ns 8.80 ns BM_std_minmax<unsigned char>/4000 2222 ns 28.6 ns BM_std_minmax<unsigned char>/4096 2265 ns 25.9 ns BM_std_minmax<unsigned char>/5500 3047 ns 48.8 ns BM_std_minmax<unsigned char>/64000 35059 ns 480 ns BM_std_minmax<unsigned char>/65536 35941 ns 491 ns BM_std_minmax<unsigned char>/70000 38922 ns 525 ns BM_std_minmax<unsigned short>/1 0.711 ns 1.18 ns BM_std_minmax<unsigned short>/2 0.957 ns 1.65 ns BM_std_minmax<unsigned short>/3 2.13 ns 2.21 ns BM_std_minmax<unsigned short>/4 2.14 ns 2.78 ns BM_std_minmax<unsigned short>/5 3.06 ns 3.29 ns BM_std_minmax<unsigned short>/6 2.89 ns 3.87 ns BM_std_minmax<unsigned short>/7 3.80 ns 4.55 ns BM_std_minmax<unsigned short>/8 3.68 ns 2.02 ns BM_std_minmax<unsigned short>/9 4.53 ns 2.40 ns BM_std_minmax<unsigned short>/10 4.60 ns 2.94 ns BM_std_minmax<unsigned short>/11 5.67 ns 3.67 ns BM_std_minmax<unsigned short>/12 5.39 ns 4.22 ns BM_std_minmax<unsigned short>/13 6.58 ns 4.78 ns BM_std_minmax<unsigned short>/14 6.33 ns 5.54 ns BM_std_minmax<unsigned short>/15 7.34 ns 6.30 ns BM_std_minmax<unsigned short>/16 7.17 ns 2.25 ns BM_std_minmax<unsigned short>/17 8.19 ns 2.61 ns BM_std_minmax<unsigned short>/18 8.02 ns 3.19 ns BM_std_minmax<unsigned short>/19 9.03 ns 3.72 ns BM_std_minmax<unsigned short>/20 8.89 ns 4.36 ns BM_std_minmax<unsigned short>/21 9.77 ns 5.10 ns BM_std_minmax<unsigned short>/22 9.70 ns 5.55 ns BM_std_minmax<unsigned short>/23 10.8 ns 6.29 ns BM_std_minmax<unsigned short>/24 10.6 ns 2.41 ns BM_std_minmax<unsigned short>/25 11.6 ns 2.75 ns BM_std_minmax<unsigned short>/26 11.4 ns 3.26 ns BM_std_minmax<unsigned short>/27 12.4 ns 3.86 ns BM_std_minmax<unsigned short>/28 12.3 ns 4.45 ns BM_std_minmax<unsigned short>/29 13.2 ns 5.07 ns BM_std_minmax<unsigned short>/30 13.1 ns 5.77 ns BM_std_minmax<unsigned short>/31 13.9 ns 6.65 ns BM_std_minmax<unsigned short>/32 13.9 ns 2.72 ns BM_std_minmax<unsigned short>/64 27.8 ns 3.25 ns BM_std_minmax<unsigned short>/512 220 ns 8.30 ns BM_std_minmax<unsigned short>/1024 435 ns 14.1 ns BM_std_minmax<unsigned short>/4000 1703 ns 49.8 ns BM_std_minmax<unsigned short>/4096 1746 ns 47.9 ns BM_std_minmax<unsigned short>/5500 2350 ns 69.9 ns BM_std_minmax<unsigned short>/64000 27388 ns 953 ns BM_std_minmax<unsigned short>/65536 28040 ns 975 ns BM_std_minmax<unsigned short>/70000 29967 ns 1040 ns BM_std_minmax<unsigned int>/1 0.712 ns 1.18 ns BM_std_minmax<unsigned int>/2 0.965 ns 1.65 ns BM_std_minmax<unsigned int>/3 2.13 ns 2.14 ns BM_std_minmax<unsigned int>/4 2.09 ns 2.64 ns BM_std_minmax<unsigned int>/5 3.02 ns 3.21 ns BM_std_minmax<unsigned int>/6 2.94 ns 3.81 ns BM_std_minmax<unsigned int>/7 3.91 ns 4.38 ns BM_std_minmax<unsigned int>/8 3.75 ns 4.93 ns BM_std_minmax<unsigned int>/9 4.71 ns 5.60 ns BM_std_minmax<unsigned int>/10 4.59 ns 6.26 ns BM_std_minmax<unsigned int>/11 5.57 ns 6.80 ns BM_std_minmax<unsigned int>/12 5.43 ns 7.47 ns BM_std_minmax<unsigned int>/13 6.45 ns 8.10 ns BM_std_minmax<unsigned int>/14 6.32 ns 8.69 ns BM_std_minmax<unsigned int>/15 7.29 ns 9.37 ns BM_std_minmax<unsigned int>/16 7.12 ns 9.99 ns BM_std_minmax<unsigned int>/17 8.24 ns 10.6 ns BM_std_minmax<unsigned int>/18 8.00 ns 11.2 ns BM_std_minmax<unsigned int>/19 8.94 ns 12.0 ns BM_std_minmax<unsigned int>/20 8.91 ns 12.6 ns BM_std_minmax<unsigned int>/21 9.73 ns 17.2 ns BM_std_minmax<unsigned int>/22 9.75 ns 13.8 ns BM_std_minmax<unsigned int>/23 10.6 ns 14.5 ns BM_std_minmax<unsigned int>/24 10.6 ns 15.1 ns BM_std_minmax<unsigned int>/25 11.5 ns 15.7 ns BM_std_minmax<unsigned int>/26 11.4 ns 16.3 ns BM_std_minmax<unsigned int>/27 12.3 ns 17.0 ns BM_std_minmax<unsigned int>/28 12.3 ns 17.6 ns BM_std_minmax<unsigned int>/29 13.2 ns 18.3 ns BM_std_minmax<unsigned int>/30 13.2 ns 19.0 ns BM_std_minmax<unsigned int>/31 14.0 ns 19.6 ns BM_std_minmax<unsigned int>/32 14.0 ns 3.39 ns BM_std_minmax<unsigned int>/64 27.6 ns 4.05 ns BM_std_minmax<unsigned int>/512 221 ns 14.2 ns BM_std_minmax<unsigned int>/1024 439 ns 25.5 ns BM_std_minmax<unsigned int>/4000 1720 ns 96.3 ns BM_std_minmax<unsigned int>/4096 1762 ns 97.8 ns BM_std_minmax<unsigned int>/5500 2364 ns 146 ns BM_std_minmax<unsigned int>/64000 27874 ns 1905 ns BM_std_minmax<unsigned int>/65536 28012 ns 1961 ns BM_std_minmax<unsigned int>/70000 29899 ns 2087 ns BM_std_minmax<unsigned long long>/1 0.707 ns 1.18 ns BM_std_minmax<unsigned long long>/2 0.909 ns 1.65 ns BM_std_minmax<unsigned long long>/3 1.65 ns 2.70 ns BM_std_minmax<unsigned long long>/4 1.93 ns 2.69 ns BM_std_minmax<unsigned long long>/5 2.45 ns 3.34 ns BM_std_minmax<unsigned long long>/6 2.78 ns 3.81 ns BM_std_minmax<unsigned long long>/7 3.28 ns 4.43 ns BM_std_minmax<unsigned long long>/8 3.70 ns 4.92 ns BM_std_minmax<unsigned long long>/9 4.12 ns 5.64 ns BM_std_minmax<unsigned long long>/10 4.44 ns 6.15 ns BM_std_minmax<unsigned long long>/11 4.91 ns 6.81 ns BM_std_minmax<unsigned long long>/12 5.31 ns 7.41 ns BM_std_minmax<unsigned long long>/13 5.72 ns 7.96 ns BM_std_minmax<unsigned long long>/14 6.05 ns 8.66 ns BM_std_minmax<unsigned long long>/15 6.55 ns 9.37 ns BM_std_minmax<unsigned long long>/16 6.89 ns 7.98 ns BM_std_minmax<unsigned long long>/17 7.34 ns 8.13 ns BM_std_minmax<unsigned long long>/18 7.73 ns 8.42 ns BM_std_minmax<unsigned long long>/19 8.26 ns 8.63 ns BM_std_minmax<unsigned long long>/20 8.54 ns 8.96 ns BM_std_minmax<unsigned long long>/21 9.14 ns 9.37 ns BM_std_minmax<unsigned long long>/22 9.39 ns 9.67 ns BM_std_minmax<unsigned long long>/23 10.1 ns 10.1 ns BM_std_minmax<unsigned long long>/24 10.4 ns 10.6 ns BM_std_minmax<unsigned long long>/25 11.0 ns 11.3 ns BM_std_minmax<unsigned long long>/26 11.3 ns 12.1 ns BM_std_minmax<unsigned long long>/27 11.8 ns 14.2 ns BM_std_minmax<unsigned long long>/28 12.1 ns 15.8 ns BM_std_minmax<unsigned long long>/29 12.6 ns 17.4 ns BM_std_minmax<unsigned long long>/30 13.1 ns 18.1 ns BM_std_minmax<unsigned long long>/31 13.4 ns 18.8 ns BM_std_minmax<unsigned long long>/32 13.8 ns 10.4 ns BM_std_minmax<unsigned long long>/64 27.3 ns 15.5 ns BM_std_minmax<unsigned long long>/512 222 ns 80.6 ns BM_std_minmax<unsigned long long>/1024 443 ns 156 ns BM_std_minmax<unsigned long long>/4000 1731 ns 591 ns BM_std_minmax<unsigned long long>/4096 1752 ns 609 ns BM_std_minmax<unsigned long long>/5500 2340 ns 819 ns BM_std_minmax<unsigned long long>/64000 27166 ns 9652 ns BM_std_minmax<unsigned long long>/65536 27869 ns 9876 ns BM_std_minmax<unsigned long long>/70000 29920 ns 10680 ns ```	2024-04-06 17:22:07 +02:00
Nikolas Klauser	f5960c168d	[libc++][NFC] Make __desugars_to a variable template and rename the header to desugars_to.h (#87337 ) This improves compile times and memory usage slightly and removes some boilerplate.	2024-04-04 23:02:19 +02:00
A. Jiang	04dbf7ad44	[libc++][ranges] Avoid using `distance` in `ranges::contains_subrange` (#87155 ) Both `std::distance` or `ranges::distance` are inefficient for non-sized ranges. Also, calculating the range using `int` type is seriously problematic. This patch avoids using `distance` and calculation of the length of non-sized ranges. Fixes #86833.	2024-04-02 17:21:15 -07:00
Nikolas Klauser	985c1a44f8	[libc++] Optimize the two range overload of mismatch (#86853 ) ``` ----------------------------------------------------------------------------- Benchmark old new ----------------------------------------------------------------------------- bm_mismatch_two_range_overload<char>/1 0.941 ns 1.88 ns bm_mismatch_two_range_overload<char>/2 1.43 ns 2.15 ns bm_mismatch_two_range_overload<char>/3 1.95 ns 2.55 ns bm_mismatch_two_range_overload<char>/4 2.58 ns 2.90 ns bm_mismatch_two_range_overload<char>/5 3.75 ns 3.31 ns bm_mismatch_two_range_overload<char>/6 5.00 ns 3.83 ns bm_mismatch_two_range_overload<char>/7 5.59 ns 4.35 ns bm_mismatch_two_range_overload<char>/8 6.37 ns 4.84 ns bm_mismatch_two_range_overload<char>/16 11.8 ns 6.72 ns bm_mismatch_two_range_overload<char>/64 45.5 ns 2.59 ns bm_mismatch_two_range_overload<char>/512 366 ns 12.6 ns bm_mismatch_two_range_overload<char>/4096 2890 ns 91.6 ns bm_mismatch_two_range_overload<char>/32768 23038 ns 758 ns bm_mismatch_two_range_overload<char>/262144 142813 ns 6573 ns bm_mismatch_two_range_overload<char>/1048576 366679 ns 26710 ns bm_mismatch_two_range_overload<short>/1 0.934 ns 1.88 ns bm_mismatch_two_range_overload<short>/2 1.30 ns 2.58 ns bm_mismatch_two_range_overload<short>/3 1.76 ns 3.28 ns bm_mismatch_two_range_overload<short>/4 2.24 ns 3.98 ns bm_mismatch_two_range_overload<short>/5 2.80 ns 4.92 ns bm_mismatch_two_range_overload<short>/6 3.58 ns 6.01 ns bm_mismatch_two_range_overload<short>/7 4.29 ns 7.03 ns bm_mismatch_two_range_overload<short>/8 4.67 ns 7.39 ns bm_mismatch_two_range_overload<short>/16 9.86 ns 13.1 ns bm_mismatch_two_range_overload<short>/64 38.9 ns 4.55 ns bm_mismatch_two_range_overload<short>/512 348 ns 27.7 ns bm_mismatch_two_range_overload<short>/4096 2881 ns 225 ns bm_mismatch_two_range_overload<short>/32768 23111 ns 1715 ns bm_mismatch_two_range_overload<short>/262144 184846 ns 14416 ns bm_mismatch_two_range_overload<short>/1048576 742885 ns 57264 ns bm_mismatch_two_range_overload<int>/1 0.838 ns 1.19 ns bm_mismatch_two_range_overload<int>/2 1.19 ns 1.65 ns bm_mismatch_two_range_overload<int>/3 1.83 ns 2.06 ns bm_mismatch_two_range_overload<int>/4 2.38 ns 2.42 ns bm_mismatch_two_range_overload<int>/5 3.60 ns 2.47 ns bm_mismatch_two_range_overload<int>/6 3.68 ns 3.05 ns bm_mismatch_two_range_overload<int>/7 4.32 ns 3.36 ns bm_mismatch_two_range_overload<int>/8 5.18 ns 3.58 ns bm_mismatch_two_range_overload<int>/16 10.6 ns 2.84 ns bm_mismatch_two_range_overload<int>/64 39.0 ns 7.78 ns bm_mismatch_two_range_overload<int>/512 247 ns 53.9 ns bm_mismatch_two_range_overload<int>/4096 1927 ns 429 ns bm_mismatch_two_range_overload<int>/32768 15569 ns 3393 ns bm_mismatch_two_range_overload<int>/262144 125413 ns 28504 ns bm_mismatch_two_range_overload<int>/1048576 504549 ns 112729 ns ```	2024-04-01 18:21:51 +02:00

1 2 3 4 5 ...

359 Commits