llvm-project

Author	SHA1	Message	Date
Louis Dionne	6a54dfbfe5	[libc++][NFC] Add missing license headers Also standardize the license comment in several files where it was different from what we normally do.	2024-07-31 12:58:09 -04:00
Mark de Wever	09addf9cbe	[libc++][format] Fixes UTF-8 continuation. The mask used to check whether a code unit is a valid continuation was incorrect and accepts non-continuation code points. This fixes the issue. Reviewed By: ldionne, tahonermann, #libc Differential Revision: https://reviews.llvm.org/D149672	2023-06-20 19:28:02 +02:00
Louis Dionne	520c7fbbd0	[libc++] Mark slow tests as unsupported on GCC Some tests in our test suite are unbelievably slow on GCC due to the use of the always_inline attribute. See [1] for more details. This patch introduces the GCC-ALWAYS_INLINE-FIXME lit feature to disable tests that are plagued by that issue. At the same time, it moves several existing tests from ad-hoc `UNSUPPORTED: gcc-12` markup to the new GCC-ALWAYS_INLINE-FIXME feature, and marks the slowest tests reported by the CI as `UNSUPPORTED: GCC-ALWAYS_INLINE-FIXME`. [1]: https://discourse.llvm.org/t/rfc-stop-supporting-extern-instantiations-with-gcc/71277/1 Differential Revision: https://reviews.llvm.org/D152736	2023-06-13 10:20:30 -07:00
Mark de Wever	48985f58b4	[libc++][format][test] Adds Windows support. These tests pass on Windows without additional changes. This has been tested in D150593.	2023-05-27 13:57:26 +02:00
Mark de Wever	dff62f5251	[libc++][format] Removes the experimental status. The code has been quite ready for a while now and there are no more ABI breaking papers. So this is a good time to mark the feature as stable. Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D150802	2023-05-24 17:16:22 +02:00
Jake Egan	7ad7b3275f	[AIX] Adjust support of format function tests escaped_output.unicode.pass.cpp is failing only on 32-bit AIX. The rest are passing. Reviewed by: #libc, Mordante Differential Revision: https://reviews.llvm.org/D149078	2023-05-08 13:41:00 -04:00
Mark de Wever	68c3d66a97	[libc++][format] Improves width estimate. As obvious from the paper's title this is an LWG issue and thus retroactively applied to C++20. This change may the output for certain code points: 1 Considers 8477 extra codepoints as having a width 2 (as of Unicode 15) (mostly Tangut Ideographs) 2 Change the width of 85 unassigned code points from 2 to 1 3 Change the width of 8 codepoints (in the range U+3248 CIRCLED NUMBER TEN ON BLACK SQUARE ... U+324F CIRCLED NUMBER EIGHTY ON BLACK SQUARE) from 2 to 1, because it seems questionable to make an exception for those without input from Unicode Note that libc++ already uses Unicode 15, while the Standard requires Unicode 12. (The last time I checked MSVC STL used Unicode 14.) So in practice the only notable change is item 3. Implements P2675 LWG3780: The Paper format's width estimation is too approximate and not forward compatible Benchmark before these changes -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 3928 ns 3928 ns 178131 BM_unicode_text<char> 75231 ns 75230 ns 9158 BM_cyrillic_text<char> 59837 ns 59834 ns 11529 BM_japanese_text<char> 39842 ns 39832 ns 17501 BM_emoji_text<char> 3931 ns 3930 ns 177750 BM_ascii_text<wchar_t> 4024 ns 4024 ns 174190 BM_unicode_text<wchar_t> 63756 ns 63751 ns 11136 BM_cyrillic_text<wchar_t> 44639 ns 44638 ns 15597 BM_japanese_text<wchar_t> 34425 ns 34424 ns 20283 BM_emoji_text<wchar_t> 3937 ns 3937 ns 177684 Benchmark after these changes -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 3914 ns 3913 ns 178814 BM_unicode_text<char> 70380 ns 70378 ns 9694 BM_cyrillic_text<char> 51889 ns 51877 ns 13488 BM_japanese_text<char> 41707 ns 41705 ns 16723 BM_emoji_text<char> 3908 ns 3907 ns 177912 BM_ascii_text<wchar_t> 3949 ns 3948 ns 177525 BM_unicode_text<wchar_t> 64591 ns 64587 ns 10649 BM_cyrillic_text<wchar_t> 44089 ns 44078 ns 15721 BM_japanese_text<wchar_t> 39369 ns 39367 ns 17779 BM_emoji_text<wchar_t> 3936 ns 3934 ns 177821 Benchmarks without "if(__code_point < (__entries[0] >> 14))" -------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------- BM_ascii_text<char> 3922 ns 3922 ns 178587 BM_unicode_text<char> 94474 ns 94474 ns 7351 BM_cyrillic_text<char> 69202 ns 69200 ns 10157 BM_japanese_text<char> 42735 ns 42692 ns 16382 BM_emoji_text<char> 3920 ns 3919 ns 178704 BM_ascii_text<wchar_t> 3951 ns 3950 ns 177224 BM_unicode_text<wchar_t> 81003 ns 80988 ns 8668 BM_cyrillic_text<wchar_t> 57020 ns 57018 ns 12048 BM_japanese_text<wchar_t> 39695 ns 39687 ns 17582 BM_emoji_text<wchar_t> 3977 ns 3976 ns 176479 This optimization does carry its weight for the Unicode and Cyrillic test. For the Japanese tests the gains are minor and for emoji it seems to have no effect. Reviewed By: ldionne, tahonermann, #libc Differential Revision: https://reviews.llvm.org/D144499	2023-04-20 21:18:33 +02:00
Louis Dionne	f0fc8c4878	[libc++] Use named Lit features to flag back-deployment XFAILs Instead of writing something like `XFAIL: use_system_cxx_lib && target=...` to XFAIL back-deployment tests, introduce named Lit features like `availability-shared_mutex-missing` to represent those. This makes the XFAIL annotations leaner, and solves the problem of XFAIL comments potentially getting out of sync. This would also make it easier for another vendor to add their own annotations to the test suite by simply changing how the feature is defined for their OS releases, instead of having to modify hundreds of tests to add repetitive annotations. This doesn't touch all annotations -- only annotations that were widely duplicated are given named features (e.g. when filesystem or shared_mutex were introduced). I still think it probably doesn't make sense to have a named feature for every single fix we make to the dylib. This is in essence a revert of 2659663, but since then the test suite has changed significantly. Back when I did 2659663, the configuration files we have for the test suite right now were being bootstrapped and it wasn't clear how to provide these features for back-deployment in that context. Since then, we have a streamlined way of defining these features in `features.py` and that doesn't impact the ability for a configuration file to stay minimal. The original motivation for this change was that I am about to propose a change that would touch essentially all XFAIL annotations for back-deployment in the test suite, and this greatly reduces the number of lines changed by that upcoming change, in addition to making the test suite generally better. Differential Revision: https://reviews.llvm.org/D146359	2023-03-27 12:44:26 -04:00
Louis Dionne	3d334df587	[libc++] Remove availability markup for std::format std::format is currently experimental, so there is technically no deployment target requirement for it (since the only symbols required for it are in `libc++experimental.a`). However, some parts of std::format depend indirectly on the floating point std::to_chars implementation, which does have deployment target requirements. This patch removes all the availability format for std::format and updates the XFAILs in the tests to properly explain why they fail on old deployment targets, when they do. It also changes a couple of tests to avoid depending on floating-point std::to_chars when it isn't fundamental to the test. Finally, some tests are marked as XFAIL but I added a comment saying TODO FMT This test should not require std::to_chars(floating-point) These tests do not fundamentally depend on floating-point std::to_chars, however they end up failing because calling std::format even without a floating-point argument to format will end up requiring floating-point std::to_chars. I believe this is an implementation artifact that could be avoided in all cases where we know the format string at compile-time. In the tests, I added the TODO comment only to the places where we could do better and actually avoid relying on floating-point std::to_chars because we know the format string at compile-time. Differential Revision: https://reviews.llvm.org/D134598	2023-03-22 16:32:26 -04:00
Mark de Wever	a48007355a	[libc++][format] Implements string escaping. Implements parts of - P2286R8 Formatting Ranges Reviewed By: #libc, tahonermann Differential Revision: https://reviews.llvm.org/D134036	2022-10-20 17:29:34 +02:00
Mark de Wever	6195bdb9f1	[NFC][libc++][format] Improves tests. This is mainly to improve the readability of the tests. As a side effects the tests run faster too, Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D135288	2022-10-11 19:23:29 +02:00
Mark de Wever	857a78c04d	[libc++] Implements Unicode grapheme clustering This implements the Grapheme clustering as required by P1868R2 width: clarifying units of width and precision in std::format This was omitted in the initial patch, but the paper was marked as completed. This really completes the paper. Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D126971	2022-07-20 18:38:32 +02:00

12 Commits