12 Commits

Author SHA1 Message Date
Louis Dionne
6a54dfbfe5 [libc++][NFC] Add missing license headers
Also standardize the license comment in several files where it was
different from what we normally do.
2024-07-31 12:58:09 -04:00
Mark de Wever
09addf9cbe [libc++][format] Fixes UTF-8 continuation.
The mask used to check whether a code unit is a valid continuation was
incorrect and accepts non-continuation code points. This fixes the
issue.

Reviewed By: ldionne, tahonermann, #libc

Differential Revision: https://reviews.llvm.org/D149672
2023-06-20 19:28:02 +02:00
Louis Dionne
520c7fbbd0 [libc++] Mark slow tests as unsupported on GCC
Some tests in our test suite are unbelievably slow on GCC due to the
use of the always_inline attribute. See [1] for more details.

This patch introduces the GCC-ALWAYS_INLINE-FIXME lit feature to
disable tests that are plagued by that issue. At the same time, it
moves several existing tests from ad-hoc `UNSUPPORTED: gcc-12` markup
to the new GCC-ALWAYS_INLINE-FIXME feature, and marks the slowest tests
reported by the CI as `UNSUPPORTED: GCC-ALWAYS_INLINE-FIXME`.

[1]: https://discourse.llvm.org/t/rfc-stop-supporting-extern-instantiations-with-gcc/71277/1

Differential Revision: https://reviews.llvm.org/D152736
2023-06-13 10:20:30 -07:00
Mark de Wever
48985f58b4 [libc++][format][test] Adds Windows support.
These tests pass on Windows without additional changes. This has been
tested in D150593.
2023-05-27 13:57:26 +02:00
Mark de Wever
dff62f5251 [libc++][format] Removes the experimental status.
The code has been quite ready for a while now and there are no more ABI
breaking papers. So this is a good time to mark the feature as stable.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D150802
2023-05-24 17:16:22 +02:00
Jake Egan
7ad7b3275f [AIX] Adjust support of format function tests
escaped_output.unicode.pass.cpp is failing only on 32-bit AIX. The rest are passing.

Reviewed by: #libc, Mordante

Differential Revision: https://reviews.llvm.org/D149078
2023-05-08 13:41:00 -04:00
Mark de Wever
68c3d66a97 [libc++][format] Improves width estimate.
As obvious from the paper's title this is an LWG issue and thus retroactively
applied to C++20. This change may the output for certain code points:
1 Considers 8477 extra codepoints as having a width 2 (as of Unicode 15)
  (mostly Tangut Ideographs)
2 Change the width of 85 unassigned code points from 2 to 1
3 Change the width of 8 codepoints (in the range U+3248 CIRCLED NUMBER
  TEN ON BLACK SQUARE ... U+324F CIRCLED NUMBER EIGHTY ON BLACK
  SQUARE) from 2 to 1, because it seems questionable to make an exception
  for those without input from Unicode

Note that libc++ already uses Unicode 15, while the Standard requires Unicode 12.
(The last time I checked MSVC STL used Unicode 14.)

So in practice the only notable change is item 3.

Implements
  P2675 LWG3780: The Paper
  format's width estimation is too approximate and not forward compatible

Benchmark before these changes
--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
BM_ascii_text<char>             3928 ns         3928 ns       178131
BM_unicode_text<char>          75231 ns        75230 ns         9158
BM_cyrillic_text<char>         59837 ns        59834 ns        11529
BM_japanese_text<char>         39842 ns        39832 ns        17501
BM_emoji_text<char>             3931 ns         3930 ns       177750
BM_ascii_text<wchar_t>          4024 ns         4024 ns       174190
BM_unicode_text<wchar_t>       63756 ns        63751 ns        11136
BM_cyrillic_text<wchar_t>      44639 ns        44638 ns        15597
BM_japanese_text<wchar_t>      34425 ns        34424 ns        20283
BM_emoji_text<wchar_t>          3937 ns         3937 ns       177684

Benchmark after these changes
--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
BM_ascii_text<char>             3914 ns         3913 ns       178814
BM_unicode_text<char>          70380 ns        70378 ns         9694
BM_cyrillic_text<char>         51889 ns        51877 ns        13488
BM_japanese_text<char>         41707 ns        41705 ns        16723
BM_emoji_text<char>             3908 ns         3907 ns       177912
BM_ascii_text<wchar_t>          3949 ns         3948 ns       177525
BM_unicode_text<wchar_t>       64591 ns        64587 ns        10649
BM_cyrillic_text<wchar_t>      44089 ns        44078 ns        15721
BM_japanese_text<wchar_t>      39369 ns        39367 ns        17779
BM_emoji_text<wchar_t>          3936 ns         3934 ns       177821

Benchmarks without "if(__code_point < (__entries[0] >> 14))"
--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
BM_ascii_text<char>             3922 ns         3922 ns       178587
BM_unicode_text<char>          94474 ns        94474 ns         7351
BM_cyrillic_text<char>         69202 ns        69200 ns        10157
BM_japanese_text<char>         42735 ns        42692 ns        16382
BM_emoji_text<char>             3920 ns         3919 ns       178704
BM_ascii_text<wchar_t>          3951 ns         3950 ns       177224
BM_unicode_text<wchar_t>       81003 ns        80988 ns         8668
BM_cyrillic_text<wchar_t>      57020 ns        57018 ns        12048
BM_japanese_text<wchar_t>      39695 ns        39687 ns        17582
BM_emoji_text<wchar_t>          3977 ns         3976 ns       176479

This optimization does carry its weight for the Unicode and Cyrillic
test. For the Japanese tests the gains are minor and for emoji it seems
to have no effect.

Reviewed By: ldionne, tahonermann, #libc

Differential Revision: https://reviews.llvm.org/D144499
2023-04-20 21:18:33 +02:00
Louis Dionne
f0fc8c4878 [libc++] Use named Lit features to flag back-deployment XFAILs
Instead of writing something like `XFAIL: use_system_cxx_lib && target=...`
to XFAIL back-deployment tests, introduce named Lit features like
`availability-shared_mutex-missing` to represent those. This makes the
XFAIL annotations leaner, and solves the problem of XFAIL comments
potentially getting out of sync. This would also make it easier for
another vendor to add their own annotations to the test suite by simply
changing how the feature is defined for their OS releases, instead
of having to modify hundreds of tests to add repetitive annotations.

This doesn't touch *all* annotations -- only annotations that were widely
duplicated are given named features (e.g. when filesystem or shared_mutex
were introduced). I still think it probably doesn't make sense to have a
named feature for every single fix we make to the dylib.

This is in essence a revert of 2659663, but since then the test suite
has changed significantly. Back when I did 2659663, the configuration
files we have for the test suite right now were being bootstrapped and
it wasn't clear how to provide these features for back-deployment in
that context. Since then, we have a streamlined way of defining these
features in `features.py` and that doesn't impact the ability for a
configuration file to stay minimal.

The original motivation for this change was that I am about to propose
a change that would touch essentially all XFAIL annotations for back-deployment
in the test suite, and this greatly reduces the number of lines changed
by that upcoming change, in addition to making the test suite generally
better.

Differential Revision: https://reviews.llvm.org/D146359
2023-03-27 12:44:26 -04:00
Louis Dionne
3d334df587 [libc++] Remove availability markup for std::format
std::format is currently experimental, so there is technically no
deployment target requirement for it (since the only symbols required
for it are in `libc++experimental.a`).

However, some parts of std::format depend indirectly on the floating
point std::to_chars implementation, which does have deployment target
requirements.

This patch removes all the availability format for std::format and
updates the XFAILs in the tests to properly explain why they fail
on old deployment targets, when they do. It also changes a couple
of tests to avoid depending on floating-point std::to_chars when
it isn't fundamental to the test.

Finally, some tests are marked as XFAIL but I added a comment saying

   TODO FMT This test should not require std::to_chars(floating-point)

These tests do not fundamentally depend on floating-point std::to_chars,
however they end up failing because calling std::format even without a
floating-point argument to format will end up requiring floating-point
std::to_chars. I believe this is an implementation artifact that could
be avoided in all cases where we know the format string at compile-time.
In the tests, I added the TODO comment only to the places where we could
do better and actually avoid relying on floating-point std::to_chars
because we know the format string at compile-time.

Differential Revision: https://reviews.llvm.org/D134598
2023-03-22 16:32:26 -04:00
Mark de Wever
a48007355a [libc++][format] Implements string escaping.
Implements parts of
- P2286R8 Formatting Ranges

Reviewed By: #libc, tahonermann

Differential Revision: https://reviews.llvm.org/D134036
2022-10-20 17:29:34 +02:00
Mark de Wever
6195bdb9f1 [NFC][libc++][format] Improves tests.
This is mainly to improve the readability of the tests. As a side
effects the tests run faster too,

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D135288
2022-10-11 19:23:29 +02:00
Mark de Wever
857a78c04d [libc++] Implements Unicode grapheme clustering
This implements the Grapheme clustering as required by
P1868R2 width: clarifying units of width and precision in std::format

This was omitted in the initial patch, but the paper was marked as completed. This really completes the paper.

Reviewed By: ldionne, #libc

Differential Revision: https://reviews.llvm.org/D126971
2022-07-20 18:38:32 +02:00