2 Commits

Author SHA1 Message Date
Mark de Wever
20341c3ad6 [libc++][format] Adds a UTF transcoder.
This is a preparation for

  P2093R14 Formatted output

When the output of print is to the terminal it needs to use the native
API. This means transcoding UTF-8 to UTF-16 on Windows. The encoder's
interface is modeled after

 P2728 Unicode in the Library, Part 1: UTF Transcoding

But only the required part for P2093R14 is implemented.

On Windows wchar_t is 16 bits, in order to test on platforms where
wchar_t is 32 bits the transcoder has support for char16_t. It also adds
and UTF-8 to UTF-32 encoder which is useful for other tests.

Note it is possible to use <codecvt> for transcoding, but that header is
deprecated. So rather write new code that is not deprecated; the hard
part, decoding, has already been done. The <codecvt> header also
requires locale support while the new code works without including
<locale>.

Note the current transcoder implementation can be optimized since it
basically does UTF-8 -> UTF-32 -> UTF-16. The first goal is to have a
working implementation. Since it's not part of the ABI it's possible to
do the optimization later.

Depends on D149672

Reviewed By: ldionne, tahonermann, #libc

Differential Revision: https://reviews.llvm.org/D150031
2023-07-11 20:28:19 +02:00
Mark de Wever
c40595f2ce [libc++][modules] Adds std module cppm files.
This adds the cppm files of D144994. These files by themselves will do
nothing. The goal is to reduce the size of D144994 and making it easier
to review the real changes of the patch.

Implements parts of
- P2465R3 Standard Library Modules std and std.compat

Reviewed By: ldionne, ChuanqiXu, aaronmondal, #libc

Differential Revision: https://reviews.llvm.org/D151030
2023-05-23 18:51:27 +02:00