8 Commits

Author SHA1 Message Date
Fangrui Song
cfdba8b740 Fix some -Wconstant-conversion warnings for future Clang (D139114) 2023-01-13 16:28:44 -08:00
Vitaly Buka
2ca4b4f82e [NFC] Suppress warning after D139114 2023-01-13 14:37:21 -08:00
Fangrui Song
b1df3a2c0b [Support] llvm::Optional => std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-16 08:49:10 +00:00
Kazu Hirata
aadaaface2 [llvm] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 21:11:44 -08:00
Corentin Jabot
c932cef32a Update Unicode to 15.0
Unicode 15.0 adds 4,489 characters, for a total of 149,186 characters.
These additions include 2 new scripts along with 20 new emoji characters,
and 4,193 CJK ideographs.

This changes modify most existing tables including
 - XID_Start/XID_Continue in Clang
 - The character name database (used by \N{} in Clang)
 - The list of formattable/printable codepoints
 - The case folding algorithm (which we had not updated since Unicode 9)
 - The list of nonspacing/enclosing marks used by the column width
   computation algorithm. The rest of the column width algorithm
   is not updated.

Reviewed By: tahonermann

Differential Revision: https://reviews.llvm.org/D133807
2022-09-22 05:03:01 +02:00
Kazu Hirata
89f1433225 Use llvm::lower_bound (NFC) 2022-09-03 11:17:37 -07:00
Benjamin Kramer
f5cd172e51 [Support] Work around an issue when building with old versions of libstdc++
llvm/lib/Support/UnicodeNameToCodepoint.cpp:189:12: error: chosen constructor is explicit in copy-initialization
    return {N, false, 0};
           ^~~~~~~~~~~~~
/usr/include/c++/5.4.0/tuple:479:19: note: explicit constructor declared here
        constexpr tuple(_UElements&&... __elements)
                  ^
2022-06-26 12:55:51 +02:00
Corentin Jabot
c92056d038 [Clang][C++23] P2071 Named universal character escapes
Implements [[ https://wg21.link/p2071r1  | P2071 Named Universal Character Escapes ]] - as an extension in all language mode, the patch  not warn in c++23 mode will be done later once this paper is plenary approved (in July).

We add

 * A code generator that transforms `UnicodeData.txt` and `NameAliases.txt` to a space efficient data structure that can be queried in `O(NameLength)`
 * A set of functions in `Unicode.h` to query that data, including

   * A function to find an exact match of a given Unicode character name
   * A function to perform a loose (ignoring case, space, underscore, medial hyphen) matching
   * A function returning the best matching codepoint for a given string per edit distance

 * Support of `\N{}` escape sequences in String and character Literals, with loose and typos diagnostics/fixits
 * Support of `\N{}` as UCN with loose matching diagnostics/fixits.

Loose matching is considered an error to match closely the semantics of P2071.

The generated data contributes to 280kB of data to the binaries.

`UnicodeData.txt` and `NameAliases.txt`  are not committed to the repository in this patch, and regenerating the data is a manual process.

Reviewed By: tahonermann

Differential Revision: https://reviews.llvm.org/D123064
2022-06-25 19:03:33 +02:00