llvm-project

Author	SHA1	Message	Date
cor3ntin	03e43cf1c7	[Clang] Update Unicode version to 15.1 (#77147 ) This update all of our Unicode tables to Unicode 15.1. This is a minor version so only a relatively small numbers of characters are added, mainly ideographs https://www.unicode.org/versions/Unicode15.1.0/#Appendices_nb	2024-01-17 22:55:58 +01:00
Corentin Jabot	2903769bf5	Update the list of double width codepoints All east asian width wide and full-width codepoints are considered double width, as well as emojis and symbols commonely rendered as emoji. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D138518	2022-11-28 15:13:37 +01:00
Corentin Jabot	c932cef32a	Update Unicode to 15.0 Unicode 15.0 adds 4,489 characters, for a total of 149,186 characters. These additions include 2 new scripts along with 20 new emoji characters, and 4,193 CJK ideographs. This changes modify most existing tables including - XID_Start/XID_Continue in Clang - The character name database (used by \N{} in Clang) - The list of formattable/printable codepoints - The case folding algorithm (which we had not updated since Unicode 9) - The list of nonspacing/enclosing marks used by the column width computation algorithm. The rest of the column width algorithm is not updated. Reviewed By: tahonermann Differential Revision: https://reviews.llvm.org/D133807	2022-09-22 05:03:01 +02:00
Corentin Jabot	10f6d61bf1	[Clang][NFC] fix typo	2022-07-06 16:05:23 +02:00
Corentin Jabot	64ab2b1dcc	Improve handling of static assert messages. Instead of dumping the string literal (which quotes it and escape every non-ascii symbol), we can use the content of the string when it is a 8 byte string. Wide, UTF-8/UTF-16/32 strings are still completely escaped, until we clarify how these entities should behave (cf https://wg21.link/p2361). `FormatDiagnostic` is modified to escape non printable characters and invalid UTF-8. This ensures that unicode characters, spaces and new lines are properly rendered in static messages. This make clang more consistent with other implementation and fixes this tweet https://twitter.com/jfbastien/status/1298307325443231744 :) Of note, `PaddingChecker` did print out new lines that were later removed by the diagnostic printing code. To be consistent with its tests, the new lines are removed from the diagnostic. Unicode tables updated to both use the Unicode definitions and the Unicode 14.0 data. U+00AD SOFT HYPHEN is still considered a print character to match existing practices in terminals, in addition of being considered a formatting character as per Unicode. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D108469	2022-06-29 14:57:35 +02:00
Corentin Jabot	a774ba7f60	Revert "Improve handling of static assert messages." This reverts commit 870b6d21839707a3e4c40a29b526995f065a220f. This seems to break some libc++ tests, reverting while investigating	2022-06-29 00:03:23 +02:00
Corentin Jabot	870b6d2183	Improve handling of static assert messages. Instead of dumping the string literal (which quotes it and escape every non-ascii symbol), we can use the content of the string when it is a 8 byte string. Wide, UTF-8/UTF-16/32 strings are still completely escaped, until we clarify how these entities should behave (cf https://wg21.link/p2361). `FormatDiagnostic` is modified to escape non printable characters and invalid UTF-8. This ensures that unicode characters, spaces and new lines are properly rendered in static messages. This make clang more consistent with other implementation and fixes this tweet https://twitter.com/jfbastien/status/1298307325443231744 :) Of note, `PaddingChecker` did print out new lines that were later removed by the diagnostic printing code. To be consistent with its tests, the new lines are removed from the diagnostic. Unicode tables updated to both use the Unicode definitions and the Unicode 14.0 data. U+00AD SOFT HYPHEN is still considered a print character to match existing practices in terminals, in addition of being considered a formatting character as per Unicode. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D108469	2022-06-28 22:26:00 +02:00
serge-sans-paille	9501419e87	Speedup some unicode rendering Use a fast path for column width computation for ascii characters. Especially relevant for llvm-objdump. before: % time ./bin/llvm-objdump -D -j .text /lib/libc.so.6 >/dev/null ./bin/llvm-objdump -D -j .text /lib/libc.so.6 > /dev/null 0.75s user 0.01s system 99% cpu 0.757 total after: % time ./bin/llvm-objdump -D -j .text /lib/libc.so.6 >/dev/null ./bin/llvm-objdump -D -j .text /lib/libc.so.6 > /dev/null 0.37s user 0.01s system 99% cpu 0.378 total Differential Revision: https://reviews.llvm.org/D92180	2020-12-03 20:11:11 +01:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Alexander Kornienko	9aa60fd6f8	Move generic isPrint and columnWidth implementations to a separate header/source to allow using both generic and system-dependent versions on win32. Summary: This is needed so we can use generic columnWidthUTF8 in clang-format on win32 simultaneously with a separate system-dependent implementations of isPrint/columnWidth in TextDiagnostic.cpp to avoid attempts to print Unicode characters using narrow-character interfaces (which is not supported on Windows, and we'll have to figure out how to handle this). Reviewers: jordan_rose Reviewed By: jordan_rose CC: llvm-commits, klimek Differential Revision: http://llvm-reviews.chandlerc.com/D1559 llvm-svn: 189952	2013-09-04 16:00:12 +00:00

10 Commits