16 Commits

Author SHA1 Message Date
James Y Knight
20451cb06b Update license on Unicode.org's ConvertUTF code.
The code was relicensed by its owner (Unicode.org) a long time back,
but we still had the old (problematic) license in our fork.

Note that the source files have not been distributed from unicode.org
since 2009 (due to being buggy and unmaintained upstream), but they
were given this license before that.

Fixes https://github.com/llvm/llvm-project/issues/32309

Differential Revision: https://reviews.llvm.org/D66390
2022-08-12 16:51:08 +00:00
Corentin Jabot
d4892a168f [Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059
2022-07-13 10:19:26 +02:00
Jonas Devlieghere
a262f4dbd7 Revert "[Clang] Add a warning on invalid UTF-8 in comments."
This reverts commit cc309721d20c8e544ae7a10a66735ccf4981a11c because it
breaks the following tests on GreenDragon:

  TestDataFormatterObjCCF.py
  TestDataFormatterObjCExpr.py
  TestDataFormatterObjCKVO.py
  TestDataFormatterObjCNSBundle.py
  TestDataFormatterObjCNSData.py
  TestDataFormatterObjCNSError.py
  TestDataFormatterObjCNSNumber.py
  TestDataFormatterObjCNSURL.py
  TestDataFormatterObjCPlain.py
  TestDataFormatterObjNSException.py

https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45288/
2022-07-12 15:22:29 -07:00
Corentin Jabot
cc309721d2 [Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059
2022-07-12 14:34:30 +02:00
Corentin Jabot
50416e5454 Revert "[Clang] Add a warning on invalid UTF-8 in comments."
It is probable thart this change crashes on the powerpc bots.

This reverts commit 355532a1499aa9b13a89fb5b5caaba2344d57cd7.
2022-07-09 17:18:35 +02:00
Corentin Jabot
355532a149 [Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059
2022-07-09 11:26:45 +02:00
Nico Weber
e9fe20dab3 Revert "[Clang] Add a warning on invalid UTF-8 in comments."
This reverts commit 4174f0ca618b467571b43cff12cbe4c4239670f8.

Also revert follow-up "[Clang] Fix invalid utf-8 detection"
This reverts commit bf45e27a676d87944f1f13d5f0d0f39935fc4010.

The second commit broke tests, see comments on
https://reviews.llvm.org/D129223, and it sounds like the first
commit isn't valid without the second one. So reverting both for now.
2022-07-06 22:51:52 +02:00
Corentin Jabot
bf45e27a67 [Clang] Fix invalid utf-8 detection
The length of valid codepoints was incorrectly
calculated which was not caught before because the
absence of tests for the valid codepoints scenario.

Differential Revision: https://reviews.llvm.org/D129223
2022-07-06 22:20:04 +02:00
Corentin Jabot
4174f0ca61 [Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059
2022-07-06 21:18:29 +02:00
Corentin Jabot
fb06dd3e8c Revert "[Clang] Add a warning on invalid UTF-8 in comments."
Reverting while I investigate build failures

This reverts commit e3dc56805f1029dd5959e4c69196a287961afb8d.
2022-07-06 19:45:12 +02:00
Corentin Jabot
e3dc56805f [Clang] Add a warning on invalid UTF-8 in comments.
Introduce an off-by default `-Winvalid-utf8` warning
that detects invalid UTF-8 code units sequences in comments.

Invalid UTF-8 in other places is already diagnosed,
as that cannot appear in identifiers and other grammar constructs.

The warning is off by default as its likely to be somewhat disruptive
otherwise.

This warning allows clang to conform to the yet-to be approved WG21
"P2295R5 Support for UTF-8 as a portable source file encoding"
paper.

Reviewed By: aaron.ballman, #clang-language-wg

Differential Revision: https://reviews.llvm.org/D128059
2022-07-06 17:59:44 +02:00
Chandler Carruth
2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Fangrui Song
f78650a8de Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293
2018-07-30 19:41:25 +00:00
Chandler Carruth
6bda14b313 Sort the remaining #include lines in include/... and lib/....
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.

I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.

This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.

Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).

llvm-svn: 304787
2017-06-06 11:49:48 +00:00
Galina Kistanova
229c9c1159 Disabled implicit-fallthrough warnings for ConvertUTF.cpp.
ConvertUTF.cpp has a little dependency on LLVM, and since the code extensively uses fall-through switches,
I prefer disabling the warning for the whole file, rather than adding attributes for each case.

llvm-svn: 304120
2017-05-29 01:34:26 +00:00
Justin Lebar
9091055efa Move UTF functions into namespace llvm.
Summary:
This lets people link against LLVM and their own version of the UTF
library.

I determined this only affects llvm, clang, lld, and lldb by running

$ git grep -wl 'UTF[0-9]\+\|\bConvertUTF\bisLegalUTF\|getNumBytesFor' | cut -f 1 -d '/' | sort | uniq
  clang
  lld
  lldb
  llvm

Tested with

  ninja lldb
  ninja check-clang check-llvm check-lld

(ninja check-lldb doesn't complete for me with or without this patch.)

Reviewers: rnk

Subscribers: klimek, beanz, mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D24996

llvm-svn: 282822
2016-09-30 00:38:45 +00:00