There is some code to make sure that C++ keywords that are identifiers
in the other languages are not treated as keywords. Right now, the kind
is set to identifier, and the identifier info is cleared. The latter is
probably so that the code for identifying C++ structures does not
recognize those structures by mistake when formatting a language that
does not have those structures. But we did not find an instance where
the language can have the sequence of tokens, the code tries to parse
the structure as if it is C++ using the identifier info instead of the
token kind, but without checking for the language setting. However,
there are places where the code checks whether the identifier info field
is null or not. They are places where an identifier and a keyword are
treated the same way. For example, the name of a function in
JavaScript. This patch removes the lines that clear the identifier
info. This way, a C++ keyword gets treated in the same way as an
identifier in those places.
JavaScript
New
```JavaScript
async function
union(
myparamnameiswaytooloooong) {
}
```
Old
```JavaScript
async function
union(
myparamnameiswaytooloooong) {
}
```
Java
New
```Java
enum union { ABC, CDE }
```
Old
```Java
enum
union { ABC, CDE }
```
This reverts commit 97dcbdef6089175c45e14fcbcf5c88b10233a79a.
There is some code to make sure that C++ keywords that are identifiers
in the other languages are not treated as keywords. Right now, the kind
is set to identifier, and the identifier info is cleared. The latter is
probably so that the code for identifying C++ structures does not
recognize those structures by mistake when formatting a language that
does not have those structures. But we did not find an instance where
the language can have the sequence of tokens, the code tries to parse
the structure as if it is C++ using the identifier info instead of the
token kind, but without checking for the language setting. However,
there are places where the code checks whether the identifier info field
is null or not. They are places where an identifier and a keyword are
treated the same way. For example, the name of a function in
JavaScript. This patch removes the lines that clear the identifier
info. This way, a C++ keyword gets treated in the same way as an
identifier in those places.
JavaScript
New
```JavaScript
async function
union(
myparamnameiswaytooloooong) {
}
```
Old
```JavaScript
async function
union(
myparamnameiswaytooloooong) {
}
```
Java
New
```Java
enum union { ABC, CDE }
```
Old
```Java
enum
union { ABC, CDE }
```
This regressions was introduced in
70d7ea0cebcf363cd0ddcfb76375fb5fada87dd5.
The commit moved some code and correctly picked up an explicit check for
not running on Verilog.
However, the moved code also never ran for JavaScript and after the
commit we run it there and
this causes the wrong formatting of:
```js
export type Params = Config&{
columns: Column[];
};
```
into
```js
export type Params = Config&{
columns:
Column[];
};
```
Now. string literals in lines beginning with `export type` will not be
broken.
The case was missed in 5db201fb75e6. I don't know TypeScript. And
merging GitHub pull requests seems to be a little too easy. So it got
committed before the reviewers had a chance to find edge cases.
Dictionary literal keys and strings in TypeScript type declarations can
not be broken.
The problem was pointed out by @alexfh and @e-kud here:
https://reviews.llvm.org/D154093#4644512
Now strings that are too long for one line in C#, Java, JavaScript, and
Verilog get broken into several lines. C# and JavaScript interpolated
strings are not broken.
A new subclass BreakableStringLiteralUsingOperators is used to handle
the logic for adding plus signs and commas. The updateAfterBroken
method was added because now parentheses or braces may be required after
the parentheses or commas are added. In order to decide whether the
added plus sign should be unindented in the BreakableToken object, the
logic for it is taken out into a separate function
shouldUnindentNextOperator.
The logic for finding the continuation indentation when the option
AlignAfterOpenBracket is set to DontAlign is not implemented yet. So in
that case the new line may have the wrong indentation, and the parts may
have the wrong length if the string needs to be broken more than once
because finding where to break the string depends on where the string
starts.
The preambles for the C# and Java unit tests are changed to the newer
style in order to allow the 3-argument verifyFormat macro. Some cases
are changed from verifyFormat to verifyImcompleteFormat because those
use incomplete code and the new verifyFormat function checks that the
code is complete.
The line in the doc was changed to being indented by 4 spaces, that is,
the default continuation indentation. It has always been the case. It
was probably a mistake that the doc showed 2 spaces previously.
This commit was fist committed as 16ccba51072b. The tests caused
assertion failures. Then it was reverted in 547bce36132a.
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D154093
Now strings that are too long for one line in C#, Java, JavaScript, and
Verilog get broken into several lines. C# and JavaScript interpolated
strings are not broken.
A new subclass BreakableStringLiteralUsingOperators is used to handle
the logic for adding plus signs and commas. The updateAfterBroken
method was added because now parentheses or braces may be required after
the parentheses or commas are added. In order to decide whether the
added plus sign should be unindented in the BreakableToken object, the
logic for it is taken out into a separate function
shouldUnindentNextOperator.
The logic for finding the continuation indentation when the option
AlignAfterOpenBracket is set to DontAlign is not implemented yet. So in
that case the new line may have the wrong indentation, and the parts may
have the wrong length if the string needs to be broken more than once
because finding where to break the string depends on where the string
starts.
The preambles for the C# and Java unit tests are changed to the newer
style in order to allow the 3-argument verifyFormat macro. Some cases
are changed from verifyFormat to verifyImcompleteFormat because those
use incomplete code and the new verifyFormat function checks that the
code is complete.
The line in the doc was changed to being indented by 4 spaces, that is,
the default continuation indentation. It has always been the case. It
was probably a mistake that the doc showed 2 spaces previously.
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D154093
This patch adds a verifyNoChange macro to verify code that won't
change after being formatted. (The code will not be messed up before
being formatted.) It then replaces EXPECT_EQ with verifyFormat
wherever applicable so that the code will be messed up before being
formatted. When the replacement fails the unit test, verifyFormat is
replaced with verifyNoChange.
Differential Revision: https://reviews.llvm.org/D153109
Contributed by @jankuehle!
Users can choose to only import/export the type of the symbol (not value nor namespace) by adding a `type` keyword, e.g.:
```
import type {x} from 'y';
import {type x} from 'y';
export type {x};
export {type x};
```
Previously, this was not handled and would:
- Terminate import sorting
- Remove the space before the curly bracket in `export type {`
With this change, both formatting and import sorting work as expected.
Reviewed By: MyDeveloperDay, krasimir
Differential Revision: https://reviews.llvm.org/D150116
Different loop termination conditions resulted in confusion of whether
*Offset was intended to be inside or outside the token.
This ultimately led to constructing an out-of-range SourceLocation.
Fix by making Offset consistently point *after* the token.
Differential Revision: https://reviews.llvm.org/D135356
[clang-format] Indent import statements in JavaScript.
Take for example this piece of code found at
<https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import>.
```
for (const link of document.querySelectorAll("nav > a")) {
link.addEventListener("click", e => {
e.preventDefault();
import('/modules/my-module.js')
.then(module => {
module.loadPageInto(main);
})
.catch(err => {
main.textContent = err.message;
});
});
}
```
Previously the import line would be unindented, looking like this.
```
for (const link of document.querySelectorAll("nav > a")) {
link.addEventListener("click", e => {
e.preventDefault();
import('/modules/my-module.js')
.then(module => {
module.loadPageInto(main);
})
.catch(err => {
main.textContent = err.message;
});
});
}
```
Actually we were going to fix this along with fixing Verilog import
statements. But the patch got too big.
Reviewed By: MyDeveloperDay, curdeius
Differential Revision: https://reviews.llvm.org/D121906
TypeScript 4.3 added a new "override" keyword for class members. This
lets clang-format know about it, so it can format code using it
properly.
Reviewed By: krasimir
Differential Revision: https://reviews.llvm.org/D108692
This fixes up a regression we found from
https://reviews.llvm.org/D107267: in specific contexts, clang-format
stopped breaking after the `)` in TypeScript decorations. There were no test cases covering this, so I added one.
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D108538
The patch https://reviews.llvm.org/D105964 (58494c856a)
updated detection of function declaration names. It had the unfortunate
consequence that it started breaking between `function` and the function
name in some cases in JavaScript code.
This patch addresses this.
Reviewed By: MyDeveloperDay, owenpan
Differential Revision: https://reviews.llvm.org/D107267
`asserts` is a pseudo keyword in TypeScript used in return types.
Wrapping after it triggers automatic semicolon insertion, which
breaks the code semantics/syntax.
`asserts` is different from other pseudo keywords in that it is
specific to TS and only carries meaning in a very specific location.
Thus introducing a token type is probably overkill.
Differential Revision: https://reviews.llvm.org/D100953
In JavaScript, `- -1;` is legal syntax, the language allows unary minus.
However the two tokens must not collapse together: `--1` is prefix
decrement, i.e. different syntax.
Before:
- -1; ==> --1;
After:
- -1; ==> - -1;
This change makes no attempt to format this "nicely", given by all
likelihood this represents a programming mistake by the user, or odd
generated code.
The check is not guarded by language: this appears to be a problem in
Java as well, and will also be beneficial when formatting syntactically
incorrect C++ (e.g. during editing).
Differential Revision: https://reviews.llvm.org/D99495
In JavaScript breaking before a `@tag` in a comment puts it on a new line, and
machinery that parses these comments will fail to understand such comments.
This adapts clang-format to not break before `@`. Similar functionality exists
for not breaking before `{`.
Reviewed By: mprobst
Differential Revision: https://reviews.llvm.org/D91078
In JavaScript some @tags can be followed by `{`, and machinery that parses
these comments will fail to understand the comment if followed by a line break.
clang-format already handles this case by not breaking before `{` in comments.
However this was not working in cases when the column limit falls within `@tag`
or between `@tag` and `{`. This adapts clang-format for this case.
Reviewed By: mprobst
Differential Revision: https://reviews.llvm.org/D90908
Summary:
Even when BreakBeforeBinaryOperators is set, AlignOperands kept
aligning the beginning of the line, even when it could align the
actual operands (e.g. after an assignment).
With this patch, the operands are actually aligned, and the operator
gets aligned with the equal sign:
int aaaaa = bbbbbb
+ cccccc;
This not happen in tests, to avoid 'breaking' the indentation:
if (aaaaa
&& bbbbb)
return;
Reviewers: krasimir, djasper, klimek, MyDeveloperDay
Reviewed By: MyDeveloperDay
Subscribers: MyDeveloperDay, acoomans, cfe-commits, klimek
Tags: #clang, #clang-format
Differential Revision: https://reviews.llvm.org/D32478
Summary:
Ensure the clang-format unit tests are themselves clang-formatted
Having areas of the llvm code which are clang-format clean, give us more areas to run new clang-format binaries on ensuring we haven't broken anything.
It seems to me we SHOULD have this clang-formatted at a minimum, otherwise how can we expect others to use clang-format if we "don't eat our own dogfood", also if the tests are dependent on the formatting of the code then that would also be bad!
Reviewed By: sammccall
Subscribers: cfe-commits
Tags: #clang, #clang-format
Differential Revision: https://reviews.llvm.org/D79204
Summary:
Even when BreakBeforeBinaryOperators is set, AlignOperands kept
aligning the beginning of the line, even when it could align the
actual operands (e.g. after an assignment).
With this patch, there is an option to actually align the operands, so
that the operator gets right-aligned with the equal sign or return
operator:
int aaaaa = bbbbbb
+ cccccc;
return aaaaaaa
&& bbbbbbb;
This not happen in parentheses, to avoid 'breaking' the indentation:
if (aaaaa
&& bbbbb)
return;
Reviewers: krasimir, djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D32478
Summary:
The previous change in https://reviews.llvm.org/D77311 attempted to
detect more C++ keywords. However it also precisely detected all
JavaScript keywords. That's generally correct, but many JavaScripy
keywords, e.g. `get`, are so-called pseudo-keywords. They can be used in
positions where a keyword would never be legal, e.g. in a dotted
expression:
x.type; // type is a pseudo-keyword, but can be used here.
x.get; // same for get etc.
This change introduces an additional parameter to
`IsJavaScriptIdentifier`, allowing clients to toggle whether they want
to allow `IdentifierName` tokens, i.e. pseudo-keywords.
Reviewers: krasimir
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77548
Summary:
C++ defines a number of keywords that are regular identifiers in
JavaScript, e.g. `concept`:
const concept = 1; // legit JS
This change expands the existing `IsJavaScriptIdentifier(Tok)` function
to return false for C++ keywords that aren't keywords in JS.
Reviewers: krasimir
Subscribers: jfb, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77311
Summary:
This change adds an option to insert trailing commas into container
literals. For example, in JavaScript:
const x = [
a,
b,
^~~~~ inserted if missing.
]
This is implemented as a seperate post-processing pass after formatting
(because formatting might change whether the container literal does or
does not wrap). This keeps the code relatively simple and orthogonal,
though it has the notable drawback that the newly inserted comma is not
taken into account for formatting decisions (e.g. it might exceed the 80
char limit). To avoid exceeding the ColumnLimit, a comma is only
inserted if it fits into the limit.
Trailing comma insertion conceptually conflicts with argument
bin-packing: inserting a comma disables bin-packing, so we cannot do
both. clang-format rejects FormatStyle configurations that do both with
this change.
Reviewers: krasimir, MyDeveloperDay
Subscribers: cfe-commits
Tags: #clang
Summary:
clang-format currently always wraps the body of non-empty arrow
functions:
const x = () => {
z();
};
This change implements support for the `AllowShortLambdasOnASingleLine`
style options, controlling the indent style for arrow function bodies
that have one or fewer statements. SLS_All puts all on a single line,
SLS_Inline only arrow functions used in an inline position.
const x = () => { z(); };
Multi-statement arrow functions continue to be wrapped. Function
expressions (`a = function() {}`) and function/method declarations are
unaffected as well.
Reviewers: krasimir
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73335
Summary:
clang-format currently treats the nullish coalescing operator `??` like
the ternary operator. That causes multiple nullish terms to be each
indented relative to the last `??`, as they would in a ternary.
The `??` operator is often used in chains though, and as such more
similar to other binary operators, such as `||`. So to fix the indent,
set its token type to `||`, so it inherits the same treatment.
This opens up the question of operator precedence. However, `??` is
required to be parenthesized when mixed with `||` and `&&`, so this is
not a problem that can come up in syntactically legal code.
Reviewers: krasimir
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D73026
Summary:
tslint and tsc (the TypeScript compiler itself) use comment pragmas of
the style:
// tslint:disable-next-line:foo
// @ts-ignore
These must not be wrapped and must stay on their own line, in isolation.
For tslint, this required adding it to the pragma regexp. The comments
starting with `@` are already left alone, but this change adds test
coverage for them.
Reviewers: krasimir
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72907