629 Commits

Author SHA1 Message Date
Emilia Kond
3e87829a8a
[clang-format] Fix requires misannotation with comma (#65908)
clang-format uses a heuristic to determine if a requires() is either a
requires clause or requires expression, based on what is in the
parentheses. Part of this heuristic assumed that a requires clause can
never contain a comma, however this is not the case if said comma is in
the template argument of a type.

This patch allows commas to appear in a requires clause if an angle
bracket `<` has been opened.

Fixes https://github.com/llvm/llvm-project/issues/65904
2023-09-28 01:34:30 +03:00
Owen Pan
38dd67c8b3 [clang-format][NFC] Minor cleanup of the parser and annotator 2023-09-25 22:20:02 -07:00
Owen Pan
a2046ca9af
[clang-format][NFC] Clean up signatures of some parser functions (#66569)
Removed TT_CompoundRequirementLBrace and parameters
CanContainBracedList and NextLBracesType.
2023-09-20 01:13:05 -07:00
Emilia Kond
e9ed1aa9cd
[clang-format] Correctly annotate designated initializer with PP if (#65409)
When encountering braces, such as those of a designated initializer,
clang-format scans ahead to see what is contained within the braces. If
it found a statement, like an if-statement of for-loop, it would deem
the braces as not an initializer, but as a block instead.

However, this heuristic incorrectly included a preprocessor `#if` line
as an if-statement. This manifested in strange results and discrepancies
between `#ifdef` and `#if defined`.

With this patch, `if` is now ignored if it is preceeded by `#`.

Fixes most of https://github.com/llvm/llvm-project/issues/56685
2023-09-07 22:23:05 +03:00
Owen Pan
c22c0c4769 [clang-format][NFC] Change conjunction of isNot() with one !isOneOf() 2023-09-07 00:20:02 -07:00
Owen Pan
91c4db0061 [clang-format][NFC] Replace !is() with isNot()
Differential Revision: https://reviews.llvm.org/D158571
2023-08-24 01:27:24 -07:00
Owen Pan
e3a79503a3 [clang-format] Exclude kw_decltype in RemoveParentheses
From https://en.cppreference.com/w/cpp/language/decltype:
Note that if the name of an object is parenthesized, it is treated as an
ordinary lvalue expression, thus decltype(x) and decltype((x)) are often
different types.

Fixes #64786.

Differential Revision: https://reviews.llvm.org/D158155
2023-08-17 16:04:44 -07:00
Owen Pan
70d7ea0ceb [clang-format] Handle goto labels preceded by C++11 attributes
Fixes #64229.

Differential Revision: https://reviews.llvm.org/D156655
2023-08-02 22:48:48 -07:00
Björn Schäpers
77a38f43b1 [clang-format] Supress aligning of trailing namespace comments
Fixes https://github.com/llvm/llvm-project/issues/57504.

Differential Revision: https://reviews.llvm.org/D138263
2023-08-02 11:50:14 +02:00
Galen Elias
400da115c5 [clang-format] Fix braced initializer with templated base class
Fixes #64134.

Differential Revision: https://reviews.llvm.org/D156705
2023-08-01 13:53:22 -07:00
Owen Pan
adece4e452 [clang-format][NFC] Remove superfluous code in UnwrappedLineParser
Differential Revision: https://reviews.llvm.org/D156643
2023-08-01 13:31:24 -07:00
Jared Grubb
63d6659a04 [clang-format] Fix support for ObjC blocks with pointer return types
The ObjC-block detection code only supports a single token as the return type. Add support to detect pointers, too (ObjC has lots of object-pointers).

For example, using `BasedOnStyle: WebKit`, the following is stable output:

```
int* p = ^int*(void)
{ //
    return nullptr;
}
();
```

After the patch, this is stable:

```
int* p = ^int*(void) { //
    return nullptr;
}();
```

Differential Review: https://reviews.llvm.org/D146434
2023-07-17 14:47:49 +01:00
Björn Schäpers
ce7356f081 [clang-format] Don't eat two semicolons after namespace
Remove the double check, move the comment.

This changes behavior, but I think for the better. Despite the comment
my personal opinion would be to not even gracefully handle the one
semicolon, it shouldn't be there.

Differential Revision: https://reviews.llvm.org/D138373
2023-07-12 12:23:20 +02:00
Owen Pan
3a6a0702c2 [clang-format] Add an option to remove redundant parentheses
Differential Revision: https://reviews.llvm.org/D154484
2023-07-11 16:33:19 -07:00
Emilia Kond
15e14f129f
[clang-format] Preserve AmpAmpTokenType in nested parentheses
When parsing a requires clause, the UnwrappedLineParser would delegate to
parseParens with an AmpAmpTokenType set to BinaryOperator. However,
parseParens would not carry this over into any nested parens, meaning it
could assign a different token type to an && in a requires clause.

This patch makes sure parseParens inherits its parameter when performing
a recursive call.

Fixes https://github.com/llvm/llvm-project/issues/63251

Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D153641
2023-06-26 12:39:16 +03:00
Galen Elias
d39929b925 [clang-format] Adjust braced list detection (reland 6dcde65)
This is a retry of https://reviews.llvm.org/D114583, which was backed
out for regressions.

Clang Format is detecting a nested scope followed by another open brace
as a braced initializer list due to incorrectly thinking it's matching a
braced initializer at the end of a constructor initializer list which is
followed by the body open brace.

Unfortunately, UnwrappedLineParser isn't doing a very detailed parse, so
it's not super straightforward to distinguish these cases given the
current structure of calculateBraceTypes. My current hypothesis is that
these can be disambiguated by looking at the token preceding the
l_brace, as initializer list parameters will be preceded by an
identifier, but a scope block generally will not (barring the MACRO
wildcard).

To this end, I am adding tracking of the previous token to the LBraceStack
to help scope this particular case.

TokenAnnotatorTests cherry picked from https://reviews.llvm.org/D150452.

Fixes #33891.
Fixes #52911.

Differential Revision: https://reviews.llvm.org/D150403
2023-05-23 18:50:41 -07:00
Owen Pan
dc4ab97085 [clang-format] Revert 6dcde65 due to missing commit message title
This reverts commit 6dcde658b2380d7ca1451ea5d1099af3e294ea16.
2023-05-23 18:48:52 -07:00
Galen Elias
6dcde658b2 This is a retry of https://reviews.llvm.org/D114583, which was backed
out for regressions.

Clang Format is detecting a nested scope followed by another open brace
as a braced initializer list due to incorrectly thinking it's matching a
braced initializer at the end of a constructor initializer list which is
followed by the body open brace.

Unfortunately, UnwrappedLineParser isn't doing a very detailed parse, so
it's not super straightforward to distinguish these cases given the
current structure of calculateBraceTypes. My current hypothesis is that
these can be disambiguated by looking at the token preceding the
l_brace, as initializer list parameters will be preceded by an
identifier, but a scope block generally will not (barring the MACRO
wildcard).

To this end, I am adding tracking of the previous token to the LBraceStack
to help scope this particular case.

TokenAnnotatorTests cherry picked from https://reviews.llvm.org/D150452.

Fixes #33891.
Fixes #52911.

Differential Revision: https://reviews.llvm.org/D150403
2023-05-22 20:25:55 -07:00
sstwcw
369e8762b4 [clang-format] Stop comment disrupting indentation of Verilog ports
Before:

```
module x
    #( //
        parameter x)
    ( //
        input y);
endmodule
```

After:

```
module x
    #(//
      parameter x)
    (//
     input y);
endmodule
```

If the first line in a port or parameter list is not a comment, the
following lines will be aligned to the first line as intended:

```
module x
    #(parameter x1,
      parameter x2)
    (input y,
     input y2);
endmodule
```

Previously, the indentation would be changed to an extra continuation
indentation relative to the start of the parenthesis or the hash if
the first token inside the parentheses was a comment.  It is a feature
introduced in ddaa9be97839.  The feature enabled one to insert a `//`
comment right after an opening parentheses to put the function
arguments on a new line with a small indentation regardless of how
long the function name is, like this:

```
someFunction(anotherFunction( // Force break.
    parameter));
```

People are unlikely to use this feature in a Verilog port list because
the formatter already puts the port list on its own lines.  A comment
at the start of a port list is probably a comment for the port on the
next line.

We also removed the space before the comment so that its indentation
would be same as that for a line comment anywhere else in the port
list.

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D149562
2023-05-16 02:56:58 +00:00
Jan Kuhle
34b422bafb clang-format: [JS] support import/export type
Contributed by @jankuehle!

Users can choose to only import/export the type of the symbol (not value nor namespace) by adding a `type` keyword, e.g.:

```
import type {x} from 'y';
import {type x} from 'y';
export type {x};
export {type x};
```

Previously, this was not handled and would:
- Terminate import sorting
- Remove the space before the curly bracket in `export type {`

With this change, both formatting and import sorting work as expected.

Reviewed By: MyDeveloperDay, krasimir

Differential Revision: https://reviews.llvm.org/D150116
2023-05-10 15:27:03 +02:00
sstwcw
82a90caa88 [clang-format] Correctly format goto labels followed by blocks
There doesn't seem to be an issue on GitHub.  But previously, a space
would be inserted before the goto colon in the code below.

    switch (x) {
    case 0:
    goto_0: {
      action();
      break;
    }
    }

Previously, the colon following a goto label would be annotated as
`TT_InheritanceColon`.  A goto label followed by an opening brace
wasn't recognized.  It is easy to add another line to have
`spaceRequiredBefore` function recognize the case, but I believed it
is more proper to avoid doing the same thing in `UnwrappedLineParser`
and `TokenAnnotator`.  So now the label colons would be labeled in
`UnwrappedLineParser`, and `spaceRequiredBefore` would rely on that.

Previously we had the type `TT_GotoLabelColon` intended for both goto
labels and case labels.  But since handling of goto labels and case
labels differ somewhat, I split it into separate types for goto and
case labels.

This patch doesn't change the behavior for case labels.  I added the
lines annotating case labels because they would previously be
mistakenly annotated as `TT_InheritanceColon` just like goto labels.
And since I added the annotations, the checks for the `case` and
`default` keywords in `spaceRequiredBefore` are not necessary anymore.

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D148484
2023-04-30 22:25:48 +00:00
sstwcw
0571ba8d1b [clang-format] Handle Verilog assertions and loops
Assert statements in Verilog can optionally have an else part.  We
handle them like for `if` statements, except that an `if` statement in
the else part of an `assert` statement doesn't get merged with the
`else` keyword.  Like this:

    assert (x)
      $info();
    else
      if (y)
        $info();
      else if (z)
        $info();
      else
        $info();

`foreach` and `repeat` are now handled like for or while loops.

We used the type `TT_ConditionLParen` to mark the condition part so
they are handled in the same way as the condition part of an `if`
statement.  When the code being formatted is not in Verilog, it is
only set for `if` statements, not loops.  It's because loop conditions
are currently handled slightly differently, and existing behavior is
not supposed to change.  We formatted all files ending in `.cpp` and
`.h` in the repository with and without this change.  It showed that
setting the type for `if` statements doesn't change existing behavior.

And we noticed that we forgot to make the program print the list of
tokens when the number is not correct in `TokenAnnotatorTest`.  It's
fixed now.

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D147895
2023-04-16 21:55:50 +00:00
Jorge Pinto Sousa
9db2a04548 [clang-format] Dont interpret variable named interface as keyword for C++
Fixes #53173.

Differential Revision: https://reviews.llvm.org/D148437
2023-04-16 03:23:53 -07:00
sstwcw
92b2be3965 [clang-format] Handle enum in Verilog
Verilog has enum just like C.

Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D147328
2023-04-01 17:09:44 +00:00
Owen Pan
2a42a7b4e8 [clang-format] Don't misannotate left squares as lambda introducers
A left square can start a lambda only if it's not preceded by an
identifier other than return and co-wait/co-yield/co-return.

Fixes #54245.
Fixes #61786.

Differential Revision: https://reviews.llvm.org/D147295
2023-03-31 16:13:03 -07:00
Owen Pan
767aee1de9 [clang-format] Don't annotate left brace of struct as FunctionLBrace
Related to a02c3af9f19d. Fixes #61700.

Differential Revision: https://reviews.llvm.org/D146895
2023-03-27 14:07:15 -07:00
Emilia Dreamer
5409fb3837
[clang-format] Annotate lambdas with requires clauses.
The C++ grammar allows lambdas to have a *requires-clause* in two
places, either directly after the *template-parameter-list*, such as:

`[] <typename T> requires foo<T> (T t) { ... };`

Or, at the end of the *lambda-declarator* (before the lambda's body):

`[] <typename T> (T t) requires foo<T> { ... };`

Previously, these cases weren't handled at all, resulting in weird
results.

Note that this commit only handles token annotation, so the actual
formatting still ends up suboptimal. This is mostly because I do not yet
know how to approach making the requires clause formatting of lambdas
match the formatting for functions.

Fixes https://github.com/llvm/llvm-project/issues/61269

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D145642
2023-03-26 04:38:26 +03:00
sstwcw
a1f8bab9ba [clang-format] Recognize Verilog always blocks
The small `Coverage` test was added because we added the space rule
about 2 at signs along with the rule about only 1 of it. We have not
fully covered covergroup yet.

Reviewed By: MyDeveloperDay, owenpan

Differential Revision: https://reviews.llvm.org/D145794
2023-03-14 03:49:56 +00:00
Owen Pan
bb70dacd60 [clang-format][NFC] Remove an obsolete case in parsing concepts
See https://reviews.llvm.org/D142412#4078127.
2023-03-07 14:42:22 -08:00
Owen Pan
a02c3af9f1 [clang-format] Don't annotate left brace of class as FunctionLBrace
The l_brace of class/struct/union was incorrectly annotated as
TT_FunctionLBrace in the presence of attributes. This in turn
would cause the RemoveSemicolon option to remove the semicolon
at the end of the declaration, resulting in invalid code being
generated.

Fixes #61188.

Differential Revision: https://reviews.llvm.org/D145344
2023-03-06 13:07:23 -08:00
Manuel Klimek
01402831aa [clang-format] Add simple macro replacements in formatting.
Add configuration to specify macros.
Macros will be expanded, and the code will be parsed and annotated
in the expanded state. In a second step, the formatting decisions
in the annotated expanded code will be reconstructed onto the
original unexpanded macro call.

Eventually, this will allow to remove special-case code for
various macro options we accumulated over the years in favor of
one principled mechanism.

Differential Revision: https://reviews.llvm.org/D144170
2023-02-24 15:44:24 +00:00
Owen Pan
0ef289e5b2 [clang-format][NFC] Clean up nullptr comparison style
For example, use 'Next' instead of 'Next != nullptr',
and '!Next' instead of 'Next == nullptr'.

Differential Revision: https://reviews.llvm.org/D144355
2023-02-21 02:56:27 -08:00
Manuel Klimek
be31f2c11d [clang-format][NFC] Move IndexedTokenSource to FormatTokenSource header.
Finish refactoring of the token sources towards a single location.
2023-01-31 15:06:20 +00:00
Manuel Klimek
c3bc61d72f [clang-format][NFC] Pull FormatTokenSource into its own header.
Prepare getting FormatTokenSource under unit testing.
2023-01-31 14:32:31 +00:00
Owen Pan
56313f65cc [clang-format] Put peekNextToken(/*SkipComment=*/true) to good use
To prevent potential bugs in situations where we want to peek the next
non-comment token.

Differential Revision: https://reviews.llvm.org/D142412
2023-01-24 18:40:14 -08:00
Owen Pan
02fd0020e5 [clang-format] Fix bugs in parsing C++20 module import statements
Also fixes #60145.

Differential Revision: https://reviews.llvm.org/D142296
2023-01-23 14:35:15 -08:00
Matt Kulukundis
caf393da18 Fix format for case in .proto files
Fix format for `case` in .proto files

Reviewed By: krasimir, echristo

Differential Revision: https://reviews.llvm.org/D141547
2023-01-16 17:43:50 +00:00
Emilia Dreamer
d989950157
[clang-format] Disallow decltype in the middle of constraints
If a function with a `requires` clause as a constraint has a decltype
return type, such as `decltype(auto)`, the decltype was seen to be part
of the constraint clause, rather than as part of the function
declaration, causing it to be placed on the wrong line.

This patch disallows decltype to be a part of these clauses

Fixes https://github.com/llvm/llvm-project/issues/59578

Depends on D140339

Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D140312
2023-01-06 05:18:28 +02:00
Emilia Dreamer
b1eeec6177
[clang-format] Remove special logic for parsing concept definitions.
Previously, clang-format relied on a special method to parse concept
definitions, `UnwrappedLineParser::parseConcept()`, which deferred to
`UnwrappedLineParser::parseConstraintExpression()`. This is problematic,
because the C++ grammar treats concepts and requires clauses
differently, causing issues such as https://github.com/llvm/llvm-project/issues/55898 and https://github.com/llvm/llvm-project/issues/58130.

This patch removes `parseConcept`, letting the formatter parse concept
definitions as more like what they actually are, fancy bool definitions.

NOTE that because of this, some long concept definitions change in their
formatting, as can be seen in the changed tests. This is because of a
change in split penalties, caused by a change in MightBeFunctionDecl on
the concept definition line, which was previously `true` but with this
patch is now `false`.

One might argue that `false` is a more "correct" value for concept
definitions, but I'd be fine with setting it to `true` again to maintain
compatibility with previous versions.

Fixes https://github.com/llvm/llvm-project/issues/58130

Depends on D140330

Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D140339
2023-01-06 05:17:58 +02:00
Owen Pan
5751c439be [clang-format] Improve UnwrappedLineParser::mightFitOnOneLine()
Account for an r_brace that precedes an "else if" statement when
calculating whether the line might fit on one line if the r_brace
is removed.

Fixes #59778.

Differential Revision: https://reviews.llvm.org/D140835
2023-01-05 13:23:23 -08:00
Owen Pan
91b5d508e3 Revert "[clang-format] Link the braces of a block in UnwrappedLineParser"
This reverts commit e33243c950ac40d027ad8facbf7ccf0624604a16 but
keeps the added test case and also adds another test case.

Fixes #59417.

Differential Revision: https://reviews.llvm.org/D139760
2022-12-10 02:31:53 -08:00
Owen Pan
15f121e853 [clang-format] Fix an assertion failure in block parsing
This assertion failure was introduced in 9ed2e68c9ae5 and is
manifested when both RemoveBracesLLVM and MacroBlockBegin are set.

Fixes #59335.

Differential Revision: https://reviews.llvm.org/D139281
2022-12-06 14:14:20 -08:00
Owen Pan
e33243c950 [clang-format] Link the braces of a block in UnwrappedLineParser
This includes TT_MacroBlockBegin and TT_MacroBlockEnd as well.

We can no longer use MatchingParen of FormatToken as an indicator
to mark optional braces. Instead, we directly set Optional of an
l_brace first and reset it later if it turns out that the braces
are not optional.

Also added a test case for deeply nested loops.

Differential Revision: https://reviews.llvm.org/D139257
2022-12-04 12:01:26 -08:00
Manuel Klimek
49aca00d63 [NFC] Remove peekNextToken(int).
Arbitrary lookahead restricts the implementation of our TokenSource,
specifically getting in the way of changes to handle macros better.

Instead, use getNextToken to parse lookahead linearly, and
getPosition/setPosition to unwind our lookahead.
2022-11-26 18:23:42 +00:00
Manuel Klimek
d65019bbcb [NFC] Clean up printing of UnwrappedLines.
Move print functions to start of UnwarppedLineParser so they can be
used from everywhere in the file.
Pull out function that doesn't hard-code the stream.
2022-11-25 14:26:47 +00:00
Noah Goldstein
92bccf5d3d [clang-format] Don't use PPIndentWidth inside multi-line macros
Differential Revision: https://reviews.llvm.org/D137181
2022-11-19 23:53:48 -08:00
Owen Pan
e787708bcf [clang-format][NFC] Remove parsePPElIf()
Differential Revision: https://reviews.llvm.org/D137308
2022-11-04 00:38:40 -07:00
Owen Pan
117d792f35 [clang-format] Don't skip #else/#elif of #if 0
Fixes #58188.

Differential Revision: https://reviews.llvm.org/D137052
2022-11-02 13:32:08 -07:00
sstwcw
d5be1550f1 [clang-format] Don't crash on malformed preprocessor conditions
Previously the program would crash on this input:

```
#else
#if X
```

The problem was that in `parsePPElse`, `PPBranchLevel` would be
incremented when the first `#else` gets parsed, but `PPLevelBranchCount`
and `PPLevelBranchIndex` would not be updated together.

I found the problem when working on D135740.

Differential Revision: https://reviews.llvm.org/D135972
2022-10-30 02:18:58 +00:00
Joseph Huber
037669de8b [clang-format] Do not parse certain characters in pragma directives
Currently, we parse lines inside of a compiler `#pragma` the same way we
parse any other line. This is fine for some cases, like separating
expressions and adding proper spacing, but in others it causes some poor
results from miscategorizing some tokens.

For example, the OpenMP offloading uses certain clauses that contain
special characters like `map(tofrom : A[0:N])`. This will be formatted
poorly as it will be split between lines on the first colon.
Additionally the subscript notation will lead to poor spacing. This can
be seen in the OpenMP tests as the automatic clang formatting with
inevitably ruin the formatting.

For example, the following contrived example will be formatted poorly.
```
#pragma omp target teams distribute collapse(2) map(to: A[0 : M * K])  \
    map(to: B[0:K * N]) map(tofrom:C[0:M*N]) firstprivate(Alpha) \
    firstprivate(Beta) firstprivate(X) firstprivate(D) firstprivate(Y) \
    firstprivate(E) firstprivate(Z) firstprivate(F)
```
This results in this when formatted, which is far from ideal.
```
#pragma omp target teams distribute collapse(2) map(to                         \
                                                    : A [0:M * K])             \
    map(to                                                                     \
        : B [0:K * N]) map(tofrom                                              \
                           : C [0:M * N]) firstprivate(Alpha)                  \
        firstprivate(Beta) firstprivate(X) firstprivate(D) firstprivate(Y)     \
            firstprivate(E) firstprivate(Z) firstprivate(F)
```

This patch seeks to improve this by adding extra logic where the parsing goes
awry. This is primarily caused by the colon being parsed as an inline-asm
directive and the brackes an objective-C expressions. Also the line gets
indented every single time the line is dropped.

This doesn't implement true parsing handling for OpenMP statements.

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D136100
2022-10-18 16:38:19 -05:00