298 Commits

Author SHA1 Message Date
Owen Pan
441108ccba Reland [clang-format] Fix overlapping whitespace replacements before PPDirective
If the first token of an annotated line already has a computed Newlines,
reuse it to avoid potential overlapping whitespace replacements before
preprocessor branching directives.

Fixes #62892.

Differential Revision: https://reviews.llvm.org/D151954
2023-06-16 17:00:12 -07:00
sstwcw
369e8762b4 [clang-format] Stop comment disrupting indentation of Verilog ports
Before:

```
module x
    #( //
        parameter x)
    ( //
        input y);
endmodule
```

After:

```
module x
    #(//
      parameter x)
    (//
     input y);
endmodule
```

If the first line in a port or parameter list is not a comment, the
following lines will be aligned to the first line as intended:

```
module x
    #(parameter x1,
      parameter x2)
    (input y,
     input y2);
endmodule
```

Previously, the indentation would be changed to an extra continuation
indentation relative to the start of the parenthesis or the hash if
the first token inside the parentheses was a comment.  It is a feature
introduced in ddaa9be97839.  The feature enabled one to insert a `//`
comment right after an opening parentheses to put the function
arguments on a new line with a small indentation regardless of how
long the function name is, like this:

```
someFunction(anotherFunction( // Force break.
    parameter));
```

People are unlikely to use this feature in a Verilog port list because
the formatter already puts the port list on its own lines.  A comment
at the start of a port list is probably a comment for the port on the
next line.

We also removed the space before the comment so that its indentation
would be same as that for a line comment anywhere else in the port
list.

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D149562
2023-05-16 02:56:58 +00:00
sstwcw
e12428557a [clang-format] Recognize Verilog edge identifiers
Previously the event expression would be misidentified as a port list.
A line break would be added after the comma.  The events can be
separated with either a comma or the `or` keyword, and a line break
would not be inserted if the `or` keyword was used.  We changed the
behavior of the comma to match the `or` keyword.

Before:
```
always @(posedge x,
         posedge y)
  x <= x;
always @(posedge x or posedge y)
  x <= x;
```

After:
```
always @(posedge x, posedge y)
  x <= x;
always @(posedge x or posedge y)
  x <= x;
```

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D149561
2023-05-07 05:13:04 +00:00
sstwcw
82a90caa88 [clang-format] Correctly format goto labels followed by blocks
There doesn't seem to be an issue on GitHub.  But previously, a space
would be inserted before the goto colon in the code below.

    switch (x) {
    case 0:
    goto_0: {
      action();
      break;
    }
    }

Previously, the colon following a goto label would be annotated as
`TT_InheritanceColon`.  A goto label followed by an opening brace
wasn't recognized.  It is easy to add another line to have
`spaceRequiredBefore` function recognize the case, but I believed it
is more proper to avoid doing the same thing in `UnwrappedLineParser`
and `TokenAnnotator`.  So now the label colons would be labeled in
`UnwrappedLineParser`, and `spaceRequiredBefore` would rely on that.

Previously we had the type `TT_GotoLabelColon` intended for both goto
labels and case labels.  But since handling of goto labels and case
labels differ somewhat, I split it into separate types for goto and
case labels.

This patch doesn't change the behavior for case labels.  I added the
lines annotating case labels because they would previously be
mistakenly annotated as `TT_InheritanceColon` just like goto labels.
And since I added the annotations, the checks for the `case` and
`default` keywords in `spaceRequiredBefore` are not necessary anymore.

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D148484
2023-04-30 22:25:48 +00:00
Ben Hamilton
00ea679895 [Format/ObjC] Support NS_ERROR_ENUM in ObjC language guesser
Apple added a new NS_ERROR_ENUM macro to help define enums for
NSError codes.

This updates libformat's Objective-C language-guessing heuristic
to detect the new macro as well as related NSError types.

Tested: New tests added.

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D147577
2023-04-07 16:10:52 -06:00
sstwcw
74cc4389f3 [clang-format] Add option for having one port per line in Verilog
We added the option `VerilogBreakBetweenInstancePorts` to put ports on
separate lines in module instantiations.  We made it default to true
because style guides mostly recommend it that way for example:

https://github.com/lowRISC/style-guides/blob/master/VerilogCodingStyle.md#module-instantiation

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D147327
2023-04-04 14:51:22 +00:00
sstwcw
feb585e7d6 [clang-format] Handle Verilog struct literals
Previously `isVerilogIdentifier` was mistaking the apostrophe used in
struct literals as an identifier.  It is fixed.

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D147329
2023-04-01 17:10:31 +00:00
sstwcw
f90668c8cc [clang-format] Handle Verilog assign statements
Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D146402
2023-03-25 21:13:15 +00:00
Paulo Matos
8d0c889752 [clang][WebAssembly] Initial support for reference type funcref in clang
This is the funcref counterpart to 890146b. We introduce a new attribute
that marks a function pointer as a funcref. It also implements builtin
__builtin_wasm_ref_null_func(), that returns a null funcref value.

Differential Revision: https://reviews.llvm.org/D128440
2023-03-17 18:31:44 +01:00
sstwcw
a1f8bab9ba [clang-format] Recognize Verilog always blocks
The small `Coverage` test was added because we added the space rule
about 2 at signs along with the rule about only 1 of it. We have not
fully covered covergroup yet.

Reviewed By: MyDeveloperDay, owenpan

Differential Revision: https://reviews.llvm.org/D145794
2023-03-14 03:49:56 +00:00
Manuel Klimek
01402831aa [clang-format] Add simple macro replacements in formatting.
Add configuration to specify macros.
Macros will be expanded, and the code will be parsed and annotated
in the expanded state. In a second step, the formatting decisions
in the annotated expanded code will be reconstructed onto the
original unexpanded macro call.

Eventually, this will allow to remove special-case code for
various macro options we accumulated over the years in favor of
one principled mechanism.

Differential Revision: https://reviews.llvm.org/D144170
2023-02-24 15:44:24 +00:00
Owen Pan
0ef289e5b2 [clang-format][NFC] Clean up nullptr comparison style
For example, use 'Next' instead of 'Next != nullptr',
and '!Next' instead of 'Next == nullptr'.

Differential Revision: https://reviews.llvm.org/D144355
2023-02-21 02:56:27 -08:00
sstwcw
6e473aeffd [clang-format] Put ports on separate lines in Verilog module headers
New:
```
module mh1
    (input var int in1,
     input var in2, in3,
     output tagged_st out);
endmodule
```

Old:
```
module mh1
    (input var int in1, input var in2, in3, output tagged_st out);
endmodule
```

`getNextNonComment` was modified to return a non-const pointer because
we needed to use it that way in `verilogGroupDecl`.

The comment on line 2626 was a typo.  We corrected it while modifying
the function.

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D143825
2023-02-20 03:24:13 +00:00
Owen Pan
b05dc1b876 [clang-format] Add a space between an overloaded operator and '>'
The token annotator doesn't annotate the template opener and closer
as such if they enclose an overloaded operator. This causes the
space between the operator and the closer to be removed, resulting
in invalid C++ code.

Fixes #58602.

Differential Revision: https://reviews.llvm.org/D143755
2023-02-16 20:25:39 -08:00
Manuel Klimek
c3bc61d72f [clang-format][NFC] Pull FormatTokenSource into its own header.
Prepare getting FormatTokenSource under unit testing.
2023-01-31 14:32:31 +00:00
Kazu Hirata
6ad0788c33 [clang] Use std::optional instead of llvm::Optional (NFC)
This patch replaces (llvm::|)Optional< with std::optional<.  I'll post
a separate patch to remove #include "llvm/ADT/Optional.h".

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14 12:31:01 -08:00
Kazu Hirata
a1580d7b59 [clang] Add #include <optional> (NFC)
This patch adds #include <optional> to those files containing
llvm::Optional<...> or Optional<...>.

I'll post a separate patch to actually replace llvm::Optional with
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14 11:07:21 -08:00
Krasimir Georgiev
922c8891d9 Revert "Revert "[clang-format] Add an option for breaking after C++11 attributes""
This reverts commit 879bfe6a979295f834b76df66b19a203b93eed0f.

owenpan@ pointed out on https://reviews.llvm.org/D140956 that this
actually makes the formatting more consistent, so it's not a regression.
2023-01-11 11:30:30 +00:00
Emilia Dreamer
0904e0bac8
[clang-format] Properly handle the C11 _Generic keyword.
This patch properly recognizes the generic selection expression
introduced in C11, by adding an additional token type for the colons
present in such expressions.

Previously, they would be recognized as
"inline ASM colons" purely by the fact that those are the last thing
checked for.

I tried to avoid adding an addition token type, but since colons by
default like having spaces around them, I chose to add a new type so
that no space is added after the type selector.

Currently, no aspect of the formatting of these expressions in able to
be configured, as I'm not sure what could even be configured here.

One notable thing is that association list is always formatted as
either entirely on one line, if it can fit, or with line breaks
after every comma in the expression (also after the controlling expr.)

This visually makes them more similar to switch statements when long,
matching the behaviour of the selection expression, being that of a sort
of switch on types, but also allows for terseness when only selecting
for a few things.

Fixes https://github.com/llvm/llvm-project/issues/18080

Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D139211
2023-01-11 06:00:16 +02:00
Krasimir Georgiev
879bfe6a97 Revert "[clang-format] Add an option for breaking after C++11 attributes"
This reverts commit a28f0747c2f3728bd8a6f64f7c8ba80b4e0cda9f.

It appears that this regresses some function definitions, added an
example as a comment over at https://reviews.llvm.org/D140956.
2023-01-10 09:23:44 +00:00
Owen Pan
a28f0747c2 [clang-format] Add an option for breaking after C++11 attributes
Fixes #45968.
Fixes #54265.
Fixes #58102.

Differential Revision: https://reviews.llvm.org/D140956
2023-01-05 04:08:58 -08:00
Owen Pan
15f121e853 [clang-format] Fix an assertion failure in block parsing
This assertion failure was introduced in 9ed2e68c9ae5 and is
manifested when both RemoveBracesLLVM and MacroBlockBegin are set.

Fixes #59335.

Differential Revision: https://reviews.llvm.org/D139281
2022-12-06 14:14:20 -08:00
Christopher Di Bella
e9ef45635b [clang] adds unary type transformations as compiler built-ins
Adds

* `__add_lvalue_reference`
* `__add_pointer`
* `__add_rvalue_reference`
* `__decay`
* `__make_signed`
* `__make_unsigned`
* `__remove_all_extents`
* `__remove_extent`
* `__remove_const`
* `__remove_volatile`
* `__remove_cv`
* `__remove_pointer`
* `__remove_reference`
* `__remove_cvref`

These are all compiler built-in equivalents of the unary type traits
found in [[meta.trans]][1]. The compiler already has all of the
information it needs to answer these transformations, so we can skip
needing to make partial specialisations in standard library
implementations (we already do this for a lot of the query traits). This
will hopefully improve compile times, as we won't need use as much
memory in such a base part of the standard library.

[1]: http://wg21.link/meta.trans

Co-authored-by: zoecarver

Reviewed By: aaron.ballman, rsmith

Differential Revision: https://reviews.llvm.org/D116203
2022-08-22 03:03:32 +00:00
owenca
2185f64771 [clang-format] Handle comments between access specifier and colon
Fixes #56740.

Differential Revision: https://reviews.llvm.org/D131940
2022-08-16 20:18:21 -07:00
Nico Weber
aacf1a9742 Revert "[clang] adds unary type transformations as compiler built-ins"
This reverts commit bc60cf2368de90918719dc7e3d7c63a72cc007ad.
Doesn't build on Windows and breaks gcc 9 build, see
https://reviews.llvm.org/D116203#3722094 and
https://reviews.llvm.org/D116203#3722128

Also revert two follow-ups. One fixed a warning added in
bc60cf2368de90918719dc7e3d7c63a72cc007ad, the other
makes use of the feature added in bc60cf2368de90918719dc7e3d7c63a72cc007ad
in libc++:

Revert "[libcxx][NFC] utilises compiler builtins for unary transform type-traits"
This reverts commit 06a1d917ef1f507aaa2f6891bb654696c866ea3a.

Revert "[Sema] Fix a warning"
This reverts commit c85abbe879ef3257de4db862ce249b060cc3d2a4.
2022-08-14 15:58:21 -04:00
Christopher Di Bella
bc60cf2368 [clang] adds unary type transformations as compiler built-ins
Adds

* `__add_lvalue_reference`
* `__add_pointer`
* `__add_rvalue_reference`
* `__decay`
* `__make_signed`
* `__make_unsigned`
* `__remove_all_extents`
* `__remove_extent`
* `__remove_const`
* `__remove_volatile`
* `__remove_cv`
* `__remove_pointer`
* `__remove_reference`
* `__remove_cvref`

These are all compiler built-in equivalents of the unary type traits
found in [[meta.trans]][1]. The compiler already has all of the
information it needs to answer these transformations, so we can skip
needing to make partial specialisations in standard library
implementations (we already do this for a lot of the query traits). This
will hopefully improve compile times, as we won't need use as much
memory in such a base part of the standard library.

[1]: http://wg21.link/meta.trans

Co-authored-by: zoecarver

Reviewed By: aaron.ballman, rsmith

Differential Revision: https://reviews.llvm.org/D116203
2022-08-14 17:12:15 +00:00
Fangrui Song
32197830ef [clang][clang-tools-extra] LLVM_NODISCARD => [[nodiscard]]. NFC 2022-08-09 07:11:18 +00:00
sstwcw
c88719483c [clang-format] Handle Verilog case statements
These statements are like switch statements in C, but without the 'case'
keyword in labels.

How labels are parsed.  In UnwrappedLineParser, the program tries to
parse a statement every time it sees a colon.  In TokenAnnotator, a
colon that isn't part of an expression is annotated as a label.

The token type `TT_GotoLabelColon` is added.  We did not include Verilog
in the name because we thought we would eventually have to fix the
problem that case labels in C can't contain ternary conditional
expressions and we would use that token type.

The style is like below.  Labels are on separate lines and indented by
default.  The linked style guide also has examples where labels and the
corresponding statements are on the same lines.  They are not supported
for now.

https://github.com/lowRISC/style-guides/blob/master/VerilogCodingStyle.md

```
case (state_q)
  StIdle:
    state_d = StA;
  StA: begin
    state_d = StB;
  end
endcase
```

Differential Revision: https://reviews.llvm.org/D128714
2022-07-29 00:38:30 +00:00
sstwcw
b67ee18e85 [clang-format] Handle Verilog user-defined primitives
Differential Revision: https://reviews.llvm.org/D128713
2022-07-29 00:38:30 +00:00
sstwcw
6db0c18b1a [clang-format] Handle Verilog modules
Now things inside hierarchies like modules and interfaces are
indented.  When the module header spans multiple lines, all except the
first line are indented as continuations.  We added the property
`IsContinuation` to mark lines that should be indented this way.

In order that the colons inside square brackets don't get labeled as
`TT_ObjCMethodExpr`, we added a check to only use this type when the
language is not Verilog.

Differential Revision: https://reviews.llvm.org/D128712
2022-07-29 00:38:30 +00:00
sstwcw
67480b360c [clang-format] Handle Verilog blocks
Now stuff inside begin-end blocks get indented.

Some tests are moved into FormatTestVerilog.Block from
FormatTestVerilog.If because they have nothing to do with if statements.

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D128711
2022-07-29 00:38:30 +00:00
sstwcw
f93182a887 [clang-format] Handle Verilog numbers and operators
Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D126845
2022-07-29 00:38:29 +00:00
Manuel Klimek
d6d0dc1f45 [clang-format] Add MacroUnexpander.
MacroUnexpander applies the structural formatting of expanded lines into
UnwrappedLines to the corresponding unexpanded macro calls, resulting in
UnwrappedLines for the macro calls the user typed.

Differential Revision: https://reviews.llvm.org/D88299
2022-07-12 07:11:46 +00:00
sstwcw
2e32ff106e [clang-format] Handle Verilog preprocessor directives
Verilog uses the backtick instead of the hash.  In this revision
backticks are lexed manually and then get labeled as hashes so the logic
for handling C preprocessor stuff don't have to change.  Hashes get
labeled as identifiers for Verilog-specific stuff like delays.

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D124749
2022-06-26 02:02:29 +00:00
sstwcw
9ed2e68c9a [clang-format] Parse Verilog if statements
This patch mainly handles treating `begin` as block openers.

While and for statements will be handled in another patch.

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D123450
2022-06-26 01:52:15 +00:00
sstwcw
1f69f7ea9a [clang-format] NFC Sort names of format token types
Suggested by HazardyKnusperkeks in D126845.

Reviewed By: HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D126934
2022-06-25 12:09:49 +00:00
Tobias Hieta
749fb33e82 [clang-format] Don't break lines after pragma region
We have autogenerated pragma regions in our code
which where awkwardly broken up like this:

```
#pragma region foo(bar : hello)
```
becomes

```
#pragma region foo(bar \
                   : hello)
```

This fixes the problem by adding region as a keyword
and handling it the same way as pragma mark

Reviewed By: curdeius

Differential Revision: https://reviews.llvm.org/D125961
2022-05-20 16:11:20 +02:00
owenca
9dffab9d52 [clang-format][NFC] Don't call mightFitOnOneLine() unnecessarily
Clean up UnwrappedLineParser for RemoveBracesLLVM to avoid calling
mightFitOnOneLine() as much as possible.

Differential Revision: https://reviews.llvm.org/D125626
2022-05-16 02:43:35 -07:00
Marek Kurdej
7a54fceb25 [clang-format] Handle C# 9 init accessor specifier.
Before, the code:
```
int Value { get; } = 0;
int Value { init; } = 0;
```
was formatted incoherently:
```
int Value { get; } = 0;
int Value { init; }
= 0;
```
because `init` was not recognised as an accessor specifier.

Reviewed By: MyDeveloperDay, HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D121132
2022-03-08 13:33:36 +01:00
Benjamin Kramer
a887b95edf [clang-format] Turn global COperatorsFollowingVar vector into a switch
LLVM optimizes this into a bit test. NFCI.
2022-03-05 18:00:16 +01:00
Björn Schäpers
1e7cc72ac9 [clang-format] Allow to set token types final
We have a little problem. TokenAnnotator::resetTokenMetadata() resets
the type, except for a (growing) whitelist. This is because the
TokenAnnotator visits some tokens multiple times. E.g. trying to
identify if a < is an operator less or a template opener. And in some
runs, which are bascially "reverted" the types are reset.

On the other hand, if the parser does already know the type, it should
be able to set it, without it being reset. So we introduce the ability
to set a type and make that final.

Differential Revision: https://reviews.llvm.org/D120511
2022-03-01 21:55:31 +01:00
Marek Kurdej
bfb4afee74 [clang-format] Avoid inserting space after C++ casts.
Fixes https://github.com/llvm/llvm-project/issues/53876.

This is a solution for standard C++ casts: const_cast, dynamic_cast, reinterpret_cast, static_cast.

A general approach handling all possible casts is not possible without semantic information.
Consider the code:
```
static_cast<T>(*function_pointer_variable)(arguments);
```
vs.
```
some_return_type<T> (*function_pointer_variable)(parameters);
// Later used as:
function_pointer_variable = &some_function;
return function_pointer_variable(args);
```
In the latter case, it's not a cast but a variable declaration of a pointer to function.
Without knowing what `some_return_type<T>` is (and clang-format does not know it), it's hard to distinguish between the two cases. Theoretically, one could check whether "parameters" are types (not a cast) and "arguments" are value/expressions (a cast), but that might be inefficient (needs lots of lookahead).

Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D120140
2022-02-24 10:21:02 +01:00
Krasimir Georgiev
c9592ae49b [clang-format] Fix preprocessor nesting after commit 529aa4b011c4ae808d658022ef643c44dd9b2c9c
In 529aa4b011
by setting the identifier info to nullptr, we started to subtly
interfere with the parts in the beginning of the function,
529aa4b011/clang/lib/Format/UnwrappedLineParser.cpp (L991)
causing the preprocessor nesting to change in some cases. E.g., for the
added regression test, clang-format started incorrectly guessing the
language as C++.

This tries to address this by introducing an internal identifier info
element to use instead.

Reviewed By: curdeius, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D120315
2022-02-22 15:43:18 +01:00
owenca
77e60bc42c [clang-format] Add option to insert braces after control statements
Adds a new option InsertBraces to insert the optional braces after
if, else, for, while, and do in C++.

Differential Revision: https://reviews.llvm.org/D120217
2022-02-21 20:16:25 -08:00
Marek Kurdej
48a31c8f42 [clang-format] Mark FormatToken::getPreviousNonComment() nodiscard. NFC. 2022-02-16 22:56:32 +01:00
Marek Kurdej
d81f003ce1 [clang-format] Fix formatting of struct-like records followed by variable declaration.
Fixes https://github.com/llvm/llvm-project/issues/24781.
Fixes https://github.com/llvm/llvm-project/issues/38160.

This patch splits `TT_RecordLBrace` for classes/enums/structs/unions (and other records, e.g. interfaces) and uses the brace type to avoid the error-prone scanning for record token.

The mentioned bugs were provoked by the scanning being too limited (and so not considering `const` or `constexpr`, or other qualifiers, on an anonymous struct variable declaration).

Moreover, the proposed solution is more efficient as we parse tokens once only (scanning being parsing too).

Reviewed By: MyDeveloperDay, HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D119785
2022-02-16 22:37:32 +01:00
Björn Schäpers
9aab0db13f [clang-format] Improve require and concept handling
- Added an option where to put the requires clauses.
- Renamed IndentRequires to IndentRequiresClause.
- Changed BreakBeforeConceptDeclaration from bool to an enum.

Fixes https://llvm.org/PR32165, and https://llvm.org/PR52401.

Differential Revision: https://reviews.llvm.org/D113319
2022-02-11 22:42:37 +01:00
Marek Kurdej
0104f5efed [clang-format] Mark FormatToken::getNextNonComment() nodiscard. NFC. 2022-02-11 15:20:11 +01:00
Marek Kurdej
4efde1e554 [clang-format] Move FormatToken::opensBlockOrBlockTypeList to source file. NFC. 2022-02-10 10:51:03 +01:00
Philip Sigillito
d1aed486ef [clang-format] Handle C variables with name that matches c++ access specifier
Reviewed By: MyDeveloperDay, curdeius, HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D117416
2022-01-30 20:56:50 +01:00