37 Commits

Author SHA1 Message Date
Aaron Ballman
072e81db7a Revert "[Clang][Comments] Attach comments to decl even if preproc directives are in between (#88367)"
This reverts commit 9f04d75b2bd8ba83863db74ebe1a5c08cfc5815c.

There was post-commit feedback on the direction this PR took.
2024-07-02 14:45:52 -04:00
hdoc
9f04d75b2b
[Clang][Comments] Attach comments to decl even if preproc directives are in between (#88367)
### Background

It's surprisingly common for C++ code in the wild to conditionally
show/hide declarations to Doxygen through the use of preprocessor
directives. One especially common version of this pattern is
demonstrated below:

```cpp
/// @brief Test comment
#ifdef DOXYGEN_BUILD_ENABLED
template<typename T>
#else
template <typename T>
typename std::enable_if<std::is_integral<T>::value>::type
#endif
void f() {}
```

There are more examples I've collected below to demonstrate usage of
this pattern:
- Example 1:
[Magnum](8538610fa2/src/Magnum/Resource.h (L117-L127))
- Example 2:
[libcds](9985d2a87f/cds/container/michael_list_nogc.h (L36-L54))
- Example 3:
[rocPRIM](609ae19565/rocprim/include/rocprim/block/detail/block_reduce_raking_reduce.hpp (L60-L65))
 
From my research, it seems like the most common rationale for this
functionality is hiding difficult-to-parse code from Doxygen, especially
where template metaprogramming is concerned.

Currently, Clang does not support attaching comments to decls if there
are preprocessor comments between the comment and the decl. This is
enforced here:
b6ebea7972/clang/lib/AST/ASTContext.cpp (L284-L287)

Alongside preprocessor directives, any instance of `;{}#@` between a
comment and decl will cause the comment to not be attached to the decl.

#### Rationale

It would be nice for Clang-based documentation tools, such as
[hdoc](https://hdoc.io), to support code using this pattern. Users
expect to see comments attached to the relevant decl — even if there is
an `#ifdef` in the way — which Clang does not currently do.

#### History

Originally, commas were also in the list of "banned" characters, but
were removed in `b534d3a0ef69`
([link](b534d3a0ef))
because availability macros often have commas in them. From my reading
of the code, it appears that the original intent of the code was to
exclude macros and decorators between comments and decls, possibly in an
attempt to properly attribute comments to macros (discussed further in
"Complications", below). There's some more discussion here:
https://reviews.llvm.org/D125061.

### Change

This modifies Clang comment parsing so that comments are attached to
subsequent declarations even if there are preprocessor directives
between the end of the comment and the start of the decl. Furthermore,
this change:

- Adds tests to verify that comments are attached to their associated
decls even if there are preprocessor directives in between
- Adds tests to verify that current behavior has not changed (i.e. use
of the other characters between comment and decl will result in the
comment not being attached to the decl)
- Updates existing `lit` tests which would otherwise break.

#### Complications

Clang [does not yet
support](https://github.com/llvm/llvm-project/issues/38206) attaching
doc comments to macros. Consequently, the change proposed in this RFC
affects cases where a doc comment attached to a macro is followed
immediately by a normal declaration. In these cases, the macro's doc
comments will be attached to the subsequent decl. Previously they would
be ignored because any preprocessor directives between a comment and a
decl would result in the comment not being attached to the decl. An
example of this is shown below.

```cpp
/// Doc comment for a function-like macro
/// @param n
///    A macro argument
#define custom_sqrt(n) __internal_sqrt(n)

int __internal_sqrt(int n) { return __builtin_sqrt(n); }

// NB: the doc comment for the custom_sqrt macro will actually be attached to __internal_sqrt!
```

There is a real instance of this problem in the Clang codebase, namely
here:
be10070f91/clang/lib/Headers/amxcomplexintrin.h (L65-L114)

As part of this RFC, I've added a semicolon to break up the Clang
comment parsing so that the `-Wdocumentation` errors go away, but this
is a hack. The real solution is to fix Clang comment parsing so that doc
comments are properly attached to macros, however this would be a large
change that is outside of the scope of this RFC.
2024-07-01 08:47:26 -04:00
Argyrios Kyrtzidis
819f9ffe85 [test/Index] Update libclang tests to use libclang for creating PCH files.
This is consistent and tests the primary configuration we want to test, libclang
creating and consuming PCH files.

llvm-svn: 244066
2015-08-05 17:23:59 +00:00
Argyrios Kyrtzidis
b534d3a0ef [libclang] Remove comma from the blacklist of characters that prevent a comment to be attached to a decl.
It's common to use an availability function macro at the start of a decl.
rdar://13965065

llvm-svn: 187230
2013-07-26 18:38:12 +00:00
Dmitri Gribenko
d947a66c13 Split annotate-comments.cpp into a fragile (that uses hardcoded line numbers)
and a non-fragile (that uses [[@LINE]]) parts.

llvm-svn: 168098
2012-11-15 22:03:13 +00:00
NAKAMURA Takumi
4edb74cf81 clang/test/Index/annotate-comments.cpp: Relax the expression to be matched to -fms-compatibility. Then XFAIL can be removed.
FYI, it can be reproduced with "c-index-test -std=c++11 -fms-compatibility".

llvm-svn: 166261
2012-10-19 03:27:50 +00:00
NAKAMURA Takumi
a9eebc3153 clang/test/Index/annotate-comments.cpp: Mark this as XFAIL on msvc. Investigating.
llvm-svn: 166250
2012-10-19 00:22:54 +00:00
Fariborz Jahanian
673c5215e1 Fix this test to match recent addition of declaration tag.
llvm-svn: 166190
2012-10-18 17:19:41 +00:00
Fariborz Jahanian
a7d76d2672 [Doc parsing]: This patch adds <Declaration> tag to
XML comment for declarations which pretty-prints
declaration. I had to XFAIL one test annotate-comments.cpp.
This test is currently unmaintainable as written.
Dmitri G., can you see what we can do about this test.
We should change this test such that adding a new tag does not wreck
havoc to the test.

llvm-svn: 166130
2012-10-17 21:58:03 +00:00
Dmitri Gribenko
7acbf00f96 Comment AST: TableGen'ize all command lists in CommentCommandTraits.cpp.
Now we have a list of all commands.  This is a good thing in itself, but it
also enables us to easily implement typo correction for command names.

With this change we have objects that contain information about each command,
so it makes sense to resolve command name just once during lexing (currently we
store command names as strings and do a linear search every time some property
value is needed).  Thus comment token and AST nodes were changed to contain a
command ID -- index into a tables of builtin and registered commands.  Unknown
commands are registered during parsing and thus are also uniformly assigned an
ID.  Using an ID instead of a StringRef is also a nice memory optimization
since ID is a small integer that fits into a common bitfield in Comment class.

This change implies that to get any information about a command (even a command
name) we need a CommandTraits object to resolve the command ID to CommandInfo*.
Currently a fresh temporary CommandTraits object is created whenever it is
needed since it does not have any state.  But with this change it has state --
new commands can be registered, so a CommandTraits object was added to
ASTContext.

Also, in libclang CXComment has to be expanded to include a CXTranslationUnit
so that all functions working on comment AST nodes can get a CommandTraits
object.  This breaks binary compatibility of CXComment APIs.

Now clang_FullComment_getAsXML(CXTranslationUnit TU, CXComment CXC) doesn't
need TU parameter anymore, so it was removed.  This is a source-incompatible
change for this C API.

llvm-svn: 163540
2012-09-10 20:32:42 +00:00
Dmitri Gribenko
557a8d568b Merging consecutive comments: be more conservative.
Should fix part 2 of PR13374.

llvm-svn: 162723
2012-08-28 01:20:53 +00:00
Dmitri Gribenko
75eea89920 CommentBriefParser: allow paragraphs to be separated by line of whitespace.
Skip paragraphs that contain only whitespace.

llvm-svn: 162315
2012-08-21 21:15:34 +00:00
Dmitri Gribenko
bacbc08b98 Remove absolute file path in test.
llvm-svn: 161602
2012-08-09 18:35:49 +00:00
Dmitri Gribenko
ba7aca3b38 Comment to HTML and XML conversion: ignore commands that contain a declaration
as their argument.  For example, \fn, \function, \typedef, \method, \class etc.

llvm-svn: 161601
2012-08-09 18:20:29 +00:00
Dmitri Gribenko
dcbc8ce2b5 Comment to HTML and XML conversion: use CommandTraits to classify commands.
This also fixes a bug in comment to XML conversion: \result was just an
ordinary paragraph, not an alias for \returns.

llvm-svn: 161596
2012-08-09 17:33:20 +00:00
Dmitri Gribenko
6cffc1928a Comment XML: use xml:space="preserve" in Verbatim tags, so that XML tidy does
not compress spaces in verbatim content.

llvm-svn: 161531
2012-08-08 22:10:24 +00:00
Dmitri Gribenko
168d23414a Comment AST: DeclInfo: add a special kind for enums.
Comment XML: add a root node kind for enums.

llvm-svn: 161442
2012-08-07 18:59:04 +00:00
Dmitri Gribenko
94ef6357ca Comment AST: treat enumerators as "variables" in DeclInfo.
llvm-svn: 161435
2012-08-07 18:12:22 +00:00
Dmitri Gribenko
740c0fbe0e libclang API for comment-to-xml conversion.
The implementation also includes a Relax NG schema and tests for the schema
itself.  The schema is used in c-index-test to verify that XML documents we
produce are valid.  In order to do the validation, we add an optional libxml2
dependency for c-index-test.

Credits for CMake part go to Doug Gregor.  Credits for Autoconf part go to Eric
Christopher.  Thanks!

llvm-svn: 161431
2012-08-07 17:54:38 +00:00
Dmitri Gribenko
58e4131995 Comment to HTML conversion: correct typo in CSS class name: taram -> tparam
llvm-svn: 161145
2012-08-01 23:47:30 +00:00
Dmitri Gribenko
307cf89b19 Comment to HTML conversion: skip \tparam commands with whitespace paragraphs
llvm-svn: 161096
2012-08-01 00:48:00 +00:00
Dmitri Gribenko
7c0456f91b Comment to HTML conversion: escape HTML special characters in command arguments
llvm-svn: 161094
2012-08-01 00:21:12 +00:00
Dmitri Gribenko
34df220410 Comment parsing: add support for \tparam command on all levels.
The only caveat is renumbering CXCommentKind enum for aesthetic reasons -- this
breaks libclang binary compatibility, but should not be a problem since API is
so new.

This also fixes PR13372 as a side-effect.

llvm-svn: 161087
2012-07-31 22:37:06 +00:00
Dmitri Gribenko
4586df765e Implement resolving of HTML character references (named: &amp;, decimal: &#42;,
hex: &#x1a;) during comment parsing.

Now internal representation of plain text in comment AST does not contain
character references, but the characters themselves.

llvm-svn: 160891
2012-07-27 20:37:06 +00:00
Dmitri Gribenko
6b375193a2 libclang comment to HTML rendering: \result is the same as \returns
llvm-svn: 160738
2012-07-25 17:14:58 +00:00
Dmitri Gribenko
378458d597 libclang comments AST: clang_ParamCommandComment_getParamName: don't assert
when a \param command does not have a parameter name, just return an empty
string instead.

llvm-svn: 160638
2012-07-23 19:41:49 +00:00
Dmitri Gribenko
d73e4ce992 Comment AST: add InlineContentComment::RenderKind to specify a default
rendering mode for clients that don't want to interpret Doxygen commands.

Also add a libclang API to query this information.

llvm-svn: 160633
2012-07-23 16:43:01 +00:00
Dmitri Gribenko
4c6d7a2ed2 Comment to HTML conversion: add more CSS classes to identify function arguments
by index.  This is useful if the user does not document all arguments, and we
can't find a particular argument by index via :nth-of-type() CSS selector.

llvm-svn: 160595
2012-07-21 01:47:43 +00:00
Dmitri Gribenko
5e4fe00e64 Add libclang APIs to walk comments ASTs and an API to convert a comment to an
HTML fragment.

For testing, c-index-test now has even more output:
* HTML rendering of a comment
* comment AST tree dump in S-expressions like Comment::dump(), but implemented
* with libclang APIs.

llvm-svn: 160577
2012-07-20 21:34:34 +00:00
Dmitri Gribenko
77369eead6 CommentBriefParser: use \returns if we can't find the \brief or just a plain
paragraph.

llvm-svn: 160550
2012-07-20 17:01:34 +00:00
Dmitri Gribenko
3e242d6d3c CommentBriefParser: make \short should equivalent to \brief, per Doxygen manual.
llvm-svn: 160383
2012-07-17 18:35:14 +00:00
Dmitri Gribenko
44cd7e6746 Restrict the set of declaration kinds for which we allow trailing comments.
llvm-svn: 159878
2012-07-06 23:27:33 +00:00
Dmitri Gribenko
0743f94671 Cleanup \brief comment. Since it is a single paragraph, no need to save newlines there.
llvm-svn: 159325
2012-06-28 01:38:21 +00:00
Dmitri Gribenko
a1e9c8e783 Teach \brief parser about commands that start a new paragraph implicitly
llvm-svn: 159309
2012-06-28 00:01:41 +00:00
Dmitri Gribenko
7e8729b904 Attaching documentation comments to declarations: don't attach a comment to a declaration if there is a preprocessor directive between them.
llvm-svn: 159305
2012-06-27 23:43:37 +00:00
Dmitri Gribenko
5188c4b9cc Implement a lexer for structured comments.
llvm-svn: 159223
2012-06-26 20:39:18 +00:00
Dmitri Gribenko
aab8383a2b Structured comment parsing, first step.
* Retain comments in the AST
* Serialize/deserialize comments
* Find comments attached to a certain Decl
* Expose raw comment text and SourceRange via libclang

llvm-svn: 158771
2012-06-20 00:34:58 +00:00