…ecord level.
This fixes the incorrect diagnostic emitted when compiling the following
snippet
```
// string_view.h
template<class _CharT>
class basic_string_view;
typedef basic_string_view<char> string_view;
template<class _CharT>
class
__attribute__((__preferred_name__(string_view)))
basic_string_view {
public:
basic_string_view()
{
}
};
inline basic_string_view<char> foo()
{
return basic_string_view<char>();
}
// A.cppm
module;
#include "string_view.h"
export module A;
// Use.cppm
module;
#include "string_view.h"
export module Use;
import A;
```
The diagnostic is
```
string_view.h:11:5: error: 'basic_string_view<char>::basic_string_view' from module 'A.<global>' is not present in definition of 'string_view' provided earlier
```
The underlying issue is that deserialization of the `preferred_name`
attribute triggers deserialization of `basic_string_view<char>`, which
triggers the deserialization of the `preferred_name` attribute again
(since it's attached to the `basic_string_view` template).
The deserialization logic is implemented in a way that prevents it from
going on a loop in a literal sense (it detects early on that it has
already seen the `string_view` typedef when trying to start its
deserialization for the second time), but leaves the typedef
deserialization in an unfinished state. Subsequently, the `string_view`
typedef from the deserialized module cannot be merged with the same
typedef from `string_view.h`, resulting in the above diagnostic.
This PR resolves the problem by delaying the deserialization of the
`preferred_name` attribute until the deserialization of the
`basic_string_view` template is completed. As a result of deferring, the
deserialization of the `preferred_name` attribute doesn't need to go on
a loop since the type of the `string_view` typedef is already known when
it's deserialized.
This causes us to generate an enum to go along with the select
diagnostic, which allows for clearer diagnostic error emit lines.
The syntax for this is:
%enum_select<EnumerationName>{%OptionalEnumeratorName{Text}|{Text2}}0
Where the curley brackets around the select-text are only required if an
Enumerator name is provided.
The TableGen here emits this as a normal 'select' to the frontend, which
permits us to reuse all of the existing 'select' infrastructure.
Documentation is the same as well.
---------
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
This was an especially challenging escape hatch because it directly
forced the use of a specific X-macro structure and prevented any other
form of TableGen emission.
The problematic feature that motivated this is a case where a builtin's
prototype can't be represented in the mini-language used by TableGen.
Instead of adding a complete custom entry for this, this PR just teaches
the prototype handling to do the same thing the X-macros did in this
case: emit an empty string and let the Clang builtin handling respond
appropriately.
This should produce identical results while preserving all the rest of
the structured representation in the builtin TableGen code.
Replacing the extant streaming mode function call with an intrinsic
allows us to make further optimisations around it. For example, if it's
called within a function that has a known streaming mode, we can remove
the dead code, and avoid the redundant conditional branch.
The goal is to make incremental (if small) progress towards fully
TableGen'ed builtins, and to unblock #120534 by gaining access to more
powerful TableGen-based representations.
The bulk `.td` file addition was generated with the help of a very rough
Python script. That script made no attempt to be robust or reusable, it
specifically handled only the cases in the X86 `.def` file.
Four entries from the `.def` file were not handled automatically as they
used `BUILTIN` rather than `TARGET_BUILTIN`. These were ported by hand
to an empty-feature `TargetBuiltin` entry, which seems like a better
match.
For all the automatically ported entries, the results were compared by
sorting and diffing the `.def` file and the generated `.inc` file. The
only differences were:
- Different horizontal whitespace
- Additional entries that had already been ported to the `.td` file.
- More systematically using `Oi` instead of `LLi` for the type `long
long int` in the fully general `__builtin_ia32_...` builtins for OpenCL
support. The `.def` file was only partially moved to this it seems, and
the systematic migration has updated a few missed builtins.
This fixes the llvm-support build that generates the profile data, and
wraps the whole `cmake --build` command with perf instead of wrapping
each individual clang invocation. This limits the number of profile
files generated and reduces the time spent running perf2bolt.
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
A few recent changes are causing build breaks when
`-DLLVM_ENABLE_MODULES=ON` (such as
834dfd23155351c9885eddf7b9664f7697326946 and
7dfdca1961aadc75ca397818bfb9bd32f1879248).
This PR makes the required updates so that clang/llvm builds when
`-DLLVM_ENABLE_MODULES=ON`.
rdar://140803058
The first path argument was always being ignored, and since most calls
to this command only passed one path, it wasn't actually doing anything
in most cases.
This bug was introduced by dd0356d741aefa25ece973d6cc4b55dcb73b84b4.
The existing mechanism being used is to manually add a reference in the
documentation. These references link to the beginning of the text rather
than the heading for the attribute which is what this PR allows.
---------
Co-authored-by: Sirraide <aeternalmail@gmail.com>
- Switch to an enumerated type approach, which is less error-prone as we
continue to add new types. This is similar to NeonEmitter.
- Fix existing faulty typespec modifiers
This is unneeded in almost all circumstances. We only return an APValue
back to clang when the evaluation is finished, and that is always done
by an EvalEmitter - which has its own implementation of the Ret
instructions.
This patch implements the following intrinsics:
8-bit floating-point convert to deinterleaved half-precision or
BFloat16.
``` c
// Variant is also available for: _bf16[_mf8]_x2
svfloat16x2_t svcvtl1_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm) __arm_streaming;
svfloat16x2_t svcvtl2_f16[_mf8]_x2_fpm(svmfloat8_t zn, fpm_t fpm) __arm_streaming;
```
Defined in https://github.com/ARM-software/acle/pull/323
Co-authored-by: Caroline Concatto caroline.concatto@arm.com
Co-authored-by: Marian Lukac marian.lukac@arm.com
…x8 and MFloat8x16
This patch adds MFloat8 as a TypeFlag and Kind on Neon to generate the
typedefs using emitNeonTypeDefs.
It does not need any change in Clang, because SEMA and CodeGen use the
Builtins defined in AArch64SVEACLETypes.def
## Problem Statement
Previously, the examples in the AST matcher reference, which gets
generated by the Doxygen comments in `ASTMatchers.h`, were untested and
best effort.
Some of the matchers had no or wrong examples of how to use the matcher.
## Solution
This patch introduces a simple DSL around Doxygen commands to enable
testing the AST matcher documentation in a way that should be relatively
easy to use.
In `ASTMatchers.h`, most matchers are documented with a Doxygen comment.
Most of these also have a code example that aims to show what the
matcher will match, given a matcher somewhere in the documentation text.
The way that the documentation is tested, is by using Doxygen's alias
feature to declare custom aliases. These aliases forward to
`<tt>text</tt>` (which is what Doxygen's `\c` does, but for multiple
words). Using the Doxygen aliases is the obvious choice, because there
are (now) four consumers:
- people reading the header/using signature help
- the Doxygen generated documentation
- the generated HTML AST matcher reference
- (new) the generated matcher tests
This patch rewrites/extends the documentation such that all matchers
have a documented example.
The new `generate_ast_matcher_doc_tests.py` script will warn on any
undocumented matchers (but not on matchers without a Doxygen comment)
and provides diagnostics and statistics about the matchers.
The current statistics emitted by the parser are:
```text
Statistics:
doxygen_blocks : 519
missing_tests : 10
skipped_objc : 42
code_snippets : 503
matches : 820
matchers : 580
tested_matchers : 574
none_type_matchers : 6
```
The tests are generated during building, and the script will only print
something if it found an issue with the specified tests (e.g., missing
tests).
## Description
DSL for generating the tests from documentation.
TLDR:
```
\header{a.h}
\endheader <- zero or more header
\code
int a = 42;
\endcode
\compile_args{-std=c++,c23-or-later} <- optional, the std flag supports std ranges and
whole languages
\matcher{expr()} <- one or more matchers in succession
\match{42} <- one or more matches in succession
\matcher{varDecl()} <- new matcher resets the context, the above
\match will not count for this new
matcher(-group)
\match{int a = 42} <- only applies to the previous matcher (not to the
previous case)
```
The above block can be repeated inside a Doxygen command for multiple
code examples for a single matcher.
The test generation script will only look for these annotations and
ignore anything else like `\c` or the sentences where these annotations
are embedded into: `The matcher \matcher{expr()} matches the number
\match{42}.`.
### Language Grammar
[] denotes an optional, and <> denotes user-input
```
compile_args j:= \compile_args{[<compile_arg>;]<compile_arg>}
matcher_tag_key ::= type
match_tag_key ::= type || std || count || sub
matcher_tags ::= [matcher_tag_key=<value>;]matcher_tag_key=<value>
match_tags ::= [match_tag_key=<value>;]match_tag_key=<value>
matcher ::= \matcher{[matcher_tags$]<matcher>}
matchers ::= [matcher] matcher
match ::= \match{[match_tags$]<match>}
matches ::= [match] match
case ::= matchers matches
cases ::= [case] case
header-block ::= \header{<name>} <code> \endheader
code-block ::= \code <code> \endcode
testcase ::= code-block [compile_args] cases
```
### Language Standard Versions
The 'std' tag and '\compile_args' support specifying a specific language
version, a whole language and all of its versions, and thresholds
(implies ranges). Multiple arguments are passed with a ',' separator.
For a language and version to execute a tested matcher, it has to match
the specified '\compile_args' for the code, and the 'std' tag for the
matcher. Predicates for the 'std' compiler flag are used with
disjunction between languages (e.g. 'c || c++') and conjunction for all
predicates specific to each language (e.g. 'c++11-or-later &&
c++23-or-earlier').
Examples:
- `c` all available versions of C
- `c++11` only C++11
- `c++11-or-later` C++11 or later
- `c++11-or-earlier` C++11 or earlier
- `c++11-or-later,c++23-or-earlier,c` all of C and C++ between 11 and
23 (inclusive)
- `c++11-23,c` same as above
### Tags
#### `type`:
**Match types** are used to select where the string that is used to
check if a node matches comes from.
Available: `code`, `name`, `typestr`, `typeofstr`. The default is
`code`.
- `code`: Forwards to `tooling::fixit::getText(...)` and should be the
preferred way to show what matches.
- `name`: Casts the match to a `NamedDecl` and returns the result of
`getNameAsString`. Useful when the matched AST node is not easy to spell
out (`code` type), e.g., namespaces or classes with many members.
- `typestr`: Returns the result of `QualType::getAsString` for the type
derived from `Type` (otherwise, if it is derived from `Decl`, recurses
with `Node->getTypeForDecl()`)
**Matcher types** are used to mark matchers as sub-matcher with 'sub' or
as deactivated using 'none'. Testing sub-matcher is not implemented.
#### `count`:
Specifying a 'count=n' on a match will result in a test that requires
that the specified match will be matched n times. Default is 1.
#### `std`:
A match allows specifying if it matches only in specific language
versions. This may be needed when the AST differs between language
versions.
#### `sub`:
The `sub` tag on a `\match` will indicate that the match is for a node
of a bound sub-matcher.
E.g., `\matcher{expr(expr().bind("inner"))}` has a sub-matcher that
binds to `inner`, which is the value for the `sub` tag of the expected
match for the sub-matcher `\match{sub=inner$...}`. Currently,
sub-matchers are not tested in any way.
### What if ...?
#### ... I want to add a matcher?
Add a Doxygen comment to the matcher with a code example, corresponding
matchers and matches, that shows what the matcher is supposed to do.
Specify the compile arguments/supported languages if required, and run
`ninja check-clang-unit` to test the documentation.
#### ... the example I wrote is wrong?
The test-failure output of the generated test file will provide
information about
- where the generated test file is located
- which line in `ASTMatcher.h` the example is from
- which matches were: found, not-(yet)-found, expected
- in case of an unexpected match: what the node looks like using the
different `type`s
- the language version and if the test ran with a windows `-target` flag
(also in failure summary)
#### ... I don't adhere to the required order of the syntax?
The script will diagnose any found issues, such as `matcher is missing
an example` with a `file:line:` prefix,
which should provide enough information about the issue.
#### ... the script diagnoses a false-positive issue with a Doxygen
comment?
It hopefully shouldn't, but if you, e.g., added some non-matcher code
and documented it with Doxygen, then the script will consider that as a
matcher documentation. As a result, the script will print that it
detected a mismatch between the actual and the expected number of
failures. If the diagnostic truly is a false-positive, change the
`expected_failure_statistics` at the top of the
`generate_ast_matcher_doc_tests.py` file.
Fixes#57607Fixes#63748
The scalar __mfp8 type has the wrong name and mangle name in
AArch64SVEACLETypes.def
According to the ACLE[1] the name should be __mfp8
This patch fixes this problem by replacing
the Name __MFloat8_t by __mfp8
and
the Mangle Name __MFloat8_t by u6__mfp8
And we revert the incorrect typedef in NeonEmitter.
[1]https://github.com/ARM-software/acle
Simplify `EmitClangDiagsIndexName` to directly sort records instead of
creating an array of `RecordIndexElement` containing record name and
sorting it.
---------
Co-authored-by: Kazu Hirata <kazu@google.com>
This allows formatting large integers in a human friendly way. Example:
"5321584" -> "5.32M".
Use it where such human numbers are generated manually today.
Heterogenous lookups allow us to call find with StringRef, avoiding a
temporary heap allocation of std::string.
This patch introduces alias:
using DiagsInGroup = std::map<std::string, GroupInfo, std::less<>>;
because the raw type is a bit mouthful.
This adds a build of the libLLVMSupport to the lit suite that is used
for generating profile data. This helps to improve both PGO and BOLT
optimization of clang over the existing hello world training program.
I considered building all of LLVM instead of just libLLVMSupport, but
there is only a marginal increase in performance for PGO only builds
when training with a build of all of LLVM, and I didn't think it was
enough to justify the increased build times given that it is the default
configuration.
The benchmark[1] I did showed that using libLLVMSupport for training
gives a 1.35 +- 0.02 speed up for clang optimized with PGO + BOLT vs
just 1.05 +- 0.01 speed up when training with hello world.
For comparison, training with a full LLVM build gave a speed up of 1.35
+- 0.1.
Raw data:
| PGO Training | BOLT Training | Speed Up | Error Range |
| ------------ | ------------- | -------- | ----------- |
| LLVM Support | LLVM Support | 1.35 | 0.02 |
| LLVM All | LLVM All | 1.34 | 0.01 |
| LLVM Support | Hello World | 1.29 | 0.02 |
| LLVM All | PGO-ONLY | 1.27 | 0.02 |
| LLVM Support | PGO-ONLY | 1.22 | 0.02 |
| Hello World | Hello World | 1.05 | 0.01 |
| Hello World | PGO-ONLY | 1.03 | 0.01 |
Time it takes to generate profile data (on a 64-core system):
| Training Data | PGO | BOLT |
| ------------- | ----- | ----- |
| LLVM All | 1090s | 3239s |
| LLVM Support | 91s | 655s |
| Hello World | 2s | 9s |
[1] Benchmark was compiling SemaDecl.cpp
This patch fixes:
clang/utils/TableGen/ClangAttrEmitter.cpp:3869:51: error: captured
structured bindings are a C++20 extension
[-Werror,-Wc++20-extensions]
`EmitClangAttrSpellingListIndex()` performs a lot of unnecessary string
comparisons which is wasteful in time and stack space. This commit
attempts to refactor this method to be more performant.
Update ClangAttrEmitter TableGen to add explicit symbol visibility
macros to attribute class declarations it creates.
Both AnnotateFunctions and Attribute example plugins require
clang::AnnotateAttr TableGen created functions to be exported from the
Clang shared library.
This depends on macros to be added in
https://github.com/llvm/llvm-project/pull/108276
This starts moving `X86Builtins.def` to be a tablegen file. It's quite
large, so I think it'd be good to move things in multiple steps to avoid
a bunch of merge conflicts due to the amount of time this takes to
complete.
- Change FlattenedSpelling to use StringRef instead of std::String.
- Use range for loops and enumerate().
- Use ArrayRef<> instead of std::vector reference as function arguments.
- Use {} for all if/else branch bodies if one of them uses it.
ARM ACLE PR#323[1] adds new modal types for 8-bit floating point
intrinsic.
From the PR#323:
```
ACLE defines the `__mfp8` type, which can be used for the E5M2 and E4M3
8-bit floating-point formats. It is a storage and interchange only type
with no arithmetic operations other than intrinsic calls.
````
The type should be an opaque type and its format in undefined in Clang.
Only defined in the backend by a status/format register, for AArch64 the
FPMR.
This patch is an attempt to the add the mfloat8_t scalar type. It has a
parser and codegen for the new scalar type.
The patch it is lowering to and 8bit unsigned as it has no format. But
maybe we should add another opaque type.
[1] https://github.com/ARM-software/acle/pull/323