This option is confusingly named. What it actually controls is whether,
under the default of `-ffloat16-excess-precision=standard`, it is
beneficial for performance to perform calculations on float (without
intermediate rounding) or not. For `-ffloat16-excess-precision=none` the
LLVM `half` type will always be used, and all backends are expected to
legalize it correctly.
Mips requires fp128 args/returns to be passed differently than i128. It
handles this by inspecting the pre-legalization type. However, for soft
float libcalls, the original type is currently not provided (it will
look like a i128 call). To work around that, MIPS maintains a list of
libcalls working on fp128.
This patch removes that list by providing the original, pre-softening
type to calling convention lowering. This is done by carrying additional
information in CallLoweringInfo, as we unfortunately do need both types
(we want the un-softened type for OrigTy, but we need the softened type
for the actual register assignment etc.)
This is in preparation for completely removing all the custom
pre-analysis code in the Mips backend and replacing it with use of
OrigTy.
The vector combiner will process all instructions as it first loops
through the function, adding any newly added and deleted instructions to
a worklist which is then processed when all nodes are done. These leaves
extra uses in the graph as the initial processing is performed, leading
to sub-optimal decisions being made for other combines. This changes it
so that trivially dead instructions are removed immediately. The main
changes that this requires is to make sure iterator invalidation does not
occur.
According to the suggestions in
https://github.com/llvm/llvm-project/pull/150816, this commit refine the
conditions for emitting R_LARCH_ALIGN relocations.
Some existing tests are updated to avoid being affected by this
optimization. New tests are added to verify: removal of redundant ALIGN
relocations, ALIGN emitted after the first linker-relaxable instruction,
and conservatively emitted ALIGN in lower-numbered subsections.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
Consider the following code:
```cpp
# 1 __FILE__ 1 3
export module a;
```
According to the wording in
[P1857R3](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1857r3.html):
```
A module directive may only appear as the first preprocessing tokens in a file (excluding the global module fragment.)
```
and the wording in
[[cpp.pre]](https://eel.is/c++draft/cpp.pre#nt:module-file)
```
module-file:
pp-global-module-fragment[opt] pp-module group[opt] pp-private-module-fragment[opt]
```
`#` is the first pp-token in the translation unit, and it was rejected
by clang, but they really should be exempted from this rule. The goal is
to not allow any preprocessor conditionals or most state changes, but
these don't fit that.
State change would mean most semantically observable preprocessor state,
particularly anything that is order dependent. Global flags like being a
system header/module shouldn't matter.
We should exempt a brunch of directives, even though it violates the
current standard wording.
In this patch, we introduce a `TrivialDirectiveTracer` to trace the
**State change** that described above and propose to exempt the
following kind of directive: `#line`, GNU line marker, `#ident`,
`#pragma comment`, `#pragma mark`, `#pragma detect_mismatch`, `#pragma
clang __debug`, `#pragma message`, `#pragma GCC warning`, `#pragma GCC
error`, `#pragma gcc diagnostic`, `#pragma OPENCL EXTENSION`, `#pragma
warning`, `#pragma execution_character_set`, `#pragma clang
assume_nonnull` and builtin macro expansion.
Fixes https://github.com/llvm/llvm-project/issues/145274
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
Per [range.access.general]/1, these CPOs are also provided in
`<iterator>`. Currently only some of them are provided via transitive
inclusion when only `<iterator>` is included.
Drive-by: Add an entry for `ranges::reserve_hint` in the general test
file for CPOs.
`NO_FIXED_SEGMENTS_SENTINEL` has a value that is actually a valid field
encoding and so it cannot be used as a sentinel.
Replace the sentinel with a new member variable, `VariableFC`, that
contains the value previously stored in `FilterChooserMap` with
`NO_FIXED_SEGMENTS_SENTINEL` key.
pli.h and pli.w both accept signed immediates, so pli.b should too. But
unlike those instructions, pli.b doesn't do any extension so its ok to
accept an unsigned immediate as well.
Specifically in the context of the once-stored transformation, GlobalOpt
would strip
all pointer casts unconditionally, even though addrspacecasts might be
runtime operations.
This manifested particularly on CHERI targets.
This patch was inspired by an existing change in CHERI LLVM
(91afa60f17),
but has been reimplemented with updated conventions, and a testcase
constructed from scratch.
Allow assignment of float to DocType and support output of float in
writeToBlob method.
Expand tests coverage to various missing basic I/O operations.
Co-authored-by: Xavi Zhang <Xavi.Zhang@amd.com>
The name is misleading, as setting Fragment to nullptr does not
necessarily make it undefined - common and equated symbols have
a nullptr fragment as well.
This patch introduces libomptarget support for the ATTACH map-type,
which can be used to implement OpenMP conditional compliant pointer
attachment, based on whether the pointer/pointee is newly mapped on a
given construct.
For example, for the following:
```c
int *p;
#pragma omp target enter data map(p[1:10])
```
The following maps can be emitted by clang:
```
(A)
&p[0], &p[1], 10 * sizeof(p[1]), TO | FROM
&p, &p[1], sizeof(p), ATTACH
```
Without this map-type, these two possible maps could be emitted by
clang:
```
(B)
&p[0], &p[1], 10 * sizeof(p[1]), TO | FROM
(C)
&p, &p[1], 10 * sizeof(p[1]), TO | FROM | PTR_AND_OBJ
````
(B) does not perform any pointer attachment, while (C) also maps the
pointer p, which are both incorrect.
In terms of implementation, maps with the ATTACH map-type are handled
after all other maps have been processed, as it requires knowledge of
which new allocations happened as part of the construct. As per OpenMP
5.0, an attachment should happen only when either the pointer or the
pointee was newly mapped while handling the construct.
Maps with ATTACH map-type-bit do not increase/decrease the ref-count.
With OpenMP 6.1, `attach(always/never)` can be used to force/prevent
attachment. For `attach(always)`, the compiler will insert the ALWAYS
map-type, which would let libomptarget bypass the check about one of the
pointer/pointee being new. With `attach(never)`, the ATTACH map will not
be emitted at all.
The size argument of the ATTACH map-type can specify values greater than
`sizeof(void*)` which can be used to support pointer attachment on
Fortran descriptors. Note that this also requires shadow-pointer
tracking to also support them. That has not been implemented in this
patch.
This was worked upon in coordination with Ravi Narayanaswamy, who has
since retired. Happy retirement, Ravi!
---------
Co-authored-by: Alex Duran <alejandro.duran@intel.com>
This helps better distinguish warnings that could be disabled via
`.clang-tidy` config (like `clang-diagnostic-literal-conversion`) from
errors that could not be suppressed at all (like
`clang-diagnostic-error`) because it's a hard compiler error.
Retry landing https://github.com/llvm/llvm-project/pull/153373
## Major changes from previous attempt
- remove the test in CAPI because no existing tests in CAPI deal with
sanitizer exemptions
- update `mlir/docs/Dialects/GPU.md` to reflect the new behavior: load
GPU binary in global ctors, instead of loading them at call site.
- skip the test on Aarch64 since we have an issue with initialization there
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
Most of the time we don't need instruction opcode. There is no need to
carry it around all the time, we can easily get it by other means.
Rename affected variables accordingly.
Part of an effort to simplify DecoderEmitter code.
This patch makes the current behavior explicit to prepare for adding VTs
for v[567]f16.
Right now these types are EVTs and hence don't fall under
getPreferredVectorAction and are simply widened to the next legal
power-of-two vector type. For SSE2 this is v8f16.
Without the preparatory patch however, the behavior would change after
adding these types. getPreferredVectorAction would try to split them
because this is the current behavior for any f16 vector type that is not
legal.
There is a lot more detail at
https://github.com/llvm/llvm-project/issues/152150 in particular how
splitting these new types leads to an inconsistency between
NumRegistersForVT and getTypeAction.
The patch ensures that after the new types are added they would continue
to be widened rather than split. Once the patch to enable v[567]f16
lands, it will be an NFC for x86.
This patch adds some hdrgen yaml for ioctl(). Otherwise the function
never actually ends up being available in a full build. This is the last
thing that is needed to enable turning on LIBCXX_ENABLE_RANDOM_DEVICE.
Simple fix for this particular html tag. A more complete solution should
be implemented.
1. Add all html tags to table so they are recognized. Some input on what
is desirable/safe would be appreciated
2. Change the lex strategy to deal with this in a different manner
Fixes#32680
---------
Co-authored-by: Brock Denson <brock.denson@virscient.com>
```cpp
struct Base {
int m;
};
template <class T>
struct Derived : Base {
Derived() { m = 0; }
};
```
would previously generate the following output:
```
<source>:7:15: warning: 'm' should be initialized in a member initializer of the constructor [cppcoreguidelines-prefer-member-initializer]
7 | Derived() { m = 0; }
| ^~~~~~
| : m(0)
```
This patch fixes this false positive.
Note that before this patch the checker won't give false positive for
```cpp
struct Derived : Base {
Derived() { m = 0; }
};
```
and the constructor's AST is
```
`-CXXConstructorDecl 0x557df03d1fb0 <line:7:3, col:22> col:3 Derived 'void ()' implicit-inline
|-CXXCtorInitializer 'Base'
| `-CXXConstructExpr 0x557df03d2748 <col:3> 'Base' 'void () noexcept'
`-CompoundStmt 0x557df03d2898 <col:13, col:22>
`-BinaryOperator 0x557df03d2878 <col:15, col:19> 'int' lvalue '='
|-MemberExpr 0x557df03d2828 <col:15> 'int' lvalue ->m 0x557df03d1c40
| `-ImplicitCastExpr 0x557df03d2808 <col:15> 'Base *' <UncheckedDerivedToBase (Base)>
| `-CXXThisExpr 0x557df03d27f8 <col:15> 'Derived *' implicit this
`-IntegerLiteral 0x557df03d2858 <col:19> 'int' 0
```
so `isAssignmentToMemberOf` would return empty due to
f0967fca04/clang-tools-extra/clang-tidy/cppcoreguidelines/PreferMemberInitializerCheck.cpp (L118-L119)Fixes#104400
This commit is a re-do of e4a8969e56572371201863594b3a549de2e23f32,
which got reverted, with the same goal: dramatically speed-up clang-tidy
by avoiding doing work in system headers (which is wasteful as warnings
are later discarded). This proposal was already discussed here with
favorable feedback: https://github.com/llvm/llvm-project/pull/132725
The novelty of this patch is:
- It's less aggressive: it does not fiddle with AST traversal. This
solves the issue with the previous patch, which impacted the ability to
inspect parents of a given node.
- Instead, what we optimize for is exitting early in each `Traverse*`
function of `MatchASTVisitor` if the node is in a system header, thus
avoiding calling the `match()` function with its corresponding callback
(when there is a match).
- It does not cause any failing tests.
- It does not move `MatchFinderOptions` - instead we add a user-defined
default constructor which solves the same problem.
- It introduces a function `shouldSkipNode` which can be extended for
adding more conditions. For example there's a PR open about skipping
modules in clang-tidy where this could come handy:
https://github.com/llvm/llvm-project/pull/145630
As a benchmark, I ran clang-tidy with all checks activated, on a single
.cpp file which #includes all the standard C++ headers, then measure the
time as well as found warnings.
On trunk:
```
Suppressed 75413 warnings (75413 in non-user code).
real 0m12.418s
user 0m12.270s
sys 0m0.129s
```
With this patch:
```
Suppressed 11448 warnings (11448 in non-user code).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
real 0m1.666s
user 0m1.538s
sys 0m0.129s
```
With the original patch that got reverted:
```
Suppressed 11428 warnings (11428 in non-user code).
real 0m1.193s
user 0m1.096s
sys 0m0.096s
```
We therefore get a dramatic reduction in number of warnings and runtime,
with no change in functionality.
The remaining warnings are due to `PPCallbacks` - implementing a similar
system-header exclusion mechanism there can lead to almost no warnings
left in system headers. This does not bring the runtime down as much,
though, so it's probably not worth the effort.
Fixes#52959
Co-authored-by: Carlos Gálvez <carlos.galvez@zenseact.com>
* Print filter stack in non-reversed order.
* Print encoding name to the right of encoding bits to deal with
alignment issues.
* Use the correct bit width when printing encoding bits.
Example of old output:
```
01000100........
01000...........
0100............
................
tADDhirr 000000000000000001000100________
tADDrSP 000000000000000001000100_1101___
tADDspr 0000000000000000010001001____101
```
New output:
```
................
0100............
01000...........
01000100........
01000100________ tADDhirr
01000100_1101___ tADDrSP
010001001____101 tADDspr
```