Replace the switch in `getOpenMPCaptureRegions` with a loop collecting
capture regions based on the constituent directives.
---------
Co-authored-by: Alexey Bataev <a.bataev@outlook.com>
This adds a language standard mode for the latest C standard. While
WG14 is hoping for a three-year cycle, it is not clear that the next
revision of C will be in 2026 and so a flag was not created for c26
specifically.
This change supports the 128-bit operands for inline ptx asm, both input
and output.\
\
The major changes are:
- Tablegen:\
Define Int128Regs in NVPTXRegisterInfo.td. But this register does
not set as general register type in NVPTX backend so that this change
will not influence the codegen without inline asm.\
Define three NVPTX intrinsics, IMOV128rr, V2I64toI128 and
I128toV2I64. The first one moves a register, the second one moves two
64-bit registers into one 128-bit register, and the third one just does
the opposite.
- NVPTXISelLowering & NVPTXISelDAGToDAG:\
Custom lowering CopyToReg and CopyFromReg with 128-bit operands.
CopyToReg deals with the inputs of the inline asm and the CopyFromReg
deals with the outputs.\
CopyToReg is custom lowered into a V2I64toI128, which takes in the
expanded values(Lo and Hi) of the input, and moves into a 128-bit reg.\
CopyFromReg is custom lowered by adding a I128toV2I64, which breaks
down the 128-bit outputs of inline asm into the expanded values.
What is considered "executable" in clang differs slightly from the
OpenMP's "executable" category. In addition to the executable category,
subsidiary directives, and OMPD_error are considered executable.
Implement a function that performs that check.
This patch augments the HIPAMD driver to allow it to target AMDGCN
flavoured SPIR-V compilation. It's mostly straightforward, as we re-use
some of the existing SPIRV infra, however there are a few notable
additions:
- we introduce an `amdgcnspirv` offload arch, rather than relying on
using `generic` (this is already fairly overloaded) or simply using
`spirv` or `spirv64` (we'll want to use these to denote unflavoured
SPIRV, once we bring up that capability)
- initially it is won't be possible to mix-in SPIR-V and concrete AMDGPU
targets, as it would require some relatively intrusive surgery in the
HIPAMD Toolchain and the Driver to deal with two triples
(`spirv64-amd-amdhsa` and `amdgcn-amd-amdhsa`, respectively)
- in order to retain user provided compiler flags and have them
available at JIT time, we rely on embedding the command line via
`-fembed-bitcode=marker`, which the bitcode writer had previously not
implemented for SPIRV; we only allow it conditionally for AMDGCN
flavoured SPIRV, and it is handled correctly by the Translator (it ends
up as a string literal)
Once the SPIRV BE is no longer experimental we'll switch to using that
rather than the translator. There's some additional work that'll come
via a separate PR around correctly piping through AMDGCN's
implementation of `printf`, for now we merely handle its flags
correctly.
'a' is an input/output constraint for restraining assembly variables
to an indexed or indirect address operand. It previously was marked
as supported but would throw an assertion for unknown constraint type
in the back-end when this test case was compiled. This change marks it
as unsupported until we can add full support for address operands
constraining to the compiler code generation.
We have been running into source location exhaustion recently and want
to use the statistics to monitor the usage in various files to be able
to anticipate where the next problem will happen.
I picked `Statistic` because it can be written into a structured JSON
file and is easier to consume by further automation.
This commit does not change any existing per-source-manager metrics
exposed via `SourceManager::PrintStats()`. This does create some
redundancy, but I also expect to be non-controversial because it aligns
with the intended use of `Statistic`.
This commit implements the entirety of the now-accepted [N3017
-Preprocessor
Embed](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3017.htm) and
its sister C++ paper [p1967](https://wg21.link/p1967). It implements
everything in the specification, and includes an implementation that
drastically improves the time it takes to embed data in specific
scenarios (the initialization of character type arrays). The mechanisms
used to do this are used under the "as-if" rule, and in general when the
system cannot detect it is initializing an array object in a variable
declaration, will generate EmbedExpr AST node which will be expanded by
AST consumers (CodeGen or constant expression evaluators) or expand
embed directive as a comma expression.
This reverts commit
682d461d5a.
---------
Co-authored-by: The Phantom Derpstorm <phdofthehouse@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Co-authored-by: H. Vetinari <h.vetinari@gmx.com>
This is the second attempt. When parsing the target attribute
we should be letting cc1 features which don't correspond to
Extensions pass through to avoid errors like the following:
% cat neon.c
__attribute__((target("arch=armv8-a")))
uint64x2_t foo(uint64x2_t a, uint64x2_t b) { return veorq_u64(a, b); }
% clang --target=aarch64-linux-gnu -c neon.c
error: always_inline function 'veorq_u64' requires target feature
'outline-atomics', but would be inlined into function 'foo'
that is compiled without support for 'outline-atomics'
Co-authored-by: Tomas Matheson <Tomas.Matheson@arm.com>
This is a follow up to #92037, which moved the architecture information.
Generate the AArch64TargetParser CPUInfo from tablegen Processor defs using a
new tablegen emitter. Some basic error checking is added in the emitter to
ensure that duplicate features are not added to the Processor defs.
The generic CPU becomes an entry in tablegen.
Some CPU features which were present in the CPUInfo but absent from the tablegen
defs have been added to tablegen. FeatureCrypto is replaced with FeatureSHA2 and
FeatureAES. This changes a few of the tests.
In GCC, `#pragma GCC diagnostic warning "-Wfoo"` overrides command-line
`-Werror=foo` and errors that can become warnings (pedwarn with
-pedantic-errors and permerror).
```
#pragma GCC diagnostic warning "-Wnarrowing"
int x = {4.2};
#pragma GCC diagnostic warning "-Wundef"
#if FOO
#endif
// gcc -c -Werror=undef -Werror=narrowing => two warnings
```
These diagnostics are similar to our Warning/ExtWarn/Extension
diagnostics with DefaultError. This patch ports the behavior to Clang.
Fix#93474
Pull Request: https://github.com/llvm/llvm-project/pull/93647
This aligns Fuchsia targets with other similar OS targets such as
Linux. Fuchsia's libc already uses unsigned rather than the
compiler-provided __WINT_TYPE__ macro for its wint_t typedef, so
this just makes the compiler consistent with the OS's actual ABI.
The only known manifestation of the mismatch is -Wformat warnings
for %lc no matching wint_t arguments.
The closest thing I could see to existing tests for each target's
wint_t type setting was the predefine tests that check various
macros including __WINT_TYPE__ on a per-machine and/or per-OS
basis. While the setting is done per-OS in most of the target
implementations rather than actually varying by machine, the only
existing tests for __WINT_TYPE__ are in per-machine checks that
are also wholly or partly tagged as per-OS. x86_64 and riscv64
tests for respective *-linux-gnu targets now check for the same
definitions in the respective *-fuchsia targets. __WINT_TYPE__
is not among the type checked in the aarch64 tests and those lack
a section that's specifically tested for aarch64-linux-gnu; if
such is added then it can similarly be made to check for most or
all of the same value on aarch64-fuchsia as aarch64-linux-gnu.
But since the actual implementation of choosing the type is done
per-OS and not per-machine for the three machines with Fuchsia
target support, the x86 and riscv64 tests are already redundantly
testing that same code and seem sufficient.
The commit adds serialization and de-serialization implementations for
the stored regions. Basically, the serialized representation of the
regions of a PP is a (ordered) sequence of source location encodings.
For de-serialization, regions from loaded files are stored by their ASTs.
When later one queries if a loaded location L is in an opt-out
region, PP looks up the regions of the loaded AST where L is at.
(Background if helps: a pair of `#pragma clang unsafe_buffer_usage begin/end` pragmas marks a
warning-opt-out region. The begin and end locations (opt-out regions)
are stored in preprocessor instances (PP) and will be queried by the
`-Wunsafe-buffer-usage` analyzer.)
The reported issue at upstream: https://github.com/llvm/llvm-project/issues/90501
rdar://124035402
This reverts commit 70510733af33c70ff7877eaf30d7718b9358a725.
The following code is now incorrectly rejected.
```
% cat neon.c
#include <arm_neon.h>
__attribute__((target("arch=armv8-a")))
uint64x2_t foo(uint64x2_t a, uint64x2_t b) {
return veorq_u64(a, b);
}
% newclang --target=aarch64-linux-gnu -c neon.c
neon.c:5:10: error: always_inline function 'veorq_u64' requires target feature 'outline-atomics', but would be inlined into function 'foo' that is compiled without support for 'outline-atomics'
5 | return veorq_u64(a, b);
| ^
1 error generated.
```
"+outline-atomics" seems misleading here.
My reverted attempt to decouple feature dependency expansion (see
#95056) made it evident that some features are still using the FMV
dependencies in the target attribute.
The original commit broke the llvm test suite. This was addressed here:
https://github.com/llvm/llvm-test-suite/pull/133. I am now relanding it.
This commit implements the entirety of the now-accepted [N3017 -
Preprocessor
Embed](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3017.htm) and
its sister C++ paper [p1967](https://wg21.link/p1967). It implements
everything in the specification, and includes an implementation that
drastically improves the time it takes to embed data in specific
scenarios (the initialization of character type arrays). The mechanisms
used to do this are used under the "as-if" rule, and in general when the
system cannot detect it is initializing an array object in a variable
declaration, will generate EmbedExpr AST node which will be expanded
by AST consumers (CodeGen or constant expression evaluators) or
expand embed directive as a comma expression.
---------
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Co-authored-by: H. Vetinari <h.vetinari@gmx.com>
Co-authored-by: Podchishchaeva, Mariya <mariya.podchishchaeva@intel.com>
This reverts commit 2cf14398c9341feddb419e7ff9c8c5623a3da3db since it
broke the llvm test suite:
SingleSource/UnitTests/AArch64/acle-fmv-features.c:59:9:
error: instruction requires: altnzcv
SingleSource/UnitTests/AArch64/acle-fmv-features.c:117:10:
error: instruction requires: aes
...
Looks like the FMV dependencies were used in the target attribute and
now features that are FMVOnly (have AEK_NONE) cannot be expanded in
parseTargetAttr using the ExtensionSet.
This suggests that either the tests are wrong (they are using an FMVOnly
feature in a target attribute), or that we need to turn the FMVOnly
features into Extensions (these two are tablegen classes).
The dependency expansion step which was introduced by FMV has been
erroneously used for non-FMV features, for example when parsing the
target attribute. The PR #93695 has rectified most of the tests which
were relying on dependency expansion of target features specified on the
-cc1 command line. In this patch I am decoupling the dependency
expansion of features specified on the target attribute from FMV.
To do that first I am expanding FMV dependencies before passing the list
of target features to initFeatureMap(). Similarly when parsing the
target attribute I am reconstructing an ExtensionSet from the list of
target features which was created during the command line option
parsing. The attribute parsing may toggle bits of that ExtensionSet and
at the end it is converted to a list of target features. Those are
passed to initFeatureMap(), which no longer requires an override.
A side effect of this refactoring is that features specified on the
target_version attribute now supersede the command line options, which
is what should be happening in the first place.
This change seeks to add support for vendor flavoured SPIRV - more
specifically, AMDGCN flavoured SPIRV. The aim is to generate SPIRV that
carries some extra bits of information that are only usable by AMDGCN
targets, forfeiting absolute genericity to obtain greater expressiveness
for target features:
- AMDGCN inline ASM is allowed/supported, under the assumption that the
[SPV_INTEL_inline_assembly](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_inline_assembly.asciidoc)
extension is enabled/used
- AMDGCN target specific builtins are allowed/supported, under the
assumption that e.g. the `--spirv-allow-unknown-intrinsics` option is
enabled when using the downstream translator
- the featureset matches the union of AMDGCN targets' features
- the datalayout string is overspecified to affix both the program
address space and the alloca address space, the latter under the
assumption that the
[SPV_INTEL_function_pointers](https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_function_pointers.asciidoc)
extension is enabled/used, case in which the extant SPIRV datalayout
string would lead to pointers to function pointing to the private
address space, which would be wrong.
Existing AMDGCN tests are extended to cover this new target. It is
currently dormant / will require some additional changes, but I thought
I'd rather put it up for review to get feedback as early as possible. I
will note that an alternative option is to place this under AMDGPU, but
that seems slightly less natural, since this is still SPIRV, albeit
relaxed in terms of preconditions & constrained in terms of
postconditions, and only guaranteed to be usable on AMDGCN targets (it
is still possible to obtain pristine portable SPIRV through usage of the
flavoured target, though).
Although i32 type is illegal in the backend, LA64 has pretty good
support for i32 types by using W instructions.
By adding n32 to the DataLayout string, middle end optimizations will
consider i32 to be a native type. One known effect of this is enabling
LoopStrengthReduce on loops with i32 induction variables. This can be
beneficial because C/C++ code often has loops with i32 induction
variables due to the use of `int` or `unsigned int`.
If this patch exposes performance issues, those are better addressed by
tuning LSR or other passes.
Adds AArch64 support for the `preserve_none` calling convention.
Registers X0-X7, X9-X15 and X19-X28 are caller save, and can be used to
pass arguments. Delegates to AAPCS for all other registers.
Closes#87423
Adds a builtin and intrinsic for the f16x8.splat instruction.
Specified at:
29a9b9462c/proposals/half-precision/Overview.md
Note: the current spec has f16x8.splat as opcode 0x123, but this is
incorrect and will be changed to 0x120 soon.
The `FileManager` might create a virtual directory that can be used
later as a search path. This is the case when we use remapping, as
demonstrated in the suggested LIT test.
We might encounter a 'false cache miss' and add the same directory
twice into `FileManager::SeenDirEntries` if the added record is a real
directory that is present on a disk:
- Once as a virtual directory
- And once as a real one
This isn't a problem if the added directories have the same name, as in
this case, we will get a cache hit. However, it could lead to
compilation errors if the directory names are different but point to the
same folder. For example, one might use an absolute name and another a
relative one. For instance, the **implicit-module-remap.cpp** LIT test
will fail with the following message:
```
/.../implicit-module-remap.cpp.tmp/test.cpp:1:2: fatal error: module 'a'
was built in directory '/.../implicit-module-remap.cpp.tmp' but now
resides in directory '.'
1 | #include "a.h"
| ^
1 error generated.
```
The suggested fix checks if the added virtual directory is present on
the disk and handles it as a real one if that is the case.
Commit: d59bc6b5c75384aa0b1e78cc85e17e8acaccebaf
Clang/MIPS: Add +fp64 if MSA and no explicit -mfp option (#91949)
added +fp64 for `clang`, while not for `clang -cc1`. So
clang -cc1 -triple=mips -target-feature +msa -S
will emit an asm source file without ".module fp=64".
This patch adds several const-qualified variants of existing member
functions to `SourceManager`.
I started with removing const qualification from
`setNumCreatedFIDsForFileID`, and removing `const_cast` in the body of
this function, as I think it doesn't make sense to const-qualify
setters.