This patch is part of the upstreaming effort for supporting SYCL
language front end.
It makes the following changes:
1. Adds sycl_external attribute for functions with external linkage,
which is intended for use to implement the SYCL_EXTERNAL macro as
specified by the SYCL 2020 specification
2. Adds checks to avoid emitting device code when sycl_external and
sycl_kernel_entry_point attributes are not enabled
3. Fixes test failures caused by the above changes
This patch is missing diagnostics for the following diagnostics listed
in the SYCL 2020 specification's section 5.10.1, which will be addressed
in a subsequent PR:
Functions that are declared using SYCL_EXTERNAL have the following
additional restrictions beyond those imposed on other device functions:
1. If the SYCL backend does not support the generic address space then
the function cannot use raw pointers as parameter or return types.
Explicit pointer classes must be used instead;
2. The function cannot call group::parallel_for_work_item;
3. The function cannot be called from a parallel_for_work_group scope.
In addition to that, the subsequent PR will also implement diagnostics
for inline functions including virtual functions defined as inline.
---------
Co-authored-by: Mariya Podchishchaeva <mariya.podchishchaeva@intel.com>
This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
The vk::binding attribute allows users to explicitly set the set and
binding for a resource in SPIR-V without chaning the "register"
attribute, which will be used when targeting DXIL.
Fixes https://github.com/llvm/llvm-project/issues/136894
Clang's current implementation only works on array types, but GCC (which
is where we got this attribute) supports it on pointers as well as
arrays.
Fixes#150951
This patch fixes incorrect behavior in Clang where [[noreturn]] (either
spelled or inferred) was being inherited by explicit specializations of
function templates or member function templates, even when those
specializations returned normally.
Follow up on https://github.com/llvm/llvm-project/pull/145166
Sema currently has checkTargetVersionAttr and
checkTargetClonesAttrString to diagnose the said attributes. However the
code tries to handle all of AArch64, RISC-V and X86 targets at once
which is hard to maintain, therefore I am splitting these functions.
Unfortunately I could not use polymorphism because all of Sema, SemaARM,
SemaRISCV and SemaX86 inherit from SemaBase. The Sema instance itself
contains instances of every other target specific Sema.
Preserve the argument-clause for `warn-unused-result` when under clang::
scope.
We are not touching gnu:: scope for now as it's an error for GCC to have
that string. Personally I think it would be ok to relax it here too as
we are not introducing breakage to currently passing code, but feedback
is to go slowly about it.
Noticed these while doing a review on changes in the area, but C23
added support for nodiscard with a message, so it's not an extension
we need to diagnose.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
This commit is using the same mechanism as vk::ext_builtin_input to
implement the SV_Position semantic input.
The HLSL signature is not yet ready for DXIL, hence this commit only
implements the SPIR-V side.
This is incomplete as it doesn't allow the semantic on hull/domain and
other shaders, but it's a first step to validate the overall
input/output
semantic logic.
Fixes https://github.com/llvm/llvm-project/issues/136969
* Translate the following versions to 26.
* watchOS 12 -> 26
* visionOS 3 -> 26
* macos 16 -> 26
* iOS 19 -> 26
* tvOS 19 -> 26
* Emit diagnostics, but allow conversion when clients attempt to use
invalid gaps in OS versioning in availability.
* For target-triples, only allow "valid" versions for implicit
conversions.
We have multiple different attributes in clang representing device
kernels for specific targets/languages. Refactor them into one attribute
with different spellings to make it more easily scalable for new
languages/targets.
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
This introduces the attribute discussed in
https://discourse.llvm.org/t/rfc-function-type-attribute-to-prevent-cfi-instrumentation/85458.
The proposed name has been changed from `no_cfi` to
`cfi_unchecked_callee` to help differentiate from `no_sanitize("cfi")`
more easily. The proposed attribute has the following semantics:
1. Indirect calls to a function type with this attribute will not be
instrumented with CFI. That is, the indirect call will not be checked.
Note that this only changes the behavior for indirect calls on pointers
to function types having this attribute. It does not prevent all
indirect function calls for a given type from being checked.
2. All direct references to a function whose type has this attribute
will always reference the true function definition rather than an entry
in the CFI jump table.
3. When a pointer to a function with this attribute is implicitly cast
to a pointer to a function without this attribute, the compiler will
give a warning saying this attribute is discarded. This warning can be
silenced with an explicit C-style cast or C++ static_cast.
This variable attribute is used in HLSL to add Vulkan specific builtins
in a shader.
The attribute is documented here:
17727e88fd/proposals/0011-inline-spirv.md
Those variable, even if marked as `static` are externally initialized by
the pipeline/driver/GPU. This is handled by moving them to a specific
address space `hlsl_input`, also added by this commit.
The design for input variables in Clang can be found here:
355771361e/proposals/0019-spirv-input-builtin.md
Co-authored-by: Justin Bogner <mail@justinbogner.com>
With the current behavior the following example yields a linker error:
"multiple definition of `foo.default'"
// Translation Unit 1
__attribute__((target_clones("dotprod, sve"))) int foo(void) { return 1; }
// Translation Unit 2
int foo(void) { return 0; }
__attribute__((target_version("dotprod"))) int foo(void);
__attribute__((target_version("sve"))) int foo(void);
int bar(void) { return foo(); }
That is because foo.default is generated twice. As a user I don't find
this particularly intuitive. If I wanted the default to be generated in
TU1 I'd rather write target_clones("dotprod, sve", "default")
explicitly.
When changing the code I noticed that the RISC-V target defers the
resolver emission when encountering a target_version definition. This
seems accidental since it only makes sense for AArch64, where we only
emit a resolver once we've processed the entire TU, and only if the
default version is present. I've changed this so that RISC-V immediately
emmits the resolver. I adjusted the codegen tests since the functions
now appear in a different order.
Implements https://github.com/ARM-software/acle/pull/377
Microsoft helpfully defines `THIS` to `void` in two different platform
SDK headers, at least one of which is reachable via <Windows.h>. We have
a user who ran into a build because of `THIS` unfortunate macro name
collision.
Rename the members to better match our naming conventions.
Fixes#142186
The patch introduce __builtin_spirv_generic_cast_to_ptr_explicit which
is lowered to the llvm.spv.generic.cast.to.ptr.explicit intrinsic.
The SPIR-V builtins are now split into 3 differents file:
BuiltinsSPIRVCore.td,
BuiltinsSPIRVVK.td for Vulkan specific builtins, BuiltinsSPIRVCL.td for
OpenCL specific builtins
and BuiltinsSPIRVCommon.td for common ones.
The patch also introduces a new header defining its SPIR-V friendly
equivalent (__spirv_GenericCastToPtrExplicit_ToGlobal,
__spirv_GenericCastToPtrExplicit_ToLocal and
__spirv_GenericCastToPtrExplicit_ToPrivate). The functions are declared
as aliases to the new builtin allowing C-like languages to have a
definition to rely on as well as gaining proper front-end diagnostics.
The motivation for the header is to provide a stable binding for
applications or library (such as SYCL) and allows non SPIR-V targets to
provide an implementation (via libclc or similar to how it is done for
gpuintrin.h).
This adds support under LoongArch for the target("..") attributes.
The supported formats are:
- "arch=<arch>" strings, that specify the architecture features for a
function as per the -march=arch option.
- "tune=<cpu>" strings, that specify the tune-cpu cpu for a function as
per -mtune.
- "<feature>", "no-<feature>" enabled/disables the specific feature.
Introduce the `reentrant_capability` attribute, which may be specified
alongside the `capability(..)` attribute to denote that the defined
capability type is reentrant. Marking a capability as reentrant means
that acquiring the same capability multiple times is safe, and does not
produce warnings on attempted re-acquisition.
The most significant changes required are plumbing to propagate if the
attribute is present to a CapabilityExpr, and introducing
ReentrancyDepth to the LockableFactEntry class.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
This patch enhances Clang's diagnosis for unknown attributes by
providing typo correction suggestions for known attributes.
```cpp
[[gmu::deprected]] // expected-warning {{unknown attribute 'gmu::deprected' ignored; did you mean 'gnu::deprecated'?}}
int f1(void) {
return 0;
}
[[deprected]] // expected-warning {{unknown attribute 'deprected' ignored; did you mean 'deprecated'?}}
int f2(void) {
return 0;
}
```
Instead of reporting the location of the attribute, let's report the
location of the function reference that's passed to the cleanup
attribute as the first argument. This is required as the attribute might
be coming from a macro which means clang-include-cleaner skips the use
as it gets attributed to the header file declaringt the macro and not to
the main file.
To make this work, we have to add a fake argument to the CleanupAttr
constructor so we can pass in the original Expr alongside the function
declaration.
Fixes#140212
- Defines a new declaration node `HLSLRootSignature` in `DeclNodes.td`
that will consist of a `TrailingObjects` of the in-memory construction
of the root signature, namely an array of `hlsl::rootsig::RootElement`s
- Defines a new clang attr `RootSignature` which simply holds an
identifier to a corresponding root signature declaration as above
- Integrate the `HLSLRootSignatureParser` to construct the decl node in
`ParseMicrosoftAttributes` and then attach the parsed attr with an
identifier to the entry point function declaration.
- Defines the various required declaration methods
- Add testing that the declaration and reference attr are created
correctly, and some syntactical error tests.
It was previously proposed that we could have the root elements
reference be stored directly as an additional member of the attribute
and to not have a separate root signature decl. In contrast, by defining
them separately as this change proposes, we will allow a unique root
signature to have its own declaration in the AST tree. This allows us to
only construct a single root signature for all duplicate root signature
attributes. Having it located directly as a declaration might also prove
advantageous when we consider root signature libraries.
Resolves https://github.com/llvm/llvm-project/issues/119011
This patch enhances Clang's diagnosis of an unknown attribute by
printing the attribute's namespace in the diagnostic text. e.g.,
```cpp
[[foo::nodiscard]] int f(); // warning: unknown attribute 'foo::nodiscard' ignored
```
This patch adds templated `operator<<` for diagnostics that pass scoped
enums, saving people from `llvm::to_underlying()` clutter on the side of
emitting the diagnostic. This eliminates 80 out of 220 usages of
`llvm::to_underlying()` in Clang.
I also backported `std::is_scoped_enum_v` from C++23.
This introduces three things related to intialization like:
char buf[3] = "foo";
where the array does not declare enough space for the null terminator
but otherwise can represent the array contents exactly.
1) An attribute named 'nonstring' which can be used to mark that a
field or variable is not intended to hold string data.
2) -Wunterminated-string-initialization, which is grouped under
-Wextra, and diagnoses the above construct unless the declaration
uses the 'nonstring' attribute.
3) -Wc++-unterminated-string-initialization, which is grouped under
-Wc++-compat, and diagnoses the above construct even if the
declaration uses the 'nonstring' attribute.
Fixes#137705
This introduces a new diagnostic group (-Wimplicit-void-ptr-cast),
grouped under -Wc++-compat, which diagnoses implicit conversions from
void * to another pointer type in C. It's a common source of
incompatibility with C++ and is something GCC diagnoses (though GCC does
not have a specific warning group for this).
Fixes#17792
FPSCR and FPEXC will be stored in FPStatusRegs, after GPRCS2 has been
saved.
- GPRCS1
- GPRCS2
- FPStatusRegs (new)
- DPRCS
- GPRCS3
- DPRCS2
FPSCR is present on all targets with a VFP, but the FPEXC register is
not present on Cortex-M devices, so different amounts of bytes are
being pushed onto the stack depending on our target, which would
affect alignment for subsequent saves.
DPRCS1 will sum up all previous bytes that were saved, and will emit
extra instructions to ensure that its alignment is correct. My
assumption is that if DPRCS1 is able to correct its alignment to be
correct, then all subsequent saves will also have correct alignment.
Avoid annotating the saving of FPSCR and FPEXC for functions marked
with the interrupt_save_fp attribute, even though this is done as part
of frame setup. Since these are status registers, there really is no
viable way of annotating this. Since these aren't GPRs or DPRs, they
can't be used with .save or .vsave directives. Instead, just record
that the intermediate registers r4 and r5 are saved to the stack
again.
Co-authored-by: Jake Vossen <jake@vossen.dev>
Co-authored-by: Alan Phipps <a-phipps@ti.com>
The MSVC STL includes specializations of `_Is_memfunptr` for every
function pointer type, including every calling convention.
The problem is the AMDGPU target doesn't support the x86 `vectorcall`
calling convention so clang sets it to the default CC. This ends up
clashing with the already-existing overload for the default CC, so we
get a duplicate definition error when including `type_traits` (which we
heavily use in the SYCL STL) and compiling for AMDGPU on Windows.
This doesn't happen for pure AMDGPU non-SYCL because it doesn't include
the C++ STL, and it doesn't happen for CUDA/HIP because a similar
workaround was done
[here](fa49c3a888).
I am not an expert in Sema, so I did a kinda of hardcoded fix, please
let me know if there is a better way to fix this.
As far as I can tell we can't do exactly the same fix that was done for
CUDA because we can't differentiate between device and host code so
easily.
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>