Calling `DeclMustBeEmitted` should not lead to more deserialization, as
it may occur before previous deserialization has finished.
When passed a `VarDecl` with an initializer however, `DeclMustBeEmitted`
needs to know whether that initializer contains side effects. When the
`VarDecl` is deserialized but the initializer is not, this triggers
deserialization of the initializer. To avoid this we add a bit to the
serialization format for `VarDecl`s, indicating whether its initializer
contains side effects or not, so that the `ASTReader` can query this
information directly without deserializing the initializer.
rdar://153085264
We have multiple different attributes in clang representing device
kernels for specific targets/languages. Refactor them into one attribute
with different spellings to make it more easily scalable for new
languages/targets.
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
- Defines a new declaration node `HLSLRootSignature` in `DeclNodes.td`
that will consist of a `TrailingObjects` of the in-memory construction
of the root signature, namely an array of `hlsl::rootsig::RootElement`s
- Defines a new clang attr `RootSignature` which simply holds an
identifier to a corresponding root signature declaration as above
- Integrate the `HLSLRootSignatureParser` to construct the decl node in
`ParseMicrosoftAttributes` and then attach the parsed attr with an
identifier to the entry point function declaration.
- Defines the various required declaration methods
- Add testing that the declaration and reference attr are created
correctly, and some syntactical error tests.
It was previously proposed that we could have the root elements
reference be stored directly as an additional member of the attribute
and to not have a separate root signature decl. In contrast, by defining
them separately as this change proposes, we will allow a unique root
signature to have its own declaration in the AST tree. This allows us to
only construct a single root signature for all duplicate root signature
attributes. Having it located directly as a declaration might also prove
advantageous when we consider root signature libraries.
Resolves https://github.com/llvm/llvm-project/issues/119011
This adds
- The parsing of `trivially_relocatable_if_eligible`,
`replaceable_if_eligible` keywords
- `__builtin_trivially_relocate`, implemented in terms of memmove. In
the future this should
- Add the appropriate start/end lifetime markers that llvm does not have
(`start_lifetime_as`)
- Add support for ptrauth when that's upstreamed
- the `__builtin_is_cpp_trivially_relocatable` and
`__builtin_is_replaceable` traits
Fixes#127609
As reported in issue #103477, visibility of instantiated member
functions used to be ignored when calculating visibility of a
specialization.
This patch modifies `getLVForClassMember` to look up for a source
template for an instantiated member, and changes `mergeTemplateLV` to
apply it.
A similar issue was reported in #31462, but it seems that `extern`
declaration with visibility prevents the function from being emitted
as hidden. This behavior seems correct, even though GCC emits it as
with default visibility instead.
Both tests from #103477 and #31462 are added as LIT tests `test72` and
`test73` respectively.
Avoid using PreservesMost in the testcase as it is not supported on all
targets.
Original PR #118290.
Co-authored-by: Robert Dazi <14996868+v01dXYZ@users.noreply.github.com>
Co-authored-by: v01dxyz <v01dxyz@v01d.xyz>
As reported in issue #103477, visibility of instantiated member
functions used to be ignored when calculating visibility of a
specialization.
This patch modifies `getLVForClassMember` to look up for a source
template for an instantiated member, and changes `mergeTemplateLV` to
apply it.
A similar issue was reported in #31462, but it seems that `extern`
declaration with visibility prevents the function from being emitted as
hidden. This behavior seems correct, even though GCC emits it as with
default visibility instead.
Both tests from #103477 and #31462 are added as LIT tests `test72` and
`test73` respectively.
This is a basic implementation of P2719: "Type-aware allocation and
deallocation functions" described at http://wg21.link/P2719
The proposal includes some more details but the basic change in
functionality is the addition of support for an additional implicit
parameter in operators `new` and `delete` to act as a type tag. Tag is
of type `std::type_identity<T>` where T is the concrete type being
allocated. So for example, a custom type specific allocator for `int`
say can be provided by the declaration of
void *operator new(std::type_identity<int>, size_t, std::align_val_t);
void operator delete(std::type_identity<int>, void*, size_t, std::align_val_t);
However this becomes more powerful by specifying templated declarations,
for example
template <typename T> void *operator new(std::type_identity<T>, size_t, std::align_val_t);
template <typename T> void operator delete(std::type_identity<T>, void*, size_t, std::align_val_t););
Where the operators being resolved will be the concrete type being
operated over (NB. A completely unconstrained global definition as above
is not recommended as it triggers many problems similar to a general
override of the global operators).
These type aware operators can be declared as either free functions or
in class, and can be specified with or without the other implicit
parameters, with overload resolution performed according to the existing
standard parameter prioritisation, only with type parameterised
operators having higher precedence than non-type aware operators. The
only exception is destroying_delete which for reasons discussed in the
paper we do not support type-aware variants by default.
This feature is currently not supported in the compiler.
To facilitate this we emit a stub version of each kernel
function body with different name mangling scheme, and
replaces the respective kernel call-sites appropriately.
Fixes https://github.com/llvm/llvm-project/issues/60313
D120566 was an earlier attempt made to upstream a solution
for this issue.
---------
Co-authored-by: anikelal <anikelal@amd.com>
This introduces a new class 'UnsignedOrNone', which models a lite
version of `std::optional<unsigned>`, but has the same size as
'unsigned'.
This replaces most uses of `std::optional<unsigned>`, and similar
schemes utilizing 'int' and '-1' as sentinel.
Besides the smaller size advantage, this is simpler to serialize, as its
internal representation is a single unsigned int as well.
All variable declarations in the global scope that are not resources,
static or empty are implicitly added to implicit constant buffer
`$Globals`. They are created in `hlsl_constant` address space and
collected in an implicit `HLSLBufferDecl` node that is added to the AST
at the end of the translation unit. Codegen is the same as for explicit
constant buffers.
Fixes#123801
This is a second attempt to implement this feature. The first attempt
had to be reverted because of memory leaks. The problem was adding a
`SmallVector` member on `HLSLBufferDecl` node to represent a list of
default buffer declarations. When this vector needed to grow, it
allocated memory that was never released, because all memory used by AST
nodes must be allocated by `ASTContext` allocator and is released all at
once. Destructors on AST nodes are never called.
It this change the list of default buffer declarations is collected in a
`SmallVector` instance on `SemaHLSL`. The `HLSLBufDecl` representing
`$Globals` is created at the end of the translation unit when the number
of declarations is known, and the list is copied into an array allocated
by the `ASTContext` allocator.
All variable declarations in the global scope that are not resources,
static or empty are implicitly added to implicit constant buffer
`$Globals`. They are created in `hlsl_constant` address space and
collected in an implicit `HLSLBufferDecl` node that is added to the AST
at the end of the translation unit. Codegen is the same as for explicit
constant buffers.
Fixes#123801
Translates `cbuffer` declaration blocks to `target("dx.CBuffer")` type. Creates global variables in `hlsl_constant` address space for all `cbuffer` constant and adds metadata describing which global constant belongs to which constant buffer. For explicit constant buffer layout information an explicit layout type `target("dx.Layout")` is used. This might change in the future.
The constant globals are temporary and will be removed in upcoming pass that will translate `load` instructions in the `hlsl_constant` address space to constant buffer load intrinsics calls off a CBV handle (#124630, #112992).
See [Constant buffer design
doc](https://github.com/llvm/wg-hlsl/pull/94) for more details.
Fixes#113514, #106596
This is an implementation of P1061 Structure Bindings Introduce a Pack
without the ability to use packs outside of templates. There is a couple
of ways the AST could have been sliced so let me know what you think.
The only part of this change that I am unsure of is the
serialization/deserialization stuff. I followed the implementation of
other Exprs, but I do not really know how it is tested. Thank you for
your time considering this.
---------
Co-authored-by: Yanzuo Liu <zwuis@outlook.com>
This fixes instantiation of definition for friend function templates,
when the declaration found and the one containing the definition have
different template contexts.
In these cases, the the function declaration corresponding to the
definition is not available; it may not even be instantiated at all.
So this patch adds a bit which tracks which function template
declaration was instantiated from the member template. It's used to find
which primary template serves as a context for the purpose of
obtainining the template arguments needed to instantiate the definition.
Fixes#55509
Relanding patch, with no changes, after it was reverted due to revert of
commit this patch depended on.
Currently, these generate incorrect code, as streaming/SME attributes
are not propagated to the outlined function. As we've yet to work on
mixing OpenMP and streaming functions (and determine how they should
interact with OpenMP's runtime), we think it is best to disallow this
for now.
Note that PointerUnion::dyn_cast has been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
Literal migration would result in dyn_cast_if_present (see the
definition of PointerUnion::dyn_cast), but this patch uses dyn_cast
because we expect TemplateOrSpecialization to be nonnull.
We record whether an expression is immediate escalating in the
FunctionScope.
However, that only happen when parsing or transforming an expression.
This might not happen when transforming a non dependent expression.
This patch fixes that by considering a function immediate when
instantiated from an immediate function.
Fixes#123405
Note that PointerUnion::dyn_cast has been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
This patch migrates uses of PointerUnion::dyn_cast to
dyn_cast_if_present (see the definition of PointerUnion::dyn_cast).
Note that we cannot use dyn_cast in any of the migrations in this
patch; placing
assert(!X.isNull());
just before any of dyn_cast_if_present in this patch triggers some
failure in check-clang.
A SYCL kernel entry point function is a non-member function or a static
member function declared with the `sycl_kernel_entry_point` attribute.
Such functions define a pattern for an offload kernel entry point
function to be generated to enable execution of a SYCL kernel on a
device. A SYCL library implementation orchestrates the invocation of
these functions with corresponding SYCL kernel arguments in response to
calls to SYCL kernel invocation functions specified by the SYCL 2020
specification.
The offload kernel entry point function (sometimes referred to as the
SYCL kernel caller function) is generated from the SYCL kernel entry
point function by a transformation of the function parameters followed
by a transformation of the function body to replace references to the
original parameters with references to the transformed ones. Exactly how
parameters are transformed will be explained in a future change that
implements non-trivial transformations. For now, it suffices to state
that a given parameter of the SYCL kernel entry point function may be
transformed to multiple parameters of the offload kernel entry point as
needed to satisfy offload kernel argument passing requirements.
Parameters that are decomposed in this way are reconstituted as local
variables in the body of the generated offload kernel entry point
function.
For example, given the following SYCL kernel entry point function
definition:
```
template<typename KernelNameType, typename KernelType>
[[clang::sycl_kernel_entry_point(KernelNameType)]]
void sycl_kernel_entry_point(KernelType kernel) {
kernel();
}
```
and the following call:
```
struct Kernel {
int dm1;
int dm2;
void operator()() const;
};
Kernel k;
sycl_kernel_entry_point<class kernel_name>(k);
```
the corresponding offload kernel entry point function that is generated
might look as follows (assuming `Kernel` is a type that requires
decomposition):
```
void offload_kernel_entry_point_for_kernel_name(int dm1, int dm2) {
Kernel kernel{dm1, dm2};
kernel();
}
```
Other details of the generated offload kernel entry point function, such
as its name and calling convention, are implementation details that need
not be reflected in the AST and may differ across target devices. For
that reason, only the transformation described above is represented in
the AST; other details will be filled in during code generation.
These transformations are represented using new AST nodes introduced
with this change. `OutlinedFunctionDecl` holds a sequence of
`ImplicitParamDecl` nodes and a sequence of statement nodes that
correspond to the transformed parameters and function body.
`SYCLKernelCallStmt` wraps the original function body and associates it
with an `OutlinedFunctionDecl` instance. For the example above, the AST
generated for the `sycl_kernel_entry_point<kernel_name>` specialization
would look as follows:
```
FunctionDecl 'sycl_kernel_entry_point<kernel_name>(Kernel)'
TemplateArgument type 'kernel_name'
TemplateArgument type 'Kernel'
ParmVarDecl kernel 'Kernel'
SYCLKernelCallStmt
CompoundStmt
<original statements>
OutlinedFunctionDecl
ImplicitParamDecl 'dm1' 'int'
ImplicitParamDecl 'dm2' 'int'
CompoundStmt
VarDecl 'kernel' 'Kernel'
<initialization of 'kernel' with 'dm1' and 'dm2'>
<transformed statements with redirected references of 'kernel'>
```
Any ODR-use of the SYCL kernel entry point function will (with future
changes) suffice for the offload kernel entry point to be emitted. An
actual call to the SYCL kernel entry point function will result in a
call to the function. However, evaluation of a `SYCLKernelCallStmt`
statement is a no-op, so such calls will have no effect other than to
trigger emission of the offload kernel entry point.
Additionally, as a related change inspired by code review feedback,
these changes disallow use of the `sycl_kernel_entry_point` attribute
with functions defined with a _function-try-block_. The SYCL 2020
specification prohibits the use of C++ exceptions in device functions.
Even if exceptions were not prohibited, it is unclear what the semantics
would be for an exception that escapes the SYCL kernel entry point
function; the boundary between host and device code could be an implicit
noexcept boundary that results in program termination if violated, or
the exception could perhaps be propagated to host code via the SYCL
library. Pending support for C++ exceptions in device code and clear
semantics for handling them at the host-device boundary, this change
makes use of the `sycl_kernel_entry_point` attribute with a function
defined with a _function-try-block_ an error.
Note that PointerUnion::dyn_cast has been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
Literal migration would result in dyn_cast_if_present (see the
definition of PointerUnion::dyn_cast), but this patch uses dyn_cast
because we expect TemplateOrSpecialization to be nonnull.
Note that PointerUnion::dyn_cast has been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
Literal migration would result in dyn_cast_if_present (see the
definition of PointerUnion::dyn_cast), but this patch uses dyn_cast
because we expect Init to be nonnull. Note that hasInit returns true
only if Init is nonnull among other conditions.
This is a new Clang-specific attribute to ensure that field
initializations are performed explicitly.
For example, if we have
```
struct B {
[[clang::explicit]] int f1;
};
```
then the diagnostic would trigger if we do `B b{};`:
```
field 'f1' is left uninitialized, but was marked as requiring initialization
```
This prevents callers from accidentally forgetting to initialize fields,
particularly when new fields are added to the class.
Move the common case of FieldDecl::getFieldIndex() inline to mitigate
the cost of removing the extra `FieldNo` induction variable.
Also rename isNoUniqueAddress parameter to isNonVirtualBaseType, which
appears to be more accurate. I think the current name is just a
consequence of autocomplete gone wrong.
This reverts commit 81fc3add1e627c23b7270fe2739cdacc09063e54.
This breaks some LLDB tests, e.g.
SymbolFile/DWARF/x86/no_unique_address-with-bitfields.cpp:
lldb: ../llvm-project/clang/lib/AST/Decl.cpp:4604: unsigned int clang::FieldDecl::getBitWidthValue() const: Assertion `isa<ConstantExpr>(getBitWidth())' failed.
Save the bitwidth value as a `ConstantExpr` with the value set. Remove
the `ASTContext` parameter from `getBitWidthValue()`, so the latter
simply returns the value from the `ConstantExpr` instead of
constant-evaluating the bitwidth expression every time it is called.
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
We observed 2X slowdown in lldb's expression evaluation with
https://github.com/llvm/llvm-project/pull/109147 in some cases. It turns
out that calling `isRedundantInlineQualifierFor` is quite expensive.
Using short-circuit evaluation in the if statement to avoid unnecessary
calls to that function.
The __builtin_counted_by_ref builtin is used on a flexible array
pointer and returns a pointer to the "counted_by" attribute's COUNT
argument, which is a field in the same non-anonymous struct as the
flexible array member. This is useful for automatically setting the
count field without needing the programmer's intervention. Otherwise
it's possible to get this anti-pattern:
ptr = alloc(<ty>, ..., COUNT);
ptr->FAM[9] = 42; /* <<< Sanitizer will complain */
ptr->count = COUNT;
To prevent this anti-pattern, the user can create an allocator that
automatically performs the assignment:
#define alloc(TY, FAM, COUNT) ({ \
TY __p = alloc(get_size(TY, COUNT)); \
if (__builtin_counted_by_ref(__p->FAM)) \
*__builtin_counted_by_ref(__p->FAM) = COUNT; \
__p; \
})
The builtin's behavior is heavily dependent upon the "counted_by"
attribute existing. It's main utility is during allocation to avoid
the above anti-pattern. If the flexible array member doesn't have that
attribute, the builtin becomes a no-op. Therefore, if the flexible
array member has a "count" field not referenced by "counted_by", it
must be set explicitly after the allocation as this builtin will
return a "nullptr" and the assignment will most likely be elided.
---------
Co-authored-by: Bill Wendling <isanbard@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
This patch reapplies #114258, fixing an infinite recursion bug in
`ASTImporter` that occurs when importing the primary template of a class
template specialization when the latest redeclaration of that template
is a friend declaration in the primary template.