This patch implements the 'only X is allowed after' rule for combined
constructs on a device-type clause. This was left as a set of 'TODO' in
the previous patch, plus more issues were found with the TODO list,
which are fixed here.
This clause is pretty small/doesn't do much semantic-analysis-wise, , other than
have two spellings and disallow certain clauses after it. However, as
most of those aren't implemented yet, the diagnostic is left as a TODO.
These three are identical to the version on compute constructs, so this
patch implements the tests for it, and ensures that we properly validate
it against all the other clauses we're supposed to. The test is mostly
a mock-up at the moment, since most other clauses aren't implemented
yet for 'loop'.
Structural equivalence check uses a cache to store already found
non-equivalent values. This cache can be reused for calls (ASTImporter
does this). Value of "IgnoreTemplateParmDepth" can have an effect on the
structural equivalence therefore it is wrong to reuse the same cache for
checks with different values of 'IgnoreTemplateParmDepth'. The current
change adds the 'IgnoreTemplateParmDepth' to the cache key to fix the
problem.
Combined constructs (OpenACC 3.3 section 2.11) are a short-cut for
writing a `loop` construct immediately inside of a `compute` construct.
However, this interaction requires we do additional work to ensure that
we get the semantics between the two correct, as well as diagnostics.
This patch adds the semantic analysis for the constructs (but no
clauses), as well as the AST nodes.
The function definition instantiation assumes any declarations used
inside are already transformed before transforming the body, so we need
to preserve the transformed expression of CXXAssumeAttr even if it is
not a constant expression. Moreover, the full expression of the
assumption should also entail a potential lambda capture transformation,
hence the call to ActOnFinishFullExpr() after TransformExpr().
Fixes#114787
This implements the RFC
https://discourse.llvm.org/t/rfc-introduce-clang-lifetime-capture-by-x/81371
In this PR, we introduce `[[clang::lifetime_capture_by(X)]]` attribute
as discussed in the RFC.
As an implementation detail of this attribute, we store and use param
indices instead of raw param expressions. The parameter indices are
computed lazily at the end of function declaration since the function
decl (and therefore the subsequent parameters) are not visible yet while
parsing a parameter annotation.
In subsequent PR, we will infer this attribute for STL containers and
perform lifetime analysis to detect dangling cases.
Adds `AppendStructuredBuffer` and `ConsumeStructuredBuffer` definition
to HLSLExternalSemaSource. Adds separate tests for the AST shape and
element types, and adds constructor/handle.fromBinding test case to
shared test file for structured buffers.
These buffers do not have any subscript operators. Append and Consume
methods will be added later in llvm/llvm-project#112968.
Fixes#112777
After implementing 'loop', we determined that the link to its parent
only ever uses the type, not the construct itself. This patch removes
it, as it is both a waste and causes problems with serialization.
This paper made empty structures and unions implementation-defined. We
have always supported this as a GNU extension, so now we're documenting
our behavior and removing the extension warning in C2y mode.
This paper made qualified function types implementation-defined. We have
always supported this as an extension, so now we're documenting our
behavior.
Note, we still warn about this by default even in C2y mode because a
qualified function type is a sign of programmer confusion.
OpenACC restricts the contents of a 'for' loop affected by a 'loop'
construct without a 'seq'. The loop variable must be integer, pointer,
or random-access-iterator, it must monotonically increase/decrease, and
the trip count must be computable at runtime before the function.
This patch tries to implement some of these limitations to the best of
our ability, though it causes us to be perhaps overly restrictive at the
moment. I expect we'll revisit some of these rules/add additional
supported forms of loop-variable and 'monotonically increasing' here,
but the currently enforced rules are heavily inspired by the OMP
implementation here.
In 463a4f150, we assumed that all the template argument packs are of
size 1 when normalizing a constraint expression because I mistakenly
thought those packs were obtained from their injected template
parameters. This was wrong because we might be checking constraints when
instantiating a friend declaration within a class template
specialization, where the parent class template is specialized with
non-dependent template arguments.
In that sense, we shouldn't assume any pack size nor expand anything in
such a scenario. Moreover, there are no intermediate (substituted but
unexpanded) AST nodes for template template parameters, so we have to
special-case their transformations by looking into the instantiation
scope instead of extracting anything from template arguments.
Fixes#115098
The __builtin_counted_by_ref builtin is used on a flexible array
pointer and returns a pointer to the "counted_by" attribute's COUNT
argument, which is a field in the same non-anonymous struct as the
flexible array member. This is useful for automatically setting the
count field without needing the programmer's intervention. Otherwise
it's possible to get this anti-pattern:
ptr = alloc(<ty>, ..., COUNT);
ptr->FAM[9] = 42; /* <<< Sanitizer will complain */
ptr->count = COUNT;
To prevent this anti-pattern, the user can create an allocator that
automatically performs the assignment:
#define alloc(TY, FAM, COUNT) ({ \
TY __p = alloc(get_size(TY, COUNT)); \
if (__builtin_counted_by_ref(__p->FAM)) \
*__builtin_counted_by_ref(__p->FAM) = COUNT; \
__p; \
})
The builtin's behavior is heavily dependent upon the "counted_by"
attribute existing. It's main utility is during allocation to avoid
the above anti-pattern. If the flexible array member doesn't have that
attribute, the builtin becomes a no-op. Therefore, if the flexible
array member has a "count" field not referenced by "counted_by", it
must be set explicitly after the allocation as this builtin will
return a "nullptr" and the assignment will most likely be elided.
---------
Co-authored-by: Bill Wendling <isanbard@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
This fixes a crash when instantiating default arguments for templated
friend function declarations which lack a definition.
There are implementation limits which prevents us from finding the
pattern for such functions, and this causes difficulties
setting up the instantiation scope for the function parameters.
This patch skips instantiating the default argument in these cases,
which causes a minor regression in error recovery, but otherwise avoids
the crash.
Fixes#113324
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3344.pdf
This paper disallows a single `void` parameter from having qualifiers or
storage class specifiers. Clang has diagnosed most of these as an error
for a long time, but `register void` was previously accepted in all C
language modes and is now being rejected in all C language modes.
In 50e5411e4, we preserved the pack substitution index within
SubstTemplateTypeParmType nodes and performed in-place expansions of
packs such that type constraints on a lambda that serve as a pattern of
a fold expression could be evaluated if the type constraints contain any
packs that are expanded by the fold expression.
However, we made an incorrect assumption of the condition under which
in-place expansion should occur. For example, a SizeOfPackExpr case
relies on SubstTemplateTypeParmType nodes being transformed to
SubstTemplateTypeParmPackTypes rather than expanding them immediately in
place.
This fixes that by adding a flag to SubstTemplateTypeParmType to
discriminate such in-place expansion situations.
Fixes https://github.com/llvm/llvm-project/issues/113518
Adds `RasterizerOrderedStructuredBuffer` definition to
HLSLExternalSemaSource. Adds separate tests for the AST shape and
element types. Adds constructor/handle.fromBinding and subscript test
cases to shared test file for structured buffers. Additional methods
will be added later.
Fixes#112776
Implements elementwise firstbithigh hlsl builtin.
Implements firstbituhigh intrinsic for spirv and directx, which handles
unsigned integers
Implements firstbitshigh intrinsic for spirv and directx, which handles
signed integers.
Fixes#113486Closes#99115
We made the incorrect assumption that names of fields are unique when
creating their default initializers.
We fix that by keeping track of the instantiaation pattern for field
decls that are placeholder vars,
like we already do for unamed fields.
Fixes#114069
The 'allocator' modifier is now accepted in the 'allocate' clause. Added
LIT tests covering codegen, PCH, template handling, and serialization
for 'allocator' modifier.
Added support for allocator-modifier to release notes.
Testing
- New allocate modifier LIT tests.
- OpenMP LIT tests.
- check-all
- relevant sollve_vv test cases
tests/5.2/scope/test_scope_allocate_construct.c
LLVM support for the attribute has been implemented already, so it just
plumbs it through to the CUDA front-end.
One notable difference from NVCC is that the attribute can be used
regardless of the targeted GPU. On the older GPUs it will just be
ignored. The attribute is a performance hint, and does not warrant a
hard error if compiler can't benefit from it on a particular GPU
variant.
This PR implements a new type trait as a builtin,
__builtin_hlsl_is_typed_resource_element_compatible
This type traits verifies that the given input type is suitable as a
typed resource element type.
It checks that the given input type is homogeneous, has no more than 4
sub elements, does not exceed 16 bytes, and does not contain any arrays,
booleans, or enums.
Fixes an issue in https://github.com/llvm/llvm-project/pull/113730 that
needed to cause that PR to be reverted.
Fixes https://github.com/llvm/llvm-project/issues/113223
The `sycl_kernel_entry_point` attribute is used to declare a function that
defines a pattern for an offload kernel to be emitted. The attribute requires
a single type argument that specifies the type used as a SYCL kernel name as
described in section 5.2, "Naming of kernels", of the SYCL 2020 specification.
Properties of the offload kernel are collected when a function declared with
the `sycl_kernel_entry_point` attribute is parsed or instantiated. These
properties, such as the kernel name type, are stored in the AST context where
they are (or will be) used for diagnostic purposes and to facilitate reflection
to a SYCL run-time library. These properties are not serialized with the AST
but are recreated upon deserialization.
The `sycl_kernel_entry_point` attribute is intended to replace the existing
`sycl_kernel` attribute which is intended to be deprecated in a future change
and removed following an appropriate deprecation period. The new attribute
differs in that it is enabled for both SYCL host and device compilation, may
be used with non-template functions, explicitly indicates the type used as
the kernel name type, and will impact AST generation.
This change adds the basic infrastructure for the new attribute. Future
changes will add diagnostics and new AST support that will be used to drive
generation of the corresponding offload kernel.
Reverts llvm/llvm-project#113730
Reverting because test compiler-rt default failed, with:
error: comparison of different enumeration types ('Kind' and
'clang::Type::TypeClass')
This PR implements a new type trait as a builtin,
`__builtin_hlsl_is_typed_resource_element_compatible`
This type traits verifies that the given input type is suitable as a
typed resource element type.
It checks that the given input type is homogeneous, has no more than 4
sub elements, does not exceed 16 bytes, and does not contain any arrays,
booleans, or enums.
Fixes#113223
We need to compare constraint expressions when instantiating a friend
declaration that is lexically defined within a class template. Since the
evaluation is deferred, the expression might refer to untransformed
function parameters such that the substitution needs the mapping of
instantiation.
These mappings are maintained by the function declaration instantiation,
so we need to establish a "transparent" LocalInstantiationScope before
substituting into the constraint.
No release note as this fixes a regression in 19.
Fixes https://github.com/llvm/llvm-project/issues/114685