Clang supports using a type as the predicate for generic selection
expressions but lacked support for serializing/deserializing these
extended generic selection expressions.
Signed-off-by: Will Hawkins <hawkinsw@obs.cr>
When targeting runtimes that support constant literal classes, emit ObjC
literal expressions @(number), @[], and @{} as compile-time constant
data structures rather than runtime msgSend calls. This reduces code
size and runtime overhead at the cost of increased data segment size,
and avoids repeated heap allocation of equivalent literal objects.
The feature is not supported with the fragile ABI or GNU runtimes, where
it is automatically disabled.
The feature can be disabled altogether with -fno-objc-constant-literals,
or individually per literal kind:
-fno-constant-nsnumber-literals
-fno-constant-nsarray-literals
-fno-constant-nsdictionary-literals
Custom backing class names can be specified via:
-fconstant-array-class=<name>
-fconstant-dictionary-class=<name>
-fconstant-integer-number-class=<name>
-fconstant-float-number-class=<name>
-fconstant-double-number-class=<name>
rdar://45380392
rdar://168106035
---------
Co-authored-by: Ben D. Jones <bendjones@apple.com>
The `sycl_kernel_entry_point` attribute facilitates the generation of an
offload kernel entry point function based on the parameters and body
of the attributed function. This change extends the behavior of that
attribute to support integration with a SYCL runtime library through
an interface that communicates symbol names and kernel arguments
for the generated offload kernel entry point functions.
Consider the following function declared with the
`sycl_kernel_entry_point` attribute with a call to this function
occurring in the implementation of a SYCL kernel invocation function
such as `sycl::handler::single_task()`.
```c++
template<typename KernelName, typename KernelType>
[[clang::sycl_kernel_entry_point(KernelName)]]
void kernel_entry_point(KernelType kernel) {
kernel();
}
```
The body of the above function specifies the parameters and body of the
generated offload kernel entry point. Clearly, a call to the above
function by a SYCL kernel invocation function is not intended to execute
the body as written. Previously, code generation emitted an empty
function body so that calls to the function had no effect other than to
trigger the generation of the offload kernel entry point. The function
body is therefore available to hook for SYCL library support and is now
substituted with a call to a (SYCL library provided) function template
or variable template named `sycl_kernel_launch()` with the kernel
name type passed as the first template argument, the symbol name
of the offload kernel entry point passed as a string literal for the first
function argument, and the function parameters passed as the
remaining explicit function arguments. Given a call like this:
```c++
kernel_entry_point<struct KN>([]{})
```
the body of the instantiated `kernel_entry_point()` specialization would
be substituted as follows with "kernel-symbol-name" substituted for the
generated symbol name and `kernel` forwarded.
```c++
sycl_kernel_launch<KN>("kernel-symbol-name", kernel)
```
Name lookup and overload resolution for the `sycl_kernel_launch()`
function is performed at the point of definition of the
`sycl_kernel_entry_point` attributed function (or the point of
instantiation for an instantiated function template specialization). If
overload resolution fails, the program is ill-formed.
Implementation of the `sycl_kernel_launch()` function might require
additional information provided by the SYCL library. This is facilitated
by removing the previous prohibition against use of the
`sycl_kernel_entry_point` attribute with a non-static member function.
If the `sycl_kernel_entry_point` attributed function is a non-static
member function, then overload resolution for the `sycl_kernel_launch()`
function template may select a non-static member function in which case,
`this` will be implicitly passed as the implicit object argument.
If a `sycl_kernel_entry_point` attributed function is a non-static
member function, use of `this` in a potentially evaluated expression is
prohibited in the definition since `this` is not a kernel argument and
will not be available within the generated offload kernel entry point
function. The attribute cannot be applied to a function with an
explicit object parameter.
---------
Co-authored-by: Mariya Podchishchaeva <mariya.podchishchaeva@intel.com>
fixes#159438
This patch adds `MatrixElementExpr`, a new AST node for HLSL matrix
element and swizzle access (e.g. M._m00, M._11_22_33).
It introduces a shared `ElementAccessExprBase` used by both matrix and
vector swizzle expressions, updates Sema to parse and validate
zero-based and one-based accessors, detects duplicates for l-value
checks, and emits improved diagnostics. CodeGen is updated to lower
scalar and multi-element accesses consistently, and full AST
serialization, dumping, and tooling support is included. This
implementation reflects the updated
[RFC](https://github.com/llvm/wg-hlsl/pull/357/files) for HLSL matrix
accessor semantics.
(After changing the scope) This PR implements parsing the reflection
operator (^^) for primitive types. The goal is to keep the first PR
simple. In subsequent PRs, parsing for the remaining requirements will
be introduced.
This implementation is based on the fork of @katzdm.
Class `CXXReflectExpr` is introduced to represent the operand of the
reflection operator. For now, in this PR, the type std::meta::info is
not implemented yet, so when we construct an AST node CXXReflectExpr,
`VoidTy` is used as placeholder type for now.
The file `ParseReflect.cpp` is introduced, which for now only has the
function `ParseCXXReflectExpression`. It parses the operand of the
reflection operator.
---------
Co-authored-by: Shafik Yaghmour <shafik.yaghmour@intel.com>
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
Co-authored-by: Sirraide <aeternalmail@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Co-authored-by: Erich Keane <ekeane@nvidia.com>
fixes#166206
- Add swizzle support if row index is constant
- Add test cases
- Add new AST type
- Add new LValue for Matrix Row Type
- TODO: Make the new LValue a dynamic index version of ExtVectorElt
In the standard, constraint satisfaction checking is done on the
normalized form of a constraint.
Clang instead substitutes on the non-normalized form, which causes us to
report substitution failures in template arguments or concept ids, which
is non-conforming but unavoidable without a parameter mapping
This patch normalizes before satisfaction checking. However, we preserve
concept-id nodes in the normalized form, solely for diagnostics
purposes.
This addresses https://github.com/llvm/llvm-project/issues/61811 and
related concepts conformance bugs, ideally to make the remaining
implementation of concept template parameters easier
Fixes https://github.com/llvm/llvm-project/issues/135190
Fixes https://github.com/llvm/llvm-project/issues/61811
Co-authored-by: Younan Zhang
[zyn7109@gmail.com](mailto:zyn7109@gmail.com)
---------
Co-authored-by: Younan Zhang <zyn7109@gmail.com>
In the standard, constraint satisfaction checking is done on the
normalized form of a constraint.
Clang instead substitutes on the non-normalized form, which causes us to
report substitution failures in template arguments or concept ids, which
is non-conforming but unavoidable without a parameter mapping
This patch normalizes before satisfaction checking. However, we preserve
concept-id nodes in the normalized form, solely for diagnostics
purposes.
This addresses #61811 and related concepts conformance bugs, ideally to
make the remaining implementation of concept template parameters easier
Fixes#135190
Fixes #61811
Co-authored-by: Younan Zhang <zyn7109@gmail.com>
This change implements the fuse directive, `#pragma omp fuse`, as specified in the OpenMP 6.0, along with the `looprange` clause in clang.
This change also adds minimal stubs so flang keeps compiling (a full implementation in flang of this directive is still pending).
---------
Co-authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es>
This is preparatory work for the implementation of `#pragma omp fuse` in
https://github.com/llvm/llvm-project/pull/139293
**Note**: this change builds on top of
https://github.com/llvm/llvm-project/pull/155848
This change adds an additional class to hold data that will be shared
between all loop transformations: those that apply to canonical loop
nests (the majority) and those that apply to canonical loop sequences
(`fuse` in OpenMP 6.0).
This class is not a statement by itself and its goal is to avoid having
to replicate information between classes.
Also simplfiy the way we handle the "generated loops" information as we
currently only need to know if it is zero or non-zero.
This is preparatory work for the implementation of `#pragma omp fuse` in
https://github.com/llvm/llvm-project/pull/139293
Not all OpenMP loop transformations makes sense to make them inherit
from `OMPLoopBasedDirective`, in particular in OpenMP 6.0 'fuse' (to be
implemented later) is a transformation of a canonical loop sequence.
This change renames class `OMPLoopTransformationDirective` to
`OMPCanonicalLoopNestTransformationDirective` so we can reclaim that
name in a later change.
This implements support for [named
loops](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3355.htm) for
C2y.
When parsing a `LabelStmt`, we create the `LabeDecl` early before we parse
the substatement; this label is then passed down to `ParseWhileStatement()`
and friends, which then store it in the loop’s (or switch statement’s) `Scope`;
when we encounter a `break/continue` statement, we perform a lookup for
the label (and error if it doesn’t exist), and then walk the scope stack and
check if there is a scope whose preceding label is the target label, which
identifies the jump target.
The feature is only supported in C2y mode, though a cc1-only option
exists for testing (`-fnamed-loops`), which is mostly intended to try
and make sure that we don’t have to refactor this entire implementation
when/if we start supporting it in C++.
---------
Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>
This reintroduces `Type.h`, having earlier been renamed to `TypeBase.h`,
as a redirection to `TypeBase.h`, and redirects most users to include
the former instead.
This is a preparatory patch for being able to provide inline definitions
for `Type` methods which would otherwise cause a circular dependency
with `Decl{,CXX}.h`.
Doing these operations into their own NFC patch helps the git rename
detection logic work, preserving the history.
This patch makes clang just a little slower to build (~0.17%), just
because it makes more code indirectly include `DeclCXX.h`.
This is a preparatory patch, to be able to provide inline definitions
for `Type` functions which depend on `Decl{,CXX}.h`. As the latter also
depends on `Type.h`, this would not be possible without some
reorganizing.
Splitting this rename into its own patch allows git to track this as a
rename, and preserve all git history, and not force any code
reformatting.
A later NFC patch will reintroduce `Type.h` as redirection to
`TypeBase.h`, rewriting most places back to directly including `Type.h`
instead of `TypeBase.h`, leaving only a handful of places where this is
necessary.
Then yet a later patch will exploit this by making more stuff inline.
This is a first pass at implementing
[P2841R7](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p2841r7.pdf).
The implementation is far from complete; however, I'm aiming to do that
in chunks, to make our lives easier.
In particular, this does not implement
- Subsumption
- Mangling
- Satisfaction checking is minimal as we should focus on #141776 first
(note that I'm currently very stuck)
FTM, release notes, status page, etc, will be updated once the feature
is more mature. Given the state of the feature, it is not yet allowed in
older language modes.
Of note:
- Mismatches between template template arguments and template template
parameters are a bit wonky. This is addressed by #130603
- We use `UnresolvedLookupExpr` to model template-id. While this is
pre-existing, I have been wondering if we want to introduce a different
OverloadExpr subclass for that. I did not make the change in this patch.
This patch is closely related to #139293 and addresses an existing issue
in the loop transformation codebase. Specifically, it corrects the
handling of the `NumGeneratedLoops` variable in
`OMPLoopTransformationDirective` AST nodes and its inheritors (such as
OMPUnrollDirective, OMPTileDirective, etc.).
Previously, this variable was inaccurately set for certain
transformations like reverse or tile. While this did not lead to
functional bugs, since the value was only checked to determine whether
it was greater than zero or equal to zero, the inconsistency could
introduce problems when supporting more complex directives in the
future.
This patch reduces the size of several AST nodes by moving some fields
into the free bitfield space in the base `Stmt` class:
* `CXXForRangeStmt`: 96 → 88 bytes
* `ChooseExpr`: 56 → 48 bytes
* `ArrayTypeTraitExpr`: 56 → 48 bytes
* `ExpressionTraitExpr`: 40 → 32 bytes
* `CXXFoldExpr`: 64 → 56 bytes
* `ShuffleExpr`: 40 → 32 bytes
* `PackIndexingExpr`: 48 → 40 bytes
There are no noticeable memory savings (`Expr/Stmt` memory usage
125,824,496 vs 125,826,336 bytes for `SemaExpr.cpp`) in my testing,
likely because these node types are not among the most common in typical
ASTs.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
The bulk of the changes are in `CallExpr`
We cache Begin/End source locs in the trailing objects, in the space
left by making the offset to the trailing objects static.
We also set a flag to indicate that we are calling an explicit object
member function, further reducing the cost of getBeginLoc.
Fixes#140876
Adopt non-templated and array-ref returning forms of
`getTrailingObjects` in Expr.cpp/.h.
Use ArrayRef forms to eliminate manual asserting for OOB index. Use
llvm::copy() instead of std::copy() in some instances.
Adopt non-templated and array-ref returning forms of
`getTrailingObjects` in DeclOpenACC and StmtOpenACC. Also use
std::uninitialized_contruct_n to make the code a little concise.
This fixes a PCM non-determinism regression reported here:
https://github.com/llvm/llvm-project/pull/134560#issuecomment-2797744370
There was a bit in `SubstNonTypeTemplateParmPackExpr` which we missed to
serialize, and that bit eventually propagates to
`SubstNonTypeTemplateParmExpr`.
As a drive by, improve serialization for PackIndex on
SubstNonTypeTemplateParmExpr by using the newly introduced
UnsignedOrNone helpers.
There are no release notes since this regression was never released.
This is a basic implementation of P2719: "Type-aware allocation and
deallocation functions" described at http://wg21.link/P2719
The proposal includes some more details but the basic change in
functionality is the addition of support for an additional implicit
parameter in operators `new` and `delete` to act as a type tag. Tag is
of type `std::type_identity<T>` where T is the concrete type being
allocated. So for example, a custom type specific allocator for `int`
say can be provided by the declaration of
void *operator new(std::type_identity<int>, size_t, std::align_val_t);
void operator delete(std::type_identity<int>, void*, size_t, std::align_val_t);
However this becomes more powerful by specifying templated declarations,
for example
template <typename T> void *operator new(std::type_identity<T>, size_t, std::align_val_t);
template <typename T> void operator delete(std::type_identity<T>, void*, size_t, std::align_val_t););
Where the operators being resolved will be the concrete type being
operated over (NB. A completely unconstrained global definition as above
is not recommended as it triggers many problems similar to a general
override of the global operators).
These type aware operators can be declared as either free functions or
in class, and can be specified with or without the other implicit
parameters, with overload resolution performed according to the existing
standard parameter prioritisation, only with type parameterised
operators having higher precedence than non-type aware operators. The
only exception is destroying_delete which for reasons discussed in the
paper we do not support type-aware variants by default.
This introduces a new class 'UnsignedOrNone', which models a lite
version of `std::optional<unsigned>`, but has the same size as
'unsigned'.
This replaces most uses of `std::optional<unsigned>`, and similar
schemes utilizing 'int' and '-1' as sentinel.
Besides the smaller size advantage, this is simpler to serialize, as its
internal representation is a single unsigned int as well.
This was added in OpenACC PR #511 in the 3.4 branch. From an AST/Sema
perspective this is pretty trivial as the infrastructure for 'if'
already exists, however the atomic construct needed to be taught to take
clauses. This patch does that and adds some testing to do so.
Introduce a trait to determine the number of bindings that would be
produced by
```cpp
auto [...p] = expr;
```
This is necessary to implement P2300
(https://eel.is/c++draft/exec#snd.concepts-5), but can also be used to
implement a general get<N> function that supports aggregates
`__builtin_structured_binding_size` is a unary type trait that evaluates
to the number of bindings in a decomposition
If the argument cannot be decomposed, a sfinae-friendly error is
produced.
A type is considered a valid tuple if `std::tuple_size_v<T>` is a valid
expression, even if there is no valid `std::tuple_element`
specialization or suitable `get` function for that type.
Fixes#46049
This statement level construct takes no clauses and has no associated
statement, and simply labels a number of array elements as valid for
caching. The implementation here is pretty simple, but it is a touch of
a special case for parsing, so the parsing code reflects that.
This patch allows using fpfeatures pragmas with __builtin_convertvector:
- added TrailingObjects with FPOptionsOverride and methods for handling
it to ConvertVectorExpr
- added support for codegen, node dumping, and serialization of
fpfeatures contained in ConvertVectorExpr
This merges the functionality of ResolvedUnexpandedPackExpr into
FunctionParmPackExpr. I also added a test to show that
https://github.com/llvm/llvm-project/issues/125103 should be fixed with
this. I put the removal of ResolvedUnexpandedPackExpr in its own commit.
Let me know what you think.
Fixes#125103
The atomic construct is a particularly complicated one. The directive
itself is pretty simple, it has 5 options for the 'atomic-clause'.
However, the associated statement is fairly complicated.
'read' accepts:
v = x;
'write' accepts:
x = expr;
'update' (or no clause) accepts:
x++;
x--;
++x;
--x;
x binop= expr;
x = x binop expr;
x = expr binop x;
'capture' accepts either a compound statement, or:
v = x++;
v = x--;
v = ++x;
v = --x;
v = x binop= expr;
v = x = x binop expr;
v = x = expr binop x;
IF 'capture' has a compound statement, it accepts:
{v = x; x binop= expr; }
{x binop= expr; v = x; }
{v = x; x = x binop expr; }
{v = x; x = expr binop x; }
{x = x binop expr ;v = x; }
{x = expr binop x; v = x; }
{v = x; x = expr; }
{v = x; x++; }
{v = x; ++x; }
{x++; v = x; }
{++x; v = x; }
{v = x; x--; }
{v = x; --x; }
{x--; v = x; }
{--x; v = x; }
While these are all quite complicated, there is a significant amount
of similarity between the 'capture' and 'update' lists, so this patch
reuses a lot of the same functions.
This patch implements the entirety of 'atomic', creating a new Sema file
for the sema for it, as it is fairly sizable.
This is an implementation of P1061 Structure Bindings Introduce a Pack
without the ability to use packs outside of templates. There is a couple
of ways the AST could have been sliced so let me know what you think.
The only part of this change that I am unsure of is the
serialization/deserialization stuff. I followed the implementation of
other Exprs, but I do not really know how it is tested. Thank you for
your time considering this.
---------
Co-authored-by: Yanzuo Liu <zwuis@outlook.com>
A SYCL kernel entry point function is a non-member function or a static
member function declared with the `sycl_kernel_entry_point` attribute.
Such functions define a pattern for an offload kernel entry point
function to be generated to enable execution of a SYCL kernel on a
device. A SYCL library implementation orchestrates the invocation of
these functions with corresponding SYCL kernel arguments in response to
calls to SYCL kernel invocation functions specified by the SYCL 2020
specification.
The offload kernel entry point function (sometimes referred to as the
SYCL kernel caller function) is generated from the SYCL kernel entry
point function by a transformation of the function parameters followed
by a transformation of the function body to replace references to the
original parameters with references to the transformed ones. Exactly how
parameters are transformed will be explained in a future change that
implements non-trivial transformations. For now, it suffices to state
that a given parameter of the SYCL kernel entry point function may be
transformed to multiple parameters of the offload kernel entry point as
needed to satisfy offload kernel argument passing requirements.
Parameters that are decomposed in this way are reconstituted as local
variables in the body of the generated offload kernel entry point
function.
For example, given the following SYCL kernel entry point function
definition:
```
template<typename KernelNameType, typename KernelType>
[[clang::sycl_kernel_entry_point(KernelNameType)]]
void sycl_kernel_entry_point(KernelType kernel) {
kernel();
}
```
and the following call:
```
struct Kernel {
int dm1;
int dm2;
void operator()() const;
};
Kernel k;
sycl_kernel_entry_point<class kernel_name>(k);
```
the corresponding offload kernel entry point function that is generated
might look as follows (assuming `Kernel` is a type that requires
decomposition):
```
void offload_kernel_entry_point_for_kernel_name(int dm1, int dm2) {
Kernel kernel{dm1, dm2};
kernel();
}
```
Other details of the generated offload kernel entry point function, such
as its name and calling convention, are implementation details that need
not be reflected in the AST and may differ across target devices. For
that reason, only the transformation described above is represented in
the AST; other details will be filled in during code generation.
These transformations are represented using new AST nodes introduced
with this change. `OutlinedFunctionDecl` holds a sequence of
`ImplicitParamDecl` nodes and a sequence of statement nodes that
correspond to the transformed parameters and function body.
`SYCLKernelCallStmt` wraps the original function body and associates it
with an `OutlinedFunctionDecl` instance. For the example above, the AST
generated for the `sycl_kernel_entry_point<kernel_name>` specialization
would look as follows:
```
FunctionDecl 'sycl_kernel_entry_point<kernel_name>(Kernel)'
TemplateArgument type 'kernel_name'
TemplateArgument type 'Kernel'
ParmVarDecl kernel 'Kernel'
SYCLKernelCallStmt
CompoundStmt
<original statements>
OutlinedFunctionDecl
ImplicitParamDecl 'dm1' 'int'
ImplicitParamDecl 'dm2' 'int'
CompoundStmt
VarDecl 'kernel' 'Kernel'
<initialization of 'kernel' with 'dm1' and 'dm2'>
<transformed statements with redirected references of 'kernel'>
```
Any ODR-use of the SYCL kernel entry point function will (with future
changes) suffice for the offload kernel entry point to be emitted. An
actual call to the SYCL kernel entry point function will result in a
call to the function. However, evaluation of a `SYCLKernelCallStmt`
statement is a no-op, so such calls will have no effect other than to
trigger emission of the offload kernel entry point.
Additionally, as a related change inspired by code review feedback,
these changes disallow use of the `sycl_kernel_entry_point` attribute
with functions defined with a _function-try-block_. The SYCL 2020
specification prohibits the use of C++ exceptions in device functions.
Even if exceptions were not prohibited, it is unclear what the semantics
would be for an exception that escapes the SYCL kernel entry point
function; the boundary between host and device code could be an implicit
noexcept boundary that results in program termination if violated, or
the exception could perhaps be propagated to host code via the SYCL
library. Pending support for C++ exceptions in device code and clear
semantics for handling them at the host-device boundary, this change
makes use of the `sycl_kernel_entry_point` attribute with a function
defined with a _function-try-block_ an error.
This executable construct has a larger list of clauses than some of the
others, plus has some additional restrictions. This patch implements
the AST node, plus the 'cannot be the body of a if, while, do, switch,
or label' statement restriction. Future patches will handle the
rest of the restrictions, which are based on clauses.
The 'set' construct is another fairly simple one, it doesn't have an
associated statement and only a handful of allowed clauses. This patch
implements it and all the rules for it, allowing 3 of its for clauses.
The only exception is default_async, which will be implemented in a
future patch, because it isn't just being enabled, it needs a complete
new implementation.
These two constructs are very simple and similar, and only support 3
different clauses, two of which are already implemented. This patch
adds AST nodes for both constructs, and leaves the device_num clause
unimplemented, but enables the other two.
The arguments to this are the same as for the 'wait' clause, so this
reuses all of that infrastructure. So all this has to do is support a
pair of clauses that are already implemented (if and async), plus create
an AST node. This patch does so, and adds proper testing.
These constructs are all very similar and closely related, so this patch
creates the AST nodes for them, serialization, printing/etc.
Additionally the restrictions are all added as tests/todos in the tests,
as those will have to be implemented once we get those clauses implemented.
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:
// FIXME: Replace the uses of is(), get() and dyn_cast() with
// isa<T>, cast<T> and the llvm::dyn_cast<T>
I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.