535 Commits

Author SHA1 Message Date
erichkeane
be32621ce8 [OpenACC] Implement 'device' and 'host' clauses for 'update'
These two clauses just take a 'var-list' and specify where the variables
should be copied from/to.  This patch implements the AST nodes for them
and ensures they properly take a var-list.
2025-01-09 09:28:58 -08:00
erichkeane
2c2accbcc6 [OpenACC] Enable 'self' sema for 'update' construct
The 'self' clause is an unfortunately difficult one, as it has a
significantly different meaning between 'update' and the other
constructs.  This patch introduces a way for the 'self' clause to work
as both.  I considered making this two separate AST nodes (one for
'self' on 'update' and one for the others), however this makes the
automated macros/etc for supporting a clause break.

Instead, 'self' has the ability to act as either a condition or as a
var-list clause.  As this is the only one of its kind, it is implemented
all within it.  If in the future we have more that work like this, we
should consider rewriting a lot of the macros that we use to make
clauses work, and make them separate ast nodes.
2025-01-08 13:19:33 -08:00
erichkeane
db81e8c42e [OpenACC] Initial sema implementation of 'update' construct
This executable construct has a larger list of clauses than some of the
others, plus has some additional restrictions.  This patch implements
the AST node, plus the 'cannot be the body of a if, while, do, switch,
    or label' statement restriction.  Future patches will handle the
    rest of the restrictions, which are based on clauses.
2025-01-07 08:20:20 -08:00
erichkeane
ff24e9a19e [OpenACC] Implement 'default_async' sema
A fairly simple one, only valid on the 'set' construct, this clause
takes an int expression.  Most of the work was already done as a part of
parsing, so this patch ends up being a lot of infrastructure.
2025-01-06 11:03:18 -08:00
erichkeane
21c785d7bd [OpenACC] Implement 'set' construct sema
The 'set' construct is another fairly simple one, it doesn't have an
associated statement and only a handful of allowed clauses. This patch
implements it and all the rules for it, allowing 3 of its for clauses.
The only exception is default_async, which will be implemented in a
future patch, because it isn't just being enabled, it needs a complete
new implementation.
2025-01-06 11:03:18 -08:00
erichkeane
bdf2555308 [OpenACC] Implement 'device_num' clause sema for 'init'/'shutdown'
This is a very simple sema implementation, and just required AST node
plus the existing diagnostics.  This patch adds tests and adds the AST
node required, plus enables it for 'init' and 'shutdown' (only!)
2024-12-19 12:21:51 -08:00
erichkeane
4bbdb018a6 [OpenACC] Implement 'init' and 'shutdown' constructs
These two constructs are very simple and similar, and only support 3
different clauses, two of which are already implemented.  This patch
adds AST nodes for both constructs, and leaves the device_num clause
unimplemented, but enables the other two.
2024-12-19 12:21:50 -08:00
erichkeane
e34cc7c993 [OpenACC] Implement 'wait' construct
The arguments to this are the same as for the 'wait' clause, so this
reuses all of that infrastructure. So all this has to do is support a
pair of clauses that are already implemented (if and async), plus create
an AST node.  This patch does so, and adds proper testing.
2024-12-18 15:06:01 -08:00
erichkeane
fbb14dd977 [OpenACC] Implement 'use_device' clause AST/Sema
This is a clause that is only valid on 'host_data' constructs, and
identifies variables which it should use the current device address.
From a Sema perspective, the only thing novel here is mild changes to
how ActOnVar works for this clause, else this is very much like the rest
of the 'var-list' clauses.
2024-12-16 09:35:57 -08:00
erichkeane
1ab81f8e7f [OpenACC] Implement 'delete' AST/Sema for 'exit data' construct
'delete' is another clause that has very little compile-time
implication, but needs a full AST that takes a var list.  This patch
ipmlements it fully, plus adds sufficient test coverage.
2024-12-16 06:44:53 -08:00
erichkeane
3351b3bf8d [OpenACC] implement 'detach' clause sema
This is another new clause specific to 'exit data' that takes a pointer
argument. This patch implements this the same way we do a few other
clauses (like attach) that have the same restrictions.
2024-12-13 13:51:41 -08:00
erichkeane
2244d2e75c [OpenACC] Implement 'if_present' clause sema
The 'if_present' clause controls the replacement of addresses in the
var-list in current device memory.  This clause can only go on
'host_device'.  From a Sema perspective, there isn't anything to do
beyond add this to AST and pass it on.
2024-12-13 13:04:57 -08:00
erichkeane
003eb5e80d [OpenACC] Implement 'finalize' clause sema
This is a very simple clause as far as sema is concerned.  It is only
valid on 'exit data', and doesn't have any rules involving it, so it is
simply applied and passed onto the MLIR.
2024-12-13 10:41:02 -08:00
erichkeane
010d0115fc [OpenACC] Create AST nodes for 'data' constructs
These constructs are all very similar and closely related, so this patch
creates the AST nodes for them, serialization, printing/etc.
Additionally the restrictions are all added as tests/todos in the tests,
as those will have to be implemented once we get those clauses implemented.
2024-12-12 07:28:30 -08:00
erichkeane
39351f8e46 [OpenACC] Implement AST/Sema for combined constructs
Combined constructs (OpenACC 3.3 section 2.11) are a short-cut for
writing a `loop` construct immediately inside of a `compute` construct.
However, this interaction requires we do additional work to ensure that
we get the semantics between the two correct, as well as diagnostics.

This patch adds the semantic analysis for the constructs (but no
    clauses), as well as the AST nodes.
2024-11-12 09:26:25 -08:00
Erich Keane
c8cbdc659c
[OpenACC] Implement 'loop' 'vector' clause (#112259)
The 'vector' clause specifies the iterations to be executed in vector or
SIMD mode. There are some limitations on which associated compute
contexts may be associated with this and have arguments, but otherwise
this is a fairly unrestricted clause.

It DOES have region limits like 'gang' and 'worker'.
2024-10-15 06:12:19 -07:00
Erich Keane
cf456ed2a4
[OpenACC] implement loop 'worker' clause. (#112206)
The worker clause specifies iterations of the loop/ that are executed in
parallel by distributing the iterations among the multiple works within
a single gang.

The sema rules for this type are simply that it cannot be combined with
a `kernel` construct with a `num_workers` clause, child `loop` clauses
cannot contain a `gang` or `worker` clause, and that the argument is oly
allowed when associated with a `kernel`.
2024-10-14 09:08:24 -07:00
Erich Keane
5b25c31351
[OpenACC] Implement loop 'gang' clause. (#112006)
The 'gang' clause is used to specify parallel execution of loops, thus
has some complicated rules depending on the 'loop's associated compute
construct. This patch implements all of those.
2024-10-11 09:05:19 -07:00
Michael Kruse
5b03efb85d
[Clang][OpenMP] Add permutation clause (#92030)
Add the permutation clause for the interchange directive which will be
introduced in the upcoming OpenMP 6.0 specification. A preview has been
published in
[Technical Report12](https://www.openmp.org/wp-content/uploads/openmp-TR12.pdf).
2024-10-09 14:56:43 +02:00
Erich Keane
d412cea8c4
[OpenACC] Implement 'tile' attribute AST (#110999)
The 'tile' clause shares quite a bit of the rules with 'collapse', so a
followup patch will add those tests/behaviors. This patch deals with
adding the AST node.

The 'tile' clause takes a series of integer constant expressions, or *.
The asterisk is now represented by a new OpenACCAsteriskSizeExpr node,
else this clause is very similar to others.
2024-10-03 08:34:43 -07:00
Erich Keane
97da34e015
[OpenACC] Add 'collapse' clause AST/basic Sema implementation (#109461)
The 'collapse' clause on a 'loop' construct is used to specify how many
nested loops are associated with the 'loop' construct. It takes an
optional 'force' tag, and an integer constant expression as arguments.

There are many other restrictions based on the contents of the loop/etc,
but those are implemented in followup patches, for now, this patch just
adds the AST node and does basic argument checking on the loop-count.
2024-10-01 06:40:21 -07:00
Chris B
89fb8490a9
[HLSL] Implement output parameter (#101083)
HLSL output parameters are denoted with the `inout` and `out` keywords
in the function declaration. When an argument to an output parameter is
constructed a temporary value is constructed for the argument.

For `inout` pamameters the argument is initialized via copy-initialization
from the argument lvalue expression to the parameter type. For `out`
parameters the argument is not initialized before the call.

In both cases on return of the function the temporary value is written
back to the argument lvalue expression through an implicit assignment
binary operator with casting as required.

This change introduces a new HLSLOutArgExpr ast node which represents
the output argument behavior. The OutArgExpr has three defined children:
- An OpaqueValueExpr of the argument lvalue expression.
- An OpaqueValueExpr of the copy-initialized parameter.
- A BinaryOpExpr assigning the first with the value of the second.

Fixes #87526

---------

Co-authored-by: Damyan Pepper <damyanp@microsoft.com>
Co-authored-by: John McCall <rjmccall@gmail.com>
2024-08-31 10:59:08 -05:00
Shilei Tian
1c269929d0
[Clang][Sema][OpenMP] Allow thread_limit to accept multiple expressions (#102715) 2024-08-10 09:54:58 -04:00
Shilei Tian
cee594cf36
[Clang][Sema][OpenMP] Allow num_teams to accept multiple expressions (#99732)
By the OpenMP standard, `num_teams` clause can only accept one
expression (for now). In this patch, we extend it to allow to accept
multiple expressions when it is used with `target teams ompx_bare`
construct. This will allow to launch a multi-dim grid, same as CUDA/HIP.
2024-08-06 10:55:15 -04:00
Julian Brown
a42e515e3a
[OpenMP] OpenMP 5.1 "assume" directive parsing support (#92731)
This is a minimal patch to support parsing for "omp assume" directives.
These are meant to be hints to a compiler's optimisers: as such, it is
legitimate (if not very useful) to ignore them. The patch builds on top
of the existing support for "omp assumes" directives (note spelling!).

Unlike the "omp [begin/end] assumes" directives, "omp assume" is
associated with a compound statement, i.e. it can appear within a
function. The "holds" assumption could (theoretically) be mapped onto
the existing builtin "__builtin_assume", though the latter applies to a
single point in the program, and the former to a range (i.e. the whole
of the associated compound statement).

This patch fixes sollve's OpenMP 5.1 "omp assume"-based tests.
2024-08-05 07:37:07 -04:00
Krzysztof Parzyszek
243b27f7e1
[clang][OpenMP] Rename varlists to varlist, NFC (#101058)
It returns a range of variables (via Expr*), not a range of lists.
2024-07-30 08:11:09 -05:00
Michael Kruse
5c93a94f5a
[Clang][OpenMP] Add interchange directive (#93022)
Add the interchange directive which will be introduced in the upcoming
OpenMP 6.0 specification. A preview has been published in [Technical
Report 12](https://www.openmp.org/wp-content/uploads/openmp-TR12.pdf).
2024-07-19 09:24:40 +02:00
Michael Kruse
80865c01e1
[Clang][OpenMP] Add reverse directive (#92916)
Add the reverse directive which will be introduced in the upcoming
OpenMP 6.0 specification. A preview has been published in [Technical
Report 12](https://www.openmp.org/wp-content/uploads/openmp-TR12.pdf).

---------

Co-authored-by: Alexey Bataev <a.bataev@outlook.com>
2024-07-18 10:35:32 +02:00
Mariya Podchishchaeva
41c6e43792
Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (#95802)
This commit implements the entirety of the now-accepted [N3017
-Preprocessor
Embed](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3017.htm) and
its sister C++ paper [p1967](https://wg21.link/p1967). It implements
everything in the specification, and includes an implementation that
drastically improves the time it takes to embed data in specific
scenarios (the initialization of character type arrays). The mechanisms
used to do this are used under the "as-if" rule, and in general when the
system cannot detect it is initializing an array object in a variable
declaration, will generate EmbedExpr AST node which will be expanded by
AST consumers (CodeGen or constant expression evaluators) or expand
embed directive as a comma expression.

This reverts commit
682d461d5a.

---------

Co-authored-by: The Phantom Derpstorm <phdofthehouse@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Co-authored-by: H. Vetinari <h.vetinari@gmx.com>
2024-06-20 14:38:46 +02:00
Vitaly Buka
682d461d5a
Revert " [Sema, Lex, Parse] Preprocessor embed in C and C++ (and Obj-C and Obj-C++ by-proxy)" (#95299)
Reverts llvm/llvm-project#68620

Introduce or expose a memory leak and UB, see llvm/llvm-project#68620
2024-06-12 13:14:26 -07:00
The Phantom Derpstorm
5989450e00
[clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (and Obj-C and Obj-C++ by-proxy) (#68620)
This commit implements the entirety of the now-accepted [N3017 -
Preprocessor
Embed](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3017.htm) and
its sister C++ paper [p1967](https://wg21.link/p1967). It implements
everything in the specification, and includes an implementation that
drastically improves the time it takes to embed data in specific
scenarios (the initialization of character type arrays). The mechanisms
used to do this are used under the "as-if" rule, and in general when the
system cannot detect it is initializing an array object in a variable
declaration, will generate EmbedExpr AST node which will be expanded
by AST consumers (CodeGen or constant expression evaluators) or
expand embed directive as a comma expression.

---------

Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Co-authored-by: H. Vetinari <h.vetinari@gmx.com>
Co-authored-by: Podchishchaeva, Mariya <mariya.podchishchaeva@intel.com>
2024-06-12 09:16:02 +02:00
erichkeane
2b939e182d [OpenACC] Implement auto/seq/independent clause Sema for 'loop'
These three clauses are all quite trivial, as they take no parameters.
They are mutually exclusive, and 'seq' has some other exclusives that
are implemented here.

The ONE thing that isn't implemented is 2.9's restriction (line 2010):
  'A loop associated with a 'loop' construct that does not have a 'seq'
   clause must be written to meet all the following conditions'.

Future clauses will require similar work, so it'll be done as a
followup.
2024-06-05 10:17:21 -07:00
Erich Keane
42f4e505a3
[OpenACC] Loop construct basic Sema and AST work (#93742)
This patch implements the 'loop' construct AST, as well as the basic
appertainment rule. Additionally, it sets up the 'parent' compute
construct, which is necessary for codegen/other diagnostics.

A 'loop' can apply to a for or range-for loop, otherwise it has no other
restrictions (though some of its clauses do).
2024-06-05 06:21:48 -07:00
Erich Keane
a15b685c2d
[OpenACC] Implement 'reduction' sema for compute constructs (#92808)
'reduction' has a few restrictions over normal 'var-list' clauses:

1- On parallel, a num_gangs can only have 1 argument when combined with
reduction. These two aren't able to be combined on any other of the
compute constructs however.

2- The vars all must be 'numerical data types' types of some sort, or a
'composite of numerical data types'. A list of types is given in the
standard as a minimum, so we choose 'isScalar', which covers all of
these types and keeps types that are actually numeric. Other compilers
don't seem to implement the 'composite of numerical data types', though
we do.

3- Because of the above restrictions, member-of-composite is not
allowed, so any access via a memberexpr is disallowed. Array-element and
sub-arrays (aka array sections) are both permitted, so long as they meet
the requirements of #2.

This patch implements all of these for compute constructs.
2024-05-21 06:51:25 -07:00
erichkeane
8ef2011b2c Reapply "[OpenACC] device_type clause Sema for Compute constructs"
device_type, also spelled as dtype, specifies the applicability of the
clauses following it, and takes a series of identifiers representing the
architectures it applies to.  As we don't have a source for the valid
architectures yet, this patch just accepts all.

Semantically, this also limits the list of clauses that can be applied
after the device_type, so this implements that as well.

This reverts commit 06f04b2e27f2586d3db2204ed4e54f8b78fea74e.
This reapplies commit c4a9a374749deb5f2a932a7d4ef9321be1b2ae5d.
The build failures were caused by the patch depending on the order of
evaluation of arguments to a function. This reapplication separates out
the capture of one of the values.
2024-05-13 10:29:43 -07:00
erichkeane
06f04b2e27 Revert "[OpenACC] device_type clause Sema for Compute constructs"
This reverts commit c4a9a374749deb5f2a932a7d4ef9321be1b2ae5d.

This and the followup patch keep hitting an assert I wrote on the build
bots in a way that isn't clear.  Reverting so I can fix it without a
rush.
2024-05-13 08:40:43 -07:00
erichkeane
c4a9a37474 [OpenACC] device_type clause Sema for Compute constructs
device_type, also spelled as dtype, specifies the applicability of the
clauses following it, and takes a series of identifiers representing the
architectures it applies to.  As we don't have a source for the valid
architectures yet, this patch just accepts all.

Semantically, this also limits the list of clauses that can be applied
after the device_type, so this implements that as well.
2024-05-13 07:50:19 -07:00
erichkeane
b1b465218d [OpenACC] 'wait' clause for compute construct sema
'wait' takes a few int-exprs (well, a series of async-arguments, but
    those are effectively just an int-expr), plus a pair of tags. This
patch adds the support for this to the AST, and does the appropriate
semantic analysis for them.
2024-05-09 06:44:12 -07:00
erichkeane
30cfe2b2ac [OpenACC] Implement 'async' clause sema for compute constructs
This is a pretty simple clause, it takes an 'async-argument', which
effectively needs to be just parsed as an 'int' argument, since it can
be an arbitrarly integer at runtime (and negative values are legal for
implementation defined values).

This patch also cleans up the async-argument parsing, so 'wait' got some
minor quality-of-life improvements for parsing (both clause and
    construct).
2024-05-07 07:14:14 -07:00
erichkeane
48c8a5791a [OpenACC] Implement 'deviceptr' and 'attach' sema for compute constructs
These two are very similar to the other 'var-list' variants, except they
require that the type of the variable be a pointer.  This patch
implements that restriction.
2024-05-06 09:29:04 -07:00
erichkeane
01e91a2dde [OpenACC] Implement copyin, copyout, create clauses for compute construct
Like 'copy', these also have alternate names, so this implements that as
well.  Additionally, these have an optional tag of either 'readonly' or
'zero' depending on the clause.

Otherwise, this is a pretty rote implementation of the clause, as there
aren't any special rules for it.
2024-05-03 07:51:25 -07:00
erichkeane
054f7c0565 [OpenACC] Implement copy clause for compute constructs.
Like present, no_create, and first_private, copy is a clause that takes
just a var-list, and follows the same rules as the others.

The one unique part of this clause is that it ALSO supports two
deprecated/backwards-compatibility spellings, so this patch adds them
and implements them.
2024-05-03 07:20:41 -07:00
erichkeane
bd909d2e6f [OpenACC] Implement no_create and present clauses on compute constructs
These two are, from a semantic checking perspective, identical to
first-private/private/etc, other than appertainment. This patch
implements both.
2024-05-03 06:51:54 -07:00
erichkeane
a13c5140a2 [OpenACC] Implement firstprivate clause for compute constructs
This clause is pretty nearly copy/paste from private, except that it
doesn't support 'loop', and thus 'kernelsloop' for appertainment.
2024-05-03 06:33:35 -07:00
Erich Keane
fa67986d5b
[OpenACC] Private Clause on Compute Constructs (#90521)
The private clause is the first that takes a 'var-list', thus this has a
lot of additional work to enable the var-list type. A 'var' is a
traditional variable reference, subscript, member-expression, or
array-section, so checking of these is pretty minor.

Note: This ran into some issues with array-sections (aka sub-arrays)
that will be fixed in a follow-up patch.
2024-04-30 11:28:37 -07:00
Erich Keane
39adc8f423
[NFC] Generalize ArraySections to work for OpenACC in the future (#89639)
OpenACC is going to need an array sections implementation that is a
simpler version/more restrictive version of the OpenMP version. 

This patch moves `OMPArraySectionExpr` to `Expr.h` and renames it `ArraySectionExpr`,
 then adds an enum to choose between the two.

This also fixes a couple of 'drive-by' issues that I discovered on the way,
but leaves the OpenACC Sema parts reasonably unimplemented (no semantic
analysis implementation), as that will be a followup patch.
2024-04-25 10:22:03 -07:00
Erich Keane
dc20a0ea1f
[OpenACC] Implement 'num_gangs' sema for compute constructs (#89460)
num_gangs takes an 'int-expr-list', for 'parallel', and an 'int-expr'
for 'kernels'. This patch changes the parsing to always parse it as an
'int-expr-list', then correct the expression count during Sema. It also
implements the rest of the semantic analysis changes for this clause.
2024-04-22 08:57:25 -07:00
Erich Keane
b8adf169bb [OpenACC] Implement 'vector_length' clause On compute constructs
The 'vector_length' clause is semantically identical to the
'num_workers' clause, in that it takes a mandatory single int-expr. This
is implemented identically to it.
2024-04-18 13:27:42 -07:00
Erich Keane
76600aee9d
[OpenACC] Implement 'num_workers' clause for compute constructs (#89151)
This clause just takes an 'int expr', which is not optional. This patch
implements the clause on compute constructs.
2024-04-18 12:42:22 -07:00
Erich Keane
6133878227
[OpenACC] Implement self clause for compute constructs (#88760)
`self` clauses on compute constructs take an optional condition
expression. We again limit the implementation to ONLY compute constructs
to ensure we get all the rules correct for others. However, this one
will be particularly complicated, as it takes a `var-list` for `update`,
so when we get to that construct/clause combination, we need to do that
as well.

This patch also furthers uses of the `OpenACCClauses.def` as it became
useful while implementing this (as well as some other minor refactors as
I went through).

Finally, `self` and `if` clauses have an interaction with each other, if
an `if` clause evaluates to `true`, the `self` clause has no effect.
While this is intended and can be used 'meaningfully', we are warning on
this with a very granular warning, so that this edge case will be
noticed by newer users, but can be disabled trivially.
2024-04-16 06:57:36 -07:00