141 Commits

Author SHA1 Message Date
erichkeane
30fcf69845 [OpenACC] Fixup rules for reduction clause variable refererence type
The standard is ambiguous, but we can only support
arrays/array-sections/etc of the composite type, so make sure we enforce
the rule that way. This will better support  how we need to do lowering.
2025-08-20 07:55:16 -07:00
erichkeane
8fc80519cd [OpenACC] Fix crash on error recovery of variable in OpenACC mode
As reported, OpenACC's variable declaration handling was assuming some
semblence of legality in the example, so it didn't properly handle an
error case.  This patch fixes its assumptions so that we don't crash.

Fixes #154008
2025-08-18 07:37:45 -07:00
erichkeane
0dbcdf33b8 [OpenACC] Fix racing commit test failures for firstprivate lowering
The original patch to implement basic lowering for firstprivate didn't
have the Sema work to change the name of the variable being generated
from openacc.private.init to openacc.firstprivate.init. I forgot about
that when I merged the Sema changes this morning, so the tests now
failed.  This patch fixes those up.

Additionally, Suggested on #153622 post-commit, it seems like a good idea to
use a size of APInt that matches the size-type, so this changes us to use that
instead.
2025-08-18 07:26:50 -07:00
Erich Keane
dcdbd5b55d
[OpenACC][NFCI] Implement 'recipe' generation for firstprivate copy (#153622)
The 'firstprivate' clause requires that we do a 'copy' operation, so
this patch creates some AST nodes from which we can generate the copy
operation, including a 'temporary' and array init. For the most part
this is pretty similar to what 'private' does other than the fact that
the source is copy (and not default init!), and that there is a
temporary from which to copy.

---------

Co-authored-by: Andy Kaylor <akaylor@nvidia.com>
2025-08-15 18:42:40 +00:00
erichkeane
26dde15ed4 [OpenACC] Add warning for VLAs in a private/firstprivate clause
private/firstprivate typically do copy operations, however copying a VLA
isn't really possible.  This patch introduces a warning to alert the
person that this copy isn't happening correctly.

As a future direction, we MIGHT consider doing additional work to make
sure they are initialized/copied/deleted/etc correctly.
2025-08-06 13:14:20 -07:00
erichkeane
b291d02a93 [OpenACC][NFCI] Add extra data to firstprivate recipe AST node
During implementation I found that I need some additional data in the
AST node for codegen, so this patch adds the second declaration
reference.
2025-08-06 10:18:34 -07:00
erichkeane
258997c16e [OpenACC][NFCI] Add 'InitRecipes' to 'firstprivate' AST node
This patch adds the 'init recipes' to firstprivate like I did for
'private', so that we can properly init these types.  At the moment,
the recipe init isn't generated (just the VarDecl), and this isn't
really used anywhere as it will be used exclusively in Codegen.
2025-08-05 09:26:47 -07:00
erichkeane
74af2cec7b [OpenACC] Fix 'type' checks in private/firstprivate for array types
These would not give a correct initializer, but they are not possible
to generate correctly anyway, so this patch makes sure we look through
the array type to correctly diagnose these.
2025-08-05 08:25:34 -07:00
erichkeane
4e0b68cef0 [OpenACC] Implement warning restrictions for 'firstprivate'
'firstprivate' can't be generated unless we have a copy constructor, so
this patch implements the restriction as a warning, and prevents the
item from being added to the AST.
2025-08-04 11:42:33 -07:00
Erich Keane
66eadbb235
[OpenACC][CIR] Implement 'init' lowering for private clause vars (#151781)
Previously, #151360 implemented 'private' clause lowering, but didn't
properly initialize the variables. This patch adds that behavior to make
sure we correctly get the constructor or other init called.
2025-08-04 11:14:58 -07:00
erichkeane
3e1392fb4b [OpenACC] Allow sub-arrays in declare/use_device as an extension
These two both allow arrays as their variable references, but it is a
common thing to use sub-arrays as a way to get a pointer to act as an
array with other compilers. This patch adds these, with an
extension-warning.
2025-07-25 13:16:34 -07:00
Erich Keane
22994edb5f
[OpenACC][Sema] Implement warning for non-effective 'private' (#149004)
A 'private' variable reference needs to have a default constructor and a
destructor, else we cannot properly emit them in codegen. This patch
adds a warning-as-default-error to diagnose this.

We'll have to do something similar for firstprivate/reduction, however
it isn't clear whether we could skip the check for default-constructor
for those two (they still need a destructor!). Depending on how we
intend to create them (and we probably have to figure this out?), we
could either require JUST a copy-constructor (then make the init section
    just the alloca, and the copy-ctor be the 'copy' section), OR they
require a default-constructor + copy-assignment.
2025-07-16 11:05:41 -07:00
erichkeane
438863a09e [OpenACC][Sema] Implement warning for 'cache' invalid var ref
The 'cache' construct is lowered as marking the acc.loop in ACC MLIR.
This results in any variable references that are not inside of the
acc.loop being invalid.  This patch adds a warning to that effect, and
ensures that the variable references won't be added to the AST during
parsing so we don't try to lower them.

This results in loss of instantiation-diagnostics for these, however
that seems like an acceptable consequence to ignoring it.
2025-07-03 07:13:30 -07:00
erichkeane
f65b35d89f [OpenACC] Stop trying to analyze invalid Var-Decls.
The code to analyze VarDecls for the purpose of ensuring a magic-static
isn't present in a 'routine' was getting confused/crashed because we
create something that looks like a magic-static during error-recovery,
but it is still an invalid decl.

This patch causes us to just 'give up' in the case where the vardecl is
already invalid.

Fixes: #140920
2025-05-21 10:31:21 -07:00
erichkeane
e8dff7bea4 [OpenACC] Fix location of array-section diagnostic.
In a sub-subscript of an array-section, it is actually an array section.
So make sure we get the location correct when there isn't a 'colon' to
look at.
2025-05-20 09:04:32 -07:00
erichkeane
8313d2a8db [OpenACC] Fixup previous-clause diagnostics
Brought up in a previous review as a TODO, we could be better about how
we highlight what hte previous clause was, and how to show that the
'device_type' is the one being targetted.  This patch rewords the
diagnostics and updates a massive number of tests.
2025-05-02 09:35:32 -07:00
Erich Keane
e5f09aac48
[OpenACC][CIR] Start work to lower 'loop' (#137972)
As can be seen by the comment, this ends up being a construct that is
going to be quite a lot of work in the future to make sure we properly
identify the upperbound, lowerbound, and step. For now, we just treat
the 'loop' as container so that we can put the 'for' loop into it.

In the future, we'll have to teach the OpenACC dialect how to derive the
upperbound, lowerbound, and step from the cir.for loop. Additionally,
we'll probably have to add a few more options to it so that we can give
it the recipes it needs to determine these for random access iterators.
For Integer and Pointer values, these should already be known.
2025-05-01 08:42:04 -07:00
erichkeane
ea0e6e3383 [OpenACC] Improve 'loop' content restrictions
The 'loop' construct has some limitations that are not particularly
clear on what can be in the for-loop. This patch adds some restriction
so that the increment can only be a addition or subtraction operation,
   BUT starts allowing Itr = Itr +/- N forms.
2025-04-30 12:02:52 -07:00
erichkeane
fec003a18a [OpenACC]Reimplement 'for' loop checks for a loop construct
The 'loop' construct has some pretty strict checks as to what the for
loop associated with it has to look like. The previous implementation
was a little convoluted and missed implementing the 'condition'
checking.  Additionally, it did a bad job double-diagnosing with
templates.

This patch rewrites it in a way that does a much better job with the
double-diagnosing, and proeprly implements some checks for the
'condition' of the for loop.
2025-04-28 15:22:47 -07:00
Erich Keane
d1cce66469
[OpenACC] Switch Clang to use the Flang 'appertainment' rules for cla… (#135372)
…uses

The Flang implemenation of OpenACC uses a .td file in the llvm/Frontend
directory to determine appertainment in 4 categories:

-Required: If this list has items in it, the directive requires at least
1 of these be present.

-AllowedExclusive: Items on this list are all allowed, but only 1 from
the list may be here (That is, they are exclusive of eachother).

-AllowedOnce: Items on this list are all allowed, but may not be
duplicated.

Allowed: Items on this list are allowed. Note th at the actual list of
'allowed' is all 4 of these lists together.

This is a draft patch to swtich Clang over to use those tables. Surgery
to get this to happen in Clang Sema was somewhat reasonable. However,
some gaps in the implementations are obvious, the existing clang
implementation disagrees with the Flang interpretation of it. SO, we're
keeping a task list here based on what gets discovered.

Changes to Clang:
- [x] Switch 'directive-kind' enum conversions to use tablegen See
ff1a7bddd9435b6ae2890c07eae60bb07898bbf5
- [x] Switch 'clause-kind' enum conversions to use tablegen See
ff1a7bddd9435b6ae2890c07eae60bb07898bbf5
- [x] Investigate 'parse' test differences to see if any new
disagreements arise.
- [x] Clang/Flang disagree as to whether 'collapse' can be multiple
times on a loop. Further research showed no prose to limit this, and the
comment on the clang implementation said "no good reason to allow", so
no standards justification.
- [x] Clang/Flang disagree whether 'num_gangs' can appear >1 on a
compute/combined construct. This ended up being an unjustified
restriction.
- [x] Clang/Flang disagree as to the list of required clauses on a 'set'
construct. My research shows that Clang mistakenly included 'if' in the
list, and that it should be just 'default_async', 'device_num', and
'device_type'.
- [x] Order of 'at least one of' diagnostic has changed. Tests were
updated.
- [x] Ensure we are properly 'de-aliasing' clause names in appertainment
checks?
- [x] What is 'shortloop'? 'shortloop' seems to be an old non-standard
extension that isn't supported by flang, but is parsed for backward
compat reasons. Clang won't parse, but we at least have a spot for it in
the clause list.
- [x] Implemented proposed change for 'routine' gang/worker/vector/seq.
(see issue 539)
- [x] Implement init/shutdown can only have 1 'if' (see issue 540)
- [x] Clang/Flang disagree as to whether 'tile' is permitted more than
once on a 'loop' or combined constructs (Flang prohibits >1). I see no
justification for this in the standard. EDIT: I found a comment in clang
that I did this to make SOMETHING around duplicate checks easier.
Discussion showed we should actually have a better behavior around
'device_type' and duplicates, so I've since implemented that.
- [x] Clang/Flang disagree whether 'gang', 'worker', or 'vector' may
appear on the same construct as a 'seq' on a 'loop' or 'combined'. There
is prose for this in 2022: (a gang, worker, or vector clause may not
appear if a 'seq' clause appears). EDIT: These don't actually disagree,
but aren't in the .td file, so I restored the existing code to do this.
- [x] Clang/Flang disagree on whether 'bind' can appear >1 on a
'routine'. I believe line 3096 (A bind clause may not bind to a routine
name that has a visible bind clause) makes this limitation (Flang
permits >1 bind). we discussed and decided this should have the same
rules as worker/vector/etc, except without the 'exactly 1 of' rule (so
no dupes in individual sections).
- [x] Clang/Flang disagree on whether 'init'/'shutdown' can have
multiple 'device_num' clauses. I believe there is no supporting prose
for this limitation., We decided that `device_num` should only happen
1x.
- [x] Clang/Flang disagree whether 'num_gangs' can appear >1 on a
'kernels' construct. Line 1173 (On a kernels construct, the num_gangs
clause must have a single argument) justifies limiting on a
per-arguement basis, but doesn't do so for multiple num_gangs clauses.
WE decided to do this with the '1-per-device-type' region for num_gangs,
num_workers, and vector_length, see openacc bug here:
https://github.com/OpenACC/openacc-spec/issues/541

Changes to Flang:
- [x] Clang/Flang disgree on whether 'atomic' can take an 'if' clause.
This was added in OpenACC3.3_Next See #135451
- [x] Clang/Flang disagree on whether 'finalize' can be allowed >1 times
on a 'exit_data' construct. see #135415.
- [x] Clang/Flang disagree whether 'if_present' should be allowed >1
times on a 'host_data'/'update' construct. see #135422
- [x] Clang/Flang disagree on whether 'init'/'shutdown' can have
multiple 'device_type' clauses. I believe there is no supporting prose
for this limitation.
- [ ] SEE change for num_gangs/etc above.


Changes that need discussion/research:
2025-04-18 14:54:21 -07:00
erichkeane
6263de90df [OpenACC] Implement 'modifier-list' sema/AST
OpenACC 3.3-NEXT has changed the way tags for copy, copyin, copyout, and
create clauses are specified, and end up adding a few extras, and
permits them as a list.  This patch encodes these as bitmask enum so
they can be stored succinctly, but still diagnose reasonably.
2025-04-04 12:32:33 -07:00
Matheus Izvekov
cfee056b4e
[clang] NFC: introduce UnsignedOrNone as a replacement for std::optional<unsigned> (#134142)
This introduces a new class 'UnsignedOrNone', which models a lite
version of `std::optional<unsigned>`, but has the same size as
'unsigned'.

This replaces most uses of `std::optional<unsigned>`, and similar
schemes utilizing 'int' and '-1' as sentinel.

Besides the smaller size advantage, this is simpler to serialize, as its
internal representation is a single unsigned int as well.
2025-04-03 14:27:18 -03:00
erichkeane
d7724c8ea3 [OpenACC] allow 'if' clause on 'atomic' construct
This was added in OpenACC PR #511 in the 3.4 branch.  From an AST/Sema
perspective this is pretty trivial as the infrastructure for 'if'
already exists, however the atomic construct needed to be taught to take
clauses.  This patch does that and adds some testing to do so.
2025-04-02 10:03:24 -07:00
erichkeane
79079c9469 [OpenACC] Finish implementing 'routine' AST/Sema.
This is the last item of the OpenACC 3.3 spec. It includes the
implicit-name version of 'routine', plus significant refactorings to
make the two work together.  The implicit name version is represented as
an attribute on the function call. This patch also implements the
clauses for the implicit-name version, as well as the A.3.4 warning.
2025-03-21 08:57:54 -07:00
erichkeane
8a8f1359ee [OpenACC] Implement 'bind' ast/sema for 'routine' directive
The 'bind' clause allows the renaming of a function during code
generation.  There are a few rules about when this can/cannot happen,
and it takes either a string or identifier (previously mis-implemetned
as ID-expression) argument.

Note there are additional rules to this in the implicit-function routine
case, but that isn't implemented in this patch, as implicit-function
routine is not yet implemented either.
2025-03-10 07:49:13 -07:00
erichkeane
1b75b9e665 [OpenACC] Handle sema for gang, worker, vector, seq clauses on routine
These 4 clauses are mutually exclusive, AND require at least one of
them. Additionally, gang has some additional restrictions in that only
the 'dim' specifier is permitted. This patch implements all of this, and
ends up refactoring the handling of each of these clauses for
readabililty.
2025-03-06 11:53:46 -08:00
erichkeane
5bb112feae [NFC][OpenACC] Fix unused variable warning from df1e102e
Looks like I did a dyn_cast when all I needed was an isa! This patch
fixes that up.
2025-03-06 07:08:00 -08:00
erichkeane
df1e102e2a [OpenACC] implement AST/Sema for 'routine' construct with argument
The 'routine' construct has two forms, one which takes the name of a
function that it applies to, and another where it implicitly figures it
out based on the next declaration. This patch implements the former with
the required restrictions on the name and the function-static-variables
as specified.

What has not been implemented is any clauses for this, any of the A.3.4
warnings, or the other form.
2025-03-06 06:42:17 -08:00
erichkeane
d5cec386c1 [OpenACC] Implement 'cache' construct AST/Sema
This statement level construct takes no clauses and has no associated
statement, and simply labels a number of array elements as valid for
caching. The implementation here is pretty simple, but it is a touch of
a special case for parsing, so the parsing code reflects that.
2025-03-03 13:57:23 -08:00
erichkeane
5d7d66ba0d [OpenACC] Implement 'declare' construct AST/Sema
The 'declare' construct is the first of two 'declaration' level
constructs, so it is legal in any place a declaration is, including as a
statement, which this accomplishes by wrapping it in a DeclStmt. All
clauses on this have a 'same scope' requirement, which this enforces as
declaration context instead, which makes it possible to implement these
as a template.

The 'link' and 'device_resident' clauses are also added, which have some
similar/small restrictions, but are otherwise pretty rote.

This patch implements all of the above.
2025-03-03 07:48:29 -08:00
erichkeane
99a9133a68 [OpenACC] Implement Sema/AST for 'atomic' construct
The atomic construct is a particularly complicated one.  The directive
itself is pretty simple, it has 5 options for the 'atomic-clause'.
However, the associated statement is fairly complicated.

'read' accepts:
  v = x;
'write' accepts:
  x = expr;
'update' (or no clause) accepts:
  x++;
  x--;
  ++x;
  --x;
  x binop= expr;
  x = x binop expr;
  x = expr binop x;

'capture' accepts either a compound statement, or:
  v = x++;
  v = x--;
  v = ++x;
  v = --x;
  v = x binop= expr;
  v = x = x binop expr;
  v = x = expr binop x;

IF 'capture' has a compound statement, it accepts:
  {v = x; x binop= expr; }
  {x binop= expr; v = x; }
  {v = x; x = x binop expr; }
  {v = x; x = expr binop x; }
  {x = x binop expr ;v = x; }
  {x = expr binop x; v = x; }
  {v = x; x = expr; }
  {v = x; x++; }
  {v = x; ++x; }
  {x++; v = x; }
  {++x; v = x; }
  {v = x; x--; }
  {v = x; --x; }
  {x--; v = x; }
  {--x; v = x; }

While these are all quite complicated, there is a significant amount
of similarity between the 'capture' and 'update' lists, so this patch
reuses a lot of the same functions.

This patch implements the entirety of 'atomic', creating a new Sema file
for the sema for it, as it is fairly sizable.
2025-02-03 07:22:22 -08:00
erichkeane
2c75bda426 [OpenACC] Split up SemaOpenACC.cpp
This file is getting quite large, so this patch splits the 'clause'
specific parts off into its own file to keep them better organized.
2025-01-15 13:51:05 -08:00
erichkeane
553fa204ed [OpenACC] Implement 'at least one of' restriction for 'update'
This completes the implementation of 'update' by implementing its last
restriction. This restriction requires at least 1 of the 'self', 'host',
  or 'device' clauses.
2025-01-09 09:28:58 -08:00
erichkeane
be32621ce8 [OpenACC] Implement 'device' and 'host' clauses for 'update'
These two clauses just take a 'var-list' and specify where the variables
should be copied from/to.  This patch implements the AST nodes for them
and ensures they properly take a var-list.
2025-01-09 09:28:58 -08:00
erichkeane
2c2accbcc6 [OpenACC] Enable 'self' sema for 'update' construct
The 'self' clause is an unfortunately difficult one, as it has a
significantly different meaning between 'update' and the other
constructs.  This patch introduces a way for the 'self' clause to work
as both.  I considered making this two separate AST nodes (one for
'self' on 'update' and one for the others), however this makes the
automated macros/etc for supporting a clause break.

Instead, 'self' has the ability to act as either a condition or as a
var-list clause.  As this is the only one of its kind, it is implemented
all within it.  If in the future we have more that work like this, we
should consider rewriting a lot of the macros that we use to make
clauses work, and make them separate ast nodes.
2025-01-08 13:19:33 -08:00
erichkeane
666eee0ef8 [OpenACC] enable 'device_type' for the 'update' construct
This has a similar restriction to 'set' in that only 'async' and 'wait'
are disallowed, so this implements that restriction and enables
'device_type'.
2025-01-07 11:25:58 -08:00
erichkeane
e2c1b1fed4 [OpenACC] enable 'async' and 'wait' for 'update' construct
These work the same here as they do for every other construct, so this
is as simple as enabling them and writing tests, which this patch does.
2025-01-07 10:23:50 -08:00
erichkeane
dd1e8aa09c [OpenACC] Enable 'if' and 'if_present' for 'update' construct
The only restriction on 'if' is that only 1 can appear on an update
construct, so this enforces that.  'if_present' has no restrictions.
2025-01-07 08:20:20 -08:00
erichkeane
db81e8c42e [OpenACC] Initial sema implementation of 'update' construct
This executable construct has a larger list of clauses than some of the
others, plus has some additional restrictions.  This patch implements
the AST node, plus the 'cannot be the body of a if, while, do, switch,
    or label' statement restriction.  Future patches will handle the
    rest of the restrictions, which are based on clauses.
2025-01-07 08:20:20 -08:00
erichkeane
ff24e9a19e [OpenACC] Implement 'default_async' sema
A fairly simple one, only valid on the 'set' construct, this clause
takes an int expression.  Most of the work was already done as a part of
parsing, so this patch ends up being a lot of infrastructure.
2025-01-06 11:03:18 -08:00
erichkeane
21c785d7bd [OpenACC] Implement 'set' construct sema
The 'set' construct is another fairly simple one, it doesn't have an
associated statement and only a handful of allowed clauses. This patch
implements it and all the rules for it, allowing 3 of its for clauses.
The only exception is default_async, which will be implemented in a
future patch, because it isn't just being enabled, it needs a complete
new implementation.
2025-01-06 11:03:18 -08:00
erichkeane
bdf2555308 [OpenACC] Implement 'device_num' clause sema for 'init'/'shutdown'
This is a very simple sema implementation, and just required AST node
plus the existing diagnostics.  This patch adds tests and adds the AST
node required, plus enables it for 'init' and 'shutdown' (only!)
2024-12-19 12:21:51 -08:00
erichkeane
4bbdb018a6 [OpenACC] Implement 'init' and 'shutdown' constructs
These two constructs are very simple and similar, and only support 3
different clauses, two of which are already implemented.  This patch
adds AST nodes for both constructs, and leaves the device_num clause
unimplemented, but enables the other two.
2024-12-19 12:21:50 -08:00
erichkeane
e34cc7c993 [OpenACC] Implement 'wait' construct
The arguments to this are the same as for the 'wait' clause, so this
reuses all of that infrastructure. So all this has to do is support a
pair of clauses that are already implemented (if and async), plus create
an AST node.  This patch does so, and adds proper testing.
2024-12-18 15:06:01 -08:00
erichkeane
bfc2dbe02e [OpenACC] Implement data construct 'at least 1 of ... clauses' rule
All 4 of the 'data' constructs have a requirement that at least 1 of a
small list of clauses must appear on the construct.  This patch
implements that restriction, and updates all of the tests it takes to
do so.
2024-12-16 11:52:57 -08:00
erichkeane
fbb14dd977 [OpenACC] Implement 'use_device' clause AST/Sema
This is a clause that is only valid on 'host_data' constructs, and
identifies variables which it should use the current device address.
From a Sema perspective, the only thing novel here is mild changes to
how ActOnVar works for this clause, else this is very much like the rest
of the 'var-list' clauses.
2024-12-16 09:35:57 -08:00
erichkeane
1ab81f8e7f [OpenACC] Implement 'delete' AST/Sema for 'exit data' construct
'delete' is another clause that has very little compile-time
implication, but needs a full AST that takes a var list.  This patch
ipmlements it fully, plus adds sufficient test coverage.
2024-12-16 06:44:53 -08:00
erichkeane
3351b3bf8d [OpenACC] implement 'detach' clause sema
This is another new clause specific to 'exit data' that takes a pointer
argument. This patch implements this the same way we do a few other
clauses (like attach) that have the same restrictions.
2024-12-13 13:51:41 -08:00
erichkeane
2244d2e75c [OpenACC] Implement 'if_present' clause sema
The 'if_present' clause controls the replacement of addresses in the
var-list in current device memory.  This clause can only go on
'host_device'.  From a Sema perspective, there isn't anything to do
beyond add this to AST and pass it on.
2024-12-13 13:04:57 -08:00
erichkeane
003eb5e80d [OpenACC] Implement 'finalize' clause sema
This is a very simple clause as far as sema is concerned.  It is only
valid on 'exit data', and doesn't have any rules involving it, so it is
simply applied and passed onto the MLIR.
2024-12-13 10:41:02 -08:00