The 'cache' construct is lowered as marking the acc.loop in ACC MLIR.
This results in any variable references that are not inside of the
acc.loop being invalid. This patch adds a warning to that effect, and
ensures that the variable references won't be added to the AST during
parsing so we don't try to lower them.
This results in loss of instantiation-diagnostics for these, however
that seems like an acceptable consequence to ignoring it.
The 'capture' modifier is an OpenACC 3.3NEXT (AKA 3.4) feature, which
permits a new kind of identifying the memory location of variables in a
data clause. However, it is only valid on data, combined, or compute
constructs.
This patch implements all of the proper restrictions.
OpenACC 3.4 will permit extension clauses, which are clauses that start
with '__', and contain an optional balanced-paren-token-sequence inside
of parens.
This patch ensures we consume these, and emit a warning instead of an
error for unsupported clauses.
Following the steps of #82217, this patch reorganizes declarations in
`Parse.h`. Highlights are:
1) Declarations are grouped in the same fashion as in `Sema.h`. Table of
contents is provided at the beginning of `Parser` class. `public`
declaration go first, then `private` ones, but unlike `Sema`, most of
the stuff in `Parser` is private.
2) Documentation has been moved from `.cpp` files to the header. Grammar
was consistently put in `\verbatim` blocks to render nicely in Doxygen.
3) File has been formatted with clang-format, except for the grammar,
because clang-format butchers it.
I'd misunderstood how the ParseStringLiteralExpression function worked,
so I assumed it would catch non-string literals, however it instead
asserted. This patch now checks for that case and diagnoses.
Fixes: #139346
…uses
The Flang implemenation of OpenACC uses a .td file in the llvm/Frontend
directory to determine appertainment in 4 categories:
-Required: If this list has items in it, the directive requires at least
1 of these be present.
-AllowedExclusive: Items on this list are all allowed, but only 1 from
the list may be here (That is, they are exclusive of eachother).
-AllowedOnce: Items on this list are all allowed, but may not be
duplicated.
Allowed: Items on this list are allowed. Note th at the actual list of
'allowed' is all 4 of these lists together.
This is a draft patch to swtich Clang over to use those tables. Surgery
to get this to happen in Clang Sema was somewhat reasonable. However,
some gaps in the implementations are obvious, the existing clang
implementation disagrees with the Flang interpretation of it. SO, we're
keeping a task list here based on what gets discovered.
Changes to Clang:
- [x] Switch 'directive-kind' enum conversions to use tablegen See
ff1a7bddd9435b6ae2890c07eae60bb07898bbf5
- [x] Switch 'clause-kind' enum conversions to use tablegen See
ff1a7bddd9435b6ae2890c07eae60bb07898bbf5
- [x] Investigate 'parse' test differences to see if any new
disagreements arise.
- [x] Clang/Flang disagree as to whether 'collapse' can be multiple
times on a loop. Further research showed no prose to limit this, and the
comment on the clang implementation said "no good reason to allow", so
no standards justification.
- [x] Clang/Flang disagree whether 'num_gangs' can appear >1 on a
compute/combined construct. This ended up being an unjustified
restriction.
- [x] Clang/Flang disagree as to the list of required clauses on a 'set'
construct. My research shows that Clang mistakenly included 'if' in the
list, and that it should be just 'default_async', 'device_num', and
'device_type'.
- [x] Order of 'at least one of' diagnostic has changed. Tests were
updated.
- [x] Ensure we are properly 'de-aliasing' clause names in appertainment
checks?
- [x] What is 'shortloop'? 'shortloop' seems to be an old non-standard
extension that isn't supported by flang, but is parsed for backward
compat reasons. Clang won't parse, but we at least have a spot for it in
the clause list.
- [x] Implemented proposed change for 'routine' gang/worker/vector/seq.
(see issue 539)
- [x] Implement init/shutdown can only have 1 'if' (see issue 540)
- [x] Clang/Flang disagree as to whether 'tile' is permitted more than
once on a 'loop' or combined constructs (Flang prohibits >1). I see no
justification for this in the standard. EDIT: I found a comment in clang
that I did this to make SOMETHING around duplicate checks easier.
Discussion showed we should actually have a better behavior around
'device_type' and duplicates, so I've since implemented that.
- [x] Clang/Flang disagree whether 'gang', 'worker', or 'vector' may
appear on the same construct as a 'seq' on a 'loop' or 'combined'. There
is prose for this in 2022: (a gang, worker, or vector clause may not
appear if a 'seq' clause appears). EDIT: These don't actually disagree,
but aren't in the .td file, so I restored the existing code to do this.
- [x] Clang/Flang disagree on whether 'bind' can appear >1 on a
'routine'. I believe line 3096 (A bind clause may not bind to a routine
name that has a visible bind clause) makes this limitation (Flang
permits >1 bind). we discussed and decided this should have the same
rules as worker/vector/etc, except without the 'exactly 1 of' rule (so
no dupes in individual sections).
- [x] Clang/Flang disagree on whether 'init'/'shutdown' can have
multiple 'device_num' clauses. I believe there is no supporting prose
for this limitation., We decided that `device_num` should only happen
1x.
- [x] Clang/Flang disagree whether 'num_gangs' can appear >1 on a
'kernels' construct. Line 1173 (On a kernels construct, the num_gangs
clause must have a single argument) justifies limiting on a
per-arguement basis, but doesn't do so for multiple num_gangs clauses.
WE decided to do this with the '1-per-device-type' region for num_gangs,
num_workers, and vector_length, see openacc bug here:
https://github.com/OpenACC/openacc-spec/issues/541
Changes to Flang:
- [x] Clang/Flang disgree on whether 'atomic' can take an 'if' clause.
This was added in OpenACC3.3_Next See #135451
- [x] Clang/Flang disagree on whether 'finalize' can be allowed >1 times
on a 'exit_data' construct. see #135415.
- [x] Clang/Flang disagree whether 'if_present' should be allowed >1
times on a 'host_data'/'update' construct. see #135422
- [x] Clang/Flang disagree on whether 'init'/'shutdown' can have
multiple 'device_type' clauses. I believe there is no supporting prose
for this limitation.
- [ ] SEE change for num_gangs/etc above.
Changes that need discussion/research:
This PR reland https://github.com/llvm/llvm-project/pull/135808, fixed
some missed changes in LLDB.
I found this issue when I working on
https://github.com/llvm/llvm-project/pull/107168.
Currently we have many similiar data structures like:
- std::pair<IdentifierInfo *, SourceLocation>.
- Element type of ModuleIdPath.
- IdentifierLocPair.
- IdentifierLoc.
This PR unify these data structures to IdentifierLoc, moved
IdentifierLoc definition to SourceLocation.h, and deleted other similer
data structures.
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
Reverts llvm/llvm-project#135808
Example from the LLDB macOS CI:
https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/as-lldb-cmake/24084/execution/node/54/log/?consoleFull
```
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/source/Plugins/ExpressionParser/Clang/ClangModulesDeclVendor.cpp:360:49: error: no viable conversion from 'std::pair<clang::IdentifierInfo *, clang::SourceLocation>' to 'clang::ModuleIdPath' (aka 'ArrayRef<IdentifierLoc>')
clang::Module *top_level_module = DoGetModule(clang_path.front(), false);
^~~~~~~~~~~~~~~~~~
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:41:40: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from 'std::pair<clang::IdentifierInfo *, clang::SourceLocation>' to 'const llvm::ArrayRef<clang::IdentifierLoc> &' for 1st argument
class LLVM_GSL_POINTER [[nodiscard]] ArrayRef {
^
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:41:40: note: candidate constructor (the implicit move constructor) not viable: no known conversion from 'std::pair<clang::IdentifierInfo *, clang::SourceLocation>' to 'llvm::ArrayRef<clang::IdentifierLoc> &&' for 1st argument
/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/llvm/include/llvm/ADT/ArrayRef.h:70:18: note: candidate constructor not viable: no known conversion from 'std::pair<clang::IdentifierInfo *, clang::SourceLocation>' to 'std::nullopt_t' for 1st argument
/*implicit*/ ArrayRef(std::nullopt_t) {}
```
I found this issue when I working on
https://github.com/llvm/llvm-project/pull/107168.
Currently we have many similiar data structures like:
- `std::pair<IdentifierInfo *, SourceLocation>`.
- Element type of `ModuleIdPath`.
- `IdentifierLocPair`.
- `IdentifierLoc`.
This PR unify these data structures to `IdentifierLoc`, moved
`IdentifierLoc` definition to SourceLocation.h, and deleted other
similer data structures.
---------
Signed-off-by: yronglin <yronglin777@gmail.com>
OpenACC 3.3-NEXT has changed the way tags for copy, copyin, copyout, and
create clauses are specified, and end up adding a few extras, and
permits them as a list. This patch encodes these as bitmask enum so
they can be stored succinctly, but still diagnose reasonably.
OpenACC Github PR#499 defines the pqr-list as having at least 1 item. We
already handle that for all but 'wait', so this patch just does the work
to add it for 'wait', plus adds tests.
This is the last item of the OpenACC 3.3 spec. It includes the
implicit-name version of 'routine', plus significant refactorings to
make the two work together. The implicit name version is represented as
an attribute on the function call. This patch also implements the
clauses for the implicit-name version, as well as the A.3.4 warning.
The 'bind' clause allows the renaming of a function during code
generation. There are a few rules about when this can/cannot happen,
and it takes either a string or identifier (previously mis-implemetned
as ID-expression) argument.
Note there are additional rules to this in the implicit-function routine
case, but that isn't implemented in this patch, as implicit-function
routine is not yet implemented either.
These 4 clauses are mutually exclusive, AND require at least one of
them. Additionally, gang has some additional restrictions in that only
the 'dim' specifier is permitted. This patch implements all of this, and
ends up refactoring the handling of each of these clauses for
readabililty.
The 'routine' construct has two forms, one which takes the name of a
function that it applies to, and another where it implicitly figures it
out based on the next declaration. This patch implements the former with
the required restrictions on the name and the function-static-variables
as specified.
What has not been implemented is any clauses for this, any of the A.3.4
warnings, or the other form.
This statement level construct takes no clauses and has no associated
statement, and simply labels a number of array elements as valid for
caching. The implementation here is pretty simple, but it is a touch of
a special case for parsing, so the parsing code reflects that.
The 'declare' construct is the first of two 'declaration' level
constructs, so it is legal in any place a declaration is, including as a
statement, which this accomplishes by wrapping it in a DeclStmt. All
clauses on this have a 'same scope' requirement, which this enforces as
declaration context instead, which makes it possible to implement these
as a template.
The 'link' and 'device_resident' clauses are also added, which have some
similar/small restrictions, but are otherwise pretty rote.
This patch implements all of the above.
The atomic construct is a particularly complicated one. The directive
itself is pretty simple, it has 5 options for the 'atomic-clause'.
However, the associated statement is fairly complicated.
'read' accepts:
v = x;
'write' accepts:
x = expr;
'update' (or no clause) accepts:
x++;
x--;
++x;
--x;
x binop= expr;
x = x binop expr;
x = expr binop x;
'capture' accepts either a compound statement, or:
v = x++;
v = x--;
v = ++x;
v = --x;
v = x binop= expr;
v = x = x binop expr;
v = x = expr binop x;
IF 'capture' has a compound statement, it accepts:
{v = x; x binop= expr; }
{x binop= expr; v = x; }
{v = x; x = x binop expr; }
{v = x; x = expr binop x; }
{x = x binop expr ;v = x; }
{x = expr binop x; v = x; }
{v = x; x = expr; }
{v = x; x++; }
{v = x; ++x; }
{x++; v = x; }
{++x; v = x; }
{v = x; x--; }
{v = x; --x; }
{x--; v = x; }
{--x; v = x; }
While these are all quite complicated, there is a significant amount
of similarity between the 'capture' and 'update' lists, so this patch
reuses a lot of the same functions.
This patch implements the entirety of 'atomic', creating a new Sema file
for the sema for it, as it is fairly sizable.
These two clauses just take a 'var-list' and specify where the variables
should be copied from/to. This patch implements the AST nodes for them
and ensures they properly take a var-list.
The 'self' clause is an unfortunately difficult one, as it has a
significantly different meaning between 'update' and the other
constructs. This patch introduces a way for the 'self' clause to work
as both. I considered making this two separate AST nodes (one for
'self' on 'update' and one for the others), however this makes the
automated macros/etc for supporting a clause break.
Instead, 'self' has the ability to act as either a condition or as a
var-list clause. As this is the only one of its kind, it is implemented
all within it. If in the future we have more that work like this, we
should consider rewriting a lot of the macros that we use to make
clauses work, and make them separate ast nodes.
A fairly simple one, only valid on the 'set' construct, this clause
takes an int expression. Most of the work was already done as a part of
parsing, so this patch ends up being a lot of infrastructure.
This is a very simple sema implementation, and just required AST node
plus the existing diagnostics. This patch adds tests and adds the AST
node required, plus enables it for 'init' and 'shutdown' (only!)
These two constructs are very simple and similar, and only support 3
different clauses, two of which are already implemented. This patch
adds AST nodes for both constructs, and leaves the device_num clause
unimplemented, but enables the other two.
The arguments to this are the same as for the 'wait' clause, so this
reuses all of that infrastructure. So all this has to do is support a
pair of clauses that are already implemented (if and async), plus create
an AST node. This patch does so, and adds proper testing.
All 4 of the 'data' constructs have a requirement that at least 1 of a
small list of clauses must appear on the construct. This patch
implements that restriction, and updates all of the tests it takes to
do so.
This is a clause that is only valid on 'host_data' constructs, and
identifies variables which it should use the current device address.
From a Sema perspective, the only thing novel here is mild changes to
how ActOnVar works for this clause, else this is very much like the rest
of the 'var-list' clauses.
'delete' is another clause that has very little compile-time
implication, but needs a full AST that takes a var list. This patch
ipmlements it fully, plus adds sufficient test coverage.
This is another new clause specific to 'exit data' that takes a pointer
argument. This patch implements this the same way we do a few other
clauses (like attach) that have the same restrictions.
These constructs are all very similar and closely related, so this patch
creates the AST nodes for them, serialization, printing/etc.
Additionally the restrictions are all added as tests/todos in the tests,
as those will have to be implemented once we get those clauses implemented.
This didn't end up being properly tested, but 'delete' as a keyword
causes us to not properly recognize it as a clause kind. This patch
correctly adds the work to make sure it is recognized correctly.
Combined constructs (OpenACC 3.3 section 2.11) are a short-cut for
writing a `loop` construct immediately inside of a `compute` construct.
However, this interaction requires we do additional work to ensure that
we get the semantics between the two correct, as well as diagnostics.
This patch adds the semantic analysis for the constructs (but no
clauses), as well as the AST nodes.
OpenACC restricts the contents of a 'for' loop affected by a 'loop'
construct without a 'seq'. The loop variable must be integer, pointer,
or random-access-iterator, it must monotonically increase/decrease, and
the trip count must be computable at runtime before the function.
This patch tries to implement some of these limitations to the best of
our ability, though it causes us to be perhaps overly restrictive at the
moment. I expect we'll revisit some of these rules/add additional
supported forms of loop-variable and 'monotonically increasing' here,
but the currently enforced rules are heavily inspired by the OMP
implementation here.
The worker clause specifies iterations of the loop/ that are executed in
parallel by distributing the iterations among the multiple works within
a single gang.
The sema rules for this type are simply that it cannot be combined with
a `kernel` construct with a `num_workers` clause, child `loop` clauses
cannot contain a `gang` or `worker` clause, and that the argument is oly
allowed when associated with a `kernel`.
The 'gang' clause is used to specify parallel execution of loops, thus
has some complicated rules depending on the 'loop's associated compute
construct. This patch implements all of those.
The 'tile' clause shares quite a bit of the rules with 'collapse', so a
followup patch will add those tests/behaviors. This patch deals with
adding the AST node.
The 'tile' clause takes a series of integer constant expressions, or *.
The asterisk is now represented by a new OpenACCAsteriskSizeExpr node,
else this clause is very similar to others.
'collapse' makes the 'loop' construct apply to the next N nested loops,
and 'loop' requires all associated loops be 'for' loops (or range-for).
This patch adds this restriction, plus adds a number of infrastructure
changes to permit some of the other restrictions in the future to be
implemented.
'collapse' also requires that there is not any calls to other directives
inside of the collapse region, so this also implements that limitation.
The 'collapse' clause on a 'loop' construct is used to specify how many
nested loops are associated with the 'loop' construct. It takes an
optional 'force' tag, and an integer constant expression as arguments.
There are many other restrictions based on the contents of the loop/etc,
but those are implemented in followup patches, for now, this patch just
adds the AST node and does basic argument checking on the loop-count.
These three clauses are all quite trivial, as they take no parameters.
They are mutually exclusive, and 'seq' has some other exclusives that
are implemented here.
The ONE thing that isn't implemented is 2.9's restriction (line 2010):
'A loop associated with a 'loop' construct that does not have a 'seq'
clause must be written to meet all the following conditions'.
Future clauses will require similar work, so it'll be done as a
followup.
This patch implements the 'loop' construct AST, as well as the basic
appertainment rule. Additionally, it sets up the 'parent' compute
construct, which is necessary for codegen/other diagnostics.
A 'loop' can apply to a for or range-for loop, otherwise it has no other
restrictions (though some of its clauses do).
I discovered while working on something else that we were using the
location of the directive name as the 'beginloc' which caused some
problems in a few places. This patch makes it so our beginloc is the
'#' as we originally designed, and then adds a DirectiveLoc concept to a
construct for use diagnosing the name.
'reduction' has a few restrictions over normal 'var-list' clauses:
1- On parallel, a num_gangs can only have 1 argument when combined with
reduction. These two aren't able to be combined on any other of the
compute constructs however.
2- The vars all must be 'numerical data types' types of some sort, or a
'composite of numerical data types'. A list of types is given in the
standard as a minimum, so we choose 'isScalar', which covers all of
these types and keeps types that are actually numeric. Other compilers
don't seem to implement the 'composite of numerical data types', though
we do.
3- Because of the above restrictions, member-of-composite is not
allowed, so any access via a memberexpr is disallowed. Array-element and
sub-arrays (aka array sections) are both permitted, so long as they meet
the requirements of #2.
This patch implements all of these for compute constructs.
device_type, also spelled as dtype, specifies the applicability of the
clauses following it, and takes a series of identifiers representing the
architectures it applies to. As we don't have a source for the valid
architectures yet, this patch just accepts all.
Semantically, this also limits the list of clauses that can be applied
after the device_type, so this implements that as well.
This reverts commit 06f04b2e27f2586d3db2204ed4e54f8b78fea74e.
This reapplies commit c4a9a374749deb5f2a932a7d4ef9321be1b2ae5d.
The build failures were caused by the patch depending on the order of
evaluation of arguments to a function. This reapplication separates out
the capture of one of the values.
This reverts commit c4a9a374749deb5f2a932a7d4ef9321be1b2ae5d.
This and the followup patch keep hitting an assert I wrote on the build
bots in a way that isn't clear. Reverting so I can fix it without a
rush.
device_type, also spelled as dtype, specifies the applicability of the
clauses following it, and takes a series of identifiers representing the
architectures it applies to. As we don't have a source for the valid
architectures yet, this patch just accepts all.
Semantically, this also limits the list of clauses that can be applied
after the device_type, so this implements that as well.
'wait' takes a few int-exprs (well, a series of async-arguments, but
those are effectively just an int-expr), plus a pair of tags. This
patch adds the support for this to the AST, and does the appropriate
semantic analysis for them.
It was discovered while writing the 'wait' clause ast tests that the
'endloc' wasn't set correctly when there was no arguments. This patch
ensures we set it right, and adds an assert to prevent us from messing
it up in the future.
This is a pretty simple clause, it takes an 'async-argument', which
effectively needs to be just parsed as an 'int' argument, since it can
be an arbitrarly integer at runtime (and negative values are legal for
implementation defined values).
This patch also cleans up the async-argument parsing, so 'wait' got some
minor quality-of-life improvements for parsing (both clause and
construct).