
…uses The Flang implemenation of OpenACC uses a .td file in the llvm/Frontend directory to determine appertainment in 4 categories: -Required: If this list has items in it, the directive requires at least 1 of these be present. -AllowedExclusive: Items on this list are all allowed, but only 1 from the list may be here (That is, they are exclusive of eachother). -AllowedOnce: Items on this list are all allowed, but may not be duplicated. Allowed: Items on this list are allowed. Note th at the actual list of 'allowed' is all 4 of these lists together. This is a draft patch to swtich Clang over to use those tables. Surgery to get this to happen in Clang Sema was somewhat reasonable. However, some gaps in the implementations are obvious, the existing clang implementation disagrees with the Flang interpretation of it. SO, we're keeping a task list here based on what gets discovered. Changes to Clang: - [x] Switch 'directive-kind' enum conversions to use tablegen See ff1a7bddd9435b6ae2890c07eae60bb07898bbf5 - [x] Switch 'clause-kind' enum conversions to use tablegen See ff1a7bddd9435b6ae2890c07eae60bb07898bbf5 - [x] Investigate 'parse' test differences to see if any new disagreements arise. - [x] Clang/Flang disagree as to whether 'collapse' can be multiple times on a loop. Further research showed no prose to limit this, and the comment on the clang implementation said "no good reason to allow", so no standards justification. - [x] Clang/Flang disagree whether 'num_gangs' can appear >1 on a compute/combined construct. This ended up being an unjustified restriction. - [x] Clang/Flang disagree as to the list of required clauses on a 'set' construct. My research shows that Clang mistakenly included 'if' in the list, and that it should be just 'default_async', 'device_num', and 'device_type'. - [x] Order of 'at least one of' diagnostic has changed. Tests were updated. - [x] Ensure we are properly 'de-aliasing' clause names in appertainment checks? - [x] What is 'shortloop'? 'shortloop' seems to be an old non-standard extension that isn't supported by flang, but is parsed for backward compat reasons. Clang won't parse, but we at least have a spot for it in the clause list. - [x] Implemented proposed change for 'routine' gang/worker/vector/seq. (see issue 539) - [x] Implement init/shutdown can only have 1 'if' (see issue 540) - [x] Clang/Flang disagree as to whether 'tile' is permitted more than once on a 'loop' or combined constructs (Flang prohibits >1). I see no justification for this in the standard. EDIT: I found a comment in clang that I did this to make SOMETHING around duplicate checks easier. Discussion showed we should actually have a better behavior around 'device_type' and duplicates, so I've since implemented that. - [x] Clang/Flang disagree whether 'gang', 'worker', or 'vector' may appear on the same construct as a 'seq' on a 'loop' or 'combined'. There is prose for this in 2022: (a gang, worker, or vector clause may not appear if a 'seq' clause appears). EDIT: These don't actually disagree, but aren't in the .td file, so I restored the existing code to do this. - [x] Clang/Flang disagree on whether 'bind' can appear >1 on a 'routine'. I believe line 3096 (A bind clause may not bind to a routine name that has a visible bind clause) makes this limitation (Flang permits >1 bind). we discussed and decided this should have the same rules as worker/vector/etc, except without the 'exactly 1 of' rule (so no dupes in individual sections). - [x] Clang/Flang disagree on whether 'init'/'shutdown' can have multiple 'device_num' clauses. I believe there is no supporting prose for this limitation., We decided that `device_num` should only happen 1x. - [x] Clang/Flang disagree whether 'num_gangs' can appear >1 on a 'kernels' construct. Line 1173 (On a kernels construct, the num_gangs clause must have a single argument) justifies limiting on a per-arguement basis, but doesn't do so for multiple num_gangs clauses. WE decided to do this with the '1-per-device-type' region for num_gangs, num_workers, and vector_length, see openacc bug here: https://github.com/OpenACC/openacc-spec/issues/541 Changes to Flang: - [x] Clang/Flang disgree on whether 'atomic' can take an 'if' clause. This was added in OpenACC3.3_Next See #135451 - [x] Clang/Flang disagree on whether 'finalize' can be allowed >1 times on a 'exit_data' construct. see #135415. - [x] Clang/Flang disagree whether 'if_present' should be allowed >1 times on a 'host_data'/'update' construct. see #135422 - [x] Clang/Flang disagree on whether 'init'/'shutdown' can have multiple 'device_type' clauses. I believe there is no supporting prose for this limitation. - [ ] SEE change for num_gangs/etc above. Changes that need discussion/research:
390 lines
14 KiB
C++
390 lines
14 KiB
C++
// RUN: %clang_cc1 %s -fopenacc -verify
|
|
|
|
constexpr int three() { return 3; }
|
|
constexpr int one() { return 1; }
|
|
constexpr int neg() { return -1; }
|
|
constexpr int zero() { return 0; }
|
|
|
|
struct NotConstexpr {
|
|
constexpr NotConstexpr(){};
|
|
|
|
operator int(){ return 1; }
|
|
};
|
|
struct ConvertsNegative {
|
|
constexpr ConvertsNegative(){};
|
|
|
|
constexpr operator int(){ return -1; }
|
|
};
|
|
struct ConvertsOne{
|
|
constexpr ConvertsOne(){};
|
|
|
|
constexpr operator int(){ return 1; }
|
|
};
|
|
|
|
struct ConvertsThree{
|
|
constexpr ConvertsThree(){};
|
|
|
|
constexpr operator int(){ return 3; }
|
|
};
|
|
|
|
template<typename T, int Val>
|
|
void negative_zero_constexpr_templ() {
|
|
// expected-error@+1 2{{OpenACC 'tile' clause size expression must be positive integer value, evaluated to 0}}
|
|
#pragma acc serial loop tile(*, T{})
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j);
|
|
|
|
// expected-error@+1{{OpenACC 'tile' clause size expression must be positive integer value, evaluated to -1}}
|
|
#pragma acc parallel loop tile(Val, *)
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j);
|
|
|
|
// expected-error@+1{{OpenACC 'tile' clause size expression must be positive integer value, evaluated to 0}}
|
|
#pragma acc kernels loop tile(zero(), *)
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j);
|
|
}
|
|
|
|
void negative_zero_constexpr() {
|
|
negative_zero_constexpr_templ<int, 1>(); // expected-note{{in instantiation of function template specialization}}
|
|
negative_zero_constexpr_templ<int, -1>(); // expected-note{{in instantiation of function template specialization}}
|
|
|
|
// expected-error@+1{{OpenACC 'tile' clause size expression must be positive integer value, evaluated to 0}}
|
|
#pragma acc serial loop tile(0, *)
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j);
|
|
|
|
// expected-error@+1{{OpenACC 'tile' clause size expression must be positive integer value, evaluated to 0}}
|
|
#pragma acc parallel loop tile(1, 0)
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j);
|
|
|
|
// expected-error@+1{{OpenACC 'tile' clause size expression must be positive integer value, evaluated to -1}}
|
|
#pragma acc kernels loop tile(1, -1)
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j);
|
|
|
|
// expected-error@+1{{OpenACC 'tile' clause size expression must be positive integer value, evaluated to -1}}
|
|
#pragma acc parallel loop tile(-1, 0)
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j);
|
|
|
|
// expected-error@+1{{OpenACC 'tile' clause size expression must be positive integer value, evaluated to 0}}
|
|
#pragma acc serial loop tile(zero(), 0)
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j);
|
|
|
|
// expected-error@+1{{OpenACC 'tile' clause size expression must be positive integer value, evaluated to -1}}
|
|
#pragma acc kernels loop tile(1, neg())
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j);
|
|
|
|
// expected-error@+1{{OpenACC 'tile' clause size expression must be an asterisk or a constant expression}}
|
|
#pragma acc parallel loop tile(NotConstexpr{})
|
|
for(int i = 0; i < 5; ++i);
|
|
|
|
// expected-error@+1{{OpenACC 'tile' clause size expression must be positive integer value, evaluated to -1}}
|
|
#pragma acc serial loop tile(1, ConvertsNegative{})
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j);
|
|
|
|
#pragma acc kernels loop tile(*, ConvertsOne{})
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j);
|
|
}
|
|
|
|
template<unsigned One>
|
|
void only_for_loops_templ() {
|
|
// expected-note@+1{{'parallel loop' construct is here}}
|
|
#pragma acc parallel loop tile(One)
|
|
// expected-error@+1{{OpenACC 'parallel loop' construct can only be applied to a 'for' loop}}
|
|
while(true);
|
|
|
|
// expected-note@+1{{'serial loop' construct is here}}
|
|
#pragma acc serial loop tile(One)
|
|
// expected-error@+1{{OpenACC 'serial loop' construct can only be applied to a 'for' loop}}
|
|
do {} while(true);
|
|
|
|
// expected-error@+1{{'tile' clause specifies a loop count greater than the number of available loops}}
|
|
#pragma acc kernels loop tile(One, 2) // expected-note 2{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i)
|
|
// expected-error@+1{{while loop cannot appear in intervening code of a 'kernels loop' with a 'tile' clause}}
|
|
while(true);
|
|
|
|
// expected-error@+1{{'tile' clause specifies a loop count greater than the number of available loops}}
|
|
#pragma acc serial loop tile(One, 2) // expected-note 2{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i)
|
|
// expected-error@+1{{do loop cannot appear in intervening code of a 'serial loop' with a 'tile' clause}}
|
|
do{}while(true);
|
|
}
|
|
|
|
|
|
void only_for_loops() {
|
|
// expected-note@+1{{'parallel loop' construct is here}}
|
|
#pragma acc parallel loop tile(1)
|
|
// expected-error@+1{{OpenACC 'parallel loop' construct can only be applied to a 'for' loop}}
|
|
while(true);
|
|
|
|
// expected-note@+1{{'serial loop' construct is here}}
|
|
#pragma acc serial loop tile(1)
|
|
// expected-error@+1{{OpenACC 'serial loop' construct can only be applied to a 'for' loop}}
|
|
do {} while(true);
|
|
|
|
// expected-error@+1{{'tile' clause specifies a loop count greater than the number of available loops}}
|
|
#pragma acc kernels loop tile(1, 2) // expected-note 2{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i)
|
|
// expected-error@+1{{while loop cannot appear in intervening code of a 'kernels loop' with a 'tile' clause}}
|
|
while(true);
|
|
|
|
// expected-error@+1{{'tile' clause specifies a loop count greater than the number of available loops}}
|
|
#pragma acc parallel loop tile(1, 2) // expected-note 2{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i)
|
|
// expected-error@+1{{do loop cannot appear in intervening code of a 'parallel loop' with a 'tile' clause}}
|
|
do{}while(true);
|
|
}
|
|
|
|
template<unsigned Val>
|
|
void depth_too_high_templ() {
|
|
// expected-error@+1{{'tile' clause specifies a loop count greater than the number of available loops}}
|
|
#pragma acc kernels loop tile (Val, *, Val) // expected-note{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j);
|
|
|
|
// expected-error@+1{{'tile' clause specifies a loop count greater than the number of available loops}}
|
|
#pragma acc parallel loop tile (Val, *, Val) // expected-note 2{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j)
|
|
// expected-error@+1{{while loop cannot appear in intervening code of a 'parallel loop' with a 'tile' clause}}
|
|
while(true);
|
|
|
|
// expected-error@+1{{'tile' clause specifies a loop count greater than the number of available loops}}
|
|
#pragma acc serial loop tile (Val, *, Val) // expected-note 2{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j)
|
|
// expected-error@+1{{do loop cannot appear in intervening code of a 'serial loop' with a 'tile' clause}}
|
|
do{}while(true);
|
|
|
|
int Arr[Val+5];
|
|
|
|
// expected-error@+1{{'tile' clause specifies a loop count greater than the number of available loops}}
|
|
#pragma acc kernels loop tile (Val, *, Val) // expected-note 2{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i)
|
|
for(auto x : Arr)
|
|
// expected-error@+1{{while loop cannot appear in intervening code of a 'kernels loop' with a 'tile' clause}}
|
|
while(true)
|
|
for(int j = 0; j < 5; ++j);
|
|
|
|
#pragma acc parallel loop tile (Val, *, Val)
|
|
for(int i = 0; i < 5; ++i)
|
|
for(auto x : Arr)
|
|
for(int j = 0; j < 5; ++j)
|
|
while(true);
|
|
}
|
|
|
|
void depth_too_high() {
|
|
depth_too_high_templ<3>();
|
|
|
|
int Arr[5];
|
|
|
|
// expected-error@+1{{'tile' clause specifies a loop count greater than the number of available loops}}
|
|
#pragma acc serial loop tile (1, *, 3) // expected-note{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j);
|
|
|
|
// expected-error@+1{{'tile' clause specifies a loop count greater than the number of available loops}}
|
|
#pragma acc parallel loop tile (1, *, 3) // expected-note 2{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j)
|
|
// expected-error@+1{{while loop cannot appear in intervening code of a 'parallel loop' with a 'tile' clause}}
|
|
while(true);
|
|
|
|
// expected-error@+1{{'tile' clause specifies a loop count greater than the number of available loops}}
|
|
#pragma acc parallel loop tile (1, *, 3) // expected-note 2{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j)
|
|
// expected-error@+1{{do loop cannot appear in intervening code of a 'parallel loop' with a 'tile' clause}}
|
|
do{}while(true);
|
|
|
|
// expected-error@+1{{'tile' clause specifies a loop count greater than the number of available loops}}
|
|
#pragma acc parallel loop tile (1, *, 3) // expected-note 2{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i)
|
|
for(int j = 0; j < 5; ++j)
|
|
// expected-error@+1{{while loop cannot appear in intervening code of a 'parallel loop' with a 'tile' clause}}
|
|
while(true)
|
|
for(int j = 0; j < 5; ++j);
|
|
|
|
#pragma acc parallel loop tile (1, *, 3)
|
|
for(int i = 0; i < 5; ++i)
|
|
for(auto x : Arr)
|
|
for(int j = 0; j < 5; ++j)
|
|
while(true);
|
|
}
|
|
|
|
template<unsigned Val>
|
|
void not_single_loop_templ() {
|
|
|
|
int Arr[Val];
|
|
|
|
#pragma acc parallel loop tile (Val, *, 3) // expected-note{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i) {
|
|
for (auto x : Arr)
|
|
for(int k = 0; k < 5; ++k);
|
|
// expected-error@+1{{more than one for-loop in a loop associated with OpenACC 'parallel loop' construct with a 'tile' clause}}
|
|
for(int j = 0; j < 5; ++j)
|
|
for(int k = 0; k < 5; ++k);
|
|
}
|
|
}
|
|
|
|
void not_single_loop() {
|
|
not_single_loop_templ<3>(); // no diagnostic, was diagnosed in phase 1.
|
|
|
|
int Arr[5];
|
|
|
|
#pragma acc parallel loop tile (1, *, 3)// expected-note{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i) {
|
|
for (auto x : Arr)
|
|
for(int k = 0; k < 5; ++k);
|
|
// expected-error@+1{{more than one for-loop in a loop associated with OpenACC 'parallel loop' construct with a 'tile' clause}}
|
|
for(int j = 0; j < 5; ++j)
|
|
for(int k = 0; k < 5; ++k);
|
|
}
|
|
}
|
|
|
|
template<unsigned Val>
|
|
void no_other_directives_templ() {
|
|
|
|
int Arr[Val];
|
|
|
|
#pragma acc parallel loop tile (Val, *, 3) // expected-note{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i) {
|
|
for (auto x : Arr) {
|
|
// expected-error@+1{{OpenACC 'serial' construct cannot appear in intervening code of a 'parallel loop' with a 'tile' clause}}
|
|
#pragma acc serial
|
|
;
|
|
for(int j = 0; j < 5; ++j);
|
|
}
|
|
}
|
|
|
|
// OK, in innermost
|
|
#pragma acc parallel loop tile (Val, *, 3)
|
|
for(int i = 0; i < 5; ++i) {
|
|
for(int j = 0; j < 5; ++j) {
|
|
for (auto x : Arr) {
|
|
#pragma acc serial
|
|
;
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
void no_other_directives() {
|
|
no_other_directives_templ<3>();
|
|
int Arr[5];
|
|
|
|
#pragma acc parallel loop tile (1, *, 3) // expected-note{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i) {
|
|
for (auto x : Arr) {
|
|
// expected-error@+1{{OpenACC 'serial' construct cannot appear in intervening code of a 'parallel loop' with a 'tile' clause}}
|
|
#pragma acc serial
|
|
;
|
|
for(int j = 0; j < 5; ++j);
|
|
}
|
|
}
|
|
|
|
// OK, in innermost
|
|
#pragma acc parallel loop tile (3, *, 3)
|
|
for(int i = 0; i < 5; ++i) {
|
|
for(int j = 0; j < 5; ++j) {
|
|
for (auto x : Arr) {
|
|
#pragma acc serial
|
|
;
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
void call();
|
|
template<unsigned Val>
|
|
void intervening_templ() {
|
|
#pragma acc parallel loop tile(1, Val, *) // expected-note{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i) {
|
|
//expected-error@+1{{inner loops must be tightly nested inside a 'tile' clause on a 'parallel loop' construct}}
|
|
call();
|
|
for(int j = 0; j < 5; ++j)
|
|
for(int k = 0; k < 5; ++k);
|
|
}
|
|
|
|
#pragma acc parallel loop tile(1, Val, *) // expected-note{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i) {
|
|
//expected-error@+1{{inner loops must be tightly nested inside a 'tile' clause on a 'parallel loop' construct}}
|
|
unsigned I;
|
|
for(int j = 0; j < 5; ++j)
|
|
for(int k = 0; k < 5; ++k);
|
|
}
|
|
|
|
#pragma acc parallel loop tile(1, Val, *)
|
|
// expected-error@+2{{OpenACC 'parallel loop' construct must have a terminating condition}}
|
|
// expected-note@-2{{'parallel loop' construct is here}}
|
|
for(int i = 0;;++i) {
|
|
// expected-error@+2{{OpenACC 'parallel loop' construct must have a terminating condition}}
|
|
// expected-note@-5{{'parallel loop' construct is here}}
|
|
for(int j = 0;;++j)
|
|
// expected-error@+2{{OpenACC 'parallel loop' construct must have a terminating condition}}
|
|
// expected-note@-8{{'parallel loop' construct is here}}
|
|
for(int k = 0;;++k)
|
|
call();
|
|
}
|
|
}
|
|
|
|
void intervening() {
|
|
intervening_templ<3>();
|
|
|
|
#pragma acc parallel loop tile(1, 2, *) // expected-note{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i) {
|
|
//expected-error@+1{{inner loops must be tightly nested inside a 'tile' clause on a 'parallel loop' construct}}
|
|
call();
|
|
for(int j = 0; j < 5; ++j)
|
|
for(int k = 0; k < 5; ++k);
|
|
}
|
|
|
|
#pragma acc parallel loop tile(1, 2, *) // expected-note{{active 'tile' clause defined here}}
|
|
for(int i = 0; i < 5; ++i) {
|
|
//expected-error@+1{{inner loops must be tightly nested inside a 'tile' clause on a 'parallel loop' construct}}
|
|
unsigned I;
|
|
for(int j = 0; j < 5; ++j)
|
|
for(int k = 0; k < 5; ++k);
|
|
}
|
|
|
|
#pragma acc parallel loop tile(1, 2, *)
|
|
for(int i = 0; i < 5; ++i) {
|
|
for(int j = 0; j < 5; ++j)
|
|
for(int k = 0; k < 5; ++k)
|
|
call();
|
|
}
|
|
|
|
#pragma acc parallel loop tile(1, 2, *)
|
|
// expected-error@+2{{OpenACC 'parallel loop' construct must have a terminating condition}}
|
|
// expected-note@-2{{'parallel loop' construct is here}}
|
|
for(int i = 0;;++i) {
|
|
// expected-error@+2{{OpenACC 'parallel loop' construct must have a terminating condition}}
|
|
// expected-note@-5{{'parallel loop' construct is here}}
|
|
for(int j = 0;;++j)
|
|
// expected-error@+2{{OpenACC 'parallel loop' construct must have a terminating condition}}
|
|
// expected-note@-8{{'parallel loop' construct is here}}
|
|
for(int k = 0;;++k)
|
|
for(;;)
|
|
call();
|
|
}
|
|
}
|
|
|
|
void collapse_tile_depth() {
|
|
// expected-error@+4{{'collapse' clause specifies a loop count greater than the number of available loops}}
|
|
// expected-note@+3{{active 'collapse' clause defined here}}
|
|
// expected-error@+2{{'tile' clause specifies a loop count greater than the number of available loops}}
|
|
// expected-note@+1{{active 'tile' clause defined here}}
|
|
#pragma acc parallel loop tile(1, 2, 3) collapse (3)
|
|
for(int i = 0; i < 5;++i) {
|
|
for(int j = 0; j < 5; ++j);
|
|
}
|
|
}
|