IR for 'target teams loop' is now dependent on suitability of associated
loop-nest.
If a loop-nest:
- does not contain a function call, or
- the -fopenmp-assume-no-nested-parallelism has been specified,
- or the call is to an OpenMP API AND
- does not contain nested loop bind(parallel) directives
then it can be emitted as 'target teams distribute parallel for', which
is the current default. Otherwise, it is emitted as 'target teams
distribute'.
Added debug output indicating how 'target teams loop' was emitted. Flag
is -mllvm -debug-only=target-teams-loop-codegen
Added LIT tests explicitly verifying 'target teams loop' emitted as a
parallel loop and a distribute loop.
Updated other 'loop' related tests as needed to reflect change in IR.
- These updates account for most of the changed files and
additions/deletions.
In PR #79382, I need to add a new type that derives from
ConstantArrayType. This means that ConstantArrayType can no longer use
`llvm::TrailingObjects` to store the trailing optional Expr*.
This change refactors ConstantArrayType to store a 60-bit integer and
4-bits for the integer size in bytes. This replaces the APInt field
previously in the type but preserves enough information to recreate it
where needed.
To reduce the number of places where the APInt is re-constructed I've
also added some helper methods to the ConstantArrayType to allow some
common use cases that operate on either the stored small integer or the
APInt as appropriate.
Resolves#85124.
…ctive
The function `ActOnOpenMPTargetParallelForSimdDirective` gets the number
of capture levels for OMPD_target_parallel_for, whereas the intended
directive is OMPD_target_parallel_for_simd.
This implements the C++23 `[[assume]]` attribute.
Assumption information is lowered to a call to `@llvm.assume`, unless the expression has side-effects, in which case it is discarded and a warning is issued to tell the user that the assumption doesn’t do anything. A failed assumption at compile time is an error (unless we are in `MSVCCompat` mode, in which case we don’t check assumptions at compile time).
Due to performance regressions in LLVM, assumptions can be disabled with the `-fno-assumptions` flag. With it, assumptions will still be parsed and checked, but no calls to `@llvm.assume` will be emitted and assumptions will not be checked at compile time.
This patch fixes the #67002 ([OpenMP][Clang] Scan Directive not
supported for Generic types). It disables the Sema checks/analysis that
are run on the helper arrays which go into the implementation of the
`omp scan` directive until the template instantiation happens.
Grateful to @alexey-bataev for suggesting these changes.
This is a support for " #pragma omp atomic compare weak". It has Parser
& AST support for now.
---------
Authored-by: Sunil Kuravinakop <kuravina@pe28vega.us.cray.com>
When this option is passed to clang, external (and/or weak) symbols
are not assumed to have the minimum ABI alignment normally required.
Symbols defined locally that are not weak are however still given the
minimum alignment.
This is implemented by passing a new parameter to getMinGlobalAlign()
named HasNonWeakDef that is used to return the right alignment value.
This is needed when external symbols created from a linker script may
not get the ABI minimum alignment and must therefore be treated as
unaligned by the compiler.
Changes uploaded to the phabricator on Dec 16th are lost because the
phabricator is down. Hence re-uploading it to the github.com.
Changes to be committed:
modified: clang/include/clang/Sema/Sema.h
modified: clang/lib/Sema/SemaOpenMP.cpp
modified: clang/test/OpenMP/generic_loop_ast_print.cpp
modified: clang/test/OpenMP/loop_bind_messages.cpp
modified: clang/test/PCH/pragma-loop.cpp
---------
Co-authored-by: Sunil Kuravinakop
This is a continuation of https://reviews.llvm.org/D123235 ([OpenMP]
atomic compare fail : Parser & AST support). In this branch Support for
codegen support for atomic compare fail is being added.
---------
Co-authored-by: Sunil Kuravinakop
This patch makes `num_teams` and `thread_limit` mandatory for bare
kernels,
similar to a reguar kernel language that when launching a kernel, the
grid size
has to be set explicitly.
Currently PresentModifierLocs defined with size DefaultmapKindNum; where
DefaultmapKindNum = OMPC_DEFAULTMAP_pointer + 1
Before 5.0 variable-category can not be omitted. For the test like
\#pragma omp target map(tofrom: errors) defaultmap(present)
error would be mitted.
After 5.0 that is allowd.
When try to:
PresentModifierLocs[DMC->getDefaultmapKind()] =
DMC->getDefaultmapModifierLoc();
It is accessed beyond array end.
To fix this using OMPC_DEFAULTMAP_unknow instead OMPC_DEFAULTMAP_poiner.
This reverts commit edd675ac283909397880f85ba68d0d5f99dc1be2.
This breaks clang build where every component is a shared library.
The file clang/lib/Basic/OpenMPKinds.cpp, which is a part of
libclangBasic.so, uses `getOpenMPClauseName` which isn't:
/usr/bin/ld: CMakeFiles/obj.clangBasic.dir/OpenMPKinds.cpp.o: in functio
n `clang ::getOpenMPSimpleClauseTypeName(llvm::omp::Clause, unsigned int
)':
OpenMPKinds.cpp:(.text._ZN5clang29getOpenMPSimpleClauseTypeNameEN4llvm3o
mp6ClauseEj+0x9b): undefined reference to `llvm::omp::getOpenMPClauseNam
e(llvm::omp::Clause)'
This is a support for " #pragma omp atomic compare fail ". It has Parser & AST support for now.
Reviewed By: tianshilei1992, ABataev
Differential Revision: https://reviews.llvm.org/D123235
This patch moves `OMPDeclareReductionDecl::InitKind` to DeclBase.h, so that it's complete at the point where corresponding bit-field is declared. This patch also converts it to scoped enum named `OMPDeclareReductionInitKind`
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the runtime. The
KernelLaunchEnvironment is for dynamic information *per* kernel launch.
It allows the rutime to feed information to the kernel that is not
shared with other invocations of the kernel. The first use case is to
replace the globals that synchronize teams reductions with per-launch
versions. This allows concurrent teams reductions. More uses cases will
follow, e.g., per launch memory pools.
Fixes: https://github.com/llvm/llvm-project/issues/70249
This patch moves `ArraySizeModifier` before `Type` declaration so that it's complete at `ArrayTypeBitfields` declaration. It's also converted to scoped enum along the way.
This patch starts the support for OpenMP kernel language, basically to write
OpenMP target region in SIMT style, similar to kernel languages such as CUDA.
What included in this first patch is the `ompx_bare` clause for `target teams`
directive. When `ompx_bare` exists, globalization is disabled such that local
variables will not be globalized. The runtime init/deinit function calls will
not be emitted. That being said, almost all OpenMP executable directives are
not supported in the region, such as parallel, task. This patch doesn't include
the Sema checks for that, so the use of them is UB. Simple directives, such as
atomic, can be used. We provide a set of APIs (for C, they are prefix with
`ompx_`; for C++, they are in `ompx` namespace) to get thread id, block id, etc.
Please refer to
https://tianshilei.me/wp-content/uploads/llvm-hpc-2023.pdf for more details.
This patch starts the support for OpenMP kernel language, basically to
write
OpenMP target region in SIMT style, similar to kernel languages such as
CUDA.
What included in this first patch is the `ompx_bare` clause for `target
teams`
directive. When `ompx_bare` exists, globalization is disabled such that
local
variables will not be globalized. The runtime init/deinit function calls
will
not be emitted. That being said, almost all OpenMP executable directives
are
not supported in the region, such as parallel, task. This patch doesn't
include
the Sema checks for that, so the use of them is UB. Simple directives,
such as
atomic, can be used. We provide a set of APIs (for C, they are prefix
with
`ompx_`; for C++, they are in `ompx` namespace) to get thread id, block
id, etc.
For more details, you can refer to
https://tianshilei.me/wp-content/uploads/llvm-hpc-2023.pdf.
offloading
- This patch adds support for thread_limit clause on target directive according to OpenMP 51 [2.14.5]
- The idea is to create an outer task for target region, when there is a thread_limit clause, and manipulate the thread_limit of task instead. This way, thread_limit will be applied to all the relevant constructs enclosed by the target region.
Differential Revision: https://reviews.llvm.org/D152054
structured-block
where clause is one of the following:
private(list)
reduction([reduction-modifier ,] reduction-identifier : list)
nowait
Differential Revision: https://reviews.llvm.org/D157933
The if-clause on 'target teams loop' should only accept "target" as a
directive name modifier. Any other directive name should generate an
error.
Differential Revision: https://reviews.llvm.org/D156352
This reverts commit 0d12683046ca75fb08e285f4622f2af5c82609dc and
reapplies ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2 with an extension to
fix the Flang build.
Differential Revision: https://reviews.llvm.org/D156184
CUDA and HIP have kernel attributes to tune the code generation (in the
backend). To reuse this functionality for OpenMP target regions we
introduce the `ompx_attribute` clause that takes these kernel
attributes and emits code as if they had been attached to the kernel
fuction (which is implicitly generated).
To limit the impact, we only support three kernel attributes:
`amdgpu_waves_per_eu`, for AMDGPU
`amdgpu_flat_work_group_size`, for AMDGPU
`launch_bounds`, for NVPTX
The existing implementations of those attributes are used for error
checking and code generation. `ompx_attribute` can be attached to any
executable target region and it can hold more than one kernel attribute.
Differential Revision: https://reviews.llvm.org/D156184
Currently, clang gives an incorrect reduction identifier error for the PLUS
operator for OpenMP version > 52. But, PLUS operator is allowed in OpenMP
version > 52. This revision fixes this issue and also modified the error
messages to show the correct expected operators in the message based on the OpenMP
version used (prior to OMP 6.0 and since OMP 6.0).
Test Src:
void foo() {
int a = 0 ;
#pragma omp parallel reduction(+:a)
;
#pragma omp parallel reduction(-:a)
;
}
Before this revision:
$ clang -fopenmp -fopenmp-version=60 test.c -c
test.c:3:34: error: incorrect reduction identifier, expected one of '+', '-', '*', '&', '|', '^', '&&', '||', 'min' or 'max' or declare reduction for type 'int'
3 | #pragma omp parallel reduction(+:a)
| ^
test.c:5:34: error: incorrect reduction identifier, expected one of '+', '-', '*', '&', '|', '^', '&&', '||', 'min' or 'max' or declare reduction for type 'int'
5 | #pragma omp parallel reduction(-:a)
| ^
2 errors generated.
Wit this revision:
$ clang -fopenmp -fopenmp-version=60 test.c -c
test.c:5:34: error: incorrect reduction identifier, expected one of '+', '*', '&', '|', '^', '&&', '||', 'min' or 'max' or declare reduction for type 'int'
5 | #pragma omp parallel reduction(-:a)
|
1 error generated.
Differential Revision: https://reviews.llvm.org/D155635
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over
their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes
`IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to
`-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed
to `omp.is_target_device`. Getters and setters of all these renamed properties
are also updated accordingly. Many unit tests have been updated to use the new
names, but an alias for the `-fopenmp-is-device` option is created so that
external programs do not stop working after the name change.
`IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only
valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the
`-fopenmp-is-target-device` compiler frontend option, which is only added to
the OpenMP device invocation for offloading-enabled programs.
Differential Revision: https://reviews.llvm.org/D154591
clause.
This is just syntax to make it easier for the user. It doesn't add any
new functionality.
for
doacross(sink: omp_cur_iteration - 1)
Equivalent to
doacross(sink: ConterVar - 1, ...)
doacross(source: omp_cur_iteration)
Equivalent to
doacross(source)
And restriction is:
OMP5.2 p.327
If vector is specified with the omp_cur_iteration keyword and with
sink as the dependence-type then it must be omp_cur_iteration - 1.
If vector is specified with source as the dependence-type then it must be
omp_cur_iteration.
Differential Revision: https://reviews.llvm.org/D154556
The loop directive is a descriptive construct which allows the compiler
flexibility in how it generates code for the directive's associated
loop(s). See OpenMP specification 5.2 [257:8-9].
Codegen added in this patch for the combined 'loop' directives are:
'target teams loop' -> 'target teams distribute parallel for'
'teams loop' -> 'teams distribute parallel for'
'target parallel loop' -> 'target parallel for'
'parallel loop' -> 'parallel for'
NOTE: The implementation of the 'loop' directive itself is unchanged.
Differential Revision: https://reviews.llvm.org/D145823
This was added by 453e02ca0903c9f65529d21c513925ab0fdea1e1.Use
isa instead since we don't use the result.
Fixes:
<..>SemaOpenMP.cpp:23149:13: warning: unused variable ‘TargetVarDecl’ [-Wunused-variable]
23149 | if (auto *TargetVarDecl = dyn_cast_or_null<VarDecl>(TargetDecl))
| ^~~~~~~~~~~~~
Which came up when building with GCC 9.