This was only used for checking if is_shared/is_private were legal,
which we're not bothering to do anymore.
This is apparently visible to more than the target attribute (which
seems to silently ignore unrecognized features), so this has the
potential to break something (i.e. see the OpenMP test change)
This is the default behavior and cuts down on attribute spam.
Probably should also do something to consolidate the option spellings;
printing and parsing it is repeated in at least 3 different places.
In the OpenMP tests, I had to manually delete some metadata check
lines update_cc_test_checks was inserting that included the local
build revision.
Similar to https://reviews.llvm.org/D136111, this time for class
methods.
D136111 summary:
In OpenMP target offloading an in other offloading languages, we
maintain a difference between device functions and kernel functions.
Kernel functions must be visible to the host and act as the entry point
to the target device. Device functions however cannot be called directly
by the host and must be called by a kernel function. Currently, we make
all definitions on the device protected by default. Because device
functions cannot be called or used by the host they should have hidden
visibility. This allows for the definitions to be better optimized via
LTO or other passes.
This patch marks every device class methods in the AST as having hidden
visibility. The kernel function is generated later at code-gen and we
set its visibility explicitly so it should not be affected. This
prevents the user from overriding the visibility, but since the user
can't do anything with these symbols anyway there is no point exporting
them right now.
Use explicit _w32/_w64 suffixes for the wave size to be consistent
with the existing other wave dependent intrinsics. Also start
diagnosing trying to use both wave32 and wave64.
I would have preferred to avoid the +wavefrontsize64 spam on targets
where that's the only option, but avoiding this seems to be more work
than I expected.
This is an alternative way of D139627 suggested by Craig. Creently only X86 backend uses this attribute. Let's just emit for X86 only.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D139701
Support for taskwait nowait clause with placeholder for runtime changes.
Reviewed By: cchen, ABataev
Differential Revision: https://reviews.llvm.org/D131830
This patch gives basic parsing and semantic support for "modifiers" of order clause introduced in OpenMP 5.1 ( section 2.11.3 )
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D127855
Added codegen for `omp error` directive.
This is to generate IR to call:
void __kmpc_error(ident_t *loc, int severity, const char *message);
Differential Revision: https://reviews.llvm.org/D139166
The map with AssertingVHs has been moved into the OpenMPIRBuilder which extended their lifetime.
On NVIDIA this will cause an assertion. This simply removes the AssertingVH wrapper.
When generating __clang_call_terminate use SetLLVMFunctionAttributes
to set the default function attributes, like we do for all the other
functions generated by clang. This fixes a problem where target
features from the command line weren't being applied to this function.
Differential Revision: https://reviews.llvm.org/D138679
This patch gives basic parsing and semantic analysis support for 'strict'
modifier with 'num_tasks' clause of 'taskloop' construct introduced in
OpenMP 5.1 (section 2.12.2)
Differential Revision: https://reviews.llvm.org/D138328
This patch gives basic parsing and semantic analysis support for 'strict'
modifier with 'grainsize' clause of 'taskloop' construct introduced in
OpenMP 5.1 (section 2.12.2)
Differential Revision: https://reviews.llvm.org/D138217
Previously, Itanium ABI guard variables were set after initialization was
complete for non-block declared variables with static and thread storage
duration. That resulted in initialization of such variables being restarted
in cases where the variable was referenced while it was still under
construction. Per C++20 [class.cdtor]p2, such references are permitted
(though the value obtained by such an access is unspecified). The late
initialization resulted in recursive reinitialization loops for cases like
this:
template<typename T>
struct ct {
struct mc {
mc() { ct<T>::smf(); }
void mf() const {}
};
thread_local static mc tlsdm;
static void smf() { tlsdm.mf(); }
};
template<typename T>
thread_local typename ct<T>::mc ct<T>::tlsdm;
int main() {
ct<int>::smf();
}
With this change, guard variables are set before initialization is started
so as to avoid such reinitialization loops.
Fixes https://github.com/llvm/llvm-project/issues/57828
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D135919
Error directive is allowed in both declared and executable contexts.
The function ActOnOpenMPAtClause is called in both places during the
parsers.
Adding a param "bool InExContext" to identify context which is used to
emit error massage.
Differential Revision: https://reviews.llvm.org/D137851
As per the OpenMP Spec, "A list item in a use_device_addr clause
must have a corresponding list item in the device data environment"
. Therefore a `map` clause is added which will make sure that the
respective list items are mapped to the device data environment
before the `use_device_addr` clause is specified. The CHECK lines
are also modified based on this change.
Differential Revision: https://reviews.llvm.org/D134974
It is caused by regenerate captured var value when processing the
has_device_addr, the captured var value has been generated in
GenerateOpenMPCapturedVars and passed as Arg in generateInfoForCapture.
The fix just use Arg instead regenerated just same as is_device_ptr
same line. Cases such as those in the associated lit tests, can now be
supported.
This adds a 'Count' field to TargetRegionEntryInfo to differentiate
regions with the same source position.
The OffloadEntriesInfoManager routines are updated to maintain a count of
regions seen at a location. The registration of regions proceeds that same as
before, but now the next available count is always determined and used in the
offload entry.
Fixes: https://github.com/llvm/llvm-project/issues/52707
Differential Revision: https://reviews.llvm.org/D134816
This switches everything to use the memory attribute proposed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly
attributes are dropped. The readnone, readonly and writeonly attributes
are restricted to parameters only.
The old attributes are auto-upgraded both in bitcode and IR.
The bitcode upgrade is a policy requirement that has to be retained
indefinitely. The IR upgrade is mainly there so it's not necessary
to update all tests using memory attributes in this patch, which
is already large enough. We could drop that part after migrating
tests, or retain it longer term, to make it easier to import IR
from older LLVM versions.
High-level Function/CallBase APIs like doesNotAccessMemory() or
setDoesNotAccessMemory() are mapped transparently to the memory
attribute. Code that directly manipulates attributes (e.g. via
AttributeList) on the other hand needs to switch to working with
the memory attribute instead.
Differential Revision: https://reviews.llvm.org/D135780
Re-apply of: 3d0e9edd8e53fb72e85084f4170513159212839a
Reverted in: 0cb65b0a585c8b3d4a8a2aefe994a8fc907934f8
A function parameter was using the wrong type 'llvm::TargetRegion' instead of
'const llvm:: TargetRegion&', which caused the error in the address sanitizer.
The correct type is now used.
This patch puts the individual target region information attributes into a
struct so that the nested mappings are not needed and passing the information
around is simplified.
Reviewed By: jdoerfert, mikerice
Differential Revision: https://reviews.llvm.org/D136601
This patch puts the individual target region information attributes into a
struct so that the nested mappings are not needed and passing the information
around is simplified.
Reviewed By: jdoerfert, mikerice
Differential Revision: https://reviews.llvm.org/D136601
This patch changes the kernels generated by OpenMP to have protected
visibility. This is unlikely to change anything functionally. However,
protected visibility better matches the behaviour of these GPU kernels.
We do not expect any pending shared library load to preempt these
kernels so we can specify a more restrictive visibility.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D136198
In OpenMP target offloading an in other offloading languages, we
maintain a difference between device functions and kernel functions.
Kernel functions must be visible to the host and act as the entry point
to the target device. Device functions however cannot be called directly
by the host and must be called by a kernel function. Currently, we make
all definitions on the device protected by default. Because device
functions cannot be called or used by the host they should have hidden
visibility. This allows for the definitions to be better optimized via
LTO or other passes.
This patch marks every device function in the AST as having `hidden`
visibility. The kernel function is generated later at code-gen and we
set its visibility explicitly so it should not be affected. This
prevents the user from overriding the visibility, but since the user
can't do anything with these symbols anyway there is no point exporting
them right now.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D136111
Currently generation of align assumptions for OpenMP simd construct is done
outside OMPIRBuilder for C code and it is not supported for Fortran.
According to OpenMP 5.0 standard (2.9.3) only pointers and arrays can be
aligned for C code.
If given aligned variable is pointer, then Clang generates the following set
of the LLVM IR isntructions to support simd align clause:
; memory allocation for pointer address:
%A.addr = alloca ptr, align 8
; some LLVM IR code
; Alignment instructions (alignment is equal to 32):
%0 = load ptr, ptr %A.addr, align 8
call void @llvm.assume(i1 true) [ "align"(ptr %0, i64 32) ]
If given aligned variable is array, then Clang generates the following set
of the LLVM IR isntructions to support simd align clause:
; memory allocation for array:
%B = alloca [10 x i32], align 16
; some LLVM IR code
; Alignment instructions (alignment is equal to 32):
%arraydecay = getelementptr inbounds [10 x i32], ptr %B, i64 0, i64 0
call void @llvm.assume(i1 true) [ "align"(ptr %arraydecay, i64 32) ]
OMPIRBuilder was modified to generate aligned assumptions. It generates only
llvm.assume calls. Frontend is responsible for generation of aligned pointer
and getting the default alignment value if user does not specify it in aligned
clause.
Unit and regression tests were added to check if aligned clause was handled correctly.
Differential Revision: https://reviews.llvm.org/D133578
Reviewed By: jdoerfert
This patch improves the LIT tests on the following :
1. The test on `uses_allocators` clause in the `target` region by
adding the respective CHECK lines. Allocator `omp_thread_mem_alloc`
is also added in the test.
2. The `defaultmap` clause wasn't being tested for the variable-
category `scalar` and the implicit-behavior `tofrom` with respect
to the OpenMP default version.
These improvements are inspired from SOLLVE tests.
SOLLVE repo: https://github.com/SOLLVE/sollve_vv
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D132855
Two another atomic compare capture forms, `{ v = x; expr-stmt }` and `{ expr-stmt; v = x; }`
where `expr-stmt` could be `cond-expr-stmt` are missing.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D135236
The `cl-uniform-work-group` attribute asserts that the global work-size
be a multiple of the work-group specified work group size. This should
allow optimizations. It is already present by default in the AMD
compiler and for HIP kernels so it should be safe to allow this for
OpenMP kernels by default.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D135374
We use protected visibility for almost everything with offloading. This
is because it provides us with the ability to read things from the host
without the expectation that it will be preempted by a shared library
load, bugs related to this have happened when offloading to the host.
This patch just makes the `exec_mode` global generated for each plugin
have protected visibility.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D135285
Currently the following case fails:
```
template<typename Ty>
Ty foo(Ty *addr, Ty val) {
Ty v;
#pragma omp atomic compare capture
{
v = *addr;
if (*addr > val)
*addr = val;
}
return v;
}
```
The compiler complains `addr` is not a lvalue. That's because when an expression
is instantiation dependent, we cannot tell if it is lvalue or not.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D135224
If 'order(concurrent)' clause is specified, then the iterations of SIMD loop
can be executed concurrently.
This patch adds support for LLVM IR codegen via OMPIRBuilder for SIMD loop
with 'order(concurrent)' clause. The functionality added to OMPIRBuilder is
similar to the functionality implemented in 'CodeGenFunction::EmitOMPSimdInit'.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D134046
Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>
It is data mapping ordering problem.
According omp spec
If one or more map clauses are present, the list item conversions that
are performed for any use_device_ptr or use_device_addr clause occur
after all variables are mapped on entry to the region according to those
map clauses.
The change is to put mapping data for use_device_addr at end of data
mapping array.
Differential Revision: https://reviews.llvm.org/D134556