The existing implementation has three issues which this patch addresses.
1. The last dimension which represents the bytes in the type, has the
wrong stride and count. For example, for a 4 byte int, count=1 and
stride=4. The correct representation here is count=4 and stride=1
because there are 4 bytes (count=4) that we need to copy and we do not
skip any bytes (stride=1).
2. The size of the data copy was computed using the last dimension.
However, this is incorrect in cases where some of the final dimensions
get merged into one. In this case we need to take the combined size of
the merged dimensions, which is (Count * Stride) of the first merged
dimension.
3. The Offset into a dimension was computed as a multiple of its Stride.
However, this Stride which is in bytes, already includes the stride
multiplier given by the user. This means that when the user specified
1:3:2, i.e. elements 1, 3, 5, the runtime incorrectly copied elements 2,
4, 6. Fix this by precomputing at compile time the Offset to be in bytes
by correctly multiplying the offset by the stride of the dimension
without the user-specified multiplier.
Some tests that were checking for prints inside/outside `target` regions
needed to be updated to work on systems where the ordering wasn't
deterministic.
Reverts llvm/llvm-project#184240
Original description from #165494:
-----
OpenMP allows cases like the following:
```c
int *p1, *p2, x;
p1 = p2 = &x;
...
#pragma omp target_exit_data map(delete: p1[:]) from(p2[0])
```
Which means, when the runtime encounters the `from` entry, the ref-count
may
not be zero, but it will go down to zero at the end of the current
construct,
which should cause the "from" transfer to happen.
Similarly, a user may have:
```c
struct S {
int *p;
};
#pragma omp declare_mapper (id1: S s) map(s.p) map(present, alloc: s.p[0:10])
#pragma omp declare_mapper (id2: S s) map(s.p, s.p[0:10])
S s1;
// present-check should fail here
#pragma omp target_enter_data map(alloc: s.p[0:10]) map(mapper(id1), to: s)
// "to" should be honored here
#pragma omp target_enter_data map(alloc: s.p[0:10]) map(mapper(id2), to: s)
```
Where the allocation happens before the "to" entry is encountered by the
runtime. Or, an allocation happens before a "present" entry is
encountered.
To handle cases like this, we need to use the state information of
previously
seen new allocations, deletions, "from" entries, when honoring
`to`/`from`/`present` map entries.
-----
OpenMP allows cases like the following:
```c
int *p1, *p2, x;
p1 = p2 = &x;
...
#pragma omp target_exit_data map(delete: p1[:]) from(p2[0])
```
Which means, when the runtime encounters the `from` entry, the ref-count
may not be zero, but it will go down to zero at the end of the current
construct, which should cause the "from" transfer to happen.
Similarly, a user may have:
```c
struct S {
int *p;
};
#pragma omp declare_mapper (id1: S s) map(s.p) map(present, alloc: s.p[0:10])
#pragma omp declare_mapper (id2: S s) map(s.p, s.p[0:10])
S s1;
// present-check should fail here
#pragma omp target_enter_data map(alloc: s.p[0:10]) map(mapper(id1), to: s)
// "to" should be honored here
#pragma omp target_enter_data map(alloc: s.p[0:10]) map(mapper(id2), to: s)
```
Where the allocation happens before the "to" entry is encountered by the
runtime. Or, an allocation happens before a "present" entry is
encountered.
To handle cases like this, we need to use the state information of
previously seen new allocations, deletions, "from" entries, when
honoring `to`/`from`/`present` map entries.
This reverts commit 5a457837dd988aa01c65820848381a5b99a74c0a.
Includes the test fix from
https://github.com/llvm/llvm-project/pull/177659.
The test had to be updated to exclude a scenario that was failing
with/without the change (involving mapping a struct with a byref member
with a mapper).
-----
**Original PR's description:**
This is a fix for https://github.com/llvm/llvm-project/issues/61636.
Ravi had this implemented downstream before he retired. This PR is a
chery-pick of that.
The test is taken from @jdoerfert's WIP change in
527bf4b129.
The change partially undoes the changes done in
0caf736d7e1d16d1059553fc28dbac31f0b9f788, so @alexey-bataev might need
to take a look.
This is a fix for https://github.com/llvm/llvm-project/issues/61636.
Ravi had this implemented downstream before he retired. This PR is a
chery-pick of that.
The test is taken from @jdoerfert's WIP change in
527bf4b129.
The change partially undoes the changes done in
0caf736d7e1d16d1059553fc28dbac31f0b9f788, so @alexey-bataev might need
to take a look.
---------
Co-authored-by: Ravi Narayanaswamy <ravi.narayanaswamy@intel.com>
Co-authored-by: Johannes Doerfert <johannes@jdoerfert.de>
There are a few places where data types based on character array or
string are printed in the debug message while they do not represent
strings. Such expressions should be casted to `void *` unless they
represent actual strings. Change also includes casting from integral
type to pointer type when appropriate.
Depends on #174659.
This PR adds a new map-type bit to control the fallback behavior when
when a pointer lookup fails.
For now, this is only meaningful with `RETURN_PARAM`, and can be used
for `need_device_ptr` (for which the default is to use `nullptr` as the
result
when lookup fails), and OpenMP 6.1's `use_device_ptr(fb_nullify)`.
Eventually, this can be extended to work with assumed-size maps on
`target`
constructs, to control what the argument should be set to when lookup
fails (the OpenMP spec does not have a way to control that yet).
Dependent PR: #170578.
As per OpenMP 5.1, we need to assume that when the lookup for
`use_device_ptr/addr` fails, the incoming pointer was already device
accessible.
Prior to 5.1, a lookup-failure meant a user-error (for
`use_device_ptr`),
so we could do anything in that scenario. For `use_device_addr`,
it was always incorrect to set the address to null.
OpenMP 6.1 adds a way to retain the previous behavior of nullifying a
pointer
when the lookup fails. That will be tackled by the PR stack
starting with https://github.com/llvm/llvm-project/pull/169603.
This is needed as a way to support older code that was expecting
unconditional attachment to happen for cases like:
```c
int *p;
int x;
#pragma omp targret enter data map(p) // (A)
#pragma omp target enter data map(x) // (B)
p = &x;
// By default, this does NOT attach p and x
#pragma omp target enter data map(p[0:0]) // (C)
```
When the environment variable is set, such maps, where both the pointer
and the pointee already have corresponding copies on the device, but are
not attached to one another, will be attached as-if OpenMP 6.1 TR14's
`attach(always)` map-type-modifier was specified on `(C)`.
`PRIVATE | ATTACH` maps can be used to represent firstprivate pointers
that should be initialized by doing doing the pointee's device address,
if its lookup succeeds, or retain the original host pointee's address
otherwise.
With this, for a test like the following:
```f90
integer, pointer :: p(:)
!$omp target map(p(1))
... print*, p(1)
!$omp end target
```
The codegen can look like:
```llvm
; maps for p:
; &p(1), &p(1), sizeof(p(1)), TO|FROM //(1)
; &ref_ptr(p), &p(1), sizeof(ref_ptr(p)), ATTACH //(2)
; &ref_ptr(p), &p(1), sizeof(ref_ptr(p)), PRIVATE|ATTACH|PARAM //(3)
call... @__omp_outlined...(ptr %ref_ptr_of_p)
```
* `(1)` maps the pointee `p(1)`.
* `(2)` attaches it to the (previously) mapped `ref_ptr(p)`, if present.
It can be controlled via OpenMP 6.1's `attach(auto/always/never)`
map-type modifiers.
* `(3)` privatizes and initializes the local `ref_ptr(p)`, which gets
passed
in as the kernel argument `%ref_ptr_of_p`. Can be skipped if p is not
referenced directly within the region.
While similar mapping can be used for C/C++, it's more important/useful
for Fortran as we can avoid creating another argument for passing the
descriptor, and use that to initialize the private copy in the body of
the kernel.
This change adds support for saving full contents of attached Fortran
descriptors, and not just their pointee address, in the shadow-pointer
table.
With this, we now support:
* comparing full contents of descriptors to check whether a previous
shadow-pointer entry is stale;
* restoring the full contents of descriptors
And with that, we can now use ATTACH map-types (added in #149036) for
mapping Fortran pointer/allocatable arrays, and array-sections on them.
e.g.:
```f90
integer, allocatable :: x(:)
!$omp target enter data map(to: x(:))
```
as:
```
void* addr_of_pointee = allocated(x) ? &x(1) : nullptr;
int64_t sizeof_pointee = allocated(x) ? sizeof(x(:)) : 0
addr_of_pointee, addr_of_pointee, sizeof_pointee, TO
addr_of_descriptor, addr_of_pointee, size_of_descriptor, ATTACH
```
This patch introduces libomptarget support for the ATTACH map-type,
which can be used to implement OpenMP conditional compliant pointer
attachment, based on whether the pointer/pointee is newly mapped on a
given construct.
For example, for the following:
```c
int *p;
#pragma omp target enter data map(p[1:10])
```
The following maps can be emitted by clang:
```
(A)
&p[0], &p[1], 10 * sizeof(p[1]), TO | FROM
&p, &p[1], sizeof(p), ATTACH
```
Without this map-type, these two possible maps could be emitted by
clang:
```
(B)
&p[0], &p[1], 10 * sizeof(p[1]), TO | FROM
(C)
&p, &p[1], 10 * sizeof(p[1]), TO | FROM | PTR_AND_OBJ
````
(B) does not perform any pointer attachment, while (C) also maps the
pointer p, which are both incorrect.
In terms of implementation, maps with the ATTACH map-type are handled
after all other maps have been processed, as it requires knowledge of
which new allocations happened as part of the construct. As per OpenMP
5.0, an attachment should happen only when either the pointer or the
pointee was newly mapped while handling the construct.
Maps with ATTACH map-type-bit do not increase/decrease the ref-count.
With OpenMP 6.1, `attach(always/never)` can be used to force/prevent
attachment. For `attach(always)`, the compiler will insert the ALWAYS
map-type, which would let libomptarget bypass the check about one of the
pointer/pointee being new. With `attach(never)`, the ATTACH map will not
be emitted at all.
The size argument of the ATTACH map-type can specify values greater than
`sizeof(void*)` which can be used to support pointer attachment on
Fortran descriptors. Note that this also requires shadow-pointer
tracking to also support them. That has not been implemented in this
patch.
This was worked upon in coordination with Ravi Narayanaswamy, who has
since retired. Happy retirement, Ravi!
---------
Co-authored-by: Alex Duran <alejandro.duran@intel.com>