…an device arrays (#185984)"
This reverts commit fb18d570b0466ca2a401aba11d6e58b206aebc1a.
This PR caused compilation failures with allocatable arrays, reverting
now for more investigation.
When CUDA Fortran device arrays are listed in an OpenMP private clause,
the compiler previously allocated private copies on the host heap using
fir.allocmem. This caused device-side operations to receive host
pointers instead of device pointers, leading to cudaErrorIllegalAddress
(700).
Fix by detecting symbols with a CUDA data attribute (device, managed,
unified, etc.) during privatization and using cuf.alloc / cuf.free
instead of fir.allocmem / fir.freemem, so the private copies reside in
device memory.
After https://github.com/llvm/llvm-project/pull/169740, the allocate and
deallocate cuf operation can be converted later. Update the way to
recognize double descriptor case by adding this information directly on
the operation itself.
When the rhs of the data transfer is from a different type, allocate a
new temp on the host and first transfer the rhs to it. Then, use the
elemental op created to do the conversion.
Reviewed in #152379
- Move the allocator index set up after the allocate statement otherwise
the derived type descriptor is not allocated.
- Support array of derived-type with device component
- Move the allocator index set up after the allocate statement otherwise
the derived type descriptor is not allocated.
- Support array of derived-type with device component