7 Commits

Author SHA1 Message Date
Valentin Clement (バレンタイン クレメン)
0bbebf6f3a
[flang][cuda] Convert cuf.data_transfer with descriptors (#108890)
Convert cuf.data_transfer operations involving descriptors to the newly
introduced entry points (#108244).
2024-09-17 11:00:31 -07:00
Valentin Clement (バレンタイン クレメン)
dfc21acdfa
[flang][cuda] Convert global allocation for pinned variable (#106807)
ALLOCATE/DEALLOCATE statements for module allocatable variable with the
pinned attribute can be lowered to the standard runtime call and do not
need further action since these variables will have a unique descriptor
that is on the host.
2024-09-03 14:27:16 -07:00
Valentin Clement (バレンタイン クレメン)
841327db4e
[flang][cuda] Convert cuf.alloc for box to fir.alloca in device context (#102662)
In device context managed memory is not available so it makes no sense
to allocate the descriptor using it. Fall back to fir.alloca as it is
handled well in device code.
cuf.free is just dropped.
2024-08-09 13:41:51 -07:00
Valentin Clement (バレンタイン クレメン)
a262ac0c68
[flang][cuda] Make operations dynamically legal in cuf op conversion (#102220) 2024-08-08 09:18:51 -07:00
Valentin Clement (バレンタイン クレメン)
10d7805c4f
[flang][cuda][NFC] Disambiguate namespace with cuf dialect (#102194)
Rename namespace `Fortran::runtime::cuf` to `Fortran::runtime::cuda` to
avoid embiguity with the namespace `::cuf` that is defined in the CUF
dialect.
2024-08-06 14:04:45 -07:00
Valentin Clement (バレンタイン クレメン)
a3ccaed3b9
[flang][cuda] Allocate local descriptor in managed memory (#102060)
This patch adds entry point in the runtime to be able to allocate
descriptors in managed memory. These entry points currently only call
`CUFAllocManaged` and `CUFFreeManaged` but could be more complicated in
the future.

`cuf.alloc` and `cuf.free` related to local descriptors are converted
into runtime calls.
2024-08-06 11:17:11 -07:00
Valentin Clement (バレンタイン クレメン)
fca5038597
[flang][cuda] Add conversion pass for cuf.allocate and cuf.deallocate (#101563)
Allocator can be specified in the descriptor. For simple local
allocatable, we can simply convert `cuf.allocate`/`cuf.deallocate` to
their corresponding runtime calls in the standard flang runtime. More
specific cases will require dedicated entry points. Global descriptor
will require sync between host and device copy.

This patch adds a pass to perform this conversion.
2024-08-02 16:19:10 -07:00