14 Commits

Author SHA1 Message Date
Valentin Clement (バレンタイン クレメン)
5802367ddb
[flang][cuda] Add support for allocate with source (#117388)
Add support for allocate statement with CUDA device variable and a
source.
2024-11-22 16:55:26 -08:00
Valentin Clement
42be165dde Reland '[flang][cuda] Specialize entry point for scalar to desc data transfer' 2024-11-15 19:13:55 -08:00
Valentin Clement (バレンタイン クレメン)
70b9440c88
Revert "[flang][cuda] Specialize entry point for scalar to desc data transfer" (#116458)
Reverts llvm/llvm-project#116457
2024-11-15 17:44:48 -08:00
Valentin Clement (バレンタイン クレメン)
43cb424a54
[flang][cuda] Specialize entry point for scalar to desc data transfer (#116457)
The runtime Assign function is not meant to initialize an array from a
scalar. For that we need to use DoAssignFromSource. Update the data
transfer from scalar to descriptor to use a new entry point that use
this function underneath.
2024-11-15 17:41:23 -08:00
Valentin Clement (バレンタイン クレメン)
db69d6939a
[flang][cuda] Support data transfer from descriptor to a pointer (#115023)
Data transfer from a variable with a descriptor to a pointer. We create
a descriptor for the pointer so we can use the flang runtime to perform
the transfer. The Assign function handles all corner cases. We add a new
entry points `CUFDataTransferDescDescNoRealloc` to avoid reallocation
since the variable on the LHS is not an allocatable.
2024-11-05 11:59:08 -08:00
Valentin Clement (バレンタイン クレメン)
652db7e4ff
[flang][cuda] Support data transfer from pointer to a descriptor (#114892)
When source is a pointer to an array or a scalar, embox it and use the
`CUFDataTransferDescDesc` or `CUFDataTransferGlobalDescDesc` entry
points. The runtime is already able to deal with all the corner cases
like non contiguous arrays and so on so we exploit this.

Memset might still be used for simple case where we want to initialize
to 0 for example. This will come in a follow up patch.
2024-11-05 08:56:19 -08:00
Valentin Clement (バレンタイン クレメン)
9d09c6fd9c
[flang][cuda] Update device descriptor on data transfer (#114838)
When the destination of the data transfer is a global we might need to
sync the descriptor after the data transfer is done. This is the case
when the data transfer is from host/device to device as reallocation
might have happened and the descriptor on the device needs to take the
new values written on the host.

A new entry point is added `CUFDataTransferGlobalDescDesc` with the sync
when needed.
2024-11-04 13:22:06 -08:00
Valentin Clement (バレンタイン クレメン)
c949500d51
[flang][cuda] Fix not declared terminator (#114866) 2024-11-04 12:38:02 -08:00
Valentin Clement (バレンタイン クレメン)
51f7e98d59
[flang][cuda] Crash if mode is not handled (#114842) 2024-11-04 11:47:19 -08:00
Valentin Clement (バレンタイン クレメン)
32473864cb
[flang][cuda] Data transfer with descriptor (#114598)
Reopen PR #114302 as it was automatically closed. 

Review in #114302
2024-11-01 12:35:48 -07:00
Valentin Clement (バレンタイン クレメン)
e4e9fea71e
[flang][cuda] Pass descriptor by reference for CUFMemsetDescriptor (#114338) 2024-10-31 09:02:59 -07:00
Renaud Kauffmann
bfe486fe76
Passing descriptors by reference to CUDA runtime calls (#114288)
Passing a descriptor as a `const Descriptor &` or a `const Descriptor *`
generates a FIR signature where the box is passed by value.
This is an issue, as it requires a load of the box to be passed. But
since, ultimately, all boxes are passed by reference a temporary is
generated in LLVM and the reference to the temporary is passed.

The boxes addresses are registered with the CUDA runtime but the
temporaries are not, thus preventing the runtime to properly map a host
side address to its device side counterpart.

To address this issue, this PR changes the signatures to the transfer
functions to pass a descriptor as a `Descriptor *`, which will in turn
generate a FIR signature with that takes a box reference as an argument.
2024-10-30 13:24:47 -07:00
Valentin Clement (バレンタイン クレメン)
fa627d98e8
[flang][cuda] Add entry point for alloc/free and simple copy (#109867)
These will be used to translate simple cuf.alloc/cuf.free and
cuf.data_transfer on scalar and constant size arrays.
2024-09-24 20:00:11 -07:00
Valentin Clement (バレンタイン クレメン)
bc54e5636f
[flang][cuda] Add new entry points function for data transfer (#108244)
Add new entry points for more complex data transfer involving
descriptors. These functions will be called when converting
`cuf.data_transfer` operations.
2024-09-16 09:45:44 -07:00