Mehdi Amini ea8b1608af
[GPUToLLVM] Support multiple async dependencies in gpu.launch_func lowering (#188987)
LegalizeLaunchFuncOpPattern previously rejected gpu.launch_func ops with
more than one async dependency. This change removes that limitation by
synchronizing additional dependencies onto the primary stream using
CUDA/HIP events, following the same approach already used in
ConvertWaitAsyncOpToGpuRuntimeCallPattern for gpu.wait async.

For each additional async dependency beyond the first:
- If it is a stream (produced by mgpuStreamCreate), create an event,
record it on that stream, wait for it on the primary stream, then
destroy the event.
- If it is already an event, wait for it directly on the primary stream
and destroy it.

Fixes #156984

Assisted-by: Claude Code
2026-03-27 16:09:19 +00:00
..