llvm-project

Author	SHA1	Message	Date
Valentin Clement (バレンタインクレメン)	4cb2a519db	Revert "Reland '[flang] Allow to pass an async id to allocate the descriptor (#118713 )' and #118733 " (#121029 ) This still cause issue for device runtime build.	2024-12-23 21:27:34 -08:00
Valentin Clement (バレンタインクレメン)	5b74fb75d9	Reland '[flang] Allow to pass an async id to allocate the descriptor (#118713 )' and #118733 (#120997 ) Device runtime build have been fixed. Attempt to re-land these patches that have been approved before. https://github.com/llvm/llvm-project/pull/118713 https://github.com/llvm/llvm-project/pull/118733	2024-12-23 12:13:56 -08:00
Valentin Clement (バレンタインクレメン)	16c2a1016e	Revert "[flang] Allow to pass an async id to allocate the descriptor (#118713 )" (#119109 ) This reverts commit 7d1c661381d36018fd105f4ad4c2d6dc45e7288b. This commit breaks some device runtime builds. Need time to investigate.	2024-12-07 19:55:12 -08:00
Valentin Clement (バレンタインクレメン)	83ccaad473	[flang][cuda] Use async id for device stream allocation (#118733 ) When stream is specified use cudaMallocAsync with the specified stream	2024-12-05 08:57:10 -08:00
Valentin Clement (バレンタインクレメン)	7d1c661381	[flang] Allow to pass an async id to allocate the descriptor (#118713 ) This is a patch in preparation for the support stream ordered memory allocator in CUDA Fortran. This patch adds an asynchronous id to the AllocatableAllocate runtime function and to Descriptor::Allocate so it can be passed down to the registered allocator. It is up to the allocator to use this value or not. A follow up patch will implement that asynchronous allocator for CUDA Fortran.	2024-12-04 18:24:40 -08:00
Valentin Clement (バレンタインクレメン)	cdf447baa5	[flang][cuda] Add function to allocate and deallocate device module variable (#109213 ) This patch adds new runtime entry points that perform the simple allocation/deallocation of module allocatable variable with cuda attributes. When the allocation is initiated on the host, the descriptor on the device is synchronized. Both descriptors point to the same data on the device. This is the first PR of a stack.	2024-09-18 20:22:06 -07:00
Valentin Clement	743e99dcf5	Reland "[flang][cuda] Use cuda runtime API #103488 " CUDA Fortran is meant to be an equivalent to the runtime API. Therefore, it makes more sense to use the cuda rt API in the allocators for CUF.	2024-08-14 14:56:00 -07:00
Valentin Clement (バレンタインクレメン)	f6e3dbc27d	Revert "[flang][cuda] Use cuda runtime API" (#104232 ) Reverts llvm/llvm-project#103488	2024-08-14 13:44:49 -07:00
Valentin Clement (バレンタインクレメン)	00ab8a6a4c	[flang][cuda] Use cuda runtime API (#103488 ) CUDA Fortran is meant to be an equivalent to the runtime API. Therefore, it makes more sense to use the cuda rt API in the allocators for CUF. @bdudleback	2024-08-14 12:34:45 -07:00
Valentin Clement (バレンタインクレメン)	4c1dbbe7aa	[flang][cuda] Make CUFRegisterAllocator callable from C/Fortran (#102543 )	2024-08-08 17:09:53 -07:00
Valentin Clement (バレンタインクレメン)	388b63243c	[flang][cuda] Defined allocator for unified data (#102189 ) CUDA unified variable where set to use the same allocator than managed variable. This patch adds a specific allocator for the unified variables. Currently it will call the managed allocator underneath but we want to have the flexibility to change that in the future.	2024-08-06 14:30:31 -07:00
Valentin Clement (バレンタインクレメン)	10d7805c4f	[flang][cuda][NFC] Disambiguate namespace with cuf dialect (#102194 ) Rename namespace `Fortran::runtime::cuf` to `Fortran::runtime::cuda` to avoid embiguity with the namespace `::cuf` that is defined in the CUF dialect.	2024-08-06 14:04:45 -07:00
Valentin Clement (バレンタインクレメン)	46425b8d0f	[flang][cuda] Fix allocator-registry header path (#101727 ) File was moved in #101212	2024-08-02 11:19:36 -07:00
Valentin Clement (バレンタインクレメン)	1417633943	[flang][cuda] Add CUF allocator (#101216 ) Add allocators for CUDA fortran allocation on the device. 3 allocators are added for pinned, device and managed/unified memory allocation. `CUFRegisterAllocator()` is called to register the allocators in the allocator registry added in #100690. Since this require CUDA, a cmake option `FLANG_CUF_RUNTIME` is added to conditionally build these.	2024-08-02 10:02:34 -07:00

14 Commits