llvm-project

Author	SHA1	Message	Date
Johannes Doerfert	c63dced93b	[OpenMP][JIT] Introduce support for AMDGPU To JIT kernels for AMDGPUs we need to provide the architecture, the triple, and a post-link callback. The first two are simple, the last one is a little more complicated since we need to invoke `lld`. There is some library interface but for that we need the lld library, which is not generally available, thus we go with the executable for now. In either way we need to manifest the (amdgcn) object file and read the output from another file. We should try to avoid that in the future. The options for `lld` are copied from the way clang invokes it. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D140720	2023-01-04 10:14:27 -08:00
Johannes Doerfert	93e75714cd	[OpenMP][AMDGPU][NFC] Improve error message for errors	2023-01-03 17:09:32 -08:00
Johannes Doerfert	428bc510bf	[OpenMP] Unify "exec_mode" query code and default to SPMD Defaulting to Generic mode doesn't make much sense as the kernel needs to be prepared for it. SPMD mode is the "native" execution, e.g., for "bare" kernels. It also is the execution method for constructors and destructors (as we might otherwise throw an extra warp onto them). Differential Revision: https://reviews.llvm.org/D140718	2023-01-03 16:58:13 -08:00
Kevin Sala	e5354a2bfa	[OpenMP][libomptarget] Centralize host pinned buffers map to NextGen's PluginInterface This patch moves the management/tracking of host pinned buffers to the common PluginInterface in NextGen plugins. For the moment, the management consists of tracking the host pinned allocations into a map in each device. Differential Revision: https://reviews.llvm.org/D140502	2022-12-22 02:11:05 +01:00
Kevin Sala	a487e0ffde	[NFC][OpenMP][libomptarget] Return null if error detected during allocation in NextGen AMDGPU	2022-12-22 01:46:33 +01:00
Johannes Doerfert	e3d9a448c5	[OpenMP] Account for dynamic shared memory in the AMDGPU nextgen plugin	2022-12-19 19:09:44 -08:00
Johannes Doerfert	fb2c42df41	[OpenMP] Improve AMDGPU Plugin With this patch we: - pick more sensible defaults for the number of teams, inspired by the old plugin, and configured via LIBOMPTARGET_AMDGPU_TEAMS_PER_CU. - check the input signal of a kernel launch late, after the queue lock was taken, to avoid a barrier packet more often. - copy the kernel arguments in one swoop into the appropriate memory. - manually specialize the callbacks to avoid potential indirect calls.	2022-12-19 19:09:43 -08:00
Kevin Sala	6bbf9c0cca	[OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior This commit adds the AMDGPU NextGen plugin inheriting from PluginInterface's classes. It also implements the asynchronous behavior in the plugin operations: kernel launches and memory transfers. To this end, it implements the concept of streams of asynchronous operations. The streams are implemented using the HSA signals to define input and output dependencies between asynchronous operations. Missing features: - Retrieve the maximum number of threads per group that a kernel can run. This requires reading the image. - Implement __tgt_rtl_sync_event, not used on the libomptarget side. Differential Revision: https://reviews.llvm.org/D138389	2022-12-17 00:01:24 +01:00
Kevin Sala	a66826a233	Revert "[OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior" This reverts commit 87e6b96b0009983996bfe0aa27d358008c1d1087.	2022-12-16 11:53:45 +01:00
Kevin Sala	87e6b96b00	[OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior This commit adds the AMDGPU NextGen plugin inheriting from PluginInterface's classes. It also implements the asynchronous behavior in the plugin operations: kernel launches and memory transfers. To this end, it implements the concept of streams of asynchronous operations. The streams are implemented using the HSA signals to define input and output dependencies between asynchronous operations. Missing features: - Retrieve the maximum number of threads per group that a kernel can run. This requires reading the image. - Implement __tgt_rtl_sync_event, not used on the libomptarget side. Differential Revision: https://reviews.llvm.org/D138389	2022-12-16 00:30:43 +01:00
Kevin Sala	39fe657b66	[OpenMP][libomptarget] Add utility header for AMDGPU plugins This patch prepares the PluginInterface for the new AMDGPU NextGen plugin. The original and the NextGen plugin will share some structures and functionalities. We use this header for defining them and avoiding code duplication. Differential Revision: https://reviews.llvm.org/D139792	2022-12-15 21:06:04 +01:00

11 Commits