llvm-project

Author	SHA1	Message	Date
Joseph Huber	d661aea4c5	[OpenMP] Add support for custom callback in AMDGPUStream (#112785 ) Summary: We have the ability to schedule callbacks after certain events complete. Currently we can register an arbitrary callback in CUDA, but can't in AMDGPU. I am planning on using this support to move the RPC handling to a separate thread, then using these callbacks to suspend / resume it when no kernels are running. This is a preliminary patch to keep this noise out of that one.	2024-10-29 10:18:32 -07:00
Johannes Doerfert	08533a3ee8	[Offload][NFC] Reorganize `utils::` and make Device/Host/Shared clearer (#100280 ) We had three `utils::` namespaces, all with different "meaning" (host, device, hsa_utils). We should, when we can, keep "include/Shared" accessible from host and device, thus RefCountTy has been moved to a separate header. `hsa_utils` was introduced to make `utils::` less overloaded. And common functionality was de-duplicated, e.g., `utils::advance` and `utils::advanceVoidPtr` -> `utils:advancePtr`. Type punning now checks for the size of the result to make sure it matches the source type. No functional change was intended.	2024-09-05 13:36:26 -07:00
Johannes Doerfert	3b7611594f	[Offload] Improve error reporting on memory faults (#104254 ) Since we can already track allocations, we can diagnose memory faults to some degree. If the fault happens in a prior allocation (use after free) or "close but outside" one, we can provide that information to the user. Note that the fault address might be page aligned, and not all accesses trigger a fault, especially for allocations that are backed by a MemoryManager. Still, if people disable the MemoryManager or the allocation is big enough, we can sometimes provide valueable feedback.	2024-08-21 10:01:35 -07:00
Fabian Mora	cfc76b6498	[llvm][offload] Move AMDGPU offload utilities to LLVM (#102487 ) This patch moves utilities from `offload/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h` to `llvm/Frontend/Offloading/Utility.h` to be reused by other projects. Concretely the following changes were made: - Rename `KernelMetaDataTy` to `AMDGPUKernelMetaData`. - Remove unused fields `KernelObject`, `KernelSegmentSize`, `ExplicitArgumentCount` and `ImplicitArgumentCount` from `AMDGPUKernelMetaData`. - Return the produced error if `ELFObj.sections()` failed instead of using `cantFail`. - Added `AGPRCount` field to `AMDGPUKernelMetaData`. - Added a default invalid value to all the fields in `AMDGPUKernelMetaData`.	2024-08-20 09:03:06 -04:00
Johannes Doerfert	9a1013220b	[Offload] Allow to record kernel launch stack traces (#100472 ) Similar to (de)allocation traces, we can record kernel launch stack traces and display them in case of an error. However, the AMD GPU plugin signal handler, which is invoked on memroy faults, cannot pinpoint the offending kernel. Insteade print `<NUM>`, set via `OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=<NUM>`, many traces. The recoding/record uses a ring buffer of fixed size (for now 8). For `trap` errors, we print the actual kernel name, and trace if recorded.	2024-07-31 11:49:50 -07:00
Joseph Huber	8043356380	[Offload] Change HSA header search order (#95769 ) Summary: The HSA headers existed previously in `include/hsa.h` and were moved to `include/hsa/hsa.h` in a later ROCm version. The include headers here were originally designed to favor a newer one. However, this unintentionally prevented the dyanmic HSA's `hsa.h` from being used if both were present. This patch changes the order so it will be found first. Related to https://github.com/llvm/llvm-project/pull/95484.	2024-06-17 14:52:50 -05:00
Johannes Doerfert	54b5c76d3b	[Offload] Use flat array for cuLaunchKernel (#95116 ) We already used a flat array of kernel launch parameters for the AMD GPU launch but now we also use this scheme for the NVIDIA GPU launch. The only remaining/required use of the indirection is the host plugin (due ot ffi). This allows to us simplify the use for non-OpenMP kernel launch.	2024-06-13 09:43:47 +03:00
Johannes Doerfert	f2120cda7d	[Offload][AMDGPU] Impose more restrictions for implicit kernel arguments (#95211 ) COV3 is not supported anymore, thus we can just use ArgsSize we read from the kernel to determine how many argument bytes we need and if implicit kernel arguments are used.	2024-06-12 16:42:20 +03:00
Joseph Huber	9e209a4a37	[Offload] Use the kernel argument size directly in AMDGPU offloading (#94667 ) Summary: The old COV3 implementation of HSA used to omit the implicit arguments from the kernel argument size. For COV4 and COV5 this is no longer the case so we can simply use the size reported from the symbol information. See https://github.com/ROCm/ROCR-Runtime/issues/117#issuecomment-812758161	2024-06-06 15:19:55 -05:00
Joseph Huber	435aa7663d	[Libomptarget] Rework device initialization and image registration (#93844 ) Summary: Currently, we register images into a linear table according to the logical OpenMP device identifier. We then initialize all of these images as one block. This logic requires that images are compatible with all devices instead of just the one that it can run on. This prevents us from running on systems with heterogeneous devices (i.e. image 1 runs on device 0 image 0 runs on device 1). This patch reworks the logic by instead making the compatibility check a per-device query. We then scan every device to see if it's compatible and do it as they come.	2024-06-06 08:10:56 -05:00
Joseph Huber	e19565c5c4	[Offload][AMDGPU] Only allow memory pool access to valid agents (#93969 ) Summary: The logic since the next-gen plugins was added was that every single agent would get access to a memory pool we allocated. This is necessary for things like fine-grained memory and to faciliate d2d copied. However, there are cases where an agent cannot legally access a memory pool. We have a debug check for this, but it would always be triggered in these situations because both uses of the function simply passed every agent. This patch changes the behavior by only enabling memory pool access for agents that can access the memory pool.	2024-05-31 13:34:40 -05:00
Joseph Huber	300e5b9114	[Offload] Fix enabling plugins on unsupported platforms (#93186 ) Summary: Certain plugins can only be built on specific platforms. Previously this didn't cause issues becaues each one was handled independently. However, now that we link these all directly they need to be in a CMake list. Furthermore we use this list to generate a config file. For this reason these checks are moved to where we normalize the support. Fixes: https://github.com/llvm/llvm-project/issues/93183	2024-05-23 08:06:41 -05:00
Joseph Huber	c618ae1734	[Offload] Rework handling for loading vendor runtimes (#93073 ) Summary: We previously had multiple options for this, this patch replaces them with `LIBOMPTARGET_DLOPEN_PLUGINS=` to be a list of plugins to dynamically use. It defaults to everything right now. This ignores the `host` plugin because the `libffi` dependency is going to be removed soon hopefully in https://github.com/llvm/llvm-project/pull/91264.	2024-05-22 13:04:52 -05:00
Ye Luo	831d143519	[Offload] libomptarget force dlopen vendor libraries by default. (#92788 ) Since #87009, libomptarget directly links all the plugins statically. All the dependencies of plugins got exposed to libomptarget. The CUDA plugin depends on libcuda and the amdgpu plugin depends on libhsa if not forced using dlopen. On a cluster with different compute node architectures, libomptarget can be built and run on different nodes. In the build stage, if cmake founds libcuda and `LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA=OFF`, libomptarget links libcuda.so directly and the result libomptarget may not run a node without a NVIDIA driver for example a CPU or AMD GPU only machine with a complaint that libcuda.so not found. The solution is setting `LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA` and `LIBOMPTARGET_FORCE_DLOPEN_LIBHSA` `ON`. Preferably this should be default to maximize the usability of libomptarget. If cmake detects NVIDIA or AMD software on an OS imaging building node, the resulted libomptarget may not be able to function on the user side due to the requirement the existence of vendor runtime libraries.	2024-05-22 09:40:43 -05:00
Joseph Huber	770d928303	[Offload][NFC] Remove 'libomptarget' message helpers (#92581 ) Summary: This isn't `libomptarget` anymore, and these messages were always unnecessary because no other project uses these prefixed messages. The effect of this is that no longer will the logs have `LIBOMPTARGET --` in front of everything. We have a message stating when we start building the offload project so it'll still be trivial to find.	2024-05-17 13:24:32 -05:00
Joseph Huber	16bb7e89a9	[Offload][NFC] Remove all trailing whitespace from offload/ (#92578 ) Summary: This patch cleans up the training whitespace in a bunch of tests and CMake files. Most just in preparation for other cleanups.	2024-05-17 13:15:04 -05:00
Joseph Huber	c4017cda00	[Offload][NFC] Remove header license in CMake files (#92544 ) Summary: No other project has these in the CMake itself, and they're wildly inconsistent even within the project. These don't really add anything so I think they should be removed.	2024-05-17 09:05:03 -05:00
Joseph Huber	81d20d861e	[Offload][NFC] Fix warning messages in runtime Summary: These are lots of random warnings due to inconsistent initialization or signedness.	2024-05-15 15:30:38 -05:00
Joseph Huber	c34d1893cb	[Offload] Remove support for old "BUILD_PLUGIN" options. (#91644 ) Summary: Since the move to the statically linked plugins, we added a new way to directly control which plugins will be added. Delete these old ones as they will cause the build to fail and suggest the new format.	2024-05-14 06:00:23 -05:00
Joseph Huber	fa9e90f5d2	[Reland][Libomptarget] Statically link all plugin runtimes (#87009 ) This patch overhauls the `libomptarget` and plugin interface. Currently, we define a C API and compile each plugin as a separate shared library. Then, `libomptarget` loads these API functions and forwards its internal calls to them. This was originally designed to allow multiple implementations of a library to be live. However, since then no one has used this functionality and it prevents us from using much nicer interfaces. If the old behavior is desired it should instead be implemented as a separate plugin. This patch replaces the `PluginAdaptorTy` interface with the `GenericPluginTy` that is used by the plugins. Each plugin exports a `createPlugin_<name>` function that is used to get the specific implementation. This code is now shared with `libomptarget`. There are some notable improvements to this. 1. Massively improved lifetimes of life runtime objects 2. The plugins can use a C++ interface 3. Global state does not need to be duplicated for each plugin + libomptarget 4. Easier to use and add features and improve error handling 5. Less function call overhead / Improved LTO performance. Additional changes in this plugin are related to contending with the fact that state is now shared. Initialization and deinitialization is now handled correctly and in phase with the underlying runtime, allowing us to actually know when something is getting deallocated. Depends on https://github.com/llvm/llvm-project/pull/86971 https://github.com/llvm/llvm-project/pull/86875 https://github.com/llvm/llvm-project/pull/86868	2024-05-09 09:38:22 -05:00
Joseph Huber	e5e66073c3	Revert "[Libomptarget] Statically link all plugin runtimes (#87009 )" Caused failures on build-bots, reverting to investigate. This reverts commit 80f9e814ec896fdc57ee84afad8ac4cb1f8e4627.	2024-05-09 07:05:23 -05:00
Joseph Huber	80f9e814ec	[Libomptarget] Statically link all plugin runtimes (#87009 ) This patch overhauls the `libomptarget` and plugin interface. Currently, we define a C API and compile each plugin as a separate shared library. Then, `libomptarget` loads these API functions and forwards its internal calls to them. This was originally designed to allow multiple implementations of a library to be live. However, since then no one has used this functionality and it prevents us from using much nicer interfaces. If the old behavior is desired it should instead be implemented as a separate plugin. This patch replaces the `PluginAdaptorTy` interface with the `GenericPluginTy` that is used by the plugins. Each plugin exports a `createPlugin_<name>` function that is used to get the specific implementation. This code is now shared with `libomptarget`. There are some notable improvements to this. 1. Massively improved lifetimes of life runtime objects 2. The plugins can use a C++ interface 3. Global state does not need to be duplicated for each plugin + libomptarget 4. Easier to use and add features and improve error handling 5. Less function call overhead / Improved LTO performance. Additional changes in this plugin are related to contending with the fact that state is now shared. Initialization and deinitialization is now handled correctly and in phase with the underlying runtime, allowing us to actually know when something is getting deallocated. Depends on https://github.com/llvm/llvm-project/pull/86971 https://github.com/llvm/llvm-project/pull/86875 https://github.com/llvm/llvm-project/pull/86868	2024-05-09 06:35:54 -05:00
Johannes Doerfert	330d8983d2	[Offload] Move `/openmp/libomptarget` to `/offload` (#75125 ) In a nutshell, this moves our libomptarget code to populate the offload subproject. With this commit, users need to enable the new LLVM/Offload subproject as a runtime in their cmake configuration. No further changes are expected for downstream code. Tests and other components still depend on OpenMP and have also not been renamed. The results below are for a build in which OpenMP and Offload are enabled runtimes. In addition to the pure `git mv`, we needed to adjust some CMake files. Nothing is intended to change semantics. ``` ninja check-offload ``` Works with the X86 and AMDGPU offload tests ``` ninja check-openmp ``` Still works but doesn't build offload tests anymore. ``` ls install/lib ``` Shows all expected libraries, incl. - `libomptarget.devicertl.a` - `libomptarget-nvptx-sm_90.bc` - `libomptarget.rtl.amdgpu.so` -> `libomptarget.rtl.amdgpu.so.18git` - `libomptarget.so` -> `libomptarget.so.18git` Fixes: https://github.com/llvm/llvm-project/issues/75124 --------- Co-authored-by: Saiyedul Islam <Saiyedul.Islam@amd.com>	2024-04-22 09:51:33 -07:00

23 Commits