llvm-project

Author	SHA1	Message	Date
Joseph Huber	2d588461bc	[Libomptarget] Add more moves to expected conversion Summary: Fixes other instances of the same problem in the previous patch.	2023-01-06 09:09:45 -06:00
Joseph Huber	75c03596b8	[Libomptarget] Add move to expected conversion Summary: These implicit conversions from move-only types to expected seem to only work with newer compilers. This should hopefully fix it.	2023-01-06 09:09:45 -06:00
Johannes Doerfert	ccc1324120	Introduce environment variables to deal with JIT IR We can now dump the IR before and after JIT optimizations into the files passed via `LIBOMPTARGET_JIT_PRE_OPT_IR_MODULE` and `LIBOMPTARGET_JIT_POST_OPT_IR_MODULE`, respectively. Similarly, users can set `LIBOMPTARGET_JIT_REPLACEMENT_MODULE` to replace the IR in the image with a custom IR module in a file. All options take file paths, documentation was added. Reviewed by: tianshilei1992 Differential revision: https://reviews.llvm.org/D140945	2023-01-05 00:17:46 -08:00
Johannes Doerfert	c63dced93b	[OpenMP][JIT] Introduce support for AMDGPU To JIT kernels for AMDGPUs we need to provide the architecture, the triple, and a post-link callback. The first two are simple, the last one is a little more complicated since we need to invoke `lld`. There is some library interface but for that we need the lld library, which is not generally available, thus we go with the executable for now. In either way we need to manifest the (amdgcn) object file and read the output from another file. We should try to avoid that in the future. The options for `lld` are copied from the way clang invokes it. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D140720	2023-01-04 10:14:27 -08:00
Johannes Doerfert	93e75714cd	[OpenMP][AMDGPU][NFC] Improve error message for errors	2023-01-03 17:09:32 -08:00
Johannes Doerfert	5524952c14	[OpenMP][JIT][FIX] Create the default O0 pipeline for -O0	2023-01-03 17:07:52 -08:00
Johannes Doerfert	428bc510bf	[OpenMP] Unify "exec_mode" query code and default to SPMD Defaulting to Generic mode doesn't make much sense as the kernel needs to be prepared for it. SPMD mode is the "native" execution, e.g., for "bare" kernels. It also is the execution method for constructors and destructors (as we might otherwise throw an extra warp onto them). Differential Revision: https://reviews.llvm.org/D140718	2023-01-03 16:58:13 -08:00
Doru Bercea	86dc7de8ff	Fix initializer name.	2023-01-03 12:45:28 -06:00
Ron Lieberman	750e1c8dbd	Revert "[libomptarget][plugin-nextgen] fix for [TypePromotion] NewPM support." This reverts commit 135f6a1ee8b20bb392ebad2fa5aef78e3a30ddb4.	2023-01-03 12:26:39 -06:00
Ron Lieberman	135f6a1ee8	[libomptarget][plugin-nextgen] fix for [TypePromotion] NewPM support.	2023-01-03 11:04:13 -06:00
Kevin Sala	339d810a0f	[OpenMP][libomptarget] Add TargetParser as dependency in NextGen's JIT This patch fixes an undefined reference to llvm::Triple::Triple(llvm::Twine const&). Differential Revision: https://reviews.llvm.org/D140810	2023-01-01 13:29:30 +01:00
Shilei Tian	75019f18bd	[OpenMP][JIT] Fixed a couple of issues in the initial implementation of JIT This patch fixes a couple of issues: 1. Instead of using `llvm_unreachable` for those base virtual functions, unknown value will be returned. The previous method could cause runtime error for those targets where the image is not compatible but JIT is not implemented. 2. Fixed the type in CMake that causes the `Target` CMake variable is undefined. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D140732	2022-12-28 14:40:59 -05:00
Shilei Tian	5a3a527f8a	[OpenMP] Introduce basic JIT support to OpenMP target offloading This patch adds the basic JIT support for OpenMP. Currently it only works on Nvidia GPUs. The support for AMDGPU can be extended easily by just implementing three interface functions. However, the infrastructure requires a small extra extension (add a pre process hook) to support portability for AMDGPU because the AMDGPU backend reads target features of functions. `02bc7effcc (diff-321c2038035972ad4994ff9d85b29950ba72c08a79891db5048b8f5d46915314R432)` shows how it roughly works. As for the test, even though I added the corresponding code in CMake files, the test still cannot be triggered because some code is missing in the new plugin CMake file, which has nothing to do with this patch. It will be fixed later. In order to enable JIT mode, when compiling, `-foffload-lto` is needed, and when linking, `-foffload-lto -Wl,--embed-bitcode` is needed. That implies that, LTO is required to enable JIT mode. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D139287	2022-12-27 22:19:05 -05:00
Shilei Tian	95956bd896	Revert "[OpenMP] Introduce basic JIT support to OpenMP target offloading" This reverts commit 58906e4901ec5b7ed230d7fa96123654f6a974af because it breaks AMD's buildbot.	2022-12-27 21:52:07 -05:00
Shilei Tian	58906e4901	[OpenMP] Introduce basic JIT support to OpenMP target offloading This patch adds the basic JIT support for OpenMP. Currently it only works on Nvidia GPUs. The support for AMDGPU can be extended easily by just implementing three interface functions. However, the infrastructure requires a small extra extension (add a pre process hook) to support portability for AMDGPU because the AMDGPU backend reads target features of functions. `02bc7effcc (diff-321c2038035972ad4994ff9d85b29950ba72c08a79891db5048b8f5d46915314R432)` shows how it roughly works. As for the test, even though I added the corresponding code in CMake files, the test still cannot be triggered because some code is missing in the new plugin CMake file, which has nothing to do with this patch. It will be fixed later. In order to enable JIT mode, when compiling, `-foffload-lto` is needed, and when linking, `-foffload-lto -Wl,--embed-bitcode` is needed. That implies that, LTO is required to enable JIT mode. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D139287	2022-12-27 19:07:32 -05:00
Shilei Tian	a82e5825e0	[NFC][OpenMP] Fix compile warning caused by using `std::move` on a local object on a `return` statement	2022-12-23 10:42:29 -05:00
Kevin Sala	e5354a2bfa	[OpenMP][libomptarget] Centralize host pinned buffers map to NextGen's PluginInterface This patch moves the management/tracking of host pinned buffers to the common PluginInterface in NextGen plugins. For the moment, the management consists of tracking the host pinned allocations into a map in each device. Differential Revision: https://reviews.llvm.org/D140502	2022-12-22 02:11:05 +01:00
Kevin Sala	a487e0ffde	[NFC][OpenMP][libomptarget] Return null if error detected during allocation in NextGen AMDGPU	2022-12-22 01:46:33 +01:00
Johannes Doerfert	e3d9a448c5	[OpenMP] Account for dynamic shared memory in the AMDGPU nextgen plugin	2022-12-19 19:09:44 -08:00
Johannes Doerfert	fb2c42df41	[OpenMP] Improve AMDGPU Plugin With this patch we: - pick more sensible defaults for the number of teams, inspired by the old plugin, and configured via LIBOMPTARGET_AMDGPU_TEAMS_PER_CU. - check the input signal of a kernel launch late, after the queue lock was taken, to avoid a barrier packet more often. - copy the kernel arguments in one swoop into the appropriate memory. - manually specialize the callbacks to avoid potential indirect calls.	2022-12-19 19:09:43 -08:00
Ye Luo	ee3d9ee49c	[OpenMP] Change the nextgen plugin kernel thread count scheme as old plugins' Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D140352	2022-12-19 18:27:02 -06:00
Johannes Doerfert	77197b5651	[OpenMP] Export `ompx::` symbols from the device runtime Differential Revision: https://reviews.llvm.org/D140335	2022-12-19 14:46:54 -08:00
Johannes Doerfert	2b5a99b3d9	[OpenMP] Rename the `_OMP` namespace in the device runtime to `ompx` Differential Revision: https://reviews.llvm.org/D140334	2022-12-19 14:43:59 -08:00
Kevin Sala	6bbf9c0cca	[OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior This commit adds the AMDGPU NextGen plugin inheriting from PluginInterface's classes. It also implements the asynchronous behavior in the plugin operations: kernel launches and memory transfers. To this end, it implements the concept of streams of asynchronous operations. The streams are implemented using the HSA signals to define input and output dependencies between asynchronous operations. Missing features: - Retrieve the maximum number of threads per group that a kernel can run. This requires reading the image. - Implement __tgt_rtl_sync_event, not used on the libomptarget side. Differential Revision: https://reviews.llvm.org/D138389	2022-12-17 00:01:24 +01:00
Kevin Sala	7b97941721	[OpenMP][libomptarget] Add missing symbols in dynamic_hsa This patch prepares for the new AMDGPU NextGen plugin. Differential Revision: https://reviews.llvm.org/D140213	2022-12-17 00:01:24 +01:00
Joseph Huber	d8b0f007cb	[libomptarget] Add HSA definitions for memory faults to dynamic_hsa Summary: We use the dynamic HSA file to forward declare needed definitions from the HSA runtime if not present at build time. These definitions were not included so using them caused problems on systems without it if used. Just add them.	2022-12-16 07:06:44 -06:00
Kevin Sala	a66826a233	Revert "[OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior" This reverts commit 87e6b96b0009983996bfe0aa27d358008c1d1087.	2022-12-16 11:53:45 +01:00
Kevin Sala	87e6b96b00	[OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior This commit adds the AMDGPU NextGen plugin inheriting from PluginInterface's classes. It also implements the asynchronous behavior in the plugin operations: kernel launches and memory transfers. To this end, it implements the concept of streams of asynchronous operations. The streams are implemented using the HSA signals to define input and output dependencies between asynchronous operations. Missing features: - Retrieve the maximum number of threads per group that a kernel can run. This requires reading the image. - Implement __tgt_rtl_sync_event, not used on the libomptarget side. Differential Revision: https://reviews.llvm.org/D138389	2022-12-16 00:30:43 +01:00
Kevin Sala	39fe657b66	[OpenMP][libomptarget] Add utility header for AMDGPU plugins This patch prepares the PluginInterface for the new AMDGPU NextGen plugin. The original and the NextGen plugin will share some structures and functionalities. We use this header for defining them and avoiding code duplication. Differential Revision: https://reviews.llvm.org/D139792	2022-12-15 21:06:04 +01:00
Guilherme Valarini	89c82c8394	[OpenMP] Add non-blocking support for target nowait regions This patch better integrates the target nowait functions with the tasking runtime. It splits the nowait execution into two stages: a dispatch stage, which triggers all the necessary asynchronous device operations and stores a set of post-processing procedures that must be executed after said ops; and a synchronization stage, responsible for synchronizing the previous operations in a non-blocking manner and running the appropriate post-processing functions. Suppose during the synchronization stage the operations are not completed. In that case, the attached hidden helper task is re-enqueued to any hidden helper thread to be later synchronized, allowing other target nowait regions to be concurrently dispatched. Reviewed By: jdoerfert, tianshilei1992 Differential Revision: https://reviews.llvm.org/D132005	2022-12-14 14:03:32 -03:00
Guilherme Valarini	63efc58c5a	[NFC][OpenMP] Add missing LLVM headers on utility file Differential Revision: https://reviews.llvm.org/D137566	2022-12-14 12:46:00 -03:00
Johannes Doerfert	90609fb68f	[OpenMP][NFCI] Remove effectively dead code in clang and the runtime Differential Revision: https://reviews.llvm.org/D136903	2022-12-13 18:44:19 -08:00
Jon Chesterfield	56ec7ce80d	[openmp][amdgpu] Let fine grain and kernarg pools differ	2022-12-14 02:04:21 +00:00
Johannes Doerfert	f9c29878b0	Revert "[OpenMP][NFCI] Remove effectively dead code in clang and the runtime" This reverts commit c1c8cbbf5f29257d084a23a2f6c4236c40b7afb9. One of the tests seems to be flaky/non-deterministic.	2022-12-12 22:08:28 -08:00
Johannes Doerfert	c1c8cbbf5f	[OpenMP][NFCI] Remove effectively dead code in clang and the runtime	2022-12-12 20:55:36 -08:00
Ye Luo	d3ebce9362	[OpenMP] add offload tests with reduction on complex data types Differential Revision: https://reviews.llvm.org/D139856	2022-12-12 11:48:35 -06:00
Shilei Tian	3eef428948	Revert "[OpenMP] Add `abort` to `FATAL_MESSAGE`" This reverts commit ac65b3c7a2ad67ce17d31dc14867dc83650f751e.	2022-12-11 22:46:56 -05:00
Shilei Tian	ac65b3c7a2	[OpenMP] Add `abort` to `FATAL_MESSAGE`	2022-12-11 22:41:22 -05:00
Kevin Sala	bbcffb08f0	[OpenMP][libomptarget] Add utility class for reference counting The AMDGPU NextGen plugin will use this class for counting the references of some device resources. Differential Revision: https://reviews.llvm.org/D139787	2022-12-11 21:39:25 +01:00
Dhruva Chakrabarti	aa4c0f116c	[OpenMP] [OMPT] [3/8] Implemented callback registration in libomptarget The purpose of this patch is to have tool-provided callbacks registered in libomptarget. The overall design document is in https://rice.app.box.com/s/pf3gix2hs4d4o1aatwir1set05xmjljc Defined a class OmptDeviceCallbacksTy that will be used by libomptarget and a plugin for callbacks registered by a tool. Once the callbacks are registered in libomp, a lookup function is passed to libomptarget that is used to retrieve the callbacks and register them in libomptarget. Patch from John Mellor-Crummey <johnmc@rice.edu> (With contributions from Dhruva Chakrabarti <Dhruva.Chakrabarti@amd.com>) Reviewed By: jplehr, tianshilei1992 Differential Revision: https://reviews.llvm.org/D123974	2022-12-08 11:43:10 -08:00
Shilei Tian	59ae452983	[OpenMP] Refactor CMake files related to `PluginInterface` in `plugins-nextgen` This patch uses refactors CMake files related to `PluginInterface` in `plugins-nextgen` to handle LLVM dependences in a better way. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D139371	2022-12-06 17:39:41 -05:00
Roman Lebedev	aa6ea6009f	Revert "[OpenMP] Use `add_llvm_library` to build the target `PluginInterface` in `plugins-nextgen`" This is still not working for me: ``` -- Configuring done CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.amdgpu" which requires target "elf_common" that is not in any export set. CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.cuda" which requires target "elf_common" that is not in any export set. CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.x86_64" which requires target "elf_common" that is not in any export set. CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.cuda.nextgen" which requires target "elf_common" that is not in any export set. CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.cuda.nextgen" which requires target "PluginInterface" that is not in any export set. CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.x86_64.nextgen" which requires target "elf_common" that is not in any export set. CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.x86_64.nextgen" which requires target "PluginInterface" that is not in any export set. -- Generating done ``` This reverts commit e682a76c3bf61c52628d79d6ec4db221430768c0.	2022-12-06 20:47:20 +03:00
Shilei Tian	e682a76c3b	[OpenMP] Use `add_llvm_library` to build the target `PluginInterface` in `plugins-nextgen` This patch uses `add_llvm_library` to build the target `PluginInterface` since it can handle LLVM dependences much better. One temporary drawback of using this is that currently LLVM CMake macro doesn't support object libraries very well (there was a try a couple years ago but it was reverted later `29e5722949`). After switching to that, `CXX_VISIBILITY_PRESET` can not be set correctly, which can cause runtime error that a function call from one plugin could go to another. As a consequence, `PluginInterface` is built as a static library for now. I have asked the question in CMake community (https://discourse.cmake.org/t/set-target-properties-doesnt-work-properly/7016). Once that issue is solved, I'll switch it back to object library. It is not necessarily too bad to use static library, especially `BUILDTREE_ONLY` is already set such that `PluginInterface.a` will not be installed. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D139371	2022-12-06 11:37:37 -05:00
Roman Lebedev	33bcb3dc79	Revert "[OpenMP] Use `add_llvm_library` to build the target `PluginInterface` in `plugins-nextgen`" Breaks cmake regeneration for me: ``` CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.cuda.nextgen" which requires target "PluginInterface" that is not in any export set. CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.x86_64.nextgen" which requires target "PluginInterface" that is not in any export set. ``` This reverts commit 08c4081bd3605e1b01a7ccd6accc9052c8966250.	2022-12-06 03:50:18 +03:00
Shilei Tian	08c4081bd3	[OpenMP] Use `add_llvm_library` to build the target `PluginInterface` in `plugins-nextgen` This patch uses `add_llvm_library` to build the target `PluginInterface` since it can handle LLVM dependences much better. One temporary drawback of using this is that currently LLVM CMake macro doesn't support object libraries very well (there was a try a couple years ago but it was reverted later `29e5722949`). After switching to that, `CXX_VISIBILITY_PRESET` can not be set correctly, which can cause runtime error that a function call from one plugin could go to another. As a consequence, `PluginInterface` is built as a static library for now. I have asked the question in CMake community (https://discourse.cmake.org/t/set-target-properties-doesnt-work-properly/7016). Once that issue is solved, I'll switch it back to object library. It is not necessarily too bad to use static library, especially `BUILDTREE_ONLY` is already set such that `PluginInterface.a` will not be installed. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D139371	2022-12-05 19:46:12 -05:00
Kevin Sala	5acee7dd47	[OpenMP][libomptarget] Add hasQueue() function in NextGen plugin's AsyncInfoWrapperTy This patch prepares the PluginInterface for the new AMDGPU NextGen plugin. Differential Revision: https://reviews.llvm.org/D139263	2022-12-04 13:24:40 +01:00
Kevin Sala	cea616f847	[OpenMP][libomptarget] Simplify resource managers in NextGen plugins This patch removes the classes GenericStreamManagerTy and GenericEventManagerTy from the PluginInterface header. Differential Revision: https://reviews.llvm.org/D138769	2022-12-03 22:28:34 +01:00
Kevin Sala	2cb83cd288	[OpenMP][libomptarget] Improve NextGen plugin interface for initialization This patch modifies the PluginInterface to define functions for initializing and deinitializing GenericPluginTy instances instead of using the constructor and destructor. This way, we can return errors from these functions. Also, it defines some functions that each plugin should implement for creating plugin-specific objects. This patch prepares the PluginInterface for the new AMDGPU NextGen plugin. Differential Revision: https://reviews.llvm.org/D138625	2022-12-03 22:25:15 +01:00
Kevin Sala	73a6cd23a4	[OpenMP][libomptarget] Add minor fixes to NextGen plugins List of fixes: - omptarget_device_environment symbol is not mandatory in device images - Do not synchronize in ~AsyncInfoWrapperTy() if the async info's queue is null - GenericDeviceResourceRef's create() and destroy() require the device as parameter Differential Revision: https://reviews.llvm.org/D138619	2022-12-03 22:10:31 +01:00
Kevin Sala	4fde81679c	[OpenMP][libomptarget] Allow overriding function that gets ELF symbol info The OpenMP target's NextGen plugins retrieve symbol information in the ELF image (i.e., address and size) through the ELF section and ELF symbol objects. However, the images of CUDA programs compute the address differently from the images of AMDGPU programs: - Address for CUDA symbols: image begin + section's offset + symbol's st_value - Address for AMDGPU symbols: image + begin + symbol's st_value Differential Revision: https://reviews.llvm.org/D138604	2022-12-03 21:51:09 +01:00

1 2 3 4 5 ...

1122 Commits