33 Commits

Author SHA1 Message Date
Joseph Huber
2d588461bc [Libomptarget] Add more moves to expected conversion
Summary:
Fixes other instances of the same problem in the previous patch.
2023-01-06 09:09:45 -06:00
Joseph Huber
75c03596b8 [Libomptarget] Add move to expected conversion
Summary:
These implicit conversions from move-only types to expected seem to only
work with newer compilers. This should hopefully fix it.
2023-01-06 09:09:45 -06:00
Johannes Doerfert
ccc1324120 Introduce environment variables to deal with JIT IR
We can now dump the IR before and after JIT optimizations into the
files passed via `LIBOMPTARGET_JIT_PRE_OPT_IR_MODULE` and
`LIBOMPTARGET_JIT_POST_OPT_IR_MODULE`, respectively.

Similarly, users can set `LIBOMPTARGET_JIT_REPLACEMENT_MODULE` to
replace the IR in the image with a custom IR module in a file.
All options take file paths, documentation was added.

Reviewed by: tianshilei1992

Differential revision: https://reviews.llvm.org/D140945
2023-01-05 00:17:46 -08:00
Johannes Doerfert
5524952c14 [OpenMP][JIT][FIX] Create the default O0 pipeline for -O0 2023-01-03 17:07:52 -08:00
Johannes Doerfert
428bc510bf [OpenMP] Unify "exec_mode" query code and default to SPMD
Defaulting to Generic mode doesn't make much sense as the kernel needs
to be prepared for it. SPMD mode is the "native" execution, e.g., for
"bare" kernels. It also is the execution method for constructors and
destructors (as we might otherwise throw an extra warp onto them).

Differential Revision: https://reviews.llvm.org/D140718
2023-01-03 16:58:13 -08:00
Doru Bercea
86dc7de8ff Fix initializer name. 2023-01-03 12:45:28 -06:00
Ron Lieberman
750e1c8dbd Revert "[libomptarget][plugin-nextgen] fix for [TypePromotion] NewPM support."
This reverts commit 135f6a1ee8b20bb392ebad2fa5aef78e3a30ddb4.
2023-01-03 12:26:39 -06:00
Ron Lieberman
135f6a1ee8 [libomptarget][plugin-nextgen] fix for [TypePromotion] NewPM support. 2023-01-03 11:04:13 -06:00
Kevin Sala
339d810a0f [OpenMP][libomptarget] Add TargetParser as dependency in NextGen's JIT
This patch fixes an undefined reference to llvm::Triple::Triple(llvm::Twine const&).

Differential Revision: https://reviews.llvm.org/D140810
2023-01-01 13:29:30 +01:00
Shilei Tian
75019f18bd [OpenMP][JIT] Fixed a couple of issues in the initial implementation of JIT
This patch fixes a couple of issues:
1. Instead of using `llvm_unreachable` for those base virtual functions, unknown
   value will be returned. The previous method could cause runtime error for those
   targets where the image is not compatible but JIT is not implemented.
2. Fixed the type in CMake that causes the `Target` CMake variable is undefined.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D140732
2022-12-28 14:40:59 -05:00
Shilei Tian
5a3a527f8a [OpenMP] Introduce basic JIT support to OpenMP target offloading
This patch adds the basic JIT support for OpenMP. Currently it only works on Nvidia GPUs.

The support for AMDGPU can be extended easily by just implementing three interface functions. However, the infrastructure requires a small extra extension (add a pre process hook) to support portability for AMDGPU because the AMDGPU backend reads target features of functions. 02bc7effcc (diff-321c2038035972ad4994ff9d85b29950ba72c08a79891db5048b8f5d46915314R432) shows how it roughly works.

As for the test, even though I added the corresponding code in CMake files, the test still cannot be triggered because some code is missing in the new plugin CMake file, which has nothing to do with this patch. It will be fixed later.

In order to enable JIT mode, when compiling, `-foffload-lto` is needed, and when linking, `-foffload-lto -Wl,--embed-bitcode` is needed. That implies that, LTO is required to enable JIT mode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D139287
2022-12-27 22:19:05 -05:00
Shilei Tian
95956bd896 Revert "[OpenMP] Introduce basic JIT support to OpenMP target offloading"
This reverts commit 58906e4901ec5b7ed230d7fa96123654f6a974af because it breaks AMD's buildbot.
2022-12-27 21:52:07 -05:00
Shilei Tian
58906e4901 [OpenMP] Introduce basic JIT support to OpenMP target offloading
This patch adds the basic JIT support for OpenMP. Currently it only works on Nvidia GPUs.

The support for AMDGPU can be extended easily by just implementing three interface functions. However, the infrastructure requires a small extra extension (add a pre process hook) to support portability for AMDGPU because the AMDGPU backend reads target features of functions. 02bc7effcc (diff-321c2038035972ad4994ff9d85b29950ba72c08a79891db5048b8f5d46915314R432) shows how it roughly works.

As for the test, even though I added the corresponding code in CMake files, the test still cannot be triggered because some code is missing in the new plugin CMake file, which has nothing to do with this patch. It will be fixed later.

In order to enable JIT mode, when compiling, `-foffload-lto` is needed, and when linking, `-foffload-lto -Wl,--embed-bitcode` is needed. That implies that, LTO is required to enable JIT mode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D139287
2022-12-27 19:07:32 -05:00
Shilei Tian
a82e5825e0 [NFC][OpenMP] Fix compile warning caused by using std::move on a local object on a return statement 2022-12-23 10:42:29 -05:00
Kevin Sala
e5354a2bfa [OpenMP][libomptarget] Centralize host pinned buffers map to NextGen's PluginInterface
This patch moves the management/tracking of host pinned buffers to the common PluginInterface
in NextGen plugins. For the moment, the management consists of tracking the host pinned
allocations into a map in each device.

Differential Revision: https://reviews.llvm.org/D140502
2022-12-22 02:11:05 +01:00
Johannes Doerfert
fb2c42df41 [OpenMP] Improve AMDGPU Plugin
With this patch we:
- pick more sensible defaults for the number of teams, inspired by the
  old plugin, and configured via LIBOMPTARGET_AMDGPU_TEAMS_PER_CU.
- check the input signal of a kernel launch late, after the queue lock
  was taken, to avoid a barrier packet more often.
- copy the kernel arguments in one swoop into the appropriate memory.
- manually specialize the callbacks to avoid potential indirect calls.
2022-12-19 19:09:43 -08:00
Ye Luo
ee3d9ee49c [OpenMP] Change the nextgen plugin kernel thread count scheme as old plugins'
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D140352
2022-12-19 18:27:02 -06:00
Guilherme Valarini
89c82c8394 [OpenMP] Add non-blocking support for target nowait regions
This patch better integrates the target nowait functions with the tasking runtime. It splits the nowait execution into two stages: a dispatch stage, which triggers all the necessary asynchronous device operations and stores a set of post-processing procedures that must be executed after said ops; and a synchronization stage, responsible for synchronizing the previous operations in a non-blocking manner and running the appropriate post-processing functions. Suppose during the synchronization stage the operations are not completed. In that case, the attached hidden helper task is re-enqueued to any hidden helper thread to be later synchronized, allowing other target nowait regions to be concurrently dispatched.

Reviewed By: jdoerfert, tianshilei1992

Differential Revision: https://reviews.llvm.org/D132005
2022-12-14 14:03:32 -03:00
Shilei Tian
59ae452983 [OpenMP] Refactor CMake files related to PluginInterface in plugins-nextgen
This patch uses refactors CMake files related to `PluginInterface` in `plugins-nextgen` to handle LLVM dependences in a better way.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D139371
2022-12-06 17:39:41 -05:00
Roman Lebedev
aa6ea6009f
Revert "[OpenMP] Use add_llvm_library to build the target PluginInterface in plugins-nextgen"
This is still not working for me:
```
-- Configuring done
CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.amdgpu" which requires target "elf_common" that is not in any export set.
CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.cuda" which requires target "elf_common" that is not in any export set.
CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.x86_64" which requires target "elf_common" that is not in any export set.
CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.cuda.nextgen" which requires target "elf_common" that is not in any export set.
CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.cuda.nextgen" which requires target "PluginInterface" that is not in any export set.
CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.x86_64.nextgen" which requires target "elf_common" that is not in any export set.
CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.x86_64.nextgen" which requires target "PluginInterface" that is not in any export set.
-- Generating done
```

This reverts commit e682a76c3bf61c52628d79d6ec4db221430768c0.
2022-12-06 20:47:20 +03:00
Shilei Tian
e682a76c3b [OpenMP] Use add_llvm_library to build the target PluginInterface in plugins-nextgen
This patch uses `add_llvm_library` to build the target `PluginInterface` since it can handle LLVM dependences much better. One temporary drawback of using this is that currently LLVM CMake macro doesn't support object libraries very well (there was a try a couple years ago but it was reverted later 29e5722949). After switching to that, `CXX_VISIBILITY_PRESET` can not be set correctly, which can cause runtime error that a function call from one plugin could go to another. As a consequence, `PluginInterface` is built as a static library for now. I have asked the question in CMake community (https://discourse.cmake.org/t/set-target-properties-doesnt-work-properly/7016). Once that issue is solved, I'll switch it back to object library. It is not necessarily too bad to use static library, especially `BUILDTREE_ONLY` is already set such that `PluginInterface.a` will not be installed.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D139371
2022-12-06 11:37:37 -05:00
Roman Lebedev
33bcb3dc79
Revert "[OpenMP] Use add_llvm_library to build the target PluginInterface in plugins-nextgen"
Breaks cmake regeneration for me:
```
CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.cuda.nextgen" which requires target "PluginInterface" that is not in any export set.
CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.x86_64.nextgen" which requires target "PluginInterface" that is not in any export set.
```

This reverts commit 08c4081bd3605e1b01a7ccd6accc9052c8966250.
2022-12-06 03:50:18 +03:00
Shilei Tian
08c4081bd3 [OpenMP] Use add_llvm_library to build the target PluginInterface in plugins-nextgen
This patch uses `add_llvm_library` to build the target `PluginInterface` since it can handle LLVM dependences much better. One temporary drawback of using this is that currently LLVM CMake macro doesn't support object libraries very well (there was a try a couple years ago but it was reverted later 29e5722949). After switching to that, `CXX_VISIBILITY_PRESET` can not be set correctly, which can cause runtime error that a function call from one plugin could go to another. As a consequence, `PluginInterface` is built as a static library for now. I have asked the question in CMake community (https://discourse.cmake.org/t/set-target-properties-doesnt-work-properly/7016). Once that issue is solved, I'll switch it back to object library. It is not necessarily too bad to use static library, especially `BUILDTREE_ONLY` is already set such that `PluginInterface.a` will not be installed.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D139371
2022-12-05 19:46:12 -05:00
Kevin Sala
5acee7dd47 [OpenMP][libomptarget] Add hasQueue() function in NextGen plugin's AsyncInfoWrapperTy
This patch prepares the PluginInterface for the new AMDGPU NextGen plugin.

Differential Revision: https://reviews.llvm.org/D139263
2022-12-04 13:24:40 +01:00
Kevin Sala
cea616f847 [OpenMP][libomptarget] Simplify resource managers in NextGen plugins
This patch removes the classes GenericStreamManagerTy and GenericEventManagerTy
from the PluginInterface header.

Differential Revision: https://reviews.llvm.org/D138769
2022-12-03 22:28:34 +01:00
Kevin Sala
2cb83cd288 [OpenMP][libomptarget] Improve NextGen plugin interface for initialization
This patch modifies the PluginInterface to define functions for initializing
and deinitializing GenericPluginTy instances instead of using the constructor
and destructor. This way, we can return errors from these functions. Also, it
defines some functions that each plugin should implement for creating
plugin-specific objects.

This patch prepares the PluginInterface for the new AMDGPU NextGen plugin.

Differential Revision: https://reviews.llvm.org/D138625
2022-12-03 22:25:15 +01:00
Kevin Sala
73a6cd23a4 [OpenMP][libomptarget] Add minor fixes to NextGen plugins
List of fixes:
  - omptarget_device_environment symbol is not mandatory in device images
  - Do not synchronize in ~AsyncInfoWrapperTy() if the async info's queue is null
  - GenericDeviceResourceRef's create() and destroy() require the device as parameter

Differential Revision: https://reviews.llvm.org/D138619
2022-12-03 22:10:31 +01:00
Kevin Sala
4fde81679c [OpenMP][libomptarget] Allow overriding function that gets ELF symbol info
The OpenMP target's NextGen plugins retrieve symbol information in the ELF image
(i.e., address and size) through the ELF section and ELF symbol objects. However,
the images of CUDA programs compute the address differently from the images of
AMDGPU programs:

  - Address for CUDA symbols: image begin + section's offset + symbol's st_value
  - Address for AMDGPU symbols: image + begin + symbol's st_value

Differential Revision: https://reviews.llvm.org/D138604
2022-12-03 21:51:09 +01:00
Vitaly Buka
e42080ae3f [NFC][OpenMP] Remove extra ";" 2022-11-17 23:24:40 -08:00
Vitaly Buka
143050f552 [NFC][OpenMP] Fix warning about non-virtual dtor 2022-11-17 23:24:39 -08:00
Kevin Sala
6bacbea826 [Libomptarget] Build plugins-nextgen/common/PluginInterface with protected visibility
Summary:
This commit sets the default visibility of PluginInterface's symbols (in
nextgen plugins) as protected. This prevents symbols from a plugin
library to be preempted by another plugin library's symbol. It applies
the same fix introduced by D136365.

Issue reported by @ggeorgakoudis.

Differential Revision: https://reviews.llvm.org/D138002
2022-11-16 07:11:38 -06:00
Shilei Tian
4b0c285ef2 [NFC][OpenMP] Fix compile warnings introduced by D134396 2022-10-28 11:22:43 -04:00
Kevin Sala
846904195b [OpenMP][libomptarget] New plugin infrastructure and new CUDA plugin
This patch adds a new infrastructure for OpenMP target plugins. It also implements the CUDA and GenericELF64bit plugins under this new infrastructure. We place the sources in a separate directory named plugins-nextgen, and we build the new plugins as different plugin libraries. The original plugins, which remain untouched, will be used by default. However, the user can change this behavior at run-time through the boolean envar LIBOMPTARGET_NEXTGEN_PLUGINS. If enabled, the libomptarget will try to load the NextGen version of each plugin, falling back to the original if they are not present or valid.

The idea of this new plugin infrastructure is to implement the common parts of target plugins in generic classes (defined in files inside plugins-next/common/PluginInterface folder), and then, each specific plugin defines its own specific classes inheriting from the common ones. In this way, most logic remains on the common interface while reducing the plugin-specific source code. It is also beneficial in the sense that now most code and behavior are the same across the different plugins. As an example, we define classes for a plugin, a device, a device image, a stream manager, etc. The plugin object (a single instance per plugin library) holds different device objects (i.e., one per available device), while these latter are the responsible for managing its own resources.

Most code on this patch is based on the changes made by @jdoerfert (Johannes Doerfert)

Reviewed By: jhuber6, jdoerfert

Differential Revision: https://reviews.llvm.org/D134396
2022-10-27 18:10:14 +00:00