26 Commits

Author SHA1 Message Date
Joseph Huber
e19565c5c4
[Offload][AMDGPU] Only allow memory pool access to valid agents (#93969)
Summary:
The logic since the next-gen plugins was added was that every single
agent would get access to a memory pool we allocated. This is necessary
for things like fine-grained memory and to faciliate d2d copied.
However, there are cases where an agent cannot legally access a memory
pool. We have a debug check for this, but it would always be triggered
in these situations because both uses of the function simply passed
every agent. This patch changes the behavior by only enabling memory
pool access for agents that can access the memory pool.
2024-05-31 13:34:40 -05:00
Joseph Huber
21f3a6091f
[Offload] Only initialize a plugin if it is needed (#92765)
Summary:
Initializing the plugins requires initializing the runtime like CUDA or
HSA. This has a considerable overhead on most platforms, so we should
only actually initialize a plugin if it is needed by any image that is
loaded.
2024-05-23 09:36:47 -05:00
Joseph Huber
300e5b9114
[Offload] Fix enabling plugins on unsupported platforms (#93186)
Summary:
Certain plugins can only be built on specific platforms. Previously this
didn't cause issues becaues each one was handled independently. However,
now that we link these all directly they need to be in a CMake list.
Furthermore we use this list to generate a config file. For this reason
these checks are moved to where we normalize the support.

Fixes: https://github.com/llvm/llvm-project/issues/93183
2024-05-23 08:06:41 -05:00
Joseph Huber
c618ae1734
[Offload] Rework handling for loading vendor runtimes (#93073)
Summary:
We previously had multiple options for this, this patch replaces them
with `LIBOMPTARGET_DLOPEN_PLUGINS=` to be a list of plugins to
dynamically use. It defaults to everything right now. This ignores the
`host` plugin because the `libffi` dependency is going to be removed
soon hopefully in https://github.com/llvm/llvm-project/pull/91264.
2024-05-22 13:04:52 -05:00
Joseph Huber
dbfedc6b27
[Offload] Use newer CUDA API functions when dynamically loaded (#93057)
Summary:
CUDA does its versioning by putting a redirection in the header so the
API functions remain the same while the symbol changes. These weren't
being used for some functions that required it in the dynamic cuda
version.

These functions have newer verisons that should be used. These are
fairly old as far as I'm aware so we should be able to sweep backward
compatibility under the rug.
2024-05-22 10:59:56 -05:00
Ye Luo
831d143519
[Offload] libomptarget force dlopen vendor libraries by default. (#92788)
Since #87009, libomptarget directly links all the plugins statically.
All the dependencies of plugins got exposed to libomptarget. The CUDA
plugin depends on libcuda and the amdgpu plugin depends on libhsa if not
forced using dlopen. On a cluster with different compute node
architectures, libomptarget can be built and run on different nodes. In
the build stage, if cmake founds libcuda and
`LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA=OFF`, libomptarget links libcuda.so
directly and the result libomptarget may not run a node without a NVIDIA
driver for example a CPU or AMD GPU only machine with a complaint that
libcuda.so not found.

The solution is setting `LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA` and
`LIBOMPTARGET_FORCE_DLOPEN_LIBHSA` `ON`. Preferably this should be
default to maximize the usability of libomptarget. If cmake detects
NVIDIA or AMD software on an OS imaging building node, the resulted
libomptarget may not be able to function on the user side due to the
requirement the existence of vendor runtime libraries.
2024-05-22 09:40:43 -05:00
Joseph Huber
3df7cb9ab9 [Offload] Remove unused version script for plugins
Summary:
The plugins are no longer linked to a share library, making this unused
and useless.
2024-05-20 10:06:11 -05:00
Joseph Huber
770d928303
[Offload][NFC] Remove 'libomptarget' message helpers (#92581)
Summary:
This isn't `libomptarget` anymore, and these messages were always
unnecessary because no other project uses these prefixed messages. The
effect of this is that no longer will the logs have `LIBOMPTARGET --` in
front of everything. We have a message stating when we start building
the offload project so it'll still be trivial to find.
2024-05-17 13:24:32 -05:00
Joseph Huber
16bb7e89a9
[Offload][NFC] Remove all trailing whitespace from offload/ (#92578)
Summary:
This patch cleans up the training whitespace in a bunch of tests and
CMake files. Most just in preparation for other cleanups.
2024-05-17 13:15:04 -05:00
Joseph Huber
c4017cda00
[Offload][NFC] Remove header license in CMake files (#92544)
Summary:
No other project has these in the CMake itself, and they're wildly
inconsistent even within the project. These don't really add anything so
I think they should be removed.
2024-05-17 09:05:03 -05:00
Joseph Huber
f42f57b52d
[Libomptarget] Rework Record & Replay to be a plugin member (#88928) (#89097)
Summary:
Previously, the R&R support was global state initialized by a global
constructor. This is bad because it prevents us from adequately
constraining the lifetime of the library. Additionally, we want to
minimize the amount of global state floating around.

This patch moves the R&R support into a plugin member like everything
else. This means there will be multiple copies of the R&R implementation
floating around, but this was already the case given the fact that we
currently handle everything with dynamic libraries.
2024-05-16 14:58:46 -05:00
Joseph Huber
3abd3d6e59
[Libomptarget] Remove requires information from plugin (#80345)
Summary:
Currently this is only used for the zero-copy handling. However, this
can easily be moved into `libomptarget` so that we do not need to bother
setting the requires flags in the plugin. The advantage here is that we
no longer need to do this for every device redundently. Additionally,
these requires flags are specifically OpenMP related, so they should
live in `libomptarget`.
2024-05-16 11:13:50 -05:00
Joseph Huber
81d20d861e [Offload][NFC] Fix warning messages in runtime
Summary:
These are lots of random warnings due to inconsistent initialization or
signedness.
2024-05-15 15:30:38 -05:00
Joseph Huber
302db1ab5a
[Offload] Do not link every target for JIT (#92013)
Summary:
The offload library supports basic JIT functionality, however we
currently link against every single target even though only AMDGPU and
NVPTX are supported. This somewhat bloats the dynamic library list, so
we should constrain it to what's actually used.
2024-05-14 13:57:34 -05:00
Joseph Huber
363258a3cc
[Offload] Remove old references to isCtor (#91766)
Summary:
These have long since been removed, support for ctors / dtors now
happens through special kernels the backend creates.
2024-05-14 06:00:34 -05:00
Joseph Huber
c34d1893cb
[Offload] Remove support for old "BUILD_PLUGIN" options. (#91644)
Summary:
Since the move to the statically linked plugins, we added a new way to
directly control which plugins will be added. Delete these old ones as
they will cause the build to fail and suggest the new format.
2024-05-14 06:00:23 -05:00
Tulio Magno Quites Machado Filho
e9be129287
[Offload] Fixes typo in aarch64 triple (#91622)
Use llvm::Triple:aarch64 as the little-endian triple.

Fixes commit 3e54768d7a0e1cfa65e892b6602993192ecad91e.
2024-05-09 11:46:49 -05:00
Joseph Huber
fa9e90f5d2 [Reland][Libomptarget] Statically link all plugin runtimes (#87009)
This patch overhauls the `libomptarget` and plugin interface. Currently,
we define a C API and compile each plugin as a separate shared library.
Then, `libomptarget` loads these API functions and forwards its internal
calls to them. This was originally designed to allow multiple
implementations of a library to be live. However, since then no one has
used this functionality and it prevents us from using much nicer
interfaces. If the old behavior is desired it should instead be
implemented as a separate plugin.

This patch replaces the `PluginAdaptorTy` interface with the
`GenericPluginTy` that is used by the plugins. Each plugin exports a
`createPlugin_<name>` function that is used to get the specific
implementation. This code is now shared with `libomptarget`.

There are some notable improvements to this.
1. Massively improved lifetimes of life runtime objects
2. The plugins can use a C++ interface
3. Global state does not need to be duplicated for each plugin +
   libomptarget
4. Easier to use and add features and improve error handling
5. Less function call overhead / Improved LTO performance.

Additional changes in this plugin are related to contending with the
fact that state is now shared. Initialization and deinitialization is
now handled correctly and in phase with the underlying runtime, allowing
us to actually know when something is getting deallocated.

Depends on https://github.com/llvm/llvm-project/pull/86971
https://github.com/llvm/llvm-project/pull/86875
https://github.com/llvm/llvm-project/pull/86868
2024-05-09 09:38:22 -05:00
Joseph Huber
e5e66073c3 Revert "[Libomptarget] Statically link all plugin runtimes (#87009)"
Caused failures on build-bots, reverting to investigate.

This reverts commit 80f9e814ec896fdc57ee84afad8ac4cb1f8e4627.
2024-05-09 07:05:23 -05:00
Joseph Huber
80f9e814ec
[Libomptarget] Statically link all plugin runtimes (#87009)
This patch overhauls the `libomptarget` and plugin interface. Currently,
we define a C API and compile each plugin as a separate shared library.
Then, `libomptarget` loads these API functions and forwards its internal
calls to them. This was originally designed to allow multiple
implementations of a library to be live. However, since then no one has
used this functionality and it prevents us from using much nicer
interfaces. If the old behavior is desired it should instead be
implemented as a separate plugin.

This patch replaces the `PluginAdaptorTy` interface with the
`GenericPluginTy` that is used by the plugins. Each plugin exports a
`createPlugin_<name>` function that is used to get the specific
implementation. This code is now shared with `libomptarget`.

There are some notable improvements to this.
1. Massively improved lifetimes of life runtime objects
2. The plugins can use a C++ interface
3. Global state does not need to be duplicated for each plugin +
   libomptarget
4. Easier to use and add features and improve error handling
5. Less function call overhead / Improved LTO performance.

Additional changes in this plugin are related to contending with the
fact that state is now shared. Initialization and deinitialization is
now handled correctly and in phase with the underlying runtime, allowing
us to actually know when something is getting deallocated.

Depends on https://github.com/llvm/llvm-project/pull/86971
https://github.com/llvm/llvm-project/pull/86875
https://github.com/llvm/llvm-project/pull/86868
2024-05-09 06:35:54 -05:00
Joseph Huber
e3938f4d71
[Offload] Detect native ELF machine from preprocessor (#91282)
Summary:
This gets the target's corresponding ELF value from the preprocessor.
We use this to detect if a given ELF is compatible with the CPU
offloading impolementation for OpenMP. Previously we used defitions from
CMake, but this is easier for people to understand as there may be new
users of this in the future.
2024-05-08 14:36:58 -05:00
Jhonatan Cléto
b438a817bd
[Offload] Fix dataDelete op for TARGET_ALLOC_HOST memory type (#91134)
Summary:
The `GenericDeviceTy::dataDelete` method doesn't verify the
`TargetAllocTy` of the of the device pointer. Because of this, it can
use the `MemoryManager` to free the ptr. However, the
`TARGET_ALLOC_HOST` and `TARGET_ALLOC_SHARED` types are not allocated
using the `MemoryManager` in the `GenericDeviceTy::dataAlloc` method.
Since the `MemoryManager` uses the `DeviceAllocatorTy::free` operation
without specifying the type of the ptr, some plugins may use incorrect
operations to free ptrs of certain types. In particular, this bug causes
the CUDA plugin to use the `cuMemFree` operation on ptrs of type
`TARGET_ALLOC_HOST`, resulting in an unchecked error, as shown in the
output snippet of the test
`offload/test/api/omp_host_pinned_memory_alloc.c`:

```
omptarget --> Notifying about an unmapping: HstPtr=0x00007c6114200000
omptarget --> Call to llvm_omp_target_free_host for device 0 and address 0x00007c6114200000
omptarget --> Call to omp_get_num_devices returning 1
omptarget --> Call to omp_get_initial_device returning 1
PluginInterface --> MemoryManagerTy:🆓 target memory 0x00007c6114200000.
PluginInterface --> Cannot find its node. Delete it on device directly.
TARGET CUDA RTL --> Failure to free memory: Error in cuMemFree[Host]: invalid argument
omptarget --> omp_target_free deallocated device ptr
```

This patch fixes this by adding the check of the device pointer type
before calling the appropriate operation for each type.
2024-05-07 22:21:32 -05:00
Joseph Huber
3e54768d7a
[Offload] Detect target triple from preprocessor instead of CMake (#91283)
Summary:
This patch removes the special-case handling for the target triple
inside of the CMake. I moved it into the implementation so it's easier
to see and modify.
2024-05-06 21:23:04 -05:00
Joseph Huber
b07177fb68
[Libomptarget] Rework interface for enabling plugins (#86875)
Summary:
Previously we would build all of the plugins by default and then only
load some using the `LIBOMPTARGET_PLUGINS_TO_LOAD` variable. This patch
renamed this to `LIBOMPTARGET_PLUGINS_TO_BUILD` and changes whether or
not it will include the plugin in CMake.

Additionally this patch creates a new `Targets.def` file that allows us
to enumerate all of the enabled plugins. This is somewhat different from
the old method, and it's done this way for future use that will need to
be shared. This follows the same method that LLVM uses for its targets,
however it does require adding an extra include path.

Depends on https://github.com/llvm/llvm-project/pull/86868
2024-04-29 11:18:37 -05:00
Joseph Huber
72b0c11cfd
[Libomptarget] Rename libomptarget.rtl.x86_64 to libomptarget.rtl.host (#86868)
Summary:
All of these are functionally the same code, just compiled for separate
architectures. We currently do not expose a way to execute these on
separate architectures as the host plugin works using `dlopen` into the
same process, and therefore cannot possibly be an incompatible
architecture. (This could work with a remote plugin, but this is not
supported yet).

This patch simply renames all of these to the same thing so we no longer
need to check around for its varying definitions.
2024-04-26 13:58:11 -05:00
Johannes Doerfert
330d8983d2
[Offload] Move /openmp/libomptarget to /offload (#75125)
In a nutshell, this moves our libomptarget code to populate the offload
subproject.

With this commit, users need to enable the new LLVM/Offload subproject
as a runtime in their cmake configuration.
No further changes are expected for downstream code.

Tests and other components still depend on OpenMP and have also not been
renamed. The results below are for a build in which OpenMP and Offload
are enabled runtimes. In addition to the pure `git mv`, we needed to
adjust some CMake files. Nothing is intended to change semantics.

```
ninja check-offload
```
Works with the X86 and AMDGPU offload tests

```
ninja check-openmp
```
Still works but doesn't build offload tests anymore.

```
ls install/lib
```
Shows all expected libraries, incl.
- `libomptarget.devicertl.a`
- `libomptarget-nvptx-sm_90.bc`
- `libomptarget.rtl.amdgpu.so` -> `libomptarget.rtl.amdgpu.so.18git`
- `libomptarget.so` -> `libomptarget.so.18git`

Fixes: https://github.com/llvm/llvm-project/issues/75124

---------

Co-authored-by: Saiyedul Islam <Saiyedul.Islam@amd.com>
2024-04-22 09:51:33 -07:00