48 Commits

Author SHA1 Message Date
Ross Brunton
4c0c295775
[Offload] OL_EVENT_INFO_IS_COMPLETE (#153194)
A simple info query for events that returns whether the event is
complete or not.
2025-08-22 13:40:31 +01:00
Abhinav Gaba
79cf877627
[Offload] Introduce dataFence plugin interface. (#153793)
The purpose of this fence is to ensure that any `dataSubmit`s inserted
into a queue before a `dataFence` finish before finish before any
`dataSubmit`s
inserted after it begin.

This is a no-op for most queues, since they are in-order, and by design
any operations inserted into them occur in order.

But the interface is supposed to be functional for out-of-order queues.

The addition of the interface means that any operations that rely on
such ordering (like ATTACH map-type support in #149036) can invoke it,
without worrying about whether the underlying queue is in-order or
out-of-order.

Once a plugin supports out-of-order queues, the plugin can implement
this function, without requiring any change at the libomptarget level.

---------

Co-authored-by: Alex Duran <alejandro.duran@intel.com>
2025-08-15 11:49:35 -07:00
Ross Brunton
30c7951136
[Offload] olLaunchHostFunction (#152482)
Add an `olLaunchHostFunction` method that allows enqueueing host work
to the stream.
2025-08-15 09:39:48 +01:00
Ross Brunton
910d7e90bf
[Offload] Make olLaunchKernel test thread safe (#149497)
This sprinkles a few mutexes around the plugin interface so that the
olLaunchKernel CTS test now passes when ran on multiple threads.

Part of this also involved changing the interface for device synchronise
so that it can optionally not free the underlying queue (which
introduced a race condition in liboffload).
2025-08-08 10:57:04 +01:00
Ross Brunton
a44532544b
[Offload] Don't create events for empty queues (#152304)
Add a device function to check if a device queue is empty. If liboffload
tries to create an event for an empty queue, we create an "empty" event
that is already complete.

This allows `olCreateEvent`, `olSyncEvent` and `olWaitEvent` to run
quickly for empty queues.
2025-08-07 10:16:33 +01:00
hidekisaito
83e5a99ff6
[AMDGPU][Offload] Enable memory manager use for up to ~3GB allocation size in omp_target_alloc (#151882)
Enables AMD data center class GPUs to use memory manager memory pooling
up to 3GB allocation by default, up from the "1 << 13" threshold that
all plugin-nextgen devices use.
2025-08-06 14:41:20 -07:00
Alex Duran
f092b820d1
[OFFLOAD] Fix typo in assert (#152316)
Fixes an issue introduced by PR https://github.com/llvm/llvm-project/pull/143491.
2025-08-06 17:01:47 +02:00
Alex Duran
66d1c37eb6
[OFFLOAD][OPENMP] 6.0 compatible interop interface (#143491)
The following patch introduces a new interop interface implementation
with the following characteristics:

* It supports the new 6.0 prefer_type specification
* It supports both explicit objects (from interop constructs) and
implicit objects (from variant calls).
* Implements a per-thread reuse mechanism for implicit objects to reduce
overheads.
* It provides a plugin interface that allows selecting the supported
interop types, and managing all the backend related interop operations
(init, sync, ...).
* It enables cooperation with the OpenMP runtime to allow progress on
OpenMP synchronizations.
* It cleanups some vendor/fr_id mismatchs from the current query
routines.
* It supports extension to define interop callbacks for library cleanup.
2025-08-06 16:34:39 +02:00
Ross Brunton
ae44418f28
[Offload] Erase entries from JIT cache when program is destroyed (#148847)
When `unloadBinary` is called, any entries in the JITEngine's cache
for that binary will be cleared. This fixes a nasty issue with
liboffload program handles. If two handles happen to have had the same
address (after one was free'd, for example), the cache would be hit and
return the wrong program.
2025-07-25 16:11:30 +01:00
Ross Brunton
4f02965ae2
[Offload] Store kernel name in GenericKernelTy (#142799)
GenericKernelTy has a pointer to the name that was used to create it.
However, the name passed in as an argument may not outlive the kernel.
Instead, GenericKernelTy now contains a std::string, and copies the
name into there.
2025-07-02 14:11:05 +01:00
Ross Brunton
0870c8838b
[Offload] Add an unloadBinary interface to PluginInterface (#143873)
This allows removal of a specific Image from a Device, rather than
requiring all image data to outlive the device they were created for.

This is required for `ol_program_handle_t`s, which now specify the
lifetime of the buffer used to create the program.
2025-06-25 14:53:18 +01:00
Ross Brunton
4359e55838
[Offload] Properly report errors when jit compiling (#145498)
Previously, if a binary failed to load due to failures when jit
compiling, the function would return success with nullptr. Now it
returns a new plugin error, `COMPILE_FAILURE`.
2025-06-24 16:27:12 +01:00
Ross Brunton
e6a3579653
[Offload] Replace device info queue with a tree (#144050)
Previously, device info was returned as a queue with each element having
a "Level" field indicating its nesting level. This replaces this queue
with a more traditional tree-like structure.

This should not result in a change to the output of
`llvm-offload-device-info`.
2025-06-13 09:22:47 -05:00
Ethan Luis McDonough
67ff66e677
[PGO][Offload] Fix offload coverage mapping (#143490)
This pull request fixes coverage mapping on GPU targets. 

- It adds an address space cast to the coverage mapping generation pass.
- It reads the profiled function names from the ELF directly. Reading it
from public globals was causing issues in cases where multiple
device-code object files are linked together.
2025-06-10 20:19:38 -05:00
Ross Brunton
050892d2f8
[Offload] Use new error code handling mechanism and lower-case messages (#139275)
[Offload] Use new error code handling mechanism

This removes the old ErrorCode-less error method and requires
every user to provide a concrete error code. All calls have been
updated.

In addition, for consistency with error messages elsewhere in LLVM, all
messages have been made to start lower case.
2025-05-20 08:50:20 -05:00
Joseph Huber
92bba68634
[Offload] Fix handling of 'bare' mode when environment missing (#136794)
Summary:
We treated the missing kernel environment as a unique mode, but it was
kind of this random bool that was doing the same thing and it explicitly
expects the kernel environment to be zero. It broke after the previous
change since it used to default to SPMD and didn't handle zero in any of
the other cases despite being used. This fixes that and queries for it
without needing to consume an error.
2025-04-23 08:16:39 -05:00
Nikita Popov
979c275097
[IR] Store Triple in Module (NFC) (#129868)
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.

For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.

The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
2025-03-06 10:27:47 +01:00
Ethan Luis McDonough
9e5c136d5a
[PGO][Offload] Profile profraw generation for GPU instrumentation #76587 (#93365)
This pull request is the second part of an ongoing effort to extends PGO
instrumentation to GPU device code and depends on #76587. This PR makes
the following changes:

- Introduces `__llvm_write_custom_profile` to PGO compiler-rt library.
This is an external function that can be used to write profiles with
custom data to target-specific files.
- Adds `__llvm_write_custom_profile` as weak symbol to libomptarget so
that it can write the collected data to a profraw file.
- Adds `PGODump` debug flag and only displays dump when the
aforementioned flag is set
2025-02-11 23:30:54 -06:00
Joseph Huber
5812d0bf8e
[Offload] Make only a single thread handle the RPC server thread (#126067)
Summary:
This patch just changes the interface to make starting the thread
multiple times permissable since it will only be done the first time.
Note that this does not refcount it or anything, so it's onto the user
to make sure that they don't shut down the thread before everyone is
done using it. That is the case today because the shutDown portion is
run by a single thread in the destructor phase.

Another question is if we should make this thread truly global state,
because currently it will be private to each plugin instance, so if you
have an AMD and NVIDIA image there will be two, similarly if you have
those inside of a shared library.
2025-02-06 11:38:14 -06:00
Joseph Huber
7a8779422d
[Offload] Stop the RPC server faiilng with more than one GPU (#125982)
Summary:
Pretty dumb mistake of me, forgot that this is run per-device and
per-plugin, which fell through the cracks with my testing because I have
two GPUs that use different plugins.
2025-02-05 20:51:28 -06:00
Joseph Huber
a284a6ed17 [OpenMP] Guard OpenMP specific entry handling 2025-02-03 16:16:18 -06:00
Christian Clauss
1f56bb3137
[Offload][NFC] Fix typos discovered by codespell (#125119)
https://github.com/codespell-project/codespell

% `codespell
--ignore-words-list=archtype,hsa,identty,inout,iself,nd,te,ths,vertexes
--write-changes`
2025-01-31 09:35:29 -06:00
Joseph Huber
38b3f45a81 [Offload] Fix offload-info interface
Summary:
The offload info tool doesn't initialize things properly, just check
this first instead.
2025-01-27 10:36:09 -06:00
Joseph Huber
f07505849c [Offload] Fix server thread from being shut down if unused 2025-01-27 08:29:41 -06:00
Joseph Huber
134401deea
[Offload] Move RPC server handling to a dedicated thread (#112988)
Summary:
Handling the RPC server requires running through list of jobs that the
device has requested to be done. Currently this is handled by the thread
that does the waiting for the kernel to finish. However, this is not
sound on NVIDIA architectures and only works for async launches in the
OpenMP model that uses helper threads.

However, we also don't want to have this thread doing work
unnnecessarily. For this reason we track the execution of kernels and
cause the thread to sleep via a condition variable (usually backed by
some kind of futex or other intelligent sleeping mechanism) so that the
thread will be idle while no kernels are running.
2025-01-24 11:36:45 -06:00
Joseph Huber
6518b121f0
[Offload][NFC] Factor out and rename the __tgt_offload_entry struct (#123785)
Summary:
This patch is an NFC renaming to make using the offloading entry type
more portable between other targets. Right now this is just moving its
definition to LLVM so others can use it. Future work will rework the
struct layout.
2025-01-21 12:05:24 -06:00
Shilei Tian
92376c3ff5
[Offload][OMPX] Add the runtime support for multi-dim grid and block (#118042) 2024-12-06 09:07:50 -05:00
Joseph Huber
91f5f974cb
[OpenMP] Unconditionally provide an RPC client interface for OpenMP (#117933)
Summary:
This patch adds an RPC interface that lives directly in the OpenMP
device runtime. This allows OpenMP to implement custom opcodes.
Currently this is only providing the host call interface, which is the
raw version of reverse offloading. Previously this lived in `libc/` as
an extension which is not the correct place.

The interface here uses a weak symbol for the RPC client by the same
name that the `libc` interface uses. This means that it will defer to
the libc one if both are present so we don't need to set up multiple
instances.

The presense of this symbol is what controls whether or not we set up
the RPC server. Because this is an external symbol it normally won't be
optimized out, so there's a special pass in OpenMPOpt that deletes this
symbol if it is unused during linking. That means at `O0` the RPC server
will always be present now, but will be removed trivially if it's not
used at O1 and higher.
2024-12-02 14:31:51 -06:00
Johannes Doerfert
08533a3ee8
[Offload][NFC] Reorganize utils:: and make Device/Host/Shared clearer (#100280)
We had three `utils::` namespaces, all with different "meaning" (host,
device, hsa_utils). We should, when we can, keep "include/Shared"
accessible from host and device, thus RefCountTy has been moved to a
separate header. `hsa_utils` was introduced to make `utils::` less
overloaded. And common functionality was de-duplicated, e.g.,
`utils::advance` and `utils::advanceVoidPtr` -> `utils:advancePtr`. Type
punning now checks for the size of the result to make sure it matches
the source type.

No functional change was intended.
2024-09-05 13:36:26 -07:00
Ethan Luis McDonough
fde2d23ee2
[PGO][OpenMP] Instrumentation for GPU devices (Revision of #76587) (#102691)
This pull request is a revised version of #76587. This pull request
fixes some build issues that were present in the previous version of
this change.

> This pull request is the first part of an ongoing effort to extends
PGO instrumentation to GPU device code. This PR makes the following
changes:
>
> - Adds blank registration functions to device RTL
> - Gives PGO globals protected visibility when targeting a supported
GPU
> - Handles any addrspace casts for PGO calls
> - Implements PGO global extraction in GPU plugins (currently only
dumps info)
>
> These changes can be tested by supplying `-fprofile-instrument=clang`
while targeting a GPU.
2024-08-22 01:10:54 -05:00
Johannes Doerfert
80525dfcde
[Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (#94549)
Through the new `-foffload-via-llvm` flag, CUDA kernels can now be
lowered to the LLVM/Offload API. On the Clang side, this is simply done
by using the OpenMP offload toolchain and emitting calls to `llvm*`
functions to orchestrate the kernel launch rather than `cuda*`
functions. These `llvm*` functions are implemented on top of the
existing LLVM/Offload API.

As we are about to redefine the Offload API, this wil help us in the
design process as a second offload language.

We do not support any CUDA APIs yet, however, we could:
  https://www.osti.gov/servlets/purl/1892137

For proper host execution we need to resurrect/rebase
  https://tianshilei.me/wp-content/uploads/2021/12/llpp-2021.pdf
(which was designed for debugging).

```
❯❯❯ cat test.cu
extern "C" {
void *llvm_omp_target_alloc_shared(size_t Size, int DeviceNum);
void llvm_omp_target_free_shared(void *DevicePtr, int DeviceNum);
}

__global__ void square(int *A) { *A = 42; }

int main(int argc, char **argv) {
  int DevNo = 0;
  int *Ptr = reinterpret_cast<int *>(llvm_omp_target_alloc_shared(4, DevNo));
  *Ptr = 7;
  printf("Ptr %p, *Ptr %i\n", Ptr, *Ptr);
  square<<<1, 1>>>(Ptr);
  printf("Ptr %p, *Ptr %i\n", Ptr, *Ptr);
  llvm_omp_target_free_shared(Ptr, DevNo);
}

❯❯❯ clang++ test.cu -O3 -o test123 -foffload-via-llvm --offload-arch=native

❯❯❯ llvm-objdump --offloading test123

test123:        file format elf64-x86-64

OFFLOADING IMAGE [0]:
kind            elf
arch            gfx90a
triple          amdgcn-amd-amdhsa
producer        openmp

❯❯❯ LIBOMPTARGET_INFO=16 ./test123
Ptr 0x155448ac8000, *Ptr 7
Ptr 0x155448ac8000, *Ptr 42
```
2024-08-12 17:44:58 -07:00
Johannes Doerfert
9a1013220b
[Offload] Allow to record kernel launch stack traces (#100472)
Similar to (de)allocation traces, we can record kernel launch stack
traces and display them in case of an error. However, the AMD GPU plugin
signal handler, which is invoked on memroy faults, cannot pinpoint the
offending kernel. Insteade print `<NUM>`, set via
`OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=<NUM>`, many traces. The
recoding/record uses a ring buffer of fixed size (for now 8).
For `trap` errors, we print the actual kernel name, and trace if
recorded.
2024-07-31 11:49:50 -07:00
Johannes Doerfert
c95abe94ae
[Offload] Implement double free (and other allocation error) reporting (#100261)
As a first step towards a GPU sanitizer we now can track allocations and
deallocations in order to report double frees, and other problems during
deallocation.
2024-07-30 10:10:57 -07:00
Ethan Luis McDonough
2c8b912f63
Revert "[PGO][OpenMP] Instrumentation for GPU devices (#76587)"
This reverts commit 5fd2af38e461445c583d7ffc2fe23858966eee76. It caused build issues and broke the buildbot.
2024-06-28 12:30:45 -05:00
Ethan Luis McDonough
5fd2af38e4
[PGO][OpenMP] Instrumentation for GPU devices (#76587)
This pull request is the first part of an ongoing effort to extends PGO
instrumentation to GPU device code. This PR makes the following changes:

- Adds blank registration functions to device RTL
- Gives PGO globals protected visibility when targeting a supported GPU
- Handles any addrspace casts for PGO calls
- Implements PGO global extraction in GPU plugins (currently only dumps
info)

These changes can be tested by supplying `-fprofile-instrument=clang`
while targeting a GPU.
2024-06-28 10:42:19 -05:00
Tim Gymnich
597d2f7662
[OpenMP] Add Environment Variable to disable Reuse of Blocks for High Loop Trip Counts (#89239)
Sometimes it might be beneficial to spawn more thread blocks instead of
reusing existing for multiple loop iterations.

**Alternatives considered:**

Make `DefaultNumBlocks` settable via an environment variable.

---------

Co-authored-by: Joseph Huber <huberjn@outlook.com>
2024-06-14 07:35:23 -07:00
Johannes Doerfert
54b5c76d3b
[Offload] Use flat array for cuLaunchKernel (#95116)
We already used a flat array of kernel launch parameters for the AMD GPU
launch but now we also use this scheme for the NVIDIA GPU launch. The
only remaining/required use of the indirection is the host plugin (due
ot ffi). This allows to us simplify the use for non-OpenMP kernel
launch.
2024-06-13 09:43:47 +03:00
Joseph Huber
435aa7663d
[Libomptarget] Rework device initialization and image registration (#93844)
Summary:
Currently, we register images into a linear table according to the
logical OpenMP device identifier. We then initialize all of these images
as one block. This logic requires that images are compatible with *all*
devices instead of just the one that it can run on. This prevents us
from running on systems with heterogeneous devices (i.e. image 1 runs on
device 0 image 0 runs on device 1).

This patch reworks the logic by instead making the compatibility check a
per-device query. We then scan every device to see if it's compatible
and do it as they come.
2024-06-06 08:10:56 -05:00
Joseph Huber
21f3a6091f
[Offload] Only initialize a plugin if it is needed (#92765)
Summary:
Initializing the plugins requires initializing the runtime like CUDA or
HSA. This has a considerable overhead on most platforms, so we should
only actually initialize a plugin if it is needed by any image that is
loaded.
2024-05-23 09:36:47 -05:00
Joseph Huber
f42f57b52d
[Libomptarget] Rework Record & Replay to be a plugin member (#88928) (#89097)
Summary:
Previously, the R&R support was global state initialized by a global
constructor. This is bad because it prevents us from adequately
constraining the lifetime of the library. Additionally, we want to
minimize the amount of global state floating around.

This patch moves the R&R support into a plugin member like everything
else. This means there will be multiple copies of the R&R implementation
floating around, but this was already the case given the fact that we
currently handle everything with dynamic libraries.
2024-05-16 14:58:46 -05:00
Joseph Huber
3abd3d6e59
[Libomptarget] Remove requires information from plugin (#80345)
Summary:
Currently this is only used for the zero-copy handling. However, this
can easily be moved into `libomptarget` so that we do not need to bother
setting the requires flags in the plugin. The advantage here is that we
no longer need to do this for every device redundently. Additionally,
these requires flags are specifically OpenMP related, so they should
live in `libomptarget`.
2024-05-16 11:13:50 -05:00
Joseph Huber
81d20d861e [Offload][NFC] Fix warning messages in runtime
Summary:
These are lots of random warnings due to inconsistent initialization or
signedness.
2024-05-15 15:30:38 -05:00
Joseph Huber
363258a3cc
[Offload] Remove old references to isCtor (#91766)
Summary:
These have long since been removed, support for ctors / dtors now
happens through special kernels the backend creates.
2024-05-14 06:00:34 -05:00
Joseph Huber
fa9e90f5d2 [Reland][Libomptarget] Statically link all plugin runtimes (#87009)
This patch overhauls the `libomptarget` and plugin interface. Currently,
we define a C API and compile each plugin as a separate shared library.
Then, `libomptarget` loads these API functions and forwards its internal
calls to them. This was originally designed to allow multiple
implementations of a library to be live. However, since then no one has
used this functionality and it prevents us from using much nicer
interfaces. If the old behavior is desired it should instead be
implemented as a separate plugin.

This patch replaces the `PluginAdaptorTy` interface with the
`GenericPluginTy` that is used by the plugins. Each plugin exports a
`createPlugin_<name>` function that is used to get the specific
implementation. This code is now shared with `libomptarget`.

There are some notable improvements to this.
1. Massively improved lifetimes of life runtime objects
2. The plugins can use a C++ interface
3. Global state does not need to be duplicated for each plugin +
   libomptarget
4. Easier to use and add features and improve error handling
5. Less function call overhead / Improved LTO performance.

Additional changes in this plugin are related to contending with the
fact that state is now shared. Initialization and deinitialization is
now handled correctly and in phase with the underlying runtime, allowing
us to actually know when something is getting deallocated.

Depends on https://github.com/llvm/llvm-project/pull/86971
https://github.com/llvm/llvm-project/pull/86875
https://github.com/llvm/llvm-project/pull/86868
2024-05-09 09:38:22 -05:00
Joseph Huber
e5e66073c3 Revert "[Libomptarget] Statically link all plugin runtimes (#87009)"
Caused failures on build-bots, reverting to investigate.

This reverts commit 80f9e814ec896fdc57ee84afad8ac4cb1f8e4627.
2024-05-09 07:05:23 -05:00
Joseph Huber
80f9e814ec
[Libomptarget] Statically link all plugin runtimes (#87009)
This patch overhauls the `libomptarget` and plugin interface. Currently,
we define a C API and compile each plugin as a separate shared library.
Then, `libomptarget` loads these API functions and forwards its internal
calls to them. This was originally designed to allow multiple
implementations of a library to be live. However, since then no one has
used this functionality and it prevents us from using much nicer
interfaces. If the old behavior is desired it should instead be
implemented as a separate plugin.

This patch replaces the `PluginAdaptorTy` interface with the
`GenericPluginTy` that is used by the plugins. Each plugin exports a
`createPlugin_<name>` function that is used to get the specific
implementation. This code is now shared with `libomptarget`.

There are some notable improvements to this.
1. Massively improved lifetimes of life runtime objects
2. The plugins can use a C++ interface
3. Global state does not need to be duplicated for each plugin +
   libomptarget
4. Easier to use and add features and improve error handling
5. Less function call overhead / Improved LTO performance.

Additional changes in this plugin are related to contending with the
fact that state is now shared. Initialization and deinitialization is
now handled correctly and in phase with the underlying runtime, allowing
us to actually know when something is getting deallocated.

Depends on https://github.com/llvm/llvm-project/pull/86971
https://github.com/llvm/llvm-project/pull/86875
https://github.com/llvm/llvm-project/pull/86868
2024-05-09 06:35:54 -05:00
Jhonatan Cléto
b438a817bd
[Offload] Fix dataDelete op for TARGET_ALLOC_HOST memory type (#91134)
Summary:
The `GenericDeviceTy::dataDelete` method doesn't verify the
`TargetAllocTy` of the of the device pointer. Because of this, it can
use the `MemoryManager` to free the ptr. However, the
`TARGET_ALLOC_HOST` and `TARGET_ALLOC_SHARED` types are not allocated
using the `MemoryManager` in the `GenericDeviceTy::dataAlloc` method.
Since the `MemoryManager` uses the `DeviceAllocatorTy::free` operation
without specifying the type of the ptr, some plugins may use incorrect
operations to free ptrs of certain types. In particular, this bug causes
the CUDA plugin to use the `cuMemFree` operation on ptrs of type
`TARGET_ALLOC_HOST`, resulting in an unchecked error, as shown in the
output snippet of the test
`offload/test/api/omp_host_pinned_memory_alloc.c`:

```
omptarget --> Notifying about an unmapping: HstPtr=0x00007c6114200000
omptarget --> Call to llvm_omp_target_free_host for device 0 and address 0x00007c6114200000
omptarget --> Call to omp_get_num_devices returning 1
omptarget --> Call to omp_get_initial_device returning 1
PluginInterface --> MemoryManagerTy:🆓 target memory 0x00007c6114200000.
PluginInterface --> Cannot find its node. Delete it on device directly.
TARGET CUDA RTL --> Failure to free memory: Error in cuMemFree[Host]: invalid argument
omptarget --> omp_target_free deallocated device ptr
```

This patch fixes this by adding the check of the device pointer type
before calling the appropriate operation for each type.
2024-05-07 22:21:32 -05:00
Johannes Doerfert
330d8983d2
[Offload] Move /openmp/libomptarget to /offload (#75125)
In a nutshell, this moves our libomptarget code to populate the offload
subproject.

With this commit, users need to enable the new LLVM/Offload subproject
as a runtime in their cmake configuration.
No further changes are expected for downstream code.

Tests and other components still depend on OpenMP and have also not been
renamed. The results below are for a build in which OpenMP and Offload
are enabled runtimes. In addition to the pure `git mv`, we needed to
adjust some CMake files. Nothing is intended to change semantics.

```
ninja check-offload
```
Works with the X86 and AMDGPU offload tests

```
ninja check-openmp
```
Still works but doesn't build offload tests anymore.

```
ls install/lib
```
Shows all expected libraries, incl.
- `libomptarget.devicertl.a`
- `libomptarget-nvptx-sm_90.bc`
- `libomptarget.rtl.amdgpu.so` -> `libomptarget.rtl.amdgpu.so.18git`
- `libomptarget.so` -> `libomptarget.so.18git`

Fixes: https://github.com/llvm/llvm-project/issues/75124

---------

Co-authored-by: Saiyedul Islam <Saiyedul.Islam@amd.com>
2024-04-22 09:51:33 -07:00