45 Commits

Author SHA1 Message Date
Joseph Huber
d8ba56ce3f
[compiler-rt] Split the GPU.cmake cache file to AMDGPU and NVPTX (#190349)
Summary:
These will have different functionality going forward. They should be
split so we can more easily support things only feasible in AMDGPU.
2026-04-03 10:44:04 -05:00
Joseph Huber
15bfc06b6b
[Offload][NFC] Various minor changes to Offload CMake (#189029)
Summary:
Most of these just remove some redundancy or rename `openmp` ->
`offload` where the variable is purely internal.
2026-03-27 12:06:37 -05:00
Joseph Huber
45d7ef423d
[Offload][NFC] Remove unused testing functions in CMake (#189013)
Summary:
These are called by no one.
2026-03-27 10:04:49 -05:00
Joseph Huber
07896d44a3
[OpenMP] Emit aggregate kernel prototypes and remove libffi dependency (#186261)
Summary:
This PR changes the handling of the emitted kernels when targeting a CPU
to be a pointer struct.

The old handling emitted a standard function prototype, this
necessitated a target specific ABI to call it because the signature
differed with the number of arguments. Instead, this PR emits a void
pointer to a naturally aligned struct, this is what APIs like `pthreads`
assert.

This allows us to remove all the complexity around launching host
kernels and just pass the argument list.
2026-03-20 13:08:23 -05:00
estewart08
0e7262407c
[offload] - Remove standalone build in favor of 'runtimes' (#170693)
Summary:
Follow up on removal of OPENMP_STANDALONE_BUILD in openmp (#149878).
This
build method is redundant and can be accomplished via runtimes.

Removes support for:
`cmake -S <llvm-project>/offload ...`

 Switches over to:
`make -S <llvm-project>/runtimes -DLLVM_ENABLE_RUNTIMES=openmp;offload
...`

Libomptarget has a dependency on libomp.so and requires the omp cmake
target to exist at build time, which is why both runtimes are listed.

Updates cmake compiler logic in offload/CMakeLists.txt to mirror openmp
changes:
 [openmp] Allow testing OpenMP without a full clang build tree (#182470)

User will still need to have a separate invocation to build openmp
DeviceRTL via:
`-DLLVM_ENABLE_RUNTIMES=openmp`
`-DLLVM_DEFAULT_TARGET_TRIPLE=<amdgcn-amd-amdhsa|nvptx64-nvidia-cuda>`
2026-03-19 09:00:40 -05:00
Jan Patrick Lehr
964091a2db
[OpenMP][AMDGPU] Enable omptest build (#161649)
This enables building the omptest library across the AMD buildbots that
rely on this CMake cache.
2026-03-16 15:25:12 +00:00
Joseph Huber
fd069a46bf
[copmiler-rt] Initial support for building profile library on the GPU (#185552)
Summary:
As suggested in https://github.com/llvm/llvm-project/pull/177665, we
should build a GPU version of the compiler-rt profile library instead of
writing it in-line in the lowering. This PR does not define anything GPU
specific, it simply re-uses the baremetal handling. Later PRs will
prevent the GPU specific handling we would want to do to optimize
counter handling on the GPU.

Note that this will require using the cache file, or setting these
options
manually for existing users. Hopefully if people are using the cache
file
as they should it won't break anything.
2026-03-10 13:45:18 -05:00
Joseph Huber
830236204e
[Offload] Provide a cache file for building OpenMP w/ Flang offloading (#178472)
Summary:
This build is more annoying, enables everything required for this
version of the runtime.
2026-01-29 12:35:42 -06:00
Alex Duran
f125c8db5c
[OFFLOAD] Add plugin with support for Intel oneAPI Level Zero (#158900)
Add a new nextgen plugin that supports GPU devices through the Intel oneAPI Level Zero library. The plugin is not enabled by default  and needs to be added to LIBOMPTARGET_PLUGINS_TO_BUILD explicitely.

---------

Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com>
Co-authored-by: Nick Sarnie <nick.sarnie@intel.com>
Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-12-18 08:53:03 +01:00
Michael Kruse
c32c1d0d21
[Runtimes] Default build must use its own output dirs (#168266)
Post-commit fix of #164794 reported at
https://github.com/llvm/llvm-project/pull/164794#issuecomment-3536253493

`LLVM_LIBRARY_OUTPUT_INTDIR` and `LLVM_RUNTIME_OUTPUT_INTDIR` is used by
`AddLLVM.cmake` as output directories. Unless we are in a
bootstrapping-build, It must not point to directories found by
`find_package(LLVM)` which may be read-only directories. MLIR for
instance sets thesese variables to its own build output
directory, so should the runtimes.
2025-11-19 13:51:14 +01:00
Joseph Huber
4294907022
[Offload] Build libcxx on the GPU libc bot (#157673) 2025-09-09 09:35:53 -05:00
Joseph Huber
3f3f7d1fd9 [Offload] Build the OpenMP device library with the AMDGPU libc bot
Summary:
This is missing because I forgot to add it.
2025-09-08 08:36:18 -05:00
Joseph Huber
be6f110bc0
[OpenMP] Change build of OpenMP device runtime to be a separate runtime (#136729)
Summary:
Currently we build the OpenMP device runtime as part of the `offload/`
project. This is problematic because it has several restrictions when
compared to the normal offloading runtime. It can only be built with an
up-to-date clang and we need to set the target appropriately. Currently
we hack around this by creating the compiler invocation manually, but
this patch moves it into a separate runtimes build.

This follows the same build we use for libc, libc++, compiler-rt, and
flang-rt. This also moves it from `offload/` into `openmp/` because it
is still the `openmp/` runtime and I feel it is more appropriate. We do
want a generic `offload/` library at some point, but it would be trivial
to then add that as a separate library now that we have the
infrastructure that makes adding these new libraries trivial.

This most importantly will require that users update their build
configs, mostly adding the following lines at a minimum. I was debating
whether or not I should 'auto-upgrade' this, but I just went with a
warning.

```
    -DLLVM_RUNTIME_TARGETS='default;amdgcn-amd-amdhsa;nvptx64-nvidia-cuda'     \
    -DRUNTIMES_nvptx64-nvidia-cuda_LLVM_ENABLE_RUNTIMES=openmp \
    -DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=openmp \
```

This also changed where the `.bc` version of the library lives, but it's
still created.
2025-09-08 07:51:52 -05:00
Jan Patrick Lehr
05aff0eb65
[Offload] Run tests 16-way parallel on AMDGPU (#156627)
Reduce the number of paralell tests run to align with the typical number
of VMIDs provided by the kernel driver.
2025-09-05 22:23:20 +02:00
David Tenty
63195d3d7a
[NFC][CMake] quote ${CMAKE_SYSTEM_NAME} consistently (#154537)
A CMake change included in CMake 4.0 makes `AIX` into a variable
(similar to `APPLE`, etc.)
ff03db6657

However, `${CMAKE_SYSTEM_NAME}` unfortunately also expands exactly to
`AIX` and `if` auto-expands variable names in CMake. That means you get
a double expansion if you write:

`if (${CMAKE_SYSTEM_NAME}  MATCHES "AIX")`
which becomes:
`if (AIX  MATCHES "AIX")`
which is as if you wrote:
`if (ON MATCHES "AIX")`

You can prevent this by quoting the expansion of "${CMAKE_SYSTEM_NAME}",
due to policy
[CMP0054](https://cmake.org/cmake/help/latest/policy/CMP0054.html#policy:CMP0054)
which is on by default in 4.0+. Most of the LLVM CMake already does
this, but this PR fixes the remaining cases where we do not.
2025-08-20 12:45:41 -04:00
Kewen12
2b16af8df2
[Offload][cmake] Add GPU test job limit for AMDGPU buildbot cmake cache (#146611)
Added GPU test job limit to make it consistent with current config
https://github.com/llvm/llvm-zorg/blob/main/buildbot/osuosl/master/config/builders.py#L2027C31-L2027C77
2025-07-01 19:18:28 -05:00
Jan Patrick Lehr
dd65e6e060
[Offload][libc] Add cmake cache AMDGPU buildbot (#144500)
An upcoming libc4GPU buildbot will be using this CMake cache file for
its build configuration.
2025-06-17 20:51:40 +02:00
Joel E. Denny
5709506de0
[offload] Fix finding amdgpu/nvptx-arch to generate tests (#135072)
PR #134713, which landed as 79cb6f05da37, causes this on my test
systems:

```
-- Building AMDGPU plugin for dlopened libhsa
-- Not generating AMDGPU tests, no supported devices detected. Use 'LIBOMPTARGET_FORCE_AMDGPU_TESTS' to override.
-- Building CUDA plugin for dlopened libcuda
-- Not generating NVIDIA tests, no supported devices detected. Use 'LIBOMPTARGET_FORCE_NVIDIA_TESTS' to override.
```

The problem is it cannot locate amdgpu-arch and nvptx-arch. This patch
enables it to.

I suspect there is more cleanup to do here. amdgpu-arch and nvptx-arch
do not appear to exist as cmake targets anymore, but there is still
cmake code here that looks for those targets.
2025-04-09 15:54:29 -04:00
Michael Kruse
d3255474be Reapply "[Offload][AMDGPU] LLVM_ENABLE_RUNTIMES=flang-rt for amdgpu-offload-*" (#130274)
Enable the LLVM_ENABLE_RUNTIMES=flang-rt build of the Fortran runtime
for the amdgpu-offload-* buildbots. This pre-population cmake cache
files is referred to by the llvm-zorg annotated builder factory
[script](872f477610/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py (L26)).

The corresponding change in llvm-zorg is
llvm/llvm-zorg#402

This reverts commit e296fb8ff6255b97db9ff6cd941acc730164b38f.

The worker of amdgpu-offload-rhel-8-cmake-build-only has been updated
with a newer version of Ninja that supports Fortran.
2025-03-13 13:21:36 +01:00
Michael Kruse
e296fb8ff6
Revert "[Offload][AMDGPU] LLVM_ENABLE_RUNTIMES=flang-rt for amdgpu-offload-*" (#130274)
Reverts llvm/llvm-project#129692

The builder amdgpu-offload-rhel-8-cmake-build-only fails because its
version of Ninja is too old. At least Ninja 1.10 is required for its
support for dependencies between Fortran modules.

https://lab.llvm.org/buildbot/#/builders/204/builds/2696
2025-03-07 12:30:18 +01:00
Michael Kruse
68578b38cf
[Offload][AMDGPU] LLVM_ENABLE_RUNTIMES=flang-rt for amdgpu-offload-* (#129692)
Enable the LLVM_ENABLE_RUNTIMES=flang-rt build of the Fortran runtime
for the amdgpu-offload-* buildbots. This pre-population cmake cache
files is referred to by the llvm-zorg annotated builder factory
[script](872f477610/zorg/buildbot/builders/annotated/amdgpu-offload-cmake.py (L26)).

The corresponding change in llvm-zorg is
https://github.com/llvm/llvm-zorg/pull/402
2025-03-07 11:56:00 +01:00
Jan Patrick Lehr
fe18796142
[Offload][AMDGPU] Enable SPIRV target in build conf (#129323)
Enable the SPIRV backend on the CMake-cache file buildbots.
2025-03-01 21:56:28 +01:00
Joseph Huber
feb30f25c0
[Offload] Fix the offload cache file triggering libc++ / libstdc++ mixing (#126313)
Summary:
We originally wanted `-stdlib=libc++` by default so that it could use
offloading support in libc++, however this causes issues with out the
Offloading proejct itself is built. Is the user builds the LLVM libs
with libstdc++ then uses this cache it will enable this option by
default for the ensuing build of the offloading libraries with the newly
build clang. This will cause a lot of linker failured because the C++
library doesn't match.

Long term I think the proper solution to this is to make better use of
clang configuration files, but I don't know a good way to do that by
default. For now just make it build right.
2025-02-10 13:20:35 -06:00
Jan Patrick Lehr
d412fe531d
[Offload] Enable mlir and flang in bot build (#124915)
This enables more projects in the CMake cache to add them to the
buildbot coverage in the AMDGPU buildbots.
2025-01-29 14:13:59 +01:00
Jan Patrick Lehr
4b3c17850b
[Offload] Enable shared-libs; compiler-rt as default RTLIB (#123568)
This is the next step to move the CMake cache file builder closer to the
build configuration we care about downstream.
2025-01-20 10:23:41 +01:00
Jan Patrick Lehr
61f94ebc9e
[NFC][Offload] Structure/Readability of CMake cache (#123328)
Preparing to add more config options and want to group them all from
most-common to project / component specific.
2025-01-17 13:01:25 +01:00
Jan Patrick Lehr
74486dcf41
[Offload] Add CMake cache to be used in AMDGPU bot (#119369)
Adds initial CMake cache definition that is similar to what we use in
one of our production buidlbots. The goal is to consolidate the
configurations and make them accessible.
This cache file is a first step and to prepare for full pipeline testing
once the new bot comes online.
2024-12-10 17:50:02 +01:00
Michał Górny
4a44e4b192
[offload] Remove bogus offload-tblgen check for standalone build (#119004)
fd3907ccb583df99e9c19d2fe84e4e7c52d75de9 introduced a check for system
offload-tblgen executable when doing a standalone build. This check is
bogus, since offload-tblgen is built as part of offload and not some
other preinstalled component. The path is also overwritten below, so the
check only causes tests to be disabled unnecessarily.
2024-12-06 18:21:12 +00:00
Callum Fare
fd3907ccb5
Reland #118503: [Offload] Introduce offload-tblgen and initial new API implementation (#118614)
Reland #118503. Added a fix for builds with `-DBUILD_SHARED_LIBS=ON`
(see last commit). Otherwise the changes are identical.

---


### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
2024-12-05 09:34:04 +01:00
Michał Górny
04b26f0eb7
[offload] Standalone build fixes (#118173)
A fair number of fixes to get standalone builds of offload working —
mostly copying missing bits from openmp. It's almost ready — I still
need to figure out why some of the tsts aren't linking to the right
libraries.
2024-12-04 11:27:37 +00:00
Jan Patrick Lehr
51553227f0
Revert "Reland of #108413: [Offload] Introduce offload-tblgen and initial new API implementation" (#118541)
Reverts llvm/llvm-project#118503

Broke bot
https://lab.llvm.org/staging/#/builders/131/builds/9701/steps/5/logs/stdio
2024-12-03 21:42:38 +01:00
Callum Fare
8da490320f
Reland of #108413: [Offload] Introduce offload-tblgen and initial new API implementation (#118503)
This is another attempt to reland the changes from #108413

The previous two attempts introduced regressions and were reverted. This
PR has been more thoroughly tested with various configurations so
shouldn't cause any problems this time. If anyone is aware of any likely
remaining problems then please let me know.

The changes are identical other than the fixes contained in the last 5
commits.

---


### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
2024-12-03 16:28:35 +00:00
Jan Patrick Lehr
5208bc3694
Revert "Reland #2 - [Offload] Introduce offload-tblgen and initial new API implementation (#108413. #117704)" (#117995)
Reverts llvm/llvm-project#117894

Buildbot failures in OpenMP/Offload bots.
https://lab.llvm.org/buildbot/#/builders/30/builds/11193
2024-11-28 12:58:17 +01:00
Callum Fare
992b00020f
Reland #2 - [Offload] Introduce offload-tblgen and initial new API implementation (#108413. #117704) (#117894)
Relands #117704, which relanded changes from #108413 - this was reverted
due to build issues. The new offload library did not build with
`LIBOMPTARGET_OMPT_SUPPORT` enabled, which was not picked up by
pre-merge testing.

The last commit contains the fix; everything else is otherwise identical
to the approved PR.
___

### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
2024-11-28 10:19:37 +00:00
Fraser Cormack
0cb5846a68 Revert "Reland - [Offload] Introduce offload-tblgen and initial new API implementation (#108413) (#117704)"
This reverts commit c979ec05642f292737d250c6682d85ed49bc7b6e.

This showed failures in the post-merge CI.
2024-11-27 10:49:01 +00:00
Callum Fare
c979ec0564
Reland - [Offload] Introduce offload-tblgen and initial new API implementation (#108413) (#117704)
Relands changes from #108413 - this was reverted due to build issues.
The problem was just that the `offload-tblgen` tool was behind recent
changes to tablegen that ensure `const` records. This has been fixed and
the PR is otherwise identical.

___

### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
2024-11-27 10:39:07 +00:00
Joseph Huber
d047bee496 Revert "[Offload] Introduce offload-tblgen and initial new API implementation (#108413)"
This reverts commit 8a2311c4bf9993230e37dc20b57973dc917f2338.
2024-11-25 12:16:46 -06:00
Callum Fare
8a2311c4bf
[Offload] Introduce offload-tblgen and initial new API implementation (#108413)
Introduce `offload-tblgen` and an initial implementation of a subset of
the new API. The tablegen files are intended to be the single source of
truth for the new API, with the header files, documentation, and others
bits of source all automatically generated.

**TODO** (based on review feedback so far):
- [x] Check in the generated headers
- [x] Add an `offload-generate` target to trigger the generation rather
than building them every time
- [x] Decide how error handling should work
  - [x] Finish up new error handling implementation 
- [x] Decide naming convention
- [x] Add testing for the new API
- [x] Add tablegen specific testing
- [x] clang-tidy and use llvm:: types when possible
- [x] Add optional code location arguments
- [x] Avoid multiple returns from one function

### offload-tblgen

See the included
[README](d80db06491/offload/new-api/API/README.md)
for more information on how the API definition and generation works. I'm
happy to answer any questions about it and plan to walk through it in a
future LLVM Offload call.

It should be noted that struct definitions have not been fully
implemented/tested as they aren't used by the initial API definitions,
but finishing that off in the future shouldn't be too much work.

The tablegen tooling has been designed to be easily extended with new
backends, using the classes in `RecordTypes.hpp` to abstract over the
tablegen records.

### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
$ git clone -b offload_adapter https://github.com/callumfare/unified-runtime.git
$ cd unified-runtime
$ mkdir build
$ cd build
$ cmake .. -GNinja -DUR_BUILD_ADAPTER_OFFLOAD=ON \
    -DUR_OFFLOAD_INSTALL_DIR=<offload build dir containing liboffload_new.so> \
    -DUR_OFFLOAD_INCLUDE_DIR=<offload build dir containing 'offload' headers directory>
$ ninja urinfo
export LD_LIBRARY_PATH=<offload build dir containing offload plugin libraries>
$ UR_ADAPTERS_FORCE_LOAD=$PWD/lib/libur_adapter_offload.so ./bin/urinfo
[cuda:gpu][cuda:0] CUDA, NVIDIA GeForce GT 1030  [12030]
# Demo with tracing
$ OFFLOAD_TRACE=1 UR_ADAPTERS_FORCE_LOAD=$PWD/lib/libur_adapter_offload.so ./bin/urinfo
---> offloadPlatformGet(.NumEntries = 0, .phPlatforms = {}, .pNumPlatforms = 0x7ffd05e4d6e0 (2))-> OFFLOAD_RESULT_SUCCESS
---> offloadPlatformGet(.NumEntries = 2, .phPlatforms = {0x564bf4040220, 0x564bf4040240}, .pNumPlatforms = nullptr)-> OFFLOAD_RESULT_SUCCESS
...
```


### Open questions and future work
* The new API is implemented in a separate library
(`liboffload_new.so`). It could just as easily be part of the existing
`libomptarget` library - I have no strong feelings on which is better.
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
2024-11-25 11:34:14 -06:00
Joseph Huber
3a20a5f510 [Offload] Move compiler-rt to runtimes in cache 2024-11-14 10:44:29 -06:00
Joseph Huber
de41b137dd
[Offload] Provide a CMake cache file to easily build offloading (#115074)
Summary:
This patch adds a cache file that will automatically enable openpm,
offload, and all the fancy GPU libraries.
2024-11-07 15:35:29 -06:00
Johannes Doerfert
7102592af7
[Offload] Repair and rename llvm-omp-device-info (to -offload-) (#100309)
The `llvm-omp-device-info` tool is very handy, but broke due to the lazy
evaluation of devices. This repairs the functionality and adds a test.
The tool is also renamed into `llvm-offload-device-info` as `-omp-` is
going away.
2024-07-24 09:35:09 -07:00
Joseph Huber
c618ae1734
[Offload] Rework handling for loading vendor runtimes (#93073)
Summary:
We previously had multiple options for this, this patch replaces them
with `LIBOMPTARGET_DLOPEN_PLUGINS=` to be a list of plugins to
dynamically use. It defaults to everything right now. This ignores the
`host` plugin because the `libffi` dependency is going to be removed
soon hopefully in https://github.com/llvm/llvm-project/pull/91264.
2024-05-22 13:04:52 -05:00
Joseph Huber
770d928303
[Offload][NFC] Remove 'libomptarget' message helpers (#92581)
Summary:
This isn't `libomptarget` anymore, and these messages were always
unnecessary because no other project uses these prefixed messages. The
effect of this is that no longer will the logs have `LIBOMPTARGET --` in
front of everything. We have a message stating when we start building
the offload project so it'll still be trivial to find.
2024-05-17 13:24:32 -05:00
Joseph Huber
c4017cda00
[Offload][NFC] Remove header license in CMake files (#92544)
Summary:
No other project has these in the CMake itself, and they're wildly
inconsistent even within the project. These don't really add anything so
I think they should be removed.
2024-05-17 09:05:03 -05:00
Johannes Doerfert
330d8983d2
[Offload] Move /openmp/libomptarget to /offload (#75125)
In a nutshell, this moves our libomptarget code to populate the offload
subproject.

With this commit, users need to enable the new LLVM/Offload subproject
as a runtime in their cmake configuration.
No further changes are expected for downstream code.

Tests and other components still depend on OpenMP and have also not been
renamed. The results below are for a build in which OpenMP and Offload
are enabled runtimes. In addition to the pure `git mv`, we needed to
adjust some CMake files. Nothing is intended to change semantics.

```
ninja check-offload
```
Works with the X86 and AMDGPU offload tests

```
ninja check-openmp
```
Still works but doesn't build offload tests anymore.

```
ls install/lib
```
Shows all expected libraries, incl.
- `libomptarget.devicertl.a`
- `libomptarget-nvptx-sm_90.bc`
- `libomptarget.rtl.amdgpu.so` -> `libomptarget.rtl.amdgpu.so.18git`
- `libomptarget.so` -> `libomptarget.so.18git`

Fixes: https://github.com/llvm/llvm-project/issues/75124

---------

Co-authored-by: Saiyedul Islam <Saiyedul.Islam@amd.com>
2024-04-22 09:51:33 -07:00