36 Commits

Author SHA1 Message Date
Alex Duran
e17226e20b
[OFFLOAD] Implement excluding filters for debugging (#180538)
Allow a to define a set of Types that are not shown by default when
doing default debug loggin (e.g., LIBOMPTARGET_DEBUG=All).

Users can enable output of those types of messages by explicitly adding
them to LIBOMPTARGET_DEBUG.

Used to implement: #180545

---------

Co-authored-by: Michael Klemm <michael.klemm@amd.com>
2026-02-10 10:01:52 +01:00
Hansang Bae
4aaf7ed74b
[Offload] Allow specifying debug level with all debug type (#178213)
This change allows users to specify debug level with `all` debug type.
The effect of `all:<num>` is equivalent to `<num>`.
2026-01-27 16:02:14 -06:00
Alex Duran
ba3548c8b2
[OpenMP][Offload] Remove old DP macro (#177156)
Old usages have been updated so we can remove it now
2026-01-21 15:13:13 +01:00
Alex Duran
c70ce1128d
[OpenMP][Offload] Translate Info types to Debug types when debug enabled (#175599)
Eventually we might want to rework the INFO macro to work like the new
ODBG macro but in the meantime at least translate the Info type to the
correct Debug type instead of just using DP directly (which uses the
default type).
2026-01-15 22:44:54 +01:00
Alex Duran
2a9b30d228
[OpenMP][Offload] Add a buffer layer to debug messaging (#176153)
To reduce interference between threads, instead of writing the
components of a debug message directly to the underlying stream, write
them to a buffer and flush the buffer to the stream when its completed.
2026-01-15 22:41:12 +01:00
Hansang Bae
dae3b49cba
[Offload] Update debug message printig in the plugins (#175205)
* Prepare a set of debug types in llvm::offload::debug to be used in
plugin code
* Update debug messages in the plugins
2026-01-12 14:26:43 -06:00
Alex Duran
dbd52bd558
[OFFLOAD][OpenMP] Remove old style REPORT support (#175607)
Fix the few remaining usages and remove the support for the old REPORT
macro.
2026-01-12 19:48:40 +01:00
Hansang Bae
101c6ede3b
[Offload] Debug message update part 2 (#171683)
Update debug messages based on the new method from #170425. Added a new
debug type `Tool` and updated the following files.
- include/OffloadPolicy.h
- include/OpenMP/OMPT/Connector.h
- include/Shared/Debug.h
- include/Shared/EnvironmentVar.h
- libomptarget/OpenMP/Mapping.cpp
- libomptarget/OpenMP/OMPT/Callback.cpp
- libomptarget/PluginManager.cpp
2025-12-17 09:04:55 -06:00
Alex Duran
d502ff0949
[OpenMP][Offload] Add support for lambdas with debug conditions (#172573)
This PR adds a new set of debug macros that allow a certain code to be
only executed when certain debug conditions are met. This is useful to
guard things that are not strictly messages but compute and store things
that are related to those messages.

Strictly speaking the existing ODBG_OS could be used as well but that
requires a stream object to be created which is unnecessary in some
cases.

Example of how it works:

```cpp
ODBG_IF("Counters", [&](uint32_t Level) {
    someCounter++;
    if (Level == 2) moreDetailedCounter += f();
});
ODBG("Counters") << "Counter" =  someCounter 
                 << ODBG_IF(2) << "DetailedCounter" << moreDetailedCounter;
```
2025-12-17 09:30:55 +01:00
Alex Duran
2d08b0c5f0
Revert "[OpenMP][Offload] Add support for lambdas with debug conditions" (#172570)
Reverts llvm/llvm-project#172107
2025-12-16 23:48:14 +01:00
Alex Duran
73bcc19aae
[OpenMP][Offload] Add support for lambdas with debug conditions (#172107)
This PR adds a new set of debug macros that allow a certain code to be
only executed when certain debug conditions are met. This is useful to
guard things that are not strictly messages but compute and store things
that are related to those messages.

Strictly speaking the existing ODBG_OS could be used as well but that
requires a stream object to be created which is unnecessary in some
cases.

Example of how it works:

```
ODBG_IF("Counters", [&](uint32_t Level) {
    someCounter++;
    if (Level == 2) moreDetailedCounter += f();
});
ODBG("Counters") << "Counter" =  someCounter 
                 << ODBG_IF(2) << "DetailedCounter" << moreDetailedCounter;
```
2025-12-16 18:32:58 +01:00
Alex Duran
8dd75fa473
[OpenMP][Offload] Revert format of changed messages (#171995)
Adjust format of some of the updated debug output to match the old
format as there are a number of tests that rely on it.
2025-12-16 18:31:05 +01:00
Alex Duran
02a908c4c9
[OpenMP][Offload] Continue to update libomptarget debug messages (#170425)
* Add support to use lambdas to output debug messages (like LDBG_OS)
* Update messages for interface.cpp and omptarget.cpp
2025-12-10 16:18:01 +01:00
Alex Duran
ec6091f4de
[OFFLOAD][LIBOMPTARGET] Start to update debug messages in libomptarget (#170265)
* Add compatibility support for DP and REPORT macros 
* Define a set of predefined Debug Type for libomptarget
* Start to update libomptarget files (OffloadRTL.cpp, device.cpp)
2025-12-02 23:45:23 +01:00
Alex Duran
66ddc9b3e7
[OFFLOAD] Add support for more fine grained debug messages control (#165416)
This PR introduces new debug macros that allow a more fined control of
which debug message to output and introduce C++ stream style for debug
messages.

Changing existing messages (except a few that I changed for testing)
will come in subsequent PRs.

I also think that we should make debug enabling OpenMP agnostic but, for
now, I prioritized maintaing the current libomptarget behavior for now,
and we might need more changes further down the line as we we decouple
libomptarget.
2025-11-20 18:39:56 +01:00
Joseph Huber
670c453aeb
[Offload] Remove handling for device memory pool (#163629)
Summary:
This was a lot of code that was only used for upstream LLVM builds of
AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so
just use that. Simplifies code, can be added back if we start providing
alternate forms but I don't think there's a single use-case that would
justify it yet.
2025-11-06 10:15:18 -06:00
Ross Brunton
910d7e90bf
[Offload] Make olLaunchKernel test thread safe (#149497)
This sprinkles a few mutexes around the plugin interface so that the
olLaunchKernel CTS test now passes when ran on multiple threads.

Part of this also involved changing the interface for device synchronise
so that it can optionally not free the underlying queue (which
introduced a race condition in liboffload).
2025-08-08 10:57:04 +01:00
Alex Duran
66d1c37eb6
[OFFLOAD][OPENMP] 6.0 compatible interop interface (#143491)
The following patch introduces a new interop interface implementation
with the following characteristics:

* It supports the new 6.0 prefer_type specification
* It supports both explicit objects (from interop constructs) and
implicit objects (from variant calls).
* Implements a per-thread reuse mechanism for implicit objects to reduce
overheads.
* It provides a plugin interface that allows selecting the supported
interop types, and managing all the backend related interop operations
(init, sync, ...).
* It enables cooperation with the OpenMP runtime to allow progress on
OpenMP synchronizations.
* It cleanups some vendor/fr_id mismatchs from the current query
routines.
* It supports extension to define interop callbacks for library cleanup.
2025-08-06 16:34:39 +02:00
Callum Fare
b78bc35d16
[Offload] Don't check in generated files (#141982)
Previously we decided to check in files that we generate with tablegen.
The justification at the time was that it helped reviewers unfamiliar
with `offload-tblgen` see the actual changes to the headers in PRs.
After trying it for a while, it's ended up causing some headaches and is
also not how tablegen is used elsewhere in LLVM.

This changes our use of tablegen to be more conventional. Where
possible, files are still clang-formatted, but this is no longer a hard
requirement. Because `OffloadErrcodes.inc` is shared with libomptarget
it now gets generated in a more appropriate place.
2025-06-03 10:39:04 -05:00
Ross Brunton
050892d2f8
[Offload] Use new error code handling mechanism and lower-case messages (#139275)
[Offload] Use new error code handling mechanism

This removes the old ErrorCode-less error method and requires
every user to provide a concrete error code. All calls have been
updated.

In addition, for consistency with error messages elsewhere in LLVM, all
messages have been made to start lower case.
2025-05-20 08:50:20 -05:00
Ross Brunton
1532ee6916
[Offload] Add Error Codes to PluginInterface (#138258)
A new ErrorCode enumeration is present in PluginInterface which can
be used when returning an llvm::Error from offload and PluginInterface
functions.

This enum must be kept up to sync with liboffload's ol_errc_t enum, so
both are automatically generated from liboffload's enum definition.

Some error codes have also been shuffled around to allow for future
work. Note that this patch only adds the machinery; actual error codes
will be added in a future patch.

~~Depends on #137339 , please ignore first commit of this MR.~~ This has
been merged.
2025-05-19 09:38:34 -05:00
Joseph Huber
56bf0e7202
[OpenMP] Remove dependency on LLVM include directory from DeviceRTL (#136359)
Summary:
Currently we depend on a single LLVM include directory. This is actually
only required to define one enum, which is highly unlikely to change.
THis patch makes the `Environment.h` include directory more hermetic so
we no long depend on other libraries. In exchange, we get a simpler
dependency list for the price of hard-coding `1` somewhere. I think it's
a valid trade considering that this flag is highly unlikely to change at
this point.

@ronlieb AMD version
https://gist.github.com/jhuber6/3313e6f957be14dc79fe85e5126d2cb3
2025-04-21 15:21:47 -05:00
Jon Chesterfield
deb0f3c09b
[openmp][nfc] Use builtin align in the devicertl (#131918)
Noticed while extracting the smartstack as a test case
2025-03-18 21:31:49 +00:00
Ethan Luis McDonough
9e5c136d5a
[PGO][Offload] Profile profraw generation for GPU instrumentation #76587 (#93365)
This pull request is the second part of an ongoing effort to extends PGO
instrumentation to GPU device code and depends on #76587. This PR makes
the following changes:

- Introduces `__llvm_write_custom_profile` to PGO compiler-rt library.
This is an external function that can be used to write profiles with
custom data to target-specific files.
- Adds `__llvm_write_custom_profile` as weak symbol to libomptarget so
that it can write the collected data to a profraw file.
- Adds `PGODump` debug flag and only displays dump when the
aforementioned flag is set
2025-02-11 23:30:54 -06:00
Joseph Huber
6518b121f0
[Offload][NFC] Factor out and rename the __tgt_offload_entry struct (#123785)
Summary:
This patch is an NFC renaming to make using the offloading entry type
more portable between other targets. Right now this is just moving its
definition to LLVM so others can use it. Future work will rework the
struct layout.
2025-01-21 12:05:24 -06:00
Joseph Huber
f53cb84df6
[OpenMP] Use __builtin_bit_cast instead of UB type punning (#122325)
Summary:
Use a normal bitcast, remove from the shared utils since it's not
available in
GCC 7.4
2025-01-09 13:59:21 -06:00
Joseph Huber
91f5f974cb
[OpenMP] Unconditionally provide an RPC client interface for OpenMP (#117933)
Summary:
This patch adds an RPC interface that lives directly in the OpenMP
device runtime. This allows OpenMP to implement custom opcodes.
Currently this is only providing the host call interface, which is the
raw version of reverse offloading. Previously this lived in `libc/` as
an extension which is not the correct place.

The interface here uses a weak symbol for the RPC client by the same
name that the `libc` interface uses. This means that it will defer to
the libc one if both are present so we don't need to set up multiple
instances.

The presense of this symbol is what controls whether or not we set up
the RPC server. Because this is an external symbol it normally won't be
optimized out, so there's a special pass in OpenMPOpt that deletes this
symbol if it is unused during linking. That means at `O0` the RPC server
will always be present now, but will be removed trivially if it's not
used at O1 and higher.
2024-12-02 14:31:51 -06:00
Joseph Huber
c3ac3fe825
[OpenMP] Fix redefining stdint.h types (#108607)
Summary:
We can include `stdint.h` just fine as long as we don't allow it to find
system headers, passing `-nostdlibinc` and `-nogpuinc` suppresses these
extra paths so we will just use the clang resource headers for
`stdint.h` and `stddef.h`.
2024-09-13 13:22:44 -05:00
Johannes Doerfert
08533a3ee8
[Offload][NFC] Reorganize utils:: and make Device/Host/Shared clearer (#100280)
We had three `utils::` namespaces, all with different "meaning" (host,
device, hsa_utils). We should, when we can, keep "include/Shared"
accessible from host and device, thus RefCountTy has been moved to a
separate header. `hsa_utils` was introduced to make `utils::` less
overloaded. And common functionality was de-duplicated, e.g.,
`utils::advance` and `utils::advanceVoidPtr` -> `utils:advancePtr`. Type
punning now checks for the size of the result to make sure it matches
the source type.

No functional change was intended.
2024-09-05 13:36:26 -07:00
Johannes Doerfert
80525dfcde
[Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (#94549)
Through the new `-foffload-via-llvm` flag, CUDA kernels can now be
lowered to the LLVM/Offload API. On the Clang side, this is simply done
by using the OpenMP offload toolchain and emitting calls to `llvm*`
functions to orchestrate the kernel launch rather than `cuda*`
functions. These `llvm*` functions are implemented on top of the
existing LLVM/Offload API.

As we are about to redefine the Offload API, this wil help us in the
design process as a second offload language.

We do not support any CUDA APIs yet, however, we could:
  https://www.osti.gov/servlets/purl/1892137

For proper host execution we need to resurrect/rebase
  https://tianshilei.me/wp-content/uploads/2021/12/llpp-2021.pdf
(which was designed for debugging).

```
❯❯❯ cat test.cu
extern "C" {
void *llvm_omp_target_alloc_shared(size_t Size, int DeviceNum);
void llvm_omp_target_free_shared(void *DevicePtr, int DeviceNum);
}

__global__ void square(int *A) { *A = 42; }

int main(int argc, char **argv) {
  int DevNo = 0;
  int *Ptr = reinterpret_cast<int *>(llvm_omp_target_alloc_shared(4, DevNo));
  *Ptr = 7;
  printf("Ptr %p, *Ptr %i\n", Ptr, *Ptr);
  square<<<1, 1>>>(Ptr);
  printf("Ptr %p, *Ptr %i\n", Ptr, *Ptr);
  llvm_omp_target_free_shared(Ptr, DevNo);
}

❯❯❯ clang++ test.cu -O3 -o test123 -foffload-via-llvm --offload-arch=native

❯❯❯ llvm-objdump --offloading test123

test123:        file format elf64-x86-64

OFFLOADING IMAGE [0]:
kind            elf
arch            gfx90a
triple          amdgcn-amd-amdhsa
producer        openmp

❯❯❯ LIBOMPTARGET_INFO=16 ./test123
Ptr 0x155448ac8000, *Ptr 7
Ptr 0x155448ac8000, *Ptr 42
```
2024-08-12 17:44:58 -07:00
Johannes Doerfert
9a1013220b
[Offload] Allow to record kernel launch stack traces (#100472)
Similar to (de)allocation traces, we can record kernel launch stack
traces and display them in case of an error. However, the AMD GPU plugin
signal handler, which is invoked on memroy faults, cannot pinpoint the
offending kernel. Insteade print `<NUM>`, set via
`OFFLOAD_TRACK_NUM_KERNEL_LAUNCH_TRACES=<NUM>`, many traces. The
recoding/record uses a ring buffer of fixed size (for now 8).
For `trap` errors, we print the actual kernel name, and trace if
recorded.
2024-07-31 11:49:50 -07:00
Johannes Doerfert
54b5c76d3b
[Offload] Use flat array for cuLaunchKernel (#95116)
We already used a flat array of kernel launch parameters for the AMD GPU
launch but now we also use this scheme for the NVIDIA GPU launch. The
only remaining/required use of the indirection is the host plugin (due
ot ffi). This allows to us simplify the use for non-OpenMP kernel
launch.
2024-06-13 09:43:47 +03:00
Johannes Doerfert
2eb60e2de8
[Offload][NFCI] Initialize the KernelArgsTy to default values (#95117)
Co-authored-by: Joseph Huber <huberjn@outlook.com>
2024-06-11 17:05:04 +03:00
Joseph Huber
033fa81480 [Offload][NFC] Remove unused files following static plugins
Summary:
Forgot to remove these when I landed the initial patch, they are no
longer used.
2024-05-16 11:48:34 -05:00
Joseph Huber
b07177fb68
[Libomptarget] Rework interface for enabling plugins (#86875)
Summary:
Previously we would build all of the plugins by default and then only
load some using the `LIBOMPTARGET_PLUGINS_TO_LOAD` variable. This patch
renamed this to `LIBOMPTARGET_PLUGINS_TO_BUILD` and changes whether or
not it will include the plugin in CMake.

Additionally this patch creates a new `Targets.def` file that allows us
to enumerate all of the enabled plugins. This is somewhat different from
the old method, and it's done this way for future use that will need to
be shared. This follows the same method that LLVM uses for its targets,
however it does require adding an extra include path.

Depends on https://github.com/llvm/llvm-project/pull/86868
2024-04-29 11:18:37 -05:00
Johannes Doerfert
330d8983d2
[Offload] Move /openmp/libomptarget to /offload (#75125)
In a nutshell, this moves our libomptarget code to populate the offload
subproject.

With this commit, users need to enable the new LLVM/Offload subproject
as a runtime in their cmake configuration.
No further changes are expected for downstream code.

Tests and other components still depend on OpenMP and have also not been
renamed. The results below are for a build in which OpenMP and Offload
are enabled runtimes. In addition to the pure `git mv`, we needed to
adjust some CMake files. Nothing is intended to change semantics.

```
ninja check-offload
```
Works with the X86 and AMDGPU offload tests

```
ninja check-openmp
```
Still works but doesn't build offload tests anymore.

```
ls install/lib
```
Shows all expected libraries, incl.
- `libomptarget.devicertl.a`
- `libomptarget-nvptx-sm_90.bc`
- `libomptarget.rtl.amdgpu.so` -> `libomptarget.rtl.amdgpu.so.18git`
- `libomptarget.so` -> `libomptarget.so.18git`

Fixes: https://github.com/llvm/llvm-project/issues/75124

---------

Co-authored-by: Saiyedul Islam <Saiyedul.Islam@amd.com>
2024-04-22 09:51:33 -07:00