71 Commits

Author SHA1 Message Date
Kevin Sala Penades
1f583c6dee
[OpenMP][Offload] Add offload runtime support for dyn_groupprivate clause (#152831)
Part 3 adding offload runtime support. See
https://github.com/llvm/llvm-project/pull/152651.

---------

Co-authored-by: Krzysztof Parzyszek <Krzysztof.Parzyszek@amd.com>
2026-03-12 01:13:06 -07:00
Abhinav Gaba
1d1c83ad73
Reland "[OpenMP][Offload] Handle present/to/from when a different entry did alloc/delete." (#184260)
Some tests that were checking for prints inside/outside `target` regions
needed to be updated to work on systems where the ordering wasn't
deterministic.

Reverts llvm/llvm-project#184240
    
Original description from #165494:

-----

OpenMP allows cases like the following:

```c
  int *p1, *p2, x;
  p1 = p2 = &x;
  ...
  #pragma omp target_exit_data map(delete: p1[:]) from(p2[0])
```

Which means, when the runtime encounters the `from` entry, the ref-count
may
not be zero, but it will go down to zero at the end of the current
construct,
which should cause the "from" transfer to happen.

Similarly, a user may have:

```c
  struct S {
    int *p;
  };

  #pragma omp declare_mapper (id1: S s) map(s.p) map(present, alloc: s.p[0:10])
  #pragma omp declare_mapper (id2: S s) map(s.p, s.p[0:10])

  S s1;

 // present-check should fail here
 #pragma omp target_enter_data map(alloc: s.p[0:10]) map(mapper(id1), to: s)
 // "to" should be honored here
 #pragma omp target_enter_data map(alloc: s.p[0:10]) map(mapper(id2), to: s)
```

Where the allocation happens before the "to" entry is encountered by the
runtime. Or, an allocation happens before a "present" entry is
encountered.

To handle cases like this, we need to use the state information of
previously
seen new allocations, deletions, "from" entries, when honoring
`to`/`from`/`present` map entries.

-----
2026-03-02 15:32:30 -08:00
Abhinav Gaba
42a0fbc2c7
Revert "[OpenMP][Offload] Handle present/to/from when a different entry did alloc/delete." (#184240)
Reverts llvm/llvm-project#165494

Some buildbots are not happy about CHECKs enforcing strict ordering of
prints inside/after target regions.
2026-03-02 21:52:20 +00:00
Abhinav Gaba
1a7060a7b0
[OpenMP][Offload] Handle present/to/from when a different entry did alloc/delete. (#165494)
OpenMP allows cases like the following:

```c
  int *p1, *p2, x;
  p1 = p2 = &x;
  ...
  #pragma omp target_exit_data map(delete: p1[:]) from(p2[0])
```

Which means, when the runtime encounters the `from` entry, the ref-count
may not be zero, but it will go down to zero at the end of the current
construct, which should cause the "from" transfer to happen.

Similarly, a user may have:

```c
  struct S {
    int *p;
  };

  #pragma omp declare_mapper (id1: S s) map(s.p) map(present, alloc: s.p[0:10])
  #pragma omp declare_mapper (id2: S s) map(s.p, s.p[0:10])

  S s1;

 // present-check should fail here
 #pragma omp target_enter_data map(alloc: s.p[0:10]) map(mapper(id1), to: s)
 // "to" should be honored here
 #pragma omp target_enter_data map(alloc: s.p[0:10]) map(mapper(id2), to: s)
```

Where the allocation happens before the "to" entry is encountered by the
runtime. Or, an allocation happens before a "present" entry is
encountered.

To handle cases like this, we need to use the state information of
previously seen new allocations, deletions, "from" entries, when
honoring `to`/`from`/`present` map entries.
2026-03-02 13:13:51 -08:00
Alex Duran
e17226e20b
[OFFLOAD] Implement excluding filters for debugging (#180538)
Allow a to define a set of Types that are not shown by default when
doing default debug loggin (e.g., LIBOMPTARGET_DEBUG=All).

Users can enable output of those types of messages by explicitly adding
them to LIBOMPTARGET_DEBUG.

Used to implement: #180545

---------

Co-authored-by: Michael Klemm <michael.klemm@amd.com>
2026-02-10 10:01:52 +01:00
Joseph Huber
1a86c146ae
[Offload] Add a function to register an RPC Server callback (#178774)
Summary:
We provide an RPC server to manage calls initiated by the device to run
on the host. This is very useful for the built-in handling we have,
however there are cases where we would want to extend this
functionality.

Cases like Fortran or MPI would be useful, but we cannot put references
to these in the core offloading runtime. This way, we can provide this
as a library interface that registers custom handlers for whatever code
people want.
2026-01-30 08:03:13 -06:00
Hansang Bae
4aaf7ed74b
[Offload] Allow specifying debug level with all debug type (#178213)
This change allows users to specify debug level with `all` debug type.
The effect of `all:<num>` is equivalent to `<num>`.
2026-01-27 16:02:14 -06:00
Alex Duran
ba3548c8b2
[OpenMP][Offload] Remove old DP macro (#177156)
Old usages have been updated so we can remove it now
2026-01-21 15:13:13 +01:00
Alex Duran
c70ce1128d
[OpenMP][Offload] Translate Info types to Debug types when debug enabled (#175599)
Eventually we might want to rework the INFO macro to work like the new
ODBG macro but in the meantime at least translate the Info type to the
correct Debug type instead of just using DP directly (which uses the
default type).
2026-01-15 22:44:54 +01:00
Alex Duran
2a9b30d228
[OpenMP][Offload] Add a buffer layer to debug messaging (#176153)
To reduce interference between threads, instead of writing the
components of a debug message directly to the underlying stream, write
them to a buffer and flush the buffer to the stream when its completed.
2026-01-15 22:41:12 +01:00
Abhinav Gaba
3a9e3865b5
[OpenMP][Offload] Add FB_NULLIFY map-type for use_device_ptr(fb_nullify). (1/4) (#169603)
Depends on #174659.

This PR adds a new map-type bit to control the fallback behavior when
when a pointer lookup fails.

For now, this is only meaningful with `RETURN_PARAM`, and can be used
for `need_device_ptr` (for which the default is to use `nullptr` as the
result
when lookup fails), and OpenMP 6.1's `use_device_ptr(fb_nullify)`.

Eventually, this can be extended to work with assumed-size maps on
`target`
constructs, to control what the argument should be set to when lookup
fails (the OpenMP spec does not have a way to control that yet).

Dependent PR: #170578.
2026-01-14 14:39:41 -08:00
Hansang Bae
dae3b49cba
[Offload] Update debug message printig in the plugins (#175205)
* Prepare a set of debug types in llvm::offload::debug to be used in
plugin code
* Update debug messages in the plugins
2026-01-12 14:26:43 -06:00
Alex Duran
dbd52bd558
[OFFLOAD][OpenMP] Remove old style REPORT support (#175607)
Fix the few remaining usages and remove the support for the old REPORT
macro.
2026-01-12 19:48:40 +01:00
Joachim
b036d70eee
Revert "[OMPT][Offload][OpenMP] Fixes for OMPT data used by libomptarget" (#172827)
Reverts llvm/llvm-project#156020

We will need some time for investigating buildbot failures
2025-12-18 09:48:30 +00:00
Kaloyan Ignatov
726452720c
[OMPT][Offload][OpenMP] Fixes for OMPT data used by libomptarget (#156020)
These commits fix issues regarding storage of tool data within
libomptarget. Both libomp and libomptarget have been modified to
accommodate this. We differentiate between two cases depending on the
type of the target region:

- merged target regions (default, without `nowait` clause): behavior
remains unchanged, tool data is stored in the thread local
RegionInterface class within libomptarget.
- deferred target regions (using `nowait` clause): tool data is moved to
`ompt_task_info_t` struct within libomp, as `RegionInterface` is thread
local and its data is lost whenever another task is scheduled on the
thread, which happens with deferred target regions.

In the new implementation, `RegionInterface` receives pointers to
`ompt_task_info_t` within libomp which are handled transparently within
libomptarget. Thus, the problem of tool data getting lost when a thread
receives a new task is resolved: `target_data` and `target_task_data`
remain set.

Another issue was the value of `task_data` which is supposed to belong
to the generating task of the region according to the OpenMP standard,
but instead had been set to the `task_data` of the target task itself
until now.

Test cases have been added which check both of these fixes.

---------

Co-authored-by: Joachim <jenke@itc.rwth-aachen.de>
2025-12-18 10:20:14 +01:00
Alex Duran
f125c8db5c
[OFFLOAD] Add plugin with support for Intel oneAPI Level Zero (#158900)
Add a new nextgen plugin that supports GPU devices through the Intel oneAPI Level Zero library. The plugin is not enabled by default  and needs to be added to LIBOMPTARGET_PLUGINS_TO_BUILD explicitely.

---------

Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com>
Co-authored-by: Nick Sarnie <nick.sarnie@intel.com>
Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-12-18 08:53:03 +01:00
Hansang Bae
101c6ede3b
[Offload] Debug message update part 2 (#171683)
Update debug messages based on the new method from #170425. Added a new
debug type `Tool` and updated the following files.
- include/OffloadPolicy.h
- include/OpenMP/OMPT/Connector.h
- include/Shared/Debug.h
- include/Shared/EnvironmentVar.h
- libomptarget/OpenMP/Mapping.cpp
- libomptarget/OpenMP/OMPT/Callback.cpp
- libomptarget/PluginManager.cpp
2025-12-17 09:04:55 -06:00
Alex Duran
d502ff0949
[OpenMP][Offload] Add support for lambdas with debug conditions (#172573)
This PR adds a new set of debug macros that allow a certain code to be
only executed when certain debug conditions are met. This is useful to
guard things that are not strictly messages but compute and store things
that are related to those messages.

Strictly speaking the existing ODBG_OS could be used as well but that
requires a stream object to be created which is unnecessary in some
cases.

Example of how it works:

```cpp
ODBG_IF("Counters", [&](uint32_t Level) {
    someCounter++;
    if (Level == 2) moreDetailedCounter += f();
});
ODBG("Counters") << "Counter" =  someCounter 
                 << ODBG_IF(2) << "DetailedCounter" << moreDetailedCounter;
```
2025-12-17 09:30:55 +01:00
Abhinav Gaba
c62cd2877c
[OpenMP][Offload] Add LIBOMPTARGET_TREAT_ATTACH_AUTO_AS_ALWAYS to treat attach(auto) as attach(always). (#172382)
This is needed as a way to support older code that was expecting
unconditional attachment to happen for cases like:

```c
  int *p;
  int x;

  #pragma omp targret enter data map(p) // (A)
  #pragma omp target enter data map(x)  // (B)
  p = &x;

  // By default, this does NOT attach p and x
  #pragma omp target enter data map(p[0:0]) // (C)
```

When the environment variable is set, such maps, where both the pointer
and the pointee already have corresponding copies on the device, but are
not attached to one another, will be attached as-if OpenMP 6.1 TR14's
`attach(always)` map-type-modifier was specified on `(C)`.
2025-12-16 15:55:27 -08:00
Alex Duran
2d08b0c5f0
Revert "[OpenMP][Offload] Add support for lambdas with debug conditions" (#172570)
Reverts llvm/llvm-project#172107
2025-12-16 23:48:14 +01:00
Alex Duran
73bcc19aae
[OpenMP][Offload] Add support for lambdas with debug conditions (#172107)
This PR adds a new set of debug macros that allow a certain code to be
only executed when certain debug conditions are met. This is useful to
guard things that are not strictly messages but compute and store things
that are related to those messages.

Strictly speaking the existing ODBG_OS could be used as well but that
requires a stream object to be created which is unnecessary in some
cases.

Example of how it works:

```
ODBG_IF("Counters", [&](uint32_t Level) {
    someCounter++;
    if (Level == 2) moreDetailedCounter += f();
});
ODBG("Counters") << "Counter" =  someCounter 
                 << ODBG_IF(2) << "DetailedCounter" << moreDetailedCounter;
```
2025-12-16 18:32:58 +01:00
Alex Duran
8dd75fa473
[OpenMP][Offload] Revert format of changed messages (#171995)
Adjust format of some of the updated debug output to match the old
format as there are a number of tests that rely on it.
2025-12-16 18:31:05 +01:00
Alex Duran
02a908c4c9
[OpenMP][Offload] Continue to update libomptarget debug messages (#170425)
* Add support to use lambdas to output debug messages (like LDBG_OS)
* Update messages for interface.cpp and omptarget.cpp
2025-12-10 16:18:01 +01:00
Alex Duran
ec6091f4de
[OFFLOAD][LIBOMPTARGET] Start to update debug messages in libomptarget (#170265)
* Add compatibility support for DP and REPORT macros 
* Define a set of predefined Debug Type for libomptarget
* Start to update libomptarget files (OffloadRTL.cpp, device.cpp)
2025-12-02 23:45:23 +01:00
Robert Imschweiler
8808beeb1a
Reland: [OpenMP] Implement omp_get_uid_from_device() / omp_get_device_from_uid() (#168554)
Reland https://github.com/llvm/llvm-project/pull/164392 with Fortran support moved to follow-up PR
2025-12-01 14:18:31 +01:00
Jason-VanBeusekom
84d511df8d
[OpenMP][clang] Register vtables on device for indirect calls runtime (#167011)
This is a branch off of
https://github.com/llvm/llvm-project/pull/159856, in which consists of
the runtime portion of the changes required to support indirect function
and virtual function calls on an `omp target device` when the virtual
class / indirect function is mapped to the device from the host.

Key Changes

- Introduced a new flag OMP_DECLARE_TARGET_INDIRECT_VTABLE to mark
VTable registrations
- Modified setupIndirectCallTable to support both VTable entries and
indirect function pointers

Details:
The setupIndirectCallTable implementation was modified to support this
registration type by retrieving the first address of the VTable and
inferring the remaining data needed to build the indirect call table.
Since the Vtables / Classes registered as indirect can be larger than 8
bytes, and the vtables may not be at the first address we either need to
pass the size to __llvm_omp_indirect_call_lookup and have a check at
each step of the binary search, or add multiple entries to the indirect
table for each address registered. The latter was chosen.

Commit: a00def3f20e166d4fb9328e6f0bc0742cd0afa31 is not a part of this
PR and is handled / reviewed in:
https://github.com/llvm/llvm-project/pull/159856,

This is PR (2/3) 
Register Vtable PR (1/3):
https://github.com/llvm/llvm-project/pull/159856,
Codegen / _llvm_omp_indirect_call_lookup PR (3/3):
https://github.com/llvm/llvm-project/pull/159857
2025-11-26 17:33:26 +00:00
Alex Duran
3f22ed1152
[OFFLOAD] Add support for indexed per-thread containers (#164263)
Split from #158900 it adds a PerThreadContainer that can use STL-like
indexed containers based on a slightly refactored PerThreadTable.

---------

Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-11-26 02:21:09 +01:00
Alex Duran
66ddc9b3e7
[OFFLOAD] Add support for more fine grained debug messages control (#165416)
This PR introduces new debug macros that allow a more fined control of
which debug message to output and introduce C++ stream style for debug
messages.

Changing existing messages (except a few that I changed for testing)
will come in subsequent PRs.

I also think that we should make debug enabling OpenMP agnostic but, for
now, I prioritized maintaing the current libomptarget behavior for now,
and we might need more changes further down the line as we we decouple
libomptarget.
2025-11-20 18:39:56 +01:00
Robert Imschweiler
9a0fd22da1
Revert "[OpenMP] Implement omp_get_uid_from_device() / omp_get_device_from_uid()" (#168547)
Reverts llvm/llvm-project#164392 due to fortran issues
2025-11-18 15:10:42 +00:00
Robert Imschweiler
65c4a534bd
[OpenMP] Implement omp_get_uid_from_device() / omp_get_device_from_uid() (#164392)
Use the implementation in libomptarget. If libomptarget is not
available, always return the UID / device number of the host / the
initial device.
2025-11-18 15:22:49 +01:00
Kevin Sala Penades
64ad5d976d
[Offload] Remove unused KernelArgsTy instantiation (#167197) 2025-11-08 20:54:32 -08:00
Joseph Huber
670c453aeb
[Offload] Remove handling for device memory pool (#163629)
Summary:
This was a lot of code that was only used for upstream LLVM builds of
AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so
just use that. Simplifies code, can be added back if we start providing
alternate forms but I don't think there's a single use-case that would
justify it yet.
2025-11-06 10:15:18 -06:00
Alex Duran
426d1fe548
[OFFLOAD] Remove weak from __kmpc_* calls and gather them in one header (#164613)
Follow-up from #162652

---------

Co-authored-by: Michael Klemm <michael.klemm@amd.com>
2025-10-24 15:42:20 +02:00
Nicole Aschenbrenner
16641ad8a2
[OpenMP] Adds omp_target_is_accessible routine (#138294)
Adds omp_target_is_accessible routine.
Refactors common code from omp_target_is_present to work for both
routines.

---------

Co-authored-by: Shilei Tian <i@tianshilei.me>
2025-10-22 17:35:16 +02:00
Alex Duran
9ba54ca3ee
[OFFLOAD] Interop fixes for Windows (#162652)
On Windows, for a reason I don't fully understand boolean bits get extra
padding (even when asking for packed structures) in the structures that
messes the offsets between the compiler and the runtime.

Also, "weak" works differently on Windows than Linux (i.e., the "local"
routine has preference) which causes it to crash as we don't really have
an alternate implementation of __kmpc_omp_wait_deps. Given this, it
doesn't make sense to mark it as "weak" for Linux either.
2025-10-17 11:07:31 +02:00
Joseph Huber
23efc67e19
[Offload] Remove non-blocking allocation type (#159851)
Summary:
This was originally added in as a hack to work around CUDA's limitation
on allocation. The `libc` implementation now isn't even used for CUDA so
this code is never hit. Even if this case, this code never truly worked.

A true solution would be to use CUDA's virtual memory API instead to
allocate 2MiB slabs independenctly from the normal memory management
done in the stream.
2025-09-20 09:07:14 -05:00
Abhinav Gaba
5af3fa81cc
[Offload][OpenMP] Support shadow-pointer tracking for Fortran descriptors. (#158370)
This change adds support for saving full contents of attached Fortran
descriptors, and not just their pointee address, in the shadow-pointer
table.

With this, we now support:
* comparing full contents of descriptors to check whether a previous
shadow-pointer entry is stale;
* restoring the full contents of descriptors

And with that, we can now use ATTACH map-types (added in #149036) for
mapping Fortran pointer/allocatable arrays, and array-sections on them.
e.g.:

```f90
  integer, allocatable :: x(:)
  !$omp target enter data map(to: x(:))
```

as:

```
  void* addr_of_pointee = allocated(x) ? &x(1) : nullptr;
  int64_t sizeof_pointee = allocated(x) ? sizeof(x(:)) : 0

  addr_of_pointee,    addr_of_pointee, sizeof_pointee,     TO
  addr_of_descriptor, addr_of_pointee, size_of_descriptor, ATTACH
```
2025-09-15 10:37:38 -07:00
Joseph Huber
5d550bf41c
[OpenMP] Move `__omp_rtl_data_environment' handling to OpenMP (#157182)
Summary:
This operation is done every time we load a binary, this behavior should
be moved into OpenMP since it concerns an OpenMP specific data struct.
This is a little messy, because ideally we should only be using public
APIs, but more can be extracted later.
2025-09-08 09:58:38 -05:00
Abhinav Gaba
12769aa728
[Offload] Introduce ATTACH map-type support for pointer attachment. (#149036)
This patch introduces libomptarget support for the ATTACH map-type,
which can be used to implement OpenMP conditional compliant pointer
attachment, based on whether the pointer/pointee is newly mapped on a
given construct.

For example, for the following:

```c
  int *p;
  #pragma omp target enter data map(p[1:10])
```

The following maps can be emitted by clang:
```
  (A)
  &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM
  &p, &p[1], sizeof(p), ATTACH
```

Without this map-type, these two possible maps could be emitted by
clang:
```
  (B)
  &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM

  (C)
  &p, &p[1], 10 * sizeof(p[1]), TO | FROM | PTR_AND_OBJ
````

(B) does not perform any pointer attachment, while (C) also maps the
pointer p, which are both incorrect.

In terms of implementation, maps with the ATTACH map-type are handled
after all other maps have been processed, as it requires knowledge of
which new allocations happened as part of the construct. As per OpenMP
5.0, an attachment should happen only when either the pointer or the
pointee was newly mapped while handling the construct.

Maps with ATTACH map-type-bit do not increase/decrease the ref-count.

With OpenMP 6.1, `attach(always/never)` can be used to force/prevent
attachment. For `attach(always)`, the compiler will insert the ALWAYS
map-type, which would let libomptarget bypass the check about one of the
pointer/pointee being new. With `attach(never)`, the ATTACH map will not
be emitted at all.

The size argument of the ATTACH map-type can specify values greater than
`sizeof(void*)` which can be used to support pointer attachment on
Fortran descriptors. Note that this also requires shadow-pointer
tracking to also support them. That has not been implemented in this
patch.

This was worked upon in coordination with Ravi Narayanaswamy, who has
since retired. Happy retirement, Ravi!

---------

Co-authored-by: Alex Duran <alejandro.duran@intel.com>
2025-08-17 15:17:04 -07:00
Kevin Sala Penades
5751e96f9a
[Offload][NFC] Re-enable clang-format for omptarget.h (#152937) 2025-08-10 15:46:37 -07:00
Ross Brunton
910d7e90bf
[Offload] Make olLaunchKernel test thread safe (#149497)
This sprinkles a few mutexes around the plugin interface so that the
olLaunchKernel CTS test now passes when ran on multiple threads.

Part of this also involved changing the interface for device synchronise
so that it can optionally not free the underlying queue (which
introduced a race condition in liboffload).
2025-08-08 10:57:04 +01:00
Alex Duran
66d1c37eb6
[OFFLOAD][OPENMP] 6.0 compatible interop interface (#143491)
The following patch introduces a new interop interface implementation
with the following characteristics:

* It supports the new 6.0 prefer_type specification
* It supports both explicit objects (from interop constructs) and
implicit objects (from variant calls).
* Implements a per-thread reuse mechanism for implicit objects to reduce
overheads.
* It provides a plugin interface that allows selecting the supported
interop types, and managing all the backend related interop operations
(init, sync, ...).
* It enables cooperation with the OpenMP runtime to allow progress on
OpenMP synchronizations.
* It cleanups some vendor/fr_id mismatchs from the current query
routines.
* It supports extension to define interop callbacks for library cleanup.
2025-08-06 16:34:39 +02:00
Callum Fare
b78bc35d16
[Offload] Don't check in generated files (#141982)
Previously we decided to check in files that we generate with tablegen.
The justification at the time was that it helped reviewers unfamiliar
with `offload-tblgen` see the actual changes to the headers in PRs.
After trying it for a while, it's ended up causing some headaches and is
also not how tablegen is used elsewhere in LLVM.

This changes our use of tablegen to be more conventional. Where
possible, files are still clang-formatted, but this is no longer a hard
requirement. Because `OffloadErrcodes.inc` is shared with libomptarget
it now gets generated in a more appropriate place.
2025-06-03 10:39:04 -05:00
Ross Brunton
050892d2f8
[Offload] Use new error code handling mechanism and lower-case messages (#139275)
[Offload] Use new error code handling mechanism

This removes the old ErrorCode-less error method and requires
every user to provide a concrete error code. All calls have been
updated.

In addition, for consistency with error messages elsewhere in LLVM, all
messages have been made to start lower case.
2025-05-20 08:50:20 -05:00
Ross Brunton
1532ee6916
[Offload] Add Error Codes to PluginInterface (#138258)
A new ErrorCode enumeration is present in PluginInterface which can
be used when returning an llvm::Error from offload and PluginInterface
functions.

This enum must be kept up to sync with liboffload's ol_errc_t enum, so
both are automatically generated from liboffload's enum definition.

Some error codes have also been shuffled around to allow for future
work. Note that this patch only adds the machinery; actual error codes
will be added in a future patch.

~~Depends on #137339 , please ignore first commit of this MR.~~ This has
been merged.
2025-05-19 09:38:34 -05:00
Joseph Huber
56bf0e7202
[OpenMP] Remove dependency on LLVM include directory from DeviceRTL (#136359)
Summary:
Currently we depend on a single LLVM include directory. This is actually
only required to define one enum, which is highly unlikely to change.
THis patch makes the `Environment.h` include directory more hermetic so
we no long depend on other libraries. In exchange, we get a simpler
dependency list for the price of hard-coding `1` somewhere. I think it's
a valid trade considering that this flag is highly unlikely to change at
this point.

@ronlieb AMD version
https://gist.github.com/jhuber6/3313e6f957be14dc79fe85e5126d2cb3
2025-04-21 15:21:47 -05:00
Alex
021a1f6974
[OFFLOAD] Stricter enforcement of user offload disable (#133470)
If user specifies offload is disabled (e.g.,
OMP_TARGET_OFFLOAD=disable), disable library almost completely. This
reduces resources spent to a minimum and ensures all APIs behave as if
the only available device is the host device.

Currently some of the APIs behave as if there were devices avaible for
offload even when under OMP_TARGET_OFFLOAD=disable.

---------

Co-authored-by: Joseph Huber <huberjn@outlook.com>
2025-03-28 17:28:14 -05:00
Jon Chesterfield
deb0f3c09b
[openmp][nfc] Use builtin align in the devicertl (#131918)
Noticed while extracting the smartstack as a test case
2025-03-18 21:31:49 +00:00
Ethan Luis McDonough
9e5c136d5a
[PGO][Offload] Profile profraw generation for GPU instrumentation #76587 (#93365)
This pull request is the second part of an ongoing effort to extends PGO
instrumentation to GPU device code and depends on #76587. This PR makes
the following changes:

- Introduces `__llvm_write_custom_profile` to PGO compiler-rt library.
This is an external function that can be used to write profiles with
custom data to target-specific files.
- Adds `__llvm_write_custom_profile` as weak symbol to libomptarget so
that it can write the collected data to a profraw file.
- Adds `PGODump` debug flag and only displays dump when the
aforementioned flag is set
2025-02-11 23:30:54 -06:00
Christian Clauss
1f56bb3137
[Offload][NFC] Fix typos discovered by codespell (#125119)
https://github.com/codespell-project/codespell

% `codespell
--ignore-words-list=archtype,hsa,identty,inout,iself,nd,te,ths,vertexes
--write-changes`
2025-01-31 09:35:29 -06:00