186 Commits

Author SHA1 Message Date
Michael Kruse
c32c1d0d21
[Runtimes] Default build must use its own output dirs (#168266)
Post-commit fix of #164794 reported at
https://github.com/llvm/llvm-project/pull/164794#issuecomment-3536253493

`LLVM_LIBRARY_OUTPUT_INTDIR` and `LLVM_RUNTIME_OUTPUT_INTDIR` is used by
`AddLLVM.cmake` as output directories. Unless we are in a
bootstrapping-build, It must not point to directories found by
`find_package(LLVM)` which may be read-only directories. MLIR for
instance sets thesese variables to its own build output
directory, so should the runtimes.
2025-11-19 13:51:14 +01:00
Robert Imschweiler
9a0fd22da1
Revert "[OpenMP] Implement omp_get_uid_from_device() / omp_get_device_from_uid()" (#168547)
Reverts llvm/llvm-project#164392 due to fortran issues
2025-11-18 15:10:42 +00:00
Robert Imschweiler
65c4a534bd
[OpenMP] Implement omp_get_uid_from_device() / omp_get_device_from_uid() (#164392)
Use the implementation in libomptarget. If libomptarget is not
available, always return the UID / device number of the host / the
initial device.
2025-11-18 15:22:49 +01:00
Akash Banerjee
8aa7d823b0
[OpenMP][Flang] Emit default declare mappers implicitly for derived types (#140562)
This patch adds support to emit default declare mappers for implicit
mapping of derived types when not supplied by user. This especially
helps tackle mapping of allocatables of derived types.
2025-11-14 15:59:48 +00:00
Ethan Luis McDonough
38cade7cc6
[PGO][Offload] Fix missing names bug in GPU PGO (#166444)
After #163011 was merged, the tests in
[`offload/test/offloading/gpupgo`](https://github.com/llvm/llvm-project/compare/main...EthanLuisMcDonough:llvm-project:gpupgo-names-fix-pr?expand=1#diff-f769f6cebd25fa527bd1c1150cc64eb585c41cb8a8b325c2bc80c690e47506a1)
broke because the offload plugins were no longer able to find
`__llvm_prf_nm`. This pull request explicitly makes `__llvm_prf_nm`
visible to the host on GPU targets and reverses the changes made in
f7e9968a5ba99521e6e51161f789f0cc1745193f.
2025-11-10 10:11:53 -06:00
Joseph Huber
aaddd8d38a [OpenMP] Fix tests relying on the heap size variable
Summary:
I made that an unimplemented error, but forgot that it was used for this
environment variable.
2025-11-06 13:00:26 -06:00
Joseph Huber
670c453aeb
[Offload] Remove handling for device memory pool (#163629)
Summary:
This was a lot of code that was only used for upstream LLVM builds of
AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so
just use that. Simplifies code, can be added back if we start providing
alternate forms but I don't think there's a single use-case that would
justify it yet.
2025-11-06 10:15:18 -06:00
agozillon
09318c6bff
[MLIR][OpenMP] Fix and simplify bounds offset calculation for 1-D GEP offsets (#165486)
Currently this is being calculated incorrectly and will result in
incorrect index offsets in more complicated array slices. This PR tries
to address it by refactoring and changing the calculation to be more
correct.
2025-10-31 00:54:31 +01:00
Nicole Aschenbrenner
16641ad8a2
[OpenMP] Adds omp_target_is_accessible routine (#138294)
Adds omp_target_is_accessible routine.
Refactors common code from omp_target_is_present to work for both
routines.

---------

Co-authored-by: Shilei Tian <i@tianshilei.me>
2025-10-22 17:35:16 +02:00
Kaloyan Ignatov
1f7ddb61b3
[NFC][Offload][OMPT] Improve readability of liboffload OMPT tests (#163181)
- ompt_target_data_op_t, ompt_scope_endpoint_t and ompt_target_t are now
printed as strings instead of just numbers to ease debugging
- some missing clang-format clauses have been added
2025-10-22 10:48:39 +02:00
Abhinav Gaba
829804724b
[NFC][OpenMP] Update a test that was failing on aarch64. (#164456)
The failure was reported here:

https://github.com/llvm/llvm-project/pull/164039#issuecomment-3425429556

The test was checking for the "bad" behavior so as to keep track of it, but there seem to be some issues with the pointer arithmetic specific to aarch64.

The update for now is to not check for the "bad" behavior fully.

We may need to debug further if similar issues are encountered eventually once the codegen has been fixed.
2025-10-21 21:15:52 -07:00
Abhinav Gaba
f37b4459f0
[NFC][OpenMP] Add small class-member use_device_ptr/addr unit tests. (#164039)
Two of the tests are currently asserting, and two are emitting
unexpected results.

The asserting tests will be fixed using the ATTACH-style codegen from
#153683.

The other two involve `use_device_addr` on byrefs, and need more
follow-up codegen changes, that have been noted in a FIXME comment.
2025-10-20 13:14:33 -07:00
Jan Patrick Lehr
f7e9968a5b
[Offload] XFAIL pgo tests until resolved (#163722)
While people look into it, xfail the tests.
2025-10-16 11:43:55 +02:00
Joseph Huber
914fbe367e
[OpenMP] Disable a few more tests to get the bot green (#163614) 2025-10-15 14:14:15 -05:00
Jan Patrick Lehr
4b84e0f3f0
[OpenMP] Add test to print interop identifiers (#161434)
The test covers some of the identifier symbols in the interop runtime.

This test, for now, is to guard against complete breakage, which was the
result of the other `interop.c` test not being enabled on AMD and thus,
not caught by our buildbots.
2025-10-15 20:38:33 +02:00
agozillon
9155b318f2
[Flang][OpenMP] Defer descriptor mapping for assumed dummy argument types (#154349)
This PR adds deferral of descriptor maps until they are necessary for
assumed dummy argument types. The intent is to avoid a problem where a
user can inadvertently map a temporary local descriptor to device
without their knowledge and proceed to never unmap it. This temporary
local descriptor remains lodged in OpenMP device memory and the next
time another variable or descriptor residing in the same stack address
is mapped we incur a runtime OpenMP map error as we try to remap the
same address.

This fix was discussed with the OpenMP committee and applies to OpenMP
5.2 and below, future versions of OpenMP can avoid this issue via the
attach semantics added to the specification.
2025-10-09 17:52:41 +02:00
Akash Banerjee
ed12dc5e30
[Flang][OpenMP] Implicitly map nested allocatable components in derived types (#160766)
This PR adds support for nested derived types and their mappers to the
MapInfoFinalization pass.

- Generalize MapInfoFinalization to add child maps for arbitrarily
nested allocatables when a derived object is mapped via declare mapper.
- Traverse HLFIR designates rooted at the target block arg and build
full coordinate_of chains; append members with correct membersIndex.

This fixes #156461.
2025-10-02 16:15:16 +00:00
Joseph Huber
0fcce4fb7b
[OpenMP] Mark problematic tests as XFAIL / UNSUPPORTED (#161267)
Summary:
Several of these tests have been failing for literal years. Ideally we
make efforts to fix this, but keeping these broken has had serious
consequences on our testing infrastructure where failures are the norm
so almost all test failures are disregarded. I made a tracking issue for
the ones that have been disabled.

https://github.com/llvm/llvm-project/issues/161265
2025-09-29 15:17:55 -05:00
Dominik Adamski
e4d94f4f7f
[OpenMP][Flang] Fix no-loop test (#161162)
Fortran no-loop test is supported only for GPU.
2025-09-29 16:01:52 +02:00
Dominik Adamski
83ef38a274
[Flang][OpenMP] Enable no-loop kernels (#155818)
Enable the generation of no-loop kernels for Fortran OpenMP code. target
teams distribute parallel do pragmas can be promoted to no-loop kernels
if the user adds the -fopenmp-assume-teams-oversubscription and
-fopenmp-assume-threads-oversubscription flags.

If the OpenMP kernel contains reduction or num_teams clauses, it is not
promoted to no-loop mode.

The global OpenMP device RTL oversubscription flags no longer force
no-loop code generation for Fortran.
2025-09-26 13:57:51 +02:00
Akash Banerjee
3e7e60ae5c
Revert "[Flang][OpenMP] Implicitly map nested allocatable components in derived types" (#160759)
Reverts llvm/llvm-project#160116
2025-09-25 19:53:58 +01:00
Akash Banerjee
b4f1e0e5b1
[Flang][OpenMP] Implicitly map nested allocatable components in derived types (#160116)
This PR adds support for nested derived types and their mappers to the
MapInfoFinalization pass.

- Generalize MapInfoFinalization to add child maps for arbitrarily
nested allocatables when a derived object is mapped via declare mapper.
- Traverse HLFIR designates rooted at the target block arg and build
full coordinate_of chains; append members with correct membersIndex.
  
This fixes #156461.
2025-09-24 14:30:27 +01:00
Akash Banerjee
8afea0d0ea
[OpenMP][MLIR] Preserve to/from flags in mapper base entry for mappers (#159799)
With declare mapper, the parent base entry was emitted as `TARGET_PARAM`
only. The mapper received a map-type without `to/from`, causing
components to degrade to `alloc`-only (no copies), breaking allocatable
payload mapping. This PR preserves the map-type bits from the parent.

This fixes #156466.
2025-09-19 19:34:09 +01:00
Kareem Ergawy
c286a427b9
[NFC][flang][do concurent] Add saxpy offload tests for OpenMP mapping (#155993)
Adds end-to-end tests for `do concurrent` offloading to the device.


PR stack:
- https://github.com/llvm/llvm-project/pull/155754
- https://github.com/llvm/llvm-project/pull/155987
- https://github.com/llvm/llvm-project/pull/155992
- https://github.com/llvm/llvm-project/pull/155993 ◀️
- https://github.com/llvm/llvm-project/pull/157638
- https://github.com/llvm/llvm-project/pull/156610
- https://github.com/llvm/llvm-project/pull/156837
2025-09-17 07:04:13 +02:00
Jan Patrick Lehr
311d78f2a1
[OpenMP] Fix force-usm test after #157182 (#159095)
The refactoring lead to an additional data transfer. This changes the
assumed transfers in the check-strings to work with that changed
behavior.
2025-09-16 15:42:02 +02:00
Michał Górny
312b5615df
[offload] Fix finding libomptarget in runtimes build (#157856)
Per the logic in top-level CMakeLists, `libomptarget` is placed into
`LLVM_LIBRARY_OUTPUT_INTDIR` when this variable is set. Adjust the test
logic to include this directory in `-L` and `-Wl,-rpath` arguments as
well, in order to fix finding tests when building via the `runtimes`
top-level directory.

Signed-off-by: Michał Górny <mgorny@gentoo.org>
2025-09-10 16:31:22 +02:00
agozillon
8f16af3c20
[Flang][OpenMP] Fix mapping of character type with LEN > 1 specified (#154172)
Currently, there's a number of issues with mapping characters with LEN's
specified (strings effectively). They're represented as a char type in
FIR with a len parameter, and then later on they're expanded into an
array of characters when we're translating to the LLVM dialect. However,
we don't generate a bounds for these at lowering. The fix in this PR for
this is to generate a bounds from the LEN parameter and attatch it to
the map on lowering from FIR to the LLVM dialect when we encounter this
type.
2025-09-09 16:36:04 +02:00
Joseph Huber
6d032c4df2
[OpenMP] Fix incorrect CUDA bc path after library change (#157547) 2025-09-08 17:27:59 -05:00
Julian Brown
c71da7d5e0
[OpenMP] Add tests for mapping of chained 'containing' structs (#156703)
This PR adds several new tests for mapping of chained structures, i.e.
those resembling:

  #pragma omp target map(tofrom: a->b->c)

These are currently XFAILed, although the first two tests actually work
with unified memory -- I'm not sure if it's possible to easily improve
the condition on the XFAILs in question to make them more accurate.

These cases are all fixed by the WIP PR
https://github.com/llvm/llvm-project/pull/153683.
2025-09-08 10:30:04 +01:00
Michał Górny
7a88ddd3b1
Revert "[Offload] Run unit tests as a part of check-offload" (#157346)
Reverts llvm/llvm-project#156675 due to regressions in standalone build
and test errors without all plugins enabled (#157345).
2025-09-07 15:12:15 +00:00
Jan Patrick Lehr
209d91d9e4
[Offload] Fix CHECK string in llvm-omp-device-info test (#156872) 2025-09-04 14:30:37 +02:00
Joseph Huber
99f61f3436
[Offload] Run unit tests as a part of check-offload (#156675)
Summary:
Add a dependnecy on the unit tests on the main check-offload test suite.
This matches what the other projects do, pass `llvm-lit` to the
directory to only run the lit tests, use the `check-offload-unit` for
only the unit tests.
2025-09-03 10:26:44 -05:00
Jan Patrick Lehr
27e541645c
[Offload][OpenMP] Enable more tests on AMDGPU (#156626)
(Re)enables a couple of tests that were disabled on AMDGPU for some
reason. Pass for me locally.
2025-09-03 14:04:39 +02:00
Ross Brunton
70ddd838f0
[Offload] Update tablegen tests (#156041)
These were not updated after #154736 .
2025-08-29 16:20:49 +01:00
Jan Patrick Lehr
bcb9634be8
[Offload][OpenMP] Tests require libc on GPU for printf (#155785)
These tests currently fail when libc is not configured to be built as
they require printf to be available in target regions.
2025-08-28 14:30:18 +02:00
Abhinav Gaba
bb1cb6a198
[NFC][OpenMP] Add several use_device_ptr/addr tests. (#154939)
Most tests are either compfailing or runfailing.

They should start passing once we start using ATTACH map-type based
codegen. (#153683)

Even after they start passing, there are a few places where the EXPECTED
and actual CHECKs are different, due to two main issues:
* use_device_ptr translation on `&p[0]` is not succeeding in looking-up
a previously mapped `&p[1]`
* privatization of byref use_device_addr operands is not happening
correctly.

The above should be fixed as separate standalone changes.
2025-08-25 14:23:26 -07:00
Akash Banerjee
6aafe6582d Fix test added in 1fd1d634630754cc9b9c4b5526961d5856f64ff9 2025-08-18 13:29:23 +01:00
Akash Banerjee
1fd1d63463 [MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#153048)
Add a new AutomapToTargetData pass. This gathers the declare target
enter variables which have the AUTOMAP modifier. And adds
omp.declare_target_enter/exit mapping directives for fir.alloca and
fir.free oeprations on the AUTOMAP enabled variables.

Automap Ref: OpenMP 6.0 section 7.9.7.
2025-08-15 15:41:41 +01:00
Abhinav Gaba
2912c9c249
[NFC][Offload] Add missing maps to OpenMP offloading tests. (#153103)
A few tests were only mapping a pointee, like: `map(pp[0][0])`, on an
`int** pp`, but expecting the pointers, like `pp`, `pp[0]` to also be
mapped, which is incorrect.

This change fixes six such tests.
2025-08-14 12:22:28 -07:00
Akash Banerjee
1c7720ef78 Revert "[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#153048)"
This reverts commit 4e6d510eb3ec5b5e5ea234756ea1f0b283feee4a.
2025-08-12 20:19:45 +01:00
Akash Banerjee
4e6d510eb3
[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#153048)
Add a new AutomapToTargetData pass. This gathers the declare target
enter variables which have the AUTOMAP modifier. And adds
omp.declare_target_enter/exit mapping directives for fir.alloca and
fir.free oeprations on the AUTOMAP enabled variables.

Automap Ref: OpenMP 6.0 section 7.9.7.
2025-08-12 15:18:15 +01:00
Amit Tiwari
2074e1320f
[Clang][OpenMP] Non-contiguous strided update (#144635)
This patch handles the strided update in the `#pragma omp target update
from(data[a🅱️c])` directive where 'c' represents the strided access
leading to non-contiguous update in the `data` array when the offloaded
execution returns the control back to host from device using the `from`
clause.

Issue: Clang CodeGen where info is generated for the particular
`MapType` (to, from, etc), it was failing to detect the strided access.
Because of this, the `MapType` bits were incorrect when passed to
runtime. This led to incorrect execution (contiguous) in the
libomptarget runtime code.

Added a minimal testcase that verifies the working of the patch.
2025-08-12 19:32:15 +05:30
Akash Banerjee
0998da27e9 Revert "[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#151989)"
This reverts commit 5a5e8ba0c388d57aecb359ed67919cda429fc7b1.
2025-08-11 13:52:39 +01:00
Akash Banerjee
5a5e8ba0c3
[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#151989)
Add a new `AutomapToTargetData` pass. This gathers the declare target
enter variables which have the `AUTOMAP` modifier. And adds
`omp.declare_target_enter/exit` mapping directives for `fir.allocmem`
and `fir.freemem` oeprations on the `AUTOMAP` enabled variables.

Automap Ref: OpenMP 6.0 section 7.9.7.
2025-08-11 13:18:38 +01:00
hidekisaito
83e5a99ff6
[AMDGPU][Offload] Enable memory manager use for up to ~3GB allocation size in omp_target_alloc (#151882)
Enables AMD data center class GPUs to use memory manager memory pooling
up to 3GB allocation by default, up from the "1 << 13" threshold that
all plugin-nextgen devices use.
2025-08-06 14:41:20 -07:00
Aiden Grossman
2e3fd547de [Offload] Fix typo in shared_lib_fp_mapping.c
Made a typo in 963259ef6be4871e5252ff3ac9df737af5d2b4cb because I cannot
run tests and also did not review it. This should fix it...
2025-07-25 23:17:46 +00:00
Aiden Grossman
963259ef6b
[Offload] Remove uses of %T from lit tests (#150721)
This patch removes all the instances of %T from offload/ (only one test
contained this construction). %T has been deprecated for ~7 years and is
not reccomended as it does not use a unique directory per test. Switch
to using %t to ensure we use a unique dir per test and so that we can
eventually remove %T.

I did not actually test this. A couple feeble attempts at
building/running the offload tests just leaves me with a ton of test
failures. Given how small this is I'm reasonably sure it works though.
2025-07-25 16:16:22 -07:00
agozillon
73272d6fc6
[Flang][OpenMP] Appropriately emit present/load/store in all cases in MapInfoFinalization (#150311)
Currently, we return early whenever we've already generated an
allocation for intermediate descriptor variables (required in certain
cases when we can't directly access the base address of a passes in
descriptor function argument due to HLFIR/FIR restrictions). This
unfortunately, skips over the presence check and load/store required to
set the intermediate descriptor allocations values/data. This is fine in
most cases, but if a function happens to have a series of branches with
seperate target regions capturing the same input argument, we'd emit the
present/load/store into the first branch with the first target inside of
it, the secondary (or any preceding) branches would not have the
present/load/store, this would lead to the subsequent mapped values in
that branch being empty and then leading to a memory access violation on
device.

The fix for the moment is to emit a present/load/store at the relevant
location of every target utilising the input argument, this likely will
also lead to fixing possible issues with the input argument being
manipulated inbetween target regions (primarily resizing, the data
should remain the same as we're just copying an address around, in
theory at least). There's possible optimizations/simplifications to emit
less load/stores such as by raising the load/store out of the branches
when we can, but I'm inclined to leave this sort of optimization to
lower level passes such as an LLVM pass (which very possibly already
covers it).
2025-07-25 16:15:54 +02:00
hidekisaito
75e60e745b
[AMDGPU][Offload][LIT] Run unified_shared_memory tests on gfx950 (#150372)
Enables 9 more tests
2025-07-23 22:46:26 -07:00
Ye Luo
9f6784cc1f [libomptarget] fix test offloading/disable_default_device.c
Fixes the incorrect lit command line introduced in 536ba87726d8dea862d964678dbb761ca32e21fb
2025-07-09 09:52:00 -05:00