llvm-project

Author	SHA1	Message	Date
Jan Patrick Lehr	4a8a0593bd	[Offload] Fix failing Fortran test w/ line number (#175247 ) This test also depends on the line number. Following similar approach as other with [[@LINE]] macro.	2026-01-09 21:12:28 +00:00
Joseph Huber	c722ef4874	[OpenMP] Remove testing LTO variant on CPU targets (#175187 ) Summary: This is only really meaningful for the NVPTX target. Not all build environments support host LTO and these are redundant tests, just clean this up and make it run faster.	2026-01-09 10:13:44 -06:00
Kareem Ergawy	e82399dac2	[flang][OpenMP] Prevent `omp.map.info` ops with user-defined mappers from being marked as parial maps (#175133 ) The following test was triggering a runtime crash on the host before launching the kernel: ```fortran program test_omp_target_map_bug_v5 implicit none type nested_type real, allocatable :: alloc_field(:) end type nested_type type nesting_type integer :: int_field type(nested_type) :: derived_field end type nesting_type type(nesting_type) :: config allocate(config%derived_field%alloc_field(1)) !$OMP TARGET ENTER DATA MAP(TO:config, config%derived_field%alloc_field) !$OMP TARGET config%derived_field%alloc_field(1) = 1.0 !$OMP END TARGET deallocate(config%derived_field%alloc_field) end program test_omp_target_map_bug_v5 ``` In particular, the runtime was producing a segmentation fault when the test is compiled with any optimization level > 0; if you compile with -O0 the sample ran fine. After debugging the runtime, it turned out the crash was happening at the point where the runtime calls the default mapper emitted by the compiler for `nesting_type; in particular at this point in the runtime: `c62cd2877c/offload/libomptarget/omptarget.cpp (L307)`. Bisecting the optimization pipeline using `-mllvm -opt-bisect-limit=N`, the first pass that triggered the issue on `O1` was the `instcombine` pass. Debugging this further, the issue narrows down to canonicalizing `getelementptr` instructions from using struct types (in this case the `nesting_type` in the sample above) to using addressing bytes (`i8`). In particular, in `O0`, you would see something like this: ```llvm define internal void @.omp_mapper._QQFnesting_type_omp_default_mapper(ptr noundef %0, ptr noundef %1, ptr noundef %2, i64 noundef %3, i64 noundef %4, ptr noundef %5) #6 { entry: %6 = udiv exact i64 %3, 56 %7 = getelementptr %_QFTnesting_type, ptr %2, i64 %6 .... } ``` ```llvm define internal void @.omp_mapper._QQFnesting_type_omp_default_mapper(ptr noundef %0, ptr noundef %1, ptr noundef %2, i64 noundef %3, i64 noundef %4, ptr noundef %5) #6 { entry: %6 = getelementptr i8, ptr %2, i64 %3 .... } ``` The `udiv exact` instruction emitted by the OMP IR Builder (see: `c62cd2877c/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (L9154)`) allows `instcombine` to assume that `%3` is divisible by the struct size (here `56`) and, therefore, replaces the result of the division with direct GEP on `i8` rather than the struct type. However, the runtime was calling `@.omp_mapper._QQFnesting_type_omp_default_mapper` not with `56` (the proper struct size) but with `48`! Debugging this further, I found that the size of `omp.map.info` operation to which the default mapper is attached computes the value of `48` because we set the map to partial (see: `c62cd2877c/flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp (L1146)` and `c62cd2877c/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (L4501-L4512)`). However, I think this is incorrect since the emitted mapper (and user-defined mappers in general) are defined on the whole struct type and should never be marked as partial. Hence, the fix in this PR.	2026-01-09 15:15:10 +01:00
Jan Patrick Lehr	13fb3f3b19	[Offload] Fix line numbers after #174804 (#174932 ) The changes in line numbers caused a few CHECK macros to now fail. This is fixed by this PR. Build w/ breakages: https://lab.llvm.org/staging/#/builders/105/builds/39748	2026-01-08 12:08:45 +01:00
Nick Sarnie	26b777444b	[offload][lit] XFAIL all failing tests on the Level Zero plugin (#174804 ) We finally got our buildbot added (to staging, at least) so we want to start running L0 tests in CI. We need `check-offload` to pass though, so XFAIL everything failing. There's a couple `UNSUPPORTED` as well, those are for sporadic fails. Also make set the `gpu` and `intelgpu` LIT variables when testing the `spirv64-intel` triple. We have no DeviceRTL yet so basically everything fails, but we manage to get ``` Total Discovered Tests: 432 Unsupported : 169 (39.12%) Passed : 67 (15.51%) Expectedly Failed: 196 (45.37%) ``` We still don't build the level zero plugin by default and these tests don't run unless the plugin was built, so this has no effect on most builds. --------- Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>	2026-01-07 19:20:30 +00:00
Nick Sarnie	08a43f854c	[offload][lit] Use '%not' instead of 'not' in requires.c (#174506 ) Typo exposed by recent `not` behavior change, we need to make sure we're using the LLVM one. Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>	2026-01-05 23:31:15 +00:00
Nick Sarnie	13e3527412	[offload][lit] Fix requires.c after 'not' behavior change (#174499 ) `not` behavior change in https://github.com/llvm/llvm-project/pull/174298 requires `--crash` passed now. Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>	2026-01-05 23:14:09 +00:00
Michał Górny	05349a9b43	[offload] [test] Mark bug 51781 test as requiring GPU (#174284 ) While the main problem with the test is that it requires LLD, given that it is unlikely to be testing anything meaningful for a CPU-only build, just mark it as requiring GPU. Fixes #100780 Signed-off-by: Michał Górny <mgorny@gentoo.org>	2026-01-03 19:25:51 +01:00
Amit Tiwari	230b437d05	[Clang][OpenMP] Handle check for non-contiguous mapping in pointer-based array sections (#157443 ) ### 1. ElementType deduction for pointer-based array sections Problem: Pointer-based array sections were previously ignored during `ElementType` deduction, leading to incorrect assumptions about array item types. This often resulted in out-of-bounds access, as seen in the assertion failure: ``` Assertion `idx < size()' failed. llvm-project/llvm/include/llvm/ADT/SmallVector.h:292: reference llvm::SmallVectorTemplateCommon<llvm::Value >::operatorsize_type [T = llvm::Value ] ``` Fix: Added a check in clang/lib/CodeGen/CGOpenMPRuntime.cpp to ensure `ElementType` is correctly detected for cases involving non-contiguous updates with a base pointer. Impact: Resolves failures in OpenMP_VV (formerly sollve_vv) and other offload/clang-OpenMP tests: All tests under: https://github.com/OpenMP-Validation-and-Verification/OpenMP_VV/tree/master/tests/5.0/target_update test_target_update_mapper_from_discontiguous.c test_target_update_mapper_to_discontiguous.c test_target_update_to_discontiguous.c test_target_update_from_discontiguous.c ### 2. Zero-dimension propagation in struct member mappings Problem: A zero-dimension entry for struct members introduced inconsistencies in complex mapping logic within OMPIRBuilder.cpp. Placeholder zeros propagated to emitNonContiguousDescriptor(), breaking reverse indexing logic and corrupting IR: Loops assume `Dims[I] >= 1`. When `Dims[I] == 0`: Reverse indexing still stores pointers to uninitialized allocas or mismatched slots. Runtime interprets `ArgSizes[I]` (derived from `Dims[I])` as dimensionality, causing size/offset calculations to collapse to zero → results in `size=0` async copy and plugin interface errors. Fix: Prepend a synthetic dimension of size 1 instead of appending a zero, preserving correctness in `targetDataUpdate()` for non-contiguous updates. Impact: Added dedicated test cases that previously failed on main.	2025-12-23 12:57:12 +05:30
Hansang Bae	c3a5ec0360	[NFC][Offload] Missing test change in #153683 (#172587 )	2025-12-17 09:03:54 -06:00
Abhinav Gaba	1fbf33cd40	[OpenMP][Clang] Use `ATTACH` map-type for list-items with base-pointers. (#153683 ) This adds support for using `ATTACH` map-type for proper pointer-attachment when mapping list-items that have base-pointers. For example, for the following: ```c int p; #pragma omp target enter data map(p[1:10]) ``` The following maps are now emitted by clang: ``` (A) &p[0], &p[1], 10 sizeof(p[1]), TO \| FROM &p, &p[1], sizeof(p), ATTACH ``` Previously, the two possible maps emitted by clang were: ``` (B) &p[0], &p[1], 10 * sizeof(p[1]), TO \| FROM (C) &p, &p[1], 10 * sizeof(p[1]), TO \| FROM \| PTR_AND_OBJ ```` (B) does not perform any pointer attachment, while (C) also maps the pointer p, both of which are incorrect. ----- With this change, we are using ATTACH-style maps, like `(A)`, for cases where the expression has a base-pointer. For example: ```cpp int p, pp; S ps, *pps; ... map(p[0]) ... map(p[10:20]) ... map(p) ... map(([20])p) ... map(ps->a) ... map(pps->p->a) ... map(pp[0][0]) ... map((pp + 10)[0]) ``` #### Grouping of maps based on attach base-pointers We also group mapping of clauses with the same base decl in the order of the increasing complexity of their base-pointers, e.g. for something like: ``` S spp; map(spp[0][0], spp[0][0].a), // attach-ptr: spp[0] map(spp[0]), // attach-ptr: spp map(spp), // attach-ptr: N/A ``` We first map `spp`, then `spp[0]` then `spp[0][0]` and `spp[0][0].a`. This allows us to also group "struct" allocation based on their attach pointers. This resolves the issues of us always mapping everything from the beginning of the symbol `spp`. Each group is mapped independently, and at the same level, like `spp[0][0]` and its member `spp[0][0].a`, we still get map them together as part of the same contiguous struct `spp[0][0]`. This resolves issue #141042. #### use_device_ptr/addr fixes The handling of `use_device_ptr/addr` was updated to use the attach-ptr information, and works for many cases that were failing before. It has to be done as part of this series because otherwise, the switch from ptr_to_obj to attach-style mapping would have caused regressions in existing use_device_ptr/addr tests. #### Handling of attach-pointers that are members of implicitly mapped structs: When a struct member-pointer, like `p` below, is a base-pointer in a `map` clause on a target construct (like `map(p[0:1])`, and the base of that struct is either the `this` pointer (implicitly or explicitly), or a struct that is implicitly mapped on that construct, we add an implicit `map(p)` so that we don't implicitly map the full struct. ```c struct S { int p; void f1() { #pragma omp target map(p[0:1]) // Implicitly map this->p, to ensure // that the implicit map of `this[:]` does // not map the full struct printf("%p %p\n", &p, p); } ``` #### Scope for improvement: We may be able to compute attach-ptr expr while collecting component-lists in Sema. * But we cache the computation results already, and `findAttachPtrExpr` is fairly simple, and fast. * There may be a better way to implement semantic expr comparison. #### Needs future work: * Attach-style maps not yet emitted for declare mappers. * Mapping of class member references: We are still using PTR_AND_OBJ maps for them. We will likely need to change that to handle `ref_ptr/ref_ptee`, and `attach` map-type-modifier on them. * Implicit capturing of "this" needs to map the full `this[0:1]` unless there is an explicit map on one of the members, or a map with a member as its base-pointer. * Implicit map added for capturing a class member pointer needs to also add a zero-length-array-section map. * `use_device_addr` on array-sections-on-pointers need further improvements (documented using FIXMEs) #### Why a large PR While it's unfortunate that this PR has gotten large and difficult to review, the issue is that all the functional changes have to be made together, to prevent regressions from partially implemented changes. For example, the changes to capturing were previously done separately (#145454), but they would still cause stability issues in absence of full attach-mapping. And attach-mapping needs those changes to be able to launch kernels. We extracted the utilities and functions, like those for finding attach-ptrs, or comparing exprs, out as a separate NFC PR that doesn't call those functions, just adds them (#155625). Maybe the change that adds a new error message for use_device_addr on array-sections with non-var base-pointers could have been extracted out too (but that would have had to be a follow-up change in that case, and we would get comp-fails with this PR when the erroneous case was not caught/diagnosed). --------- Co-authored-by: Alex Duran <alejandro.duran@intel.com>	2025-12-15 16:40:31 -08:00
Abhinav Gaba	f44740afff	[NFC][Offload] Fix minor debug print issues introduced in #170425 . (#172377 )	2025-12-15 15:10:24 -08:00
Kevin Sala Penades	35315a84b4	[offload] Fix CUDA args size by subtracting tail padding (#172249 ) This commit makes the cuLaunchKernel call to pass the total arguments size without tail padding.	2025-12-14 21:57:25 -08:00
Akash Banerjee	b360a782ca	Reland "[Flang][OpenMP] Add lowering support for is_device_ptr clause (#169331 )" (#170851 ) Add support for OpenMP is_device_ptr clause for target directives. [MLIR][OpenMP] Add OpenMPToLLVMIRTranslation support for is_device_ptr #169367 This PR adds support for the OpenMP is_device_ptr clause in the MLIR to LLVM IR translation for target regions. The is_device_ptr clause allows device pointers (allocated via OpenMP runtime APIs) to be used directly in target regions without implicit mapping.	2025-12-05 17:38:41 +00:00
theRonShark	be79a0d90f	Revert "[Flang][OpenMP] Add lowering support for is_device_ptr clause" (#170778 ) Reverts llvm/llvm-project#169331	2025-12-04 19:38:16 -05:00
Akash Banerjee	a77c4948a5	[Flang][OpenMP] Add lowering support for is_device_ptr clause (#169331 ) Add support for OpenMP is_device_ptr clause for target directives. [MLIR][OpenMP] Add OpenMPToLLVMIRTranslation support for is_device_ptr #169367 This PR adds support for the OpenMP is_device_ptr clause in the MLIR to LLVM IR translation for target regions. The is_device_ptr clause allows device pointers (allocated via OpenMP runtime APIs) to be used directly in target regions without implicit mapping.	2025-12-04 15:57:24 +00:00
Nick Sarnie	b3b83ac1e8	[offload][lit] Fix compilation of two offload tests (#169399 ) These are C tests, not C++, so no function parameters means unspecified number of parameters, not `void`. These compile fine on the current tested offload targets because an error is only [thrown](https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDecl.cpp#L10695) if the calling convention doesn't support variadic arguments, which they happen to. When compiling this test for other targets that do not support variadic arguments, we get an error, which does not seem intentional. Just add `void` to the parameter list. --------- Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>	2025-11-25 15:16:15 +00:00
Jan Leyonberg	3e86f05621	[OpenMP][flang] Lowering of OpenMP custom reductions to MLIR (#168417 ) This patch add support for lowering of custom reductions to MLIR. It also enhances the capability of the pass to automatically mark functions as "declare target" by traversing custom reduction initializers and combiners.	2025-11-24 16:00:46 -05:00
agozillon	173600880b	[Flang][OpenMP][MLIR] Initial declare target to for variables implementation (#119589 ) While the infrastructure for declare target to/enter and link for variables exists in the MLIR dialect and at the Flang level, the current lowering from MLIR -> LLVM IR isn't in place, it's only in place for variables that have the link clause applied. This PR aims to extend that lowering to an initial implementation that incorporates declare target to as well, which primarily requires changes in the OpenMPToLLVMIRTranslation phase. However, a minor addition to the OpenMP dialect was required to extend the declare target enumerator to include a default None field as well. This also requires a minor change to the Flang lowering's MapInfoFinlization.cpp pass to alter the map type for descriptors to deal with cases where a variable is marked declare to. Currently, when a descriptor variable is mapped declare target to the descriptor component can become attatched, and cannot be updated, this results in issues when an unusual allocation range is specified (effectively an off-by X error). The current solution is to map the descriptor always, as we always require an up-to-date version of this data. However, this also requires an interlinked PR that adds a more intricate type of mapping of structures/record types that clang currently implements, to circumvent the overwriting of the pointer in the descriptor. 3/3 required PRs to enable declare target to mapping, this PR should pass all tests and provide an all green CI. Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com	2025-11-24 21:22:49 +01:00
agozillon	20929abb85	[MLIR][OpenMP] Introduce overlapped record type map support (#119588 ) This PR introduces a new additional type of map lowering for record types that Clang currently supports, in which a user can map a top-level record type and then individual members with different mapping, effectively creating a sort of "overlapping" mapping that we attempt to cut around. This is currently most predominantly used in Fortran, when mapping descriptors and there data, we map the descriptor and its data with separate map modifiers and "cut around" the pointer data, so that wedo not overwrite it unless the runtime deems it a neccesary action based on its reference counting mechanism. However, it is a mechanism that will come in handy/trigger when a user explitily maps a record type (derived type or structure) and then explicitly maps a member with a different map type. These additions were predominantly in the OpenMPToLLVMIRTranslation.cpp file and phase, however, one Flang test that checks end-to-end IR compilation (as far as we care for now at least) was altered. 2/3 required PRs to enable declare target to mapping, should look at PR 3/3 to check for full green passes (this one will fail a number due to some dependencies). Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com	2025-11-24 21:20:29 +01:00
Akash Banerjee	8aa7d823b0	[OpenMP][Flang] Emit default declare mappers implicitly for derived types (#140562 ) This patch adds support to emit default declare mappers for implicit mapping of derived types when not supplied by user. This especially helps tackle mapping of allocatables of derived types.	2025-11-14 15:59:48 +00:00
Ethan Luis McDonough	38cade7cc6	[PGO][Offload] Fix missing names bug in GPU PGO (#166444 ) After #163011 was merged, the tests in [`offload/test/offloading/gpupgo`](https://github.com/llvm/llvm-project/compare/main...EthanLuisMcDonough:llvm-project:gpupgo-names-fix-pr?expand=1#diff-f769f6cebd25fa527bd1c1150cc64eb585c41cb8a8b325c2bc80c690e47506a1) broke because the offload plugins were no longer able to find `__llvm_prf_nm`. This pull request explicitly makes `__llvm_prf_nm` visible to the host on GPU targets and reverses the changes made in f7e9968a5ba99521e6e51161f789f0cc1745193f.	2025-11-10 10:11:53 -06:00
Joseph Huber	aaddd8d38a	[OpenMP] Fix tests relying on the heap size variable Summary: I made that an unimplemented error, but forgot that it was used for this environment variable.	2025-11-06 13:00:26 -06:00
Joseph Huber	670c453aeb	[Offload] Remove handling for device memory pool (#163629 ) Summary: This was a lot of code that was only used for upstream LLVM builds of AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so just use that. Simplifies code, can be added back if we start providing alternate forms but I don't think there's a single use-case that would justify it yet.	2025-11-06 10:15:18 -06:00
agozillon	09318c6bff	[MLIR][OpenMP] Fix and simplify bounds offset calculation for 1-D GEP offsets (#165486 ) Currently this is being calculated incorrectly and will result in incorrect index offsets in more complicated array slices. This PR tries to address it by refactoring and changing the calculation to be more correct.	2025-10-31 00:54:31 +01:00
Jan Patrick Lehr	f7e9968a5b	[Offload] XFAIL pgo tests until resolved (#163722 ) While people look into it, xfail the tests.	2025-10-16 11:43:55 +02:00
Joseph Huber	914fbe367e	[OpenMP] Disable a few more tests to get the bot green (#163614 )	2025-10-15 14:14:15 -05:00
Jan Patrick Lehr	4b84e0f3f0	[OpenMP] Add test to print interop identifiers (#161434 ) The test covers some of the identifier symbols in the interop runtime. This test, for now, is to guard against complete breakage, which was the result of the other `interop.c` test not being enabled on AMD and thus, not caught by our buildbots.	2025-10-15 20:38:33 +02:00
agozillon	9155b318f2	[Flang][OpenMP] Defer descriptor mapping for assumed dummy argument types (#154349 ) This PR adds deferral of descriptor maps until they are necessary for assumed dummy argument types. The intent is to avoid a problem where a user can inadvertently map a temporary local descriptor to device without their knowledge and proceed to never unmap it. This temporary local descriptor remains lodged in OpenMP device memory and the next time another variable or descriptor residing in the same stack address is mapped we incur a runtime OpenMP map error as we try to remap the same address. This fix was discussed with the OpenMP committee and applies to OpenMP 5.2 and below, future versions of OpenMP can avoid this issue via the attach semantics added to the specification.	2025-10-09 17:52:41 +02:00
Akash Banerjee	ed12dc5e30	[Flang][OpenMP] Implicitly map nested allocatable components in derived types (#160766 ) This PR adds support for nested derived types and their mappers to the MapInfoFinalization pass. - Generalize MapInfoFinalization to add child maps for arbitrarily nested allocatables when a derived object is mapped via declare mapper. - Traverse HLFIR designates rooted at the target block arg and build full coordinate_of chains; append members with correct membersIndex. This fixes #156461.	2025-10-02 16:15:16 +00:00
Joseph Huber	0fcce4fb7b	[OpenMP] Mark problematic tests as XFAIL / UNSUPPORTED (#161267 ) Summary: Several of these tests have been failing for literal years. Ideally we make efforts to fix this, but keeping these broken has had serious consequences on our testing infrastructure where failures are the norm so almost all test failures are disregarded. I made a tracking issue for the ones that have been disabled. https://github.com/llvm/llvm-project/issues/161265	2025-09-29 15:17:55 -05:00
Dominik Adamski	e4d94f4f7f	[OpenMP][Flang] Fix no-loop test (#161162 ) Fortran no-loop test is supported only for GPU.	2025-09-29 16:01:52 +02:00
Dominik Adamski	83ef38a274	[Flang][OpenMP] Enable no-loop kernels (#155818 ) Enable the generation of no-loop kernels for Fortran OpenMP code. target teams distribute parallel do pragmas can be promoted to no-loop kernels if the user adds the -fopenmp-assume-teams-oversubscription and -fopenmp-assume-threads-oversubscription flags. If the OpenMP kernel contains reduction or num_teams clauses, it is not promoted to no-loop mode. The global OpenMP device RTL oversubscription flags no longer force no-loop code generation for Fortran.	2025-09-26 13:57:51 +02:00
Akash Banerjee	3e7e60ae5c	Revert "[Flang][OpenMP] Implicitly map nested allocatable components in derived types" (#160759 ) Reverts llvm/llvm-project#160116	2025-09-25 19:53:58 +01:00
Akash Banerjee	b4f1e0e5b1	[Flang][OpenMP] Implicitly map nested allocatable components in derived types (#160116 ) This PR adds support for nested derived types and their mappers to the MapInfoFinalization pass. - Generalize MapInfoFinalization to add child maps for arbitrarily nested allocatables when a derived object is mapped via declare mapper. - Traverse HLFIR designates rooted at the target block arg and build full coordinate_of chains; append members with correct membersIndex. This fixes #156461.	2025-09-24 14:30:27 +01:00
Akash Banerjee	8afea0d0ea	[OpenMP][MLIR] Preserve to/from flags in mapper base entry for mappers (#159799 ) With declare mapper, the parent base entry was emitted as `TARGET_PARAM` only. The mapper received a map-type without `to/from`, causing components to degrade to `alloc`-only (no copies), breaking allocatable payload mapping. This PR preserves the map-type bits from the parent. This fixes #156466.	2025-09-19 19:34:09 +01:00
Kareem Ergawy	c286a427b9	[NFC][flang][do concurent] Add saxpy offload tests for OpenMP mapping (#155993 ) Adds end-to-end tests for `do concurrent` offloading to the device. PR stack: - https://github.com/llvm/llvm-project/pull/155754 - https://github.com/llvm/llvm-project/pull/155987 - https://github.com/llvm/llvm-project/pull/155992 - https://github.com/llvm/llvm-project/pull/155993 ◀️ - https://github.com/llvm/llvm-project/pull/157638 - https://github.com/llvm/llvm-project/pull/156610 - https://github.com/llvm/llvm-project/pull/156837	2025-09-17 07:04:13 +02:00
Jan Patrick Lehr	311d78f2a1	[OpenMP] Fix force-usm test after #157182 (#159095 ) The refactoring lead to an additional data transfer. This changes the assumed transfers in the check-strings to work with that changed behavior.	2025-09-16 15:42:02 +02:00
agozillon	8f16af3c20	[Flang][OpenMP] Fix mapping of character type with LEN > 1 specified (#154172 ) Currently, there's a number of issues with mapping characters with LEN's specified (strings effectively). They're represented as a char type in FIR with a len parameter, and then later on they're expanded into an array of characters when we're translating to the LLVM dialect. However, we don't generate a bounds for these at lowering. The fix in this PR for this is to generate a bounds from the LEN parameter and attatch it to the map on lowering from FIR to the LLVM dialect when we encounter this type.	2025-09-09 16:36:04 +02:00
Jan Patrick Lehr	27e541645c	[Offload][OpenMP] Enable more tests on AMDGPU (#156626 ) (Re)enables a couple of tests that were disabled on AMDGPU for some reason. Pass for me locally.	2025-09-03 14:04:39 +02:00
Akash Banerjee	6aafe6582d	Fix test added in 1fd1d634630754cc9b9c4b5526961d5856f64ff9	2025-08-18 13:29:23 +01:00
Akash Banerjee	1fd1d63463	[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#153048 ) Add a new AutomapToTargetData pass. This gathers the declare target enter variables which have the AUTOMAP modifier. And adds omp.declare_target_enter/exit mapping directives for fir.alloca and fir.free oeprations on the AUTOMAP enabled variables. Automap Ref: OpenMP 6.0 section 7.9.7.	2025-08-15 15:41:41 +01:00
Akash Banerjee	1c7720ef78	Revert "[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#153048 )" This reverts commit 4e6d510eb3ec5b5e5ea234756ea1f0b283feee4a.	2025-08-12 20:19:45 +01:00
Akash Banerjee	4e6d510eb3	[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#153048 ) Add a new AutomapToTargetData pass. This gathers the declare target enter variables which have the AUTOMAP modifier. And adds omp.declare_target_enter/exit mapping directives for fir.alloca and fir.free oeprations on the AUTOMAP enabled variables. Automap Ref: OpenMP 6.0 section 7.9.7.	2025-08-12 15:18:15 +01:00
Amit Tiwari	2074e1320f	[Clang][OpenMP] Non-contiguous strided update (#144635 ) This patch handles the strided update in the `#pragma omp target update from(data[a🅱️c])` directive where 'c' represents the strided access leading to non-contiguous update in the `data` array when the offloaded execution returns the control back to host from device using the `from` clause. Issue: Clang CodeGen where info is generated for the particular `MapType` (to, from, etc), it was failing to detect the strided access. Because of this, the `MapType` bits were incorrect when passed to runtime. This led to incorrect execution (contiguous) in the libomptarget runtime code. Added a minimal testcase that verifies the working of the patch.	2025-08-12 19:32:15 +05:30
Akash Banerjee	0998da27e9	Revert "[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#151989 )" This reverts commit 5a5e8ba0c388d57aecb359ed67919cda429fc7b1.	2025-08-11 13:52:39 +01:00
Akash Banerjee	5a5e8ba0c3	[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#151989 ) Add a new `AutomapToTargetData` pass. This gathers the declare target enter variables which have the `AUTOMAP` modifier. And adds `omp.declare_target_enter/exit` mapping directives for `fir.allocmem` and `fir.freemem` oeprations on the `AUTOMAP` enabled variables. Automap Ref: OpenMP 6.0 section 7.9.7.	2025-08-11 13:18:38 +01:00
Aiden Grossman	2e3fd547de	[Offload] Fix typo in shared_lib_fp_mapping.c Made a typo in 963259ef6be4871e5252ff3ac9df737af5d2b4cb because I cannot run tests and also did not review it. This should fix it...	2025-07-25 23:17:46 +00:00
Aiden Grossman	963259ef6b	[Offload] Remove uses of %T from lit tests (#150721 ) This patch removes all the instances of %T from offload/ (only one test contained this construction). %T has been deprecated for ~7 years and is not reccomended as it does not use a unique directory per test. Switch to using %t to ensure we use a unique dir per test and so that we can eventually remove %T. I did not actually test this. A couple feeble attempts at building/running the offload tests just leaves me with a ton of test failures. Given how small this is I'm reasonably sure it works though.	2025-07-25 16:16:22 -07:00
agozillon	73272d6fc6	[Flang][OpenMP] Appropriately emit present/load/store in all cases in MapInfoFinalization (#150311 ) Currently, we return early whenever we've already generated an allocation for intermediate descriptor variables (required in certain cases when we can't directly access the base address of a passes in descriptor function argument due to HLFIR/FIR restrictions). This unfortunately, skips over the presence check and load/store required to set the intermediate descriptor allocations values/data. This is fine in most cases, but if a function happens to have a series of branches with seperate target regions capturing the same input argument, we'd emit the present/load/store into the first branch with the first target inside of it, the secondary (or any preceding) branches would not have the present/load/store, this would lead to the subsequent mapped values in that branch being empty and then leading to a memory access violation on device. The fix for the moment is to emit a present/load/store at the relevant location of every target utilising the input argument, this likely will also lead to fixing possible issues with the input argument being manipulated inbetween target regions (primarily resizing, the data should remain the same as we're just copying an address around, in theory at least). There's possible optimizations/simplifications to emit less load/stores such as by raising the load/store out of the branches when we can, but I'm inclined to leave this sort of optimization to lower level passes such as an LLVM pass (which very possibly already covers it).	2025-07-25 16:15:54 +02:00

1 2 3

123 Commits