llvm-project

Author	SHA1	Message	Date
Ferran Toda	f560e4cfb1	[MLIR][OpenMP] Add omp.fuse operation (#168898 ) This patch is a follow-up from #161213 and adds the omp.fuse loop transformation for the OpenMP dialect. Used for lowering a `!$omp fuse` in Flang. Added Lowering and end2end tests.	2026-02-17 15:34:27 +01:00
Abid Qadeer	deedc7bfe3	[Flang][OpenMP] Don't generate code for unreachable target regions. (#178937 ) When a target region is placed inside a constant false condition (e.g., `if (.false.)`), the dead code gets eliminated on the host side, removing the `omp.target` operation entirely. However, the device-side compilation pipeline is unaware of this elimination and attempts to generate kernel code. Since the host never created offload metadata for the eliminated target, the device-side kernel function lacks the "kernel" attribute, causing `OpenMPOpt` to fail with an assertion when it expects all outlined kernels to have this attribute. The problem can be seen with the following code: ```fortran program cele implicit none real :: V integer :: i if (.false.) then !$omp target teams distribute parallel do do i = 1, 5 V = V * 2 end do !$omp end target teams distribute parallel do end if end program ``` It currently fails with the following assertion: ``` Assertion `omp::isOpenMPKernel(*Kernel) && "Expected kernel function!"' failed. llvm/lib/Transforms/IPO/OpenMPOpt.cpp:4291 ``` This PR adds `DeleteUnreachableTargetsPass` that identifies `omp.target` operations in unreachable code blocks and removes them.	2026-02-16 09:31:42 +00:00
Aiden Grossman	6e6f76026d	[MLIR][OpenMP] Fix unused variable warning 7c07cb6542a0c5e4340e09a9a247e3e5123c6567 introduced a variable created in an if statement that is only used in an assertion. Per the coding guidelines, mark it [[maybe_unused]].	2026-02-10 20:40:29 +00:00
Jack Styles	8949c6d86b	[MLIR][OpenMP] Add Taskloop Collapse Support (#175924 ) Following work completed in #174386 and #174623, this patch adds support for collapse to Taskloop. Collapse allows for the user to compress multiple loop nests into a single loop, and for this to work with Taskloop, there needs to be some changes to how we process the loops, and the tasks that run them. This patch brings Taskloop equivalent to OpenMP 4.5 support for MLIR and Flang.	2026-02-05 08:59:00 +00:00
Chi-Chun, Chen	36dadddd74	[Flang][mlir][OpenMP] Add affinity clause to omp.task and Flang lowering (#179003 ) - Add MLIR OpenMP affinity clause - Lower flang task affinity to mlir - Emit TODO for iterator modifier and update negative test	2026-02-04 10:30:35 -06:00
Akash Banerjee	7c07cb6542	[MLIR][OpenMP] Fix recursive mapper emission. (#178453 ) Recursive types can cause re-entrant mapper emission. The mapper function is created by OpenMPIRBuilder before the callbacks run, so it may already exist in the LLVM module even though it is not yet registered in the ModuleTranslation mapping table. Reuse and register it to break the recursion. Added offloading test.	2026-01-29 16:38:33 +00:00
Walter Lee	b1f845df32	[MLIR][OpenMP] Fix unused variable warning for #137201 (#178659 ) Fixes 4cc80831ea5d39c186fc29692556b762ffb6478b.	2026-01-29 14:14:59 +00:00
Sergio Afonso	4cc80831ea	[MLIR][OpenMP] Simplify OpenMP device codegen (#137201 ) After removing host operations from the device MLIR module, it is no longer necessary to provide special codegen logic to prevent these operations from causing compiler crashes or miscompilations. This patch removes these now unnecessary code paths to simplify codegen logic. Some MLIR tests are now replaced with Flang tests, since the responsibility of dealing with host operations has been moved earlier in the compilation flow. MLIR tests holding target device modules are updated to no longer include now unsupported host operations.	2026-01-29 12:44:40 +00:00
Jakub Kuderski	59e44799bd	[mlir] Fix new clang-tidy warning llvm-type-switch-case-types. NFC. (#178487 ) Pre-commiting this before landing the new check in https://github.com/llvm/llvm-project/pull/177892	2026-01-28 19:13:47 +00:00
Akash Banerjee	c856c3d045	[MLIR][OpenMP] Fix mapper being attached to partial maps. (#178247 ) Fix OpenMP mapper lowering by attaching user-defined/default mappers only to the base parent entry, not combined/segment entries. This prevents mapper calls with partial sizes. Added relevant tests.	2026-01-28 18:35:03 +00:00
Chaitanya	55f0ed91ef	[OpenMP][MLIR] Add thread_limit with dims modifier support (#171825 ) PR adds support of openmp 6.1 feature thread_limit with dims modifier. llvmIR translation for thread_limit with dims modifier is marked as NYI.	2026-01-27 18:16:48 +05:30
Chaitanya	08654adc62	[OpenMP][MLIR] Add num_threads clause with dims modifier support (#171767 ) PR adds support of openmp 6.1 feature num_threads with dims modifier. llvmIR translation for num_threads with dims modifier is marked as NYI.	2026-01-27 15:30:55 +05:30
Chaitanya	3aaeace4e2	[OpenMP][MLIR] Add num_teams clause with dims modifier support (#169883 ) PR adds support of openmp 6.1 feature `num_teams` with dims modifier. llvmIR translation for num_teams with dims modifier is marked as NYI.	2026-01-27 10:55:40 +05:30
Jason Van Beusekom	0bdbf01e4e	[OpenMP][Flang][MLIR] Skip trip count calculation when bounds are null (#176469 ) Fixes a segfault when trip count values are null by skipping trip count calculation when we cannot determine if it is safe to hoist out the values. Of note I originally tried to modify `extractOnlyOmpNestedDir` to return the first OpenMPConstruct directive, skipping over any earlier directives (ie stores), which did work for the below generic test case: ```fortran program minimal_repro implicit none integer :: i, m integer :: res(10) = 0 !$omp target teams map(from:m,res) private(m) m = 5 !$omp distribute parallel do do i = 1, 10 res(i) = 5 + i end do !$omp end distribute parallel do !$omp end target teams end program minimal_repro ``` But that led to incorrect output in this test case as the trip count was hoisted out and calculated by m(1000000) instead of m(1) ```fortran program minimal_repro implicit none integer :: i, x integer :: m(1) = 0 integer :: res(10) = 0 m(1) = 10 x = 1000000 !$omp target teams map(res) x = 1 !$omp distribute parallel do do i = 1, m(x) res(i) = 5 + i end do !$omp end distribute parallel do !$omp end target teams print *, "Test completed successfully m =", m, " res=", res end program minimal_repro ``` Leading to a segfault, due to the loop bounds being calculated with m(1000000) ```mlir %c1000000_i32 = arith.constant 1000000 : i32 hlfir.assign %c1000000_i32 to %10#0 : i32, !fir.ref<i32> %c1_i32 = arith.constant 1 : i32 %12 = fir.load %10#0 : !fir.ref<i32> %13 = fir.convert %12 : (i32) -> i64 %14 = hlfir.designate %5#0 (%13) : (!fir.ref<!fir.array<1xi32>>, i64) -> !fir.ref<i32> %15 = fir.load %14 : !fir.ref<i32> ... omp.target host_eval(%c1_i32 -> %arg0, %15 -> %arg1, %c1_i32_1 -> %arg2 : i32, i32, i32) map_entries(%18 -> %arg3, %19 -> %arg4, %20 -> %arg5, %23 -> %arg6 : !fir.ref<!fir.array<10xi32>>, !fir.ref<i32>, !fir.ref<i32>, !fir.ref<!fir.array<1xi32>>) { ... omp.teams { ... omp.loop_nest (%arg8) : i32 = (%arg0) to (%arg1) inclusive step (%arg2) { ``` The wip commit for this change is here: `beafeae396` We would need to have some sort of intelligent hoisting for these cases, to allow hoisting, but for now I just created this PR to fix the bug. Fixes: #176030	2026-01-21 11:56:36 +00:00
Michael Klemm	9f19d1895d	[OpenMP] Fix truncation/extension bug when calling __kmpc_push_num_teams (#173067 ) This PR fixes a bug when the lower and upper bound for the number of teams was not an `int32`, but a different type. In this case, an internal compiler would trigger due to a mismatching call to `__kmpc_push_num_teams`.	2026-01-19 11:20:11 +01:00
Austin Jiang	e6cdfb75ac	Fix typos and spelling errors across codebase (#156270 ) Corrected various spelling mistakes such as 'occurred', 'receiver', 'initialized', 'length', and others in comments, variable names, function names, and documentation throughout the project. These changes improve code readability and maintain consistency in naming and documentation. Co-authored-by: Louis Dionne <ldionne.2@gmail.com>	2026-01-13 11:52:46 -05:00
Tom Eccles	804aa88317	[MLIR][OpenMP] Support cancel taskgroup inside of taskloop (#174815 ) Implementation follows exactly what is done for omp.wsloop and omp.task. See #137841. The change to the operation verifier is to allow a taskgroup cancellation point inside of a taskloop. This was already allowed for omp.cancel.	2026-01-09 11:43:54 +00:00
Tom Eccles	ddb706bbb0	[mlir][OpenMP] Don't allocate task context structure if not needed (#174588 ) Don't allocate a task context structure if none of the private variables needed it. This was already skipped when there were no private variables at all.	2026-01-09 10:49:06 +00:00
Jack Styles	b7c17ab957	[MLIR][OpenMP] Add Initial Taskloop Clause Support (#174623 ) Following on from the work to implement MLIR -> LLVM IR Translation for Taskloop, this adds support for the following clauses to be used alongside taskloop: - if - grainsize - num_tasks - untied - Nogroup - Final - Mergeable - Priority These clauses are ones which work directly through the relevant OpenMP Runtime functions, so their information just needed collecting from the relevant location and passing through to the appropriate runtime function. Remaining clauses retain their TODO message as they have not yet been implemented.	2026-01-09 10:34:03 +00:00
Tom Eccles	cc1bb845da	[mlir][OpenMP] Fix sanitizer error in buildTaskLikeBodyGenCallback (#174983 ) This is a fix for the asan bot after https://github.com/llvm/llvm-project/pull/174386 Failing bot: https://lab.llvm.org/buildbot/#/builders/24/builds/16371 This commit undoes a simplification I thought reduced copied+pasted code. I will merge it like this now to unblock the bot, and then work separately on a different way to share code between both callbacks.	2026-01-08 14:41:40 +00:00
Tom Eccles	1af1cc21c8	[mlir][OpenMP] Translation support for taskloop construct (#174386 ) This PR replaces #166903 This implements translation for taskloop, along with DSA clauses. Other clauses will follow immediately after this is merged. This patch was collaborative work by myself, @kaviya2510, and @Stylie777. I’ve left the commits unsquashed to make authorship clear. My only changes to other author’s commits are to rebase and run clang-format. The taskloop implementation in the runtime works roughly like this: if the number of loop iterations to perform are more than some threshold, the current task is duplicated and both resulting tasks gets half of the loop range. This continues recursively until each task has a small enough loop range to run itself in a single thread. This leads to two implementation complexities: - The runtime needs to be able to update the loop bounds used when executing the loop inside of the task. This has been implemented by forcing them to always have a fixed location inside of the structure produced when outlining the task. - When a task is duplicated, all data stored for the task’s (first)private variables needs to also be duplicated and appropriate constructors run. This is handled by a task duplication function invoked by the runtime. With regards to testing, most existing tests in the gfortran and fujitsu test suites require the reduction clause (not part of OpenMP 4.5). I wrote some tests of my own and was satisfied that it seems to be working. Co-authored-by: Kaviya Rajendiran <kaviyara2000@gmail.com> Co-authored-by: Jack Styles <jack.styles@arm.com> --------- Co-authored-by: Kaviya Rajendiran <kaviyara2000@gmail.com> Co-authored-by: Jack Styles <jack.styles@arm.com>	2026-01-08 11:08:13 +00:00
Chi-Chun, Chen	5fb43838af	[mlir][OpenMP] Lower device clause for target data/enter/exit/update (#174665 ) Extend OpenMP device clause lowering for target data, target enter data, target exit data, and target update to accept non-constant values. Previously, only constant device IDs could be lowered to LLVM IR. Add Flang tests to validate device clause handling and mark the feature as supported in the OpenMPSupport documentation. New tests cover: - target teams - target teams distribute - target teams distribute parallel do - target teams distribute parallel do simd - target data Tests for target update and target enter/exit were already present in Flang.	2026-01-07 11:19:14 -06:00
Tom Eccles	07d07be73d	[mlir][OpenMP] Fix infinite loop after #174105 (#174736 )	2026-01-07 10:48:16 +00:00
Chi-Chun, Chen	3f5d91bfbc	[Flang][OpenMP] Implement device clause lowering for target directive (#173509 ) Add lowering support for the OpenMP `device` clause on the `target` directive in Flang. The device expression is propagated through MLIR OpenMP and passed to the host-side `__tgt_target_kernel` call.	2026-01-06 11:10:03 -06:00
Tom Eccles	188d13db20	[mlir][OpenMP] don't add compiler-generated barrier in single threaded code (#174105 ) We add barriers to the firstprivate copy region when they are required to avoid a race condition with the lastprivate clause. The problem is that these barriers are added by the compiler not implied by user code so it is the compiler's problem to avoid deadlock. I came across a testcase whilst working on taskloop support that looks a bit like this ``` !$omp parallel !$omp single !$omp taskloop firstprivate(a) lastprivate(a) ... !$omp end single !$omp end parallel ``` This is so that there are multiple threads for the generated tasks to be distributed over, but we don't generate the tasks afresh in every thread. The problem comes when the taskloop requires a barrier to prevent the datarace between firstprivate and lastprivate. This barrier will then be generated inside of SINGLE and so only one thread will encounter the barrier: leading to a deadlock. This patch works around the problem by detecting this situation statically and then not generating the barrier. There are cases where we cannot detect this statically (e.g. if the TASKLOOP is inside a function call inside of SINGLE). The program will still deadlock in this case after my patch. I'm unsure what the solution would be for that case. I want to fix this simple case in LLVM 22 before engaging in a longer discussion as to whether there is a better way to handle the more general case. Testing using wsloop because I want to land this (or not) independently of taskloop. Note that for wsloop it would be up to the programmer to remember to use the nowait clause, but nowait cannot be used to control generation of this barrier because it refers to the barrier after the construct not after firstprivate copyin (before the construct execution).	2026-01-06 10:22:41 +00:00
NimishMishra	11d9694b75	[flang][mlir] Add support for implicit linearization in omp.simd (#150386 ) Up till OpenMP version 4.5, the loop iteration variable in the associated do-construct of simd is linear with a linear step equal to the increment of the loop. This PR implements this functionality. For versions > 4.5, such an implicit linear clause is not assumed for the loop iteration variable. Fixes https://github.com/llvm/llvm-project/issues/171006	2026-01-03 21:37:43 -08:00
Krish Gupta	c646d1bd7d	[MLIR][OpenMP] Fix type mismatch in linear clause for INTEGER(8) variables (#173982 ) Fixes #173332 The compiler was crashing when compiling OpenMP `parallel do simd` with a `linear` clause on `INTEGER(8)` variables. The assertion failure occurred during MLIR-to-LLVM translation: Cannot create binary operator with two operands of differing type! Root Cause: The bug was in `LinearClauseProcessor::updateLinearVar()` where the step value (i32) and induction variable were multiplied without normalizing to the linear variable's type (i64), causing type mismatches in LLVM IR generation. Solution: Updated the translation logic to cast both the induction variable and step value to `linearVarTypes[index]` before performing arithmetic operations. This ensures type consistency for both integer and floating-point linear variables. Testing: - Added integration test verifying successful compilation to LLVM IR - Added lowering test for MLIR generation with various linear clause forms - Verified the exact reproducer from the issue now compiles without errors	2026-01-02 11:52:33 +00:00
Akash Banerjee	b360a782ca	Reland "[Flang][OpenMP] Add lowering support for is_device_ptr clause (#169331 )" (#170851 ) Add support for OpenMP is_device_ptr clause for target directives. [MLIR][OpenMP] Add OpenMPToLLVMIRTranslation support for is_device_ptr #169367 This PR adds support for the OpenMP is_device_ptr clause in the MLIR to LLVM IR translation for target regions. The is_device_ptr clause allows device pointers (allocated via OpenMP runtime APIs) to be used directly in target regions without implicit mapping.	2025-12-05 17:38:41 +00:00
NimishMishra	290b32a699	[llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop and omp.simd (#139386 ) This patch adds support for LLVM translation of linear clause on omp.wsloop (except for linear modifiers).	2025-12-04 20:39:17 -08:00
theRonShark	be79a0d90f	Revert "[Flang][OpenMP] Add lowering support for is_device_ptr clause" (#170778 ) Reverts llvm/llvm-project#169331	2025-12-04 19:38:16 -05:00
Akash Banerjee	a77c4948a5	[Flang][OpenMP] Add lowering support for is_device_ptr clause (#169331 ) Add support for OpenMP is_device_ptr clause for target directives. [MLIR][OpenMP] Add OpenMPToLLVMIRTranslation support for is_device_ptr #169367 This PR adds support for the OpenMP is_device_ptr clause in the MLIR to LLVM IR translation for target regions. The is_device_ptr clause allows device pointers (allocated via OpenMP runtime APIs) to be used directly in target regions without implicit mapping.	2025-12-04 15:57:24 +00:00
Mehdi Amini	4c09e45f1d	[MLIR] Apply clang-tidy fixes for llvm-qualified-auto in OpenMPToLLVMIRTranslation.cpp (NFC)	2025-12-03 07:01:47 -08:00
Tom Eccles	8ec2112ec8	[OMPIRBuilder] re-land cancel barriers patch #164586 (#169931 ) A barrier will pause execution until all threads reach it. If some go to a different barrier then we deadlock. This manifests in that the finalization callback must only be run once. Fix by ensuring we always go through the same finalization block whether the thread in cancelled or not and no matter which cancellation point causes the cancellation. The old callback only affected PARALLEL, so it has been moved into the code generating PARALLEL. For this reason, we don't need similar changes for other cancellable constructs. We need to create the barrier on the shared exit from the outlined function instead of only on the cancelled branch to make sure that threads exiting normally (without cancellation) meet the same barriers as those which were cancelled. For example, previously we might have generated code like ``` ... %ret = call i32 @__kmpc_cancel(...) %cond = icmp eq i32 %ret, 0 br i1 %cond, label %continue, label %cancel continue: // do the rest of the callback, eventually branching to %fini br label %fini cancel: // Populated by the callback: // unsafe: if any thread makes it to the end without being cancelled // it won't reach this barrier and then the program will deadlock %unused = call i32 @__kmpc_cancel_barrier(...) br label %fini fini: // run destructors etc ret ``` In the new version the barrier is moved into fini. I generate it after the destructors because the standard describes the barrier as occurring after the end of the parallel region. ``` ... %ret = call i32 @__kmpc_cancel(...) %cond = icmp eq i32 %ret, 0 br i1 %cond, label %continue, label %cancel continue: // do the rest of the callback, eventually branching to %fini br label %fini cancel: br label %fini fini: // run destructors etc // safe so long as every exit from the function happens via this block: %unused = call i32 @__kmpc_cancel_barrier(...) ret ``` To achieve this, the barrier is now generated alongside the finalization code instead of in the callback. This is the reason for the changes to the unit test. I'm unsure if I should keep the incorrect barrier generation callback only on the cancellation branch in clang with the OMPIRBuilder backend because that would match clang's ordinary codegen. Right now I have opted to remove it entirely because it is a deadlock waiting to happen. --- This re-lands #164586 with a small fix for a failing buildbot running address sanitizer on clang lit tests. In the previous version of the patch I added an insertion point guard "just to be safe" and never removed it. There isn't insertion point guarding on the other route out of this function and we do not preserve the insertion point around getFiniBB either so it is not needed here. The problem flagged by the sanitizers was because the saved insertion point pointed to an instruction which was then removed inside the FiniCB for some clang codegen functions. The instruction was freed when it was removed. Then accessing it to restore the insertion point was a use after free bug.	2025-12-01 10:07:19 +00:00
Tom Eccles	58fa7e4ccd	Revert "[OMPIRBuilder] always leave PARALLEL via the same barrier" (#169829 ) Reverts llvm/llvm-project#164586 Reverting due to buildbot failure: https://lab.llvm.org/buildbot/#/builders/169/builds/17519	2025-11-27 16:19:52 +00:00
Jack Styles	47ae3eaa29	[MLIR][OpenMP] Add MLIR Lowering Support for dist_schedule (#152736 ) `dist_schedule` was previously supported in Flang/Clang but was not implemented in MLIR, instead a user would get a "not yet implemented" error. This patch adds support for the `dist_schedule` clause to be lowered to LLVM IR when used in an `omp.distribute` or `omp.wsloop` section. There has needed to be some rework required to ensure that MLIR/LLVM emits the correct Schedule Type for the clause, as it uses a different schedule type to other OpenMP directives/clauses in the runtime library. This patch also ensures that when using dist_schedule or a chunked schedule clause, the correct llvm loop parallel accesses details are added.	2025-11-27 14:16:44 +00:00
Tom Eccles	0e5633fcd9	[OMPIRBuilder] always leave PARALLEL via the same barrier (#164586 ) A barrier will pause execution until all threads reach it. If some go to a different barrier then we deadlock. This manifests in that the finalization callback must only be run once. Fix by ensuring we always go through the same finalization block whether the thread in cancelled or not and no matter which cancellation point causes the cancellation. The old callback only affected PARALLEL, so it has been moved into the code generating PARALLEL. For this reason, we don't need similar changes for other cancellable constructs. We need to create the barrier on the shared exit from the outlined function instead of only on the cancelled branch to make sure that threads exiting normally (without cancellation) meet the same barriers as those which were cancelled. For example, previously we might have generated code like ``` ... %ret = call i32 @__kmpc_cancel(...) %cond = icmp eq i32 %ret, 0 br i1 %cond, label %continue, label %cancel continue: // do the rest of the callback, eventually branching to %fini br label %fini cancel: // Populated by the callback: // unsafe: if any thread makes it to the end without being cancelled // it won't reach this barrier and then the program will deadlock %unused = call i32 @__kmpc_cancel_barrier(...) br label %fini fini: // run destructors etc ret ``` In the new version the barrier is moved into fini. I generate it after the destructors because the standard describes the barrier as occurring after the end of the parallel region. ``` ... %ret = call i32 @__kmpc_cancel(...) %cond = icmp eq i32 %ret, 0 br i1 %cond, label %continue, label %cancel continue: // do the rest of the callback, eventually branching to %fini br label %fini cancel: br label %fini fini: // run destructors etc // safe so long as every exit from the function happens via this block: %unused = call i32 @__kmpc_cancel_barrier(...) ret ``` To achieve this, the barrier is now generated alongside the finalization code instead of in the callback. This is the reason for the changes to the unit test. I'm unsure if I should keep the incorrect barrier generation callback only on the cancellation branch in clang with the OMPIRBuilder backend because that would match clang's ordinary codegen. Right now I have opted to remove it entirely because it is a deadlock waiting to happen.	2025-11-27 14:13:25 +00:00
Kareem Ergawy	f481f5bef9	[OpenMP][flang] Add initial support for by-ref reductions on the GPU (#165714 ) Adds initial support for GPU by-ref reductions. The main problem for reduction by reference is that, prior to this PR, we were shuffling (from remote lanes within the same warp or across different warps within the block) pointers/references to the private reduction values rather than the private reduction values themselves. In particular, this diff adds support for reductions on scalar allocatables where reductions happen on loops nested in `target` regions. For example: ```fortran integer :: i real, allocatable :: scalar_alloc allocate(scalar_alloc) scalar_alloc = 0 !$omp target map(tofrom: scalar_alloc) !$omp parallel do reduction(+: scalar_alloc) do i = 1, 1000000 scalar_alloc = scalar_alloc + 1 end do !$omp end target ``` This PR supports by-ref reductions on the intra- and inter-warp levels. So far, there are still steps to be takens for full support of by-ref reductions, for example: * Support inter-block value combination is still not supported. Therefore, `target teams distribute parallel do` is still not supported. * Support for dynamically-sized arrays still needs to be added. * Support for more than one allocatable/array on the same `reduction` clause.	2025-11-26 11:59:22 +01:00
Aiden Grossman	51dd3ec13c	[MLIR][OpenMP] Bail early in sortMapIndices if indices are the same (#169474 ) If we are given the same index in the comparator callback, simply return false. Otherwise we will end up adding invalid items to occludedChildren, causing extra items to get removed that should not be, resulting in failures that manifest in different forms (assertions, asan failures, ubsan failures, etc.).	2025-11-25 06:23:12 -05:00
Jan Leyonberg	3e86f05621	[OpenMP][flang] Lowering of OpenMP custom reductions to MLIR (#168417 ) This patch add support for lowering of custom reductions to MLIR. It also enhances the capability of the pass to automatically mark functions as "declare target" by traversing custom reduction initializers and combiners.	2025-11-24 16:00:46 -05:00
agozillon	173600880b	[Flang][OpenMP][MLIR] Initial declare target to for variables implementation (#119589 ) While the infrastructure for declare target to/enter and link for variables exists in the MLIR dialect and at the Flang level, the current lowering from MLIR -> LLVM IR isn't in place, it's only in place for variables that have the link clause applied. This PR aims to extend that lowering to an initial implementation that incorporates declare target to as well, which primarily requires changes in the OpenMPToLLVMIRTranslation phase. However, a minor addition to the OpenMP dialect was required to extend the declare target enumerator to include a default None field as well. This also requires a minor change to the Flang lowering's MapInfoFinlization.cpp pass to alter the map type for descriptors to deal with cases where a variable is marked declare to. Currently, when a descriptor variable is mapped declare target to the descriptor component can become attatched, and cannot be updated, this results in issues when an unusual allocation range is specified (effectively an off-by X error). The current solution is to map the descriptor always, as we always require an up-to-date version of this data. However, this also requires an interlinked PR that adds a more intricate type of mapping of structures/record types that clang currently implements, to circumvent the overwriting of the pointer in the descriptor. 3/3 required PRs to enable declare target to mapping, this PR should pass all tests and provide an all green CI. Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com	2025-11-24 21:22:49 +01:00
agozillon	20929abb85	[MLIR][OpenMP] Introduce overlapped record type map support (#119588 ) This PR introduces a new additional type of map lowering for record types that Clang currently supports, in which a user can map a top-level record type and then individual members with different mapping, effectively creating a sort of "overlapping" mapping that we attempt to cut around. This is currently most predominantly used in Fortran, when mapping descriptors and there data, we map the descriptor and its data with separate map modifiers and "cut around" the pointer data, so that wedo not overwrite it unless the runtime deems it a neccesary action based on its reference counting mechanism. However, it is a mechanism that will come in handy/trigger when a user explitily maps a record type (derived type or structure) and then explicitly maps a member with a different map type. These additions were predominantly in the OpenMPToLLVMIRTranslation.cpp file and phase, however, one Flang test that checks end-to-end IR compilation (as far as we care for now at least) was altered. 2/3 required PRs to enable declare target to mapping, should look at PR 3/3 to check for full green passes (this one will fail a number due to some dependencies). Co-authored-by: Raghu Maddhipatla raghu.maddhipatla@amd.com	2025-11-24 21:20:29 +01:00
agozillon	09318c6bff	[MLIR][OpenMP] Fix and simplify bounds offset calculation for 1-D GEP offsets (#165486 ) Currently this is being calculated incorrectly and will result in incorrect index offsets in more complicated array slices. This PR tries to address it by refactoring and changing the calculation to be more correct.	2025-10-31 00:54:31 +01:00
Pranav Bhandarkar	e2ad554991	[Flang][mlir] - Translation of delayed privatization for deferred target-tasks (#155348 ) This PR adds support for translation of the private clause on deferred target tasks - that is `omp.target` operations with the `nowait` clause. An offloading call for a deferred target-task is not blocking - the offloading (target-generating) host task continues its execution after issuing the offloading call. Therefore, the key problem we need to solve is to ensure that the data needed for private variables to be initialized in the target task persists even after the host task has completed. We do this in a new pass called `PrepareForOMPOffloadPrivatizationPass`. For a privatized variable that needs its host counterpart for initialization (such as the shape of the data from the descriptor when an allocatable is privatized or the value of the data when an allocatable is firstprivatized), - the pass allocates memory on the heap. - it then initializes this memory by using the `init` and `copy` (for firstprivate) regions of the corresponding `omp::PrivateClauseOp`. - Finally the memory allocated on the heap is freed using the `dealloc` region of the same `omp::PrivateClauseOp` instance. This step is not straightforward though, because we cannot simply free the memory that's going to be used by another thread without any synchronization. So, for deallocation, we create a `omp.task` after the `omp.target` and synchronize the two with a dummy dependency (using the `depend` clause). In this newly created `omp.task` we do the deallocation.	2025-10-22 12:18:56 -05:00
agozillon	f2b20d3410	[Flang][OpenMP][Dialect] Swap to using MLIR dialect enum to encode map flags (#164043 ) This PR shifts from using the LLVM OpenMP enumerator bit flags to an OpenMP dialect specific enumerator. This allows us to better represent map types that wouldn't be of interest to the LLVM backend and runtime in the dialect. Primarily things like ref_ptr/ref_ptee/ref_ptr_ptee/atach_none/attach_always/attach_auto which are of interest to the compiler for certrain transformations (primarily in the FIR transformation passes dealing with mapping), but the runtime has no need to know about them. It also means if another OpenMP implementation comes along they won't need to stick to the same bit flag system LLVM chose/do leg work to address it.	2025-10-21 21:54:25 +02:00
Mehdi Amini	936e03867f	[MLIR] Apply clang-tidy fixes for performance-unnecessary-value-param in OpenMPToLLVMIRTranslation.cpp (NFC)	2025-10-17 05:58:15 -07:00
Jakub Kuderski	8bab6c4e8c	[mlir] Simplify unreachable type switch cases. NFC. (#162032 ) Use `DefaultUnreachable` from https://github.com/llvm/llvm-project/pull/161970.	2025-10-06 09:23:25 -04:00
Michael Kruse	419594230f	[mlir][omp] Add omp.tile operation (#160292 ) Add the `omp.tile` loop transformations for the OpenMP dialect. Used for lowering a standalone `!$omp tile` in Flang.	2025-10-02 17:12:14 +00:00
Jan Svoboda	c580ad488e	[clang] Use the VFS to create the OpenMP region entry ID (#160918 ) This PR uses the VFS to create the OpenMP target entry instead of going straight to the real file system. This matches the behavior of other input files of the compiler.	2025-09-26 12:25:37 -07:00
Dominik Adamski	83ef38a274	[Flang][OpenMP] Enable no-loop kernels (#155818 ) Enable the generation of no-loop kernels for Fortran OpenMP code. target teams distribute parallel do pragmas can be promoted to no-loop kernels if the user adds the -fopenmp-assume-teams-oversubscription and -fopenmp-assume-threads-oversubscription flags. If the OpenMP kernel contains reduction or num_teams clauses, it is not promoted to no-loop mode. The global OpenMP device RTL oversubscription flags no longer force no-loop code generation for Fortran.	2025-09-26 13:57:51 +02:00
Akash Banerjee	8afea0d0ea	[OpenMP][MLIR] Preserve to/from flags in mapper base entry for mappers (#159799 ) With declare mapper, the parent base entry was emitted as `TARGET_PARAM` only. The mapper received a map-type without `to/from`, causing components to degrade to `alloc`-only (no copies), breaking allocatable payload mapping. This PR preserves the map-type bits from the parent. This fixes #156466.	2025-09-19 19:34:09 +01:00

1 2 3 4 5 ...

382 Commits