llvm-project

Author	SHA1	Message	Date
agozillon	e0054e984c	[MLIR][OpenMP] Emit nullary check for mapped pointer members and appropriate size select based on results (#124604 ) This PR aims to fix a mapping error when trying to map nullary elements of a record type (primary example is allocatables/pointer types in Fortran at the moment). This should be legal to map, just not write to without pointing to anything within the target region. A common Fortran OpenMP idiom/example where this is useful can be found in the added Fortran offload example. The runtime error arises when we try to map the pointer member utilising a prescribed constant size that we receive from the lowered type, resulting in mapping of data that will be non-existent when there is no allocated data. The fix in this case is to emit a runtime check to see if the data has been allocated, if it hasn't been we select a size of 0, if it has we emit the usual type size.	2025-01-29 17:51:33 +01:00
Jeremy Morse	749443a307	[NFC][DebugInfo] Mop up final instruction-insertion call sites (#124289 ) These are the final places in the monorepo that make use of instruction insertion for methods like insertBefore and moveBefore. As part of the RemoveDIs project, instead use iterators for insertion. (see: https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 ).	2025-01-27 16:07:27 +00:00
Anchu Rajendran S	afcbcae668	[mlir][OpenMP] inscan reduction modifier and scan op mlir support (#114737 ) Scan directive allows to specify scan reductions within an worksharing loop, worksharing loop simd or simd directive which should have an `InScan` modifier associated with it. This change adds the mlir support for the same. Related PR: [Parsing and Semantic Support for scan](https://github.com/llvm/llvm-project/pull/102792)	2025-01-22 09:53:54 -08:00
Kareem Ergawy	937cbce14c	Revert "[flang][OpenMP] Enable delayed privatization by default `omp.wsloop` (#122471 )" (#123324 ) This seems to have caused some regressions in Fujitsu's test-suite: https://linaro.atlassian.net/browse/LLVM-1521 This reverts commit 6f82408bb53f57a859953d8f1114f1634a5d3ee9.	2025-01-22 10:16:40 +01:00
Thirumalai Shaktivel	c2aa11d148	[Flang] Add LLVM lowering support for UNTIED clause in Task (#121052 ) Implementation details: The UNTIED clause is recognized by setting the flag=0 for the default case or performing logical OR to flag if other clauses are specified, and this flag is passed as an argument to the `__kmpc_omp_task_alloc` runtime call. Resubmitting the PR with fix for the failure, as it was reverted here: 927a70daf31b1610627f346b0dc140eda72144b9 and previously merged here: https://github.com/llvm/llvm-project/pull/115283	2025-01-21 09:10:25 +05:30
Kareem Ergawy	6b3ba6677d	[flang][OpenMP] Unconditionally create `after_alloca` block in `allocatePrivateVars` (#123168 ) While https://github.com/llvm/llvm-project/pull/122866 fixed some issues, it introduced a regression in worksharing loops. The new bug comes from the fact that we now conditionally created the `after_alloca` block based on the number of sucessors of the alloca insertion point. This is unneccessary, we can just alway create the block. If we do this, we respect the post condtions expected after calling `allocatePrivateVars` (i.e. that the `afterAlloca` block has a single predecessor.	2025-01-16 19:08:38 +01:00
Kareem Ergawy	6f82408bb5	[flang][OpenMP] Enable delayed privatization by default `omp.wsloop` (#122471 ) This enable delayed privatization by default for `omp.wsloop` ops, with one caveat! I had to workaround the "impure" alloc region issue that being resolved at the moment. The workaround detects whether the alloc region's argument is used in the region and at the same time defined in block that does not dominate the chosen alloca insertion point. If so, we move the alloca insertion point below the defining instruction of the alloc region argument. This basically reverts to the non-delayed-privatizaiton behavior.	2025-01-16 15:44:59 +01:00
Thirumalai Shaktivel	1d890b06ee	[Flang, OpenMP] Add LLVM lowering support for PRIORITY in TASK (#120710 ) Implementation details: The PRIORITY clause is recognized by setting the flags = 32 to the `__kmpc_omp_task_alloc` runtime call. Also, store the priority-value to the `kmp_task_t` struct member	2025-01-16 10:02:30 +05:30
Kareem Ergawy	a32c45631b	[flang][OpenMP] Generalize fixing `alloca` IP pre-condition for `private` ops (#122866 ) This PR generalizes a fix that we implemented previously for `omp.wsloop`s. The fix makes sure the pre-condtion that the `alloca` block has a single successor whenever we inline delayed privatizers is respected. I simply moved the fix to `allocatePrivateVars` so that it kicks in for any op not just `omp.wsloop`. This handles a bug uncovered by [a test](https://github.com/OpenMP-Validation-and-Verification/OpenMP_VV/blob/master/tests/4.5/target_simd/test_target_simd_safelen.F90) in the OpenMP_VV test suite.	2025-01-15 14:52:10 +01:00
Sergio Afonso	9bc8828093	[OMPIRBuilder][MLIR] Add support for target 'if' clause (#122478 ) This patch implements support for handling the 'if' clause of OpenMP 'target' constructs in the OMPIRBuilder and updates MLIR to LLVM IR translation of the `omp.target` MLIR operation to make use of this new feature.	2025-01-15 10:16:19 +00:00
Sergio Afonso	d2d4c3bd59	[MLIR][OpenMP] LLVM IR translation of host_eval (#116052 ) This patch adds support for processing the `host_eval` clause of `omp.target` to populate default and runtime kernel launch attributes. Specifically, these related to the `num_teams`, `thread_limit` and `num_threads` clauses attached to operations nested inside of `omp.target`. As a result, the `thread_limit` clause of `omp.target` is also supported. The implementation of `initTargetDefaultAttrs()` is intended to reflect clang's own processing of multiple constructs and clauses in order to define a default number of teams and threads to be used as kernel attributes and to populate global variables in the target device module. One side effect of this change is that it is no longer possible to translate to LLVM IR target device MLIR modules unless they have a supported target triple. This is because the local `getGridValue()` function in the `OpenMPIRBuilder` only works for certain architectures, and it is called whenever the maximum number of threads has not been explicitly defined. This limitation also matches clang. Evaluating the collapsed loop trip count of SPMD and Generic-SPMD kernels remains unsupported.	2025-01-14 13:07:38 +00:00
Sergio Afonso	fabc443e93	[OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (#116051 ) This patch introduces a `TargetKernelRuntimeAttrs` structure to hold host-evaluated `num_teams`, `thread_limit`, `num_threads` and trip count values passed to the runtime kernel offloading call. Additionally, kernel type information is used to influence target device code generation and the `IsSPMD` flag is replaced by `ExecFlags`, which provides more granularity.	2025-01-14 12:34:37 +00:00
Sergio Afonso	27bc6bdaba	[OMPIRBuilder] Introduce struct to hold default kernel teams/threads (#116050 ) This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs` structure used to simplify passing default and constant values for number of teams and threads, and possibly other target kernel-related information in the future. This is used to forward values passed to `createTarget` to `createTargetInit`, which previously used a default unrelated set of values.	2025-01-14 11:08:55 +00:00
Sergio Afonso	9d7d8d2c87	[MLIR][OpenMP] Add host_eval clause to omp.target (#116049 ) This patch adds the `host_eval` clause to the `omp.target` operation. Additionally, it updates its op verifier to make sure all uses of block arguments defined by this clause fall within one of the few cases where they are allowed. MLIR to LLVM IR translation fails on translation of this clause with a not-yet-implemented error.	2025-01-14 10:21:46 +00:00
Kareem Ergawy	42da12063f	[flang][OpenMP] Extend delayed privatization for `omp.simd` (#122156 ) Adds support for delayed privatization for `simd` directives. This PR includes PFT down to LLVM IR lowering.	2025-01-12 07:46:58 +01:00
Kareem Ergawy	6f9e688203	[flang][OpenMP] Fix reduction init region block management (#122079 ) Replaces https://github.com/llvm/llvm-project/pull/121886 Fixes https://github.com/llvm/llvm-project/issues/120254 (hopefully 🤞) ## Problem Consider the following example: ```fortran program test real :: x(1) integer :: i !$omp parallel do reduction(+:x) do i = 1,1 x = 1 end do !$omp end parallel do end program ``` The HLFIR+OMP IR for this example looks like this: ```mlir func.func @_QQmain() { ... omp.parallel { %5 = fir.embox %4#0(%3) : (!fir.ref<!fir.array<1xf32>>, !fir.shape<1>) -> !fir.box<!fir.array<1xf32>> %6 = fir.alloca !fir.box<!fir.array<1xf32>> ... omp.wsloop private(@_QFEi_private_ref_i32 %1#0 -> %arg0 : !fir.ref<i32>) reduction(byref @add_reduction_byref_box_1xf32 %6 -> %arg1 : !fir.ref<!fir.box<!fir.array<1xf32>>>) { omp.loop_nest (%arg2) : i32 = (%c1_i32) to (%c1_i32_0) inclusive step (%c1_i32_1) { ... omp.yield } } omp.terminator } return } ``` The problem addressed by this PR is related to: the `alloca` in the `omp.parallel` region + the related `reduction` clause on the `omp.wsloop` op. When we try translate the reduction from MLIR to LLVM, we have to choose an `alloca` insertion point. This happens in `convertOmpWsloop` where at entry to that function, this is what the LLVM module looks like: ```llvm define void @_QQmain() { %tid.addr = alloca i32, align 4 ... entry: %omp_global_thread_num = call i32 @__kmpc_global_thread_num(ptr @1) br label %omp.par.entry omp.par.entry: %tid.addr.local = alloca i32, align 4 ... br label %omp.par.region omp.par.region: br label %omp.par.region1 omp.par.region1: ... %5 = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, align 8 ``` Now, when we choose an `alloca` insertion point for the reduction, this is the chosen block `omp.par.entry` (without the changes in this PR). The problem is that the allocation needed for the reduction needs to reference the `%5` SSA value. This results in inserting allocations in `omp.par.entry` that reference allocations in a later block `omp.par.region1` which causes the `Instruction does not dominate all uses!` error. ## Possible solution - take 2: This PR contains a more localized solution than https://github.com/llvm/llvm-project/pull/121886. It makes sure that on entry to `initReductionVars`, the IR builder is at a point where we can starting inserting initialization region; to make things cleaner, we still split the builder insertion point to a dedicated `omp.reduction.init`. This way we avoid splitting after the latest allocation block; which is what causing the issue.	2025-01-09 16:11:18 +01:00
agozillon	fa56e8bb64	[OpenMP][MLIR] Fix threadprivate lowering when compiling for target when target operations are in use (#119310 ) Currently the compiler will ICE in programs like the following on the device lowering pass: ``` program main implicit none type i1_t integer :: val(1000) end type i1_t integer :: i type(i1_t), pointer :: newi1 type(i1_t), pointer :: tab=>null() integer, dimension(:), pointer :: tabval !$omp THREADPRIVATE(tab) allocate(newi1) tab=>newi1 tab%val(:)=1 tabval=>tab%val !$omp target teams distribute parallel do do i = 1, 1000 tabval(i) = i end do !$omp end target teams distribute parallel do end program main ``` This is due to the fact that THREADPRIVATE returns a result operation, and this operation can actually be used by other LLVM dialect (or other dialect) operations. However, we currently skip the lowering of threadprivate, so we effectively never generate and bind an LLVM-IR result to the threadprivate operation result. So when we later go on to lower dependent LLVM dialect operations, we are missing the required LLVM-IR result, try to access and use it and then ICE. The fix in this particular PR is to allow compilation of threadprivate for device as well as host, and simply treat the device compilation as a no-op, binding the LLVM-IR result of threadprivate with no alterations and binding it, which will allow the rest of the compilation to proceed, where we'll eventually discard the host segment in any case. The other possible solution to this I can think of, is doing something similar to Flang's passes that occur prior to CodeGen to the LLVM dialect, where they erase/no-op certain unrequired operations or transform them to lower level series of operations. And we would erase/no-op threadprivate on device as we'd never have these in target regions. The main issues I can see with this are that we currently do not specialise this stage based on wether we're compiling for device or host, so it's setting a precedent and adding another point of having to understand the separation between target and host compilation. I am also not sure we'd necessarily want to enforce this at a dialect level incase someone else wishes to add a different lowering flow or translation flow. Another possible issue is that a target operation we have/utilise would depend on the result of threadprivate, meaning we'd not be allowed to entirely erase/no-op it, I am not sure of any situations where this may be an issue currently though.	2025-01-03 18:01:01 +01:00
Kaviya Rajendiran	d3eb65f15d	[MLIR][OpenMP] Lowering aligned clause to LLVM IR for SIMD directive (#119536 ) This patch, - Added a translation support for aligned clause in SIMD directive by passing the alignment details to "llvm.assume" intrinsic. - Updated the insertion point for llvm.assume intrinsic call in "OMPIRBuilder.cpp". - Added a check in aligned clause MLIR lowering, to ensure that the alignment value must be a power of 2.	2025-01-03 16:22:38 +05:30
Thirumalai Shaktivel	cbe583b0bd	[Flang] Add translation support for MutexInOutSet and InOutSet (#120715 ) Implementatoin details: Both Mutexinoutset and Inoutset is recognized as flag=0x4 and 0x8 respectively, the flags is set to `kmp_depend_info` and passed as argument to `__kmpc_omp_task_with_deps` runtime call	2024-12-26 15:02:09 +05:30
Muhammad Omair Javaid	927a70daf3	Revert "[Flang OpenMP] Add LLVM translation support for UNTIED in Task (#115283 )" This reverts commit 919aead1db64b2f1444842bc75a3af7836238671. It breaks following LLVM bots: https://lab.llvm.org/buildbot/#/builders/199 https://lab.llvm.org/buildbot/#/builders/143 https://lab.llvm.org/buildbot/#/builders/17	2024-12-24 01:47:24 +05:00
Thirumalai Shaktivel	919aead1db	[Flang OpenMP] Add LLVM translation support for UNTIED in Task (#115283 ) Implementation details: The UNTIED clause is recognized by setting the flag=0 for the default case or performing logical OR to flag if other clauses are specified, and this flag is passed as an argument to the `__kmpc_omp_task_alloc` runtime call.	2024-12-20 16:36:51 +05:30
Ivan R. Ivanov	7c9404c279	[flang][OpenMP] Add frontend support for ompx_bare clause (#111106 )	2024-12-13 21:44:43 +09:00
Jie Fu	46ec271e03	[mlir] Fix -Wunused-variable in OpenMPToLLVMIRTranslation.cpp (NFC) /llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp:3921:12: error: unused variable 'varType' [-Werror,-Wunused-variable] Type varType = mapInfoOp.getVarType(); ^ 1 error generated.	2024-12-12 22:11:41 +08:00
Kareem Ergawy	f9734b9df1	[mlir][OpenMP] - MLIR to LLVMIR translation support for delayed privatization of allocatables in `omp.target` ops (#116576 ) This PR adds support to translate the `private` clause from MLIR to LLVMIR when used on allocatables in the context of an `omp.target` op. This replaces https://github.com/llvm/llvm-project/pull/113208. Parent PR: https://github.com/llvm/llvm-project/pull/116770. Only the latest commit is relevant to the PR.	2024-12-12 14:39:58 +01:00
Kareem Ergawy	0e70e0edd5	[reapply (#118463 )][OpenMP][OMPIRBuilder] Add delayed privatization support for `wsloop` (#119170 ) This reapplies PR #118463 after introducing a fix for a bug uncovere by the test suite. The problem is that when the alloca block is terminated with a conditional branch, this violates a pre-condition of `allocatePrivateVars` (which assumes the alloca block has a single successor). This new PR includes a test that reproduces the issue. Extend MLIR to LLVM lowering by adding support for `omp.wsloop` for delayed privatization. This also refactors a few bit of code to isolate the logic needed for `firstprivate` initialization in a shared util that can be used across constructs that need it. The same is done for `dealloc` regions.	2024-12-09 14:32:04 +01:00
NimishMishra	9eb4056144	[mlir][llvm] Translation support for task detach (#116601 ) This PR adds translation support for task detach. Essentially, if the `detach` clause is present on a task, emit a `__kmpc_task_allow_completion_event` on it, and store its return (of type `kmp_event_t*`) into the `event_handle`.	2024-12-08 06:09:52 -08:00
Kareem Ergawy	c54616ea48	Revert "[OpenMP][OMPIRBuilder] Add delayed privatization support for `wsloop` (#118463 )" (#118848 )	2024-12-05 20:49:13 +01:00
Kareem Ergawy	0993335134	[OpenMP][OMPIRBuilder] Add delayed privatization support for `wsloop` (#118463 ) Extend MLIR to LLVM lowering by adding support for `omp.wsloop` for delayed privatization. This also refactors a few bit of code to isolate the logic needed for `firstprivate` initialization in a shared util that can be used across constructs that need it. The same is done for `dealloc` regions. Parent PR: https://github.com/llvm/llvm-project/pull/118447. Only latest commit is relevant for this PR.	2024-12-05 05:59:52 +01:00
Kareem Ergawy	7f72d71de7	[OpenMP][OMPIRBuilder] Refactor reduction initialization logic into one util (#118447 ) This refactors the logic needed to emit init logic for reductions by moving some duplicated code into a shared util. The logic for doing is quite involved and is needed for any construct that has reductions. Moreover, when a construct has both private and reduction clauses, both sets of clauses need to cooperate with each other when emitting the logic needed for allocation and initialization. Therefore, this PR clearly sets the boundaries for the logic needed to initialize reductions.	2024-12-05 05:23:49 +01:00
NimishMishra	b9e3a769b9	[flang][mlir][llvm][OpenMP] Add lowering and translation support for mergeable clause on task (#114662 ) Add FIR generation and LLVMIR translation support for mergeable clause on task construct. If mergeable clause is present on a task, the relevant flag in `ompt_task_flag_t` is set and passed to `__kmpc_omp_task_alloc`.	2024-11-26 02:40:26 -08:00
Tom Eccles	a6385a3fc8	[mlir][OpenMP][NFC] use llvm::zip_equal for firstprivate copy region translation (#116416 ) I think this is a bit easier to read.	2024-11-18 10:25:19 +00:00
agozillon	b5db75bfce	[OpenMP][MLIR] Descriptor explicit member map lowering changes (#113556 ) This is one of 3 PRs in a PR stack that aims to add support for explicit mapping of allocatable members in derived types. The primary changes in this PR are the OpenMPToLLVMIRTranslation.cpp changes, which are small and seek to alter the current member mapping to add an additional map insertion for pointers. Effectively, if the member is a pointer (currently indicated by having a varPtrPtr field) we add an additional map for the pointer and then alter the subsequent mapping of the member (the data) to utilise the member rather than the parents base pointer. This appears to be necessary in certain cases when mapping pointer data within record types to avoid segfaulting on device (due to incorrect data mapping). In general this record type mapping may be simplifiable in the future. There are also additions of tests which should help to showcase the affect of the changes above.	2024-11-16 12:26:29 +01:00
agozillon	d84d0caf28	[Flang][OpenMP] Update MapInfoFinalization to use BlockArgs Interface and modify use_device_ptr/addr to be order independent (#113919 ) This patch primarily updates the MapInfoFinalization pass to utilise the BlockArgument interface. It also shuffles newly added arguments the MapInfoFinalization passes to the end of the BlockArg/Relevant MapInfo lists, instead of one prior to the owning descriptor type. During this it was noted that the use_device_ptr/addr handling of target data was a little bit too order dependent so I've attempted to make it less so, as we cannot depend on argument ordering to be the same as Fortran for any future frontends.	2024-11-14 15:47:37 +01:00
Tom Eccles	8269c400b4	[mlir][OpenMP][NFC] delayed privatisation cleanup (#115298 ) Upstreaming some code cleanups ahead of supporting delayed task execution. - Make allocatePrivateVars not need to be a template (it will need to operate separately on firstprivate and private variables for delayed task execution so it can't index into lists of all variables in the operation). - Use llvm::SmallVectorImpl for function arguments - collectPrivatizationDecls already reserves size for privateDecls so we don't need to do that in callers - Use llvm::zip_equal instead of C-style array indexing	2024-11-07 12:27:31 +00:00
Tom Eccles	28452acac0	[mlir][OpenMP] delayed privatisation for TASK (#114785 ) This uses essentially an identical implementation to that used for ParallelOp. The private variable allocation and deallocation use shared functions to avoid code duplication. FIRSTPRIVATE variable copying uses duplicated code for now because I anticipate the implementation diverging in the near future once I store data for firstprivate variables in the task description structure. After enabling delayed privatisation for TASK in flang, one more test in the fujitsu test suite passes (I haven't looked into why).	2024-11-06 13:19:12 +00:00
Sergio Afonso	d3e796c2d0	[MLIR][OpenMP] Update not-yet-implemented errors, NFC (#114966 ) This patch improves not-yet-implemented error diagnostics to more closely follow the format used by Flang lowering for the same kind of errors. This helps keep some level of uniformity from a user perspective.	2024-11-05 12:48:54 +00:00
Sergio Afonso	6c28530ed0	[Flang][OpenMP] Properly bind arguments of composite operations (#113682 ) When composite constructs are lowered, clauses for each leaf construct are lowered before creating the set of loop wrapper operations, using these outside values to populate their operand lists. Then, when the loop nest associated to that composite construct is lowered, the binding of Fortran symbols to the entry block arguments defined by these loop wrappers is performed, resulting in the creation of `hlfir.declare` operations in the entry block of the `omp.loop_nest`. This approach prevents `hlfir.declare` operations related to the binding and other operations resulting from the evaluation of the clauses from being inserted between loop wrapper operations, which would be an illegal MLIR representation. However, this introduces the problem of entry block arguments defined by a wrapper that then should be used by one of its nested wrappers, because the corresponding Fortran symbol would still be mapped to an outside value at the time of gathering the list of operands for the nested wrapper. This patch adds operand re-mapping logic to update wrappers without changing when clauses are evaluated or where the `hlfir.declare` creation is performed.	2024-10-31 16:39:53 +00:00
Sergio Afonso	bd6c21460f	[MLIR][OpenMP] Emit descriptive errors for all unsupported clauses (#114037 ) This patch improves error reporting in the MLIR to LLVM IR translation pass for the 'omp' dialect by emitting descriptive errors when encountering clauses not yet supported by that pass. Additionally, not-yet-implemented errors previously missing for some clauses are added, to avoid silently ignoring them. Error messages related to inlining of `omp.private` and `omp.declare_reduction` regions have been updated to use the same format.	2024-10-31 11:59:51 +00:00
Sergio Afonso	21a6032eca	[MLIR][OpenMP] Simplify translation to LLVM IR error handling (#114036 ) This patch unifies the handling of errors passed through the OpenMPIRBuilder and removes some redundant error messages through the introduction of a custom `ErrorInfo` subclass. Additionally, the current list of operations and clauses unsupported by the MLIR to LLVM IR translation pass is added to a new Lit test to check they are being reported to the user.	2024-10-31 11:34:24 +00:00
Sergio Afonso	a1f2fb6078	[MLIR][OpenMP] Prevent composite omp.simd related crashes (#113680 ) This patch updates the translation of `omp.wsloop` with a nested `omp.simd` to prevent uses of block arguments defined by the latter from triggering null pointer dereferences. This happens because the inner `omp.simd` operation representing composite `do simd` constructs is currently skipped and not translated, but this results in block arguments defined by it not being mapped to an LLVM value. The proposed solution is to map these block arguments to the LLVM value associated to the corresponding operand, which is defined above.	2024-10-29 17:05:12 +00:00
Sergio Afonso	d87964de78	[OpenMP][OMPIRBuilder] Error propagation across callbacks (#112533 ) This patch implements an approach to communicate errors between the OMPIRBuilder and its users. It introduces `llvm::Error` and `llvm::Expected` objects to replace the values returned by callbacks passed to `OMPIRBuilder` codegen functions. These functions then check the result for errors when callbacks are called and forward them back to the caller, which has the flexibility to recover, exit cleanly or dump a stack trace. This prevents a failed callback to leave the IR in an invalid state and still continue the codegen process, triggering unrelated assertions or segmentation faults. In the case of MLIR to LLVM IR translation of the 'omp' dialect, this change results in the compiler emitting errors and exiting early instead of triggering a crash for not-yet-implemented errors. The behavior in Clang and openmp-opt stays unchanged, since callbacks will continue always returning 'success'.	2024-10-25 11:30:16 +01:00
Kareem Ergawy	ad70f3e095	[flang][OpenMP] Support `target enter\|update\|exit .. nowait` (#113305 ) Extends `nowait` support for other device directives. This PR refactors the task generation utils used for the `target` directive so that they are general enough to be reused for other device directives as well.	2024-10-23 10:48:54 +02:00
Tom Eccles	621fcf892b	[mlir][OpenMP] rewrite conversion of privatisation for omp.parallel (#111844 ) The existing conversion inlined private alloc regions and firstprivate copy regions in mlir, then undoing the modification of the mlir module before completing the conversion. To make this work, LLVM IR had to be generated using the wrong mapping for privatised values and then later fixed inside of OpenMPIRBuilder. This approach violated an assumption in OpenMPIRBuilder that private variables would be values not constants. Flang sometimes generates code where private variables are promoted to globals, the address of which is treated as a constant in LLVM IR. This caused the incorrect values for the private variable from being replaced by OpenMPIRBuilder: ultimately resulting in programs producing incorrect results. This patch rewrites delayed privatisation for omp.parallel to work more similarly to reductions: translating directly into LLVMIR with correct mappings for private variables. RFC: https://discourse.llvm.org/t/rfc-openmp-fix-issue-in-mlir-to-llvmir-translation-for-delayed-privatisation/81225 Tested against the gfortran testsuite and our internal test suite. Linaro's post-commit bots will check against the fujitsu test suite. I decided to add the new tests as flang integration tests rather than in mlir/test/Target/LLVMIR: - The regression test is for an issue filed against flang. i wanted to keep the reproducer similar to the code in the ticket. - I found the "worst case" CFG test difficult to reason about in abstract it helped me to think about what was going on in terms of a Fortran program. Fixes #106297	2024-10-16 14:43:57 +01:00
Sergio Afonso	15d85769f1	[Flang][OpenMP] Support lowering of simd reductions (#112194 ) This patch enables lowering to MLIR of the reduction clause of `simd` constructs. Lowering from MLIR to LLVM IR remains unimplemented, so at that stage it will result in errors being emitted rather than silently ignoring it as it is currently done. On composite `do simd` constructs, this lowering error will remain untriggered, as the `omp.simd` operation in that case is currently ignored. The MLIR representation, however, will now contain `reduction` information.	2024-10-16 10:27:50 +01:00
Kareem Ergawy	d0d03805f8	[flang][OpenMP] Support `target ... nowait` (#111823 ) Adds MLIR to LLVM lowering support for `target ... nowait`. This leverages the already existings code-gen patterns for `task` by treating `target ... nowait` as `task ... if(1)` and `target` (without `nowait`) as `task ... if(0)`; similar to what clang does.	2024-10-15 14:39:16 +02:00
Sergio Afonso	7ec3209493	[MLIR][OpenMP] Named recipe op's block args accessors (NFC) (#112192 ) This patch adds extra class declarations to the `omp.declare_reduction` and `omp.private` operations to access the entry block arguments defined by their regions. Some existing accesses to these arguments are updated to use the new named methods to improve code readability.	2024-10-15 11:50:30 +01:00
NimishMishra	aec87a2143	[llvm][mlir][flang][OpenMP] Emit __atomic_load and __atomic_compare_exchange libcalls for complex types in atomic update (#92364 ) This patch adds functionality to emit relevant libcalls in case atomicrmw instruction can not be emitted (for instance, in case of complex types). The IRBuilder is modified to directly emit __atomic_load and __atomic_compare_exchange libcalls. The added functions follow a similar codegen path as Clang, so that LLVM Flang generates almost similar IR as Clang. Fixes https://github.com/llvm/llvm-project/issues/83760 and https://github.com/llvm/llvm-project/issues/75138 Co-authored-by: Michael Kruse <llvm-project@meinersbur.de>	2024-10-02 23:32:36 -07:00
Sergio Afonso	5894d4e8e4	[MLIR][OpenMP] Use map format to represent use_device_{addr,ptr} (#109810 ) This patch updates the `omp.target_data` operation to use the same formatting as `map` clauses on `omp.target` for `use_device_addr` and `use_device_ptr`. This is done so the mapping that is being enforced between op arguments and associated entry block arguments is explicit. The way it is achieved is by marking these clauses as entry block argument-defining and adjusting printer/parsers accordingly. As a result of this change, block arguments for `use_device_addr` come before those for `use_device_ptr`, which is the opposite of the previous undocumented situation. Some unit tests are updated based on this change, in addition to those updated because of the format change.	2024-10-01 16:45:59 +01:00
Sergio Afonso	d0f67773b2	[MLIR][OpenMP] Normalize handling of entry block arguments (#109808 ) This patch introduces a new MLIR interface for the OpenMP dialect aimed at providing a uniform way of verifying and handling entry block arguments defined by OpenMP clauses. The approach consists in defining a set of overrideable methods that return the number of block arguments the operation holds regarding each of the clauses that may define them. These by default return 0, but they are overriden by the corresponding clause through the `extraClassDeclaration` mechanism. Another set of interface methods to get the actual lists of block arguments is defined, which is implemented based on the previously described methods. These implicitly define a standardized ordering between the list of block arguments associated to each clause, based on the alphabetical ordering of their names. They should be the preferred way of matching operation arguments and entry block arguments to that operation's first region. Some updates are made to the printing/parsing of `omp.parallel` to follow the expected order between `private` and `reduction` clauses, as well as the MLIR to LLVM IR translation pass to access block arguments using the new interface. Unit tests of operations impacted by additional verification checks and sorting of entry block arguments.	2024-10-01 15:04:27 +01:00
Pranav Bhandarkar	47d42cfa59	[mlir][OpenMP] - MLIR to LLVMIR translation support for delayed privatization in `omp.target` ops. (#109668 ) This patch adds support to translate the `private` clause on `omp.target` ops from MLIR to LLVMIR. This first cut only handles non-allocatables. Also, this is for delayed privatization.	2024-09-30 21:58:44 -05:00

1 2 3 4 5 ...

254 Commits