254 Commits

Author SHA1 Message Date
agozillon
e0054e984c
[MLIR][OpenMP] Emit nullary check for mapped pointer members and appropriate size select based on results (#124604)
This PR aims to fix a mapping error when trying to map nullary elements
of a record type (primary example is allocatables/pointer types in
Fortran at the moment). This should be legal to map, just not write to
without pointing to anything within the target region. A common Fortran
OpenMP idiom/example where this is useful can be found in the added
Fortran offload example.

The runtime error arises when we try to map the pointer member utilising
a prescribed constant size that we receive from the lowered type,
resulting in mapping of data that will be non-existent when there is no
allocated data. The fix in this case is to emit a runtime check to see
if the data has been allocated, if it hasn't been we select a size of 0,
if it has we emit the usual type size.
2025-01-29 17:51:33 +01:00
Jeremy Morse
749443a307
[NFC][DebugInfo] Mop up final instruction-insertion call sites (#124289)
These are the final places in the monorepo that make use of instruction
insertion for methods like insertBefore and moveBefore. As part of the
RemoveDIs project, instead use iterators for insertion. (see:
https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939
).
2025-01-27 16:07:27 +00:00
Anchu Rajendran S
afcbcae668
[mlir][OpenMP] inscan reduction modifier and scan op mlir support (#114737)
Scan directive allows to specify scan reductions within an worksharing
loop, worksharing loop simd or simd directive which should have an
`InScan` modifier associated with it. This change adds the mlir support
for the same.

Related PR: [Parsing and Semantic Support for
scan](https://github.com/llvm/llvm-project/pull/102792)
2025-01-22 09:53:54 -08:00
Kareem Ergawy
937cbce14c
Revert "[flang][OpenMP] Enable delayed privatization by default omp.wsloop (#122471)" (#123324)
This seems to have caused some regressions in Fujitsu's test-suite:
https://linaro.atlassian.net/browse/LLVM-1521

This reverts commit 6f82408bb53f57a859953d8f1114f1634a5d3ee9.
2025-01-22 10:16:40 +01:00
Thirumalai Shaktivel
c2aa11d148
[Flang] Add LLVM lowering support for UNTIED clause in Task (#121052)
Implementation details:
The UNTIED clause is recognized by setting the flag=0 for the default
case or performing logical OR to flag if other clauses are specified,
and this flag is passed as an argument to the `__kmpc_omp_task_alloc`
runtime call.


Resubmitting the PR with fix for the failure, as it was reverted here:
927a70daf31b1610627f346b0dc140eda72144b9
and previously merged here: https://github.com/llvm/llvm-project/pull/115283
2025-01-21 09:10:25 +05:30
Kareem Ergawy
6b3ba6677d
[flang][OpenMP] Unconditionally create after_alloca block in allocatePrivateVars (#123168)
While https://github.com/llvm/llvm-project/pull/122866 fixed some
issues, it introduced a regression in worksharing loops. The new bug
comes from the fact that we now conditionally created the `after_alloca`
block based on the number of sucessors of the alloca insertion point.
This is unneccessary, we can just alway create the block. If we do this,
we respect the post condtions expected after calling
`allocatePrivateVars` (i.e. that the `afterAlloca` block has a single
predecessor.
2025-01-16 19:08:38 +01:00
Kareem Ergawy
6f82408bb5
[flang][OpenMP] Enable delayed privatization by default omp.wsloop (#122471)
This enable delayed privatization by default for `omp.wsloop` ops, with
one caveat! I had to workaround the "impure" alloc region issue that
being resolved at the moment. The workaround detects whether the alloc
region's argument is used in the region and at the same time defined in
block that does not dominate the chosen alloca insertion point. If so,
we move the alloca insertion point below the defining instruction of the
alloc region argument. This basically reverts to the
non-delayed-privatizaiton behavior.
2025-01-16 15:44:59 +01:00
Thirumalai Shaktivel
1d890b06ee
[Flang, OpenMP] Add LLVM lowering support for PRIORITY in TASK (#120710)
Implementation details:
The PRIORITY clause is recognized by setting the flags = 32 to the 
`__kmpc_omp_task_alloc` runtime call. Also, store the priority-value 
to the `kmp_task_t` struct member
2025-01-16 10:02:30 +05:30
Kareem Ergawy
a32c45631b
[flang][OpenMP] Generalize fixing alloca IP pre-condition for private ops (#122866)
This PR generalizes a fix that we implemented previously for
`omp.wsloop`s. The fix makes sure the pre-condtion that the `alloca`
block has a single successor whenever we inline delayed privatizers is
respected. I simply moved the fix to `allocatePrivateVars` so that it
kicks in for any op not just `omp.wsloop`.

This handles a bug uncovered by [a
test](https://github.com/OpenMP-Validation-and-Verification/OpenMP_VV/blob/master/tests/4.5/target_simd/test_target_simd_safelen.F90)
in the OpenMP_VV test suite.
2025-01-15 14:52:10 +01:00
Sergio Afonso
9bc8828093
[OMPIRBuilder][MLIR] Add support for target 'if' clause (#122478)
This patch implements support for handling the 'if' clause of OpenMP
'target' constructs in the OMPIRBuilder and updates MLIR to LLVM IR
translation of the `omp.target` MLIR operation to make use of this new
feature.
2025-01-15 10:16:19 +00:00
Sergio Afonso
d2d4c3bd59
[MLIR][OpenMP] LLVM IR translation of host_eval (#116052)
This patch adds support for processing the `host_eval` clause of
`omp.target` to populate default and runtime kernel launch attributes.
Specifically, these related to the `num_teams`, `thread_limit` and
`num_threads` clauses attached to operations nested inside of
`omp.target`. As a result, the `thread_limit` clause of `omp.target` is
also supported.

The implementation of `initTargetDefaultAttrs()` is intended to reflect
clang's own processing of multiple constructs and clauses in order to
define a default number of teams and threads to be used as kernel
attributes and to populate global variables in the target device module.

One side effect of this change is that it is no longer possible to
translate to LLVM IR target device MLIR modules unless they have a
supported target triple. This is because the local `getGridValue()`
function in the `OpenMPIRBuilder` only works for certain architectures,
and it is called whenever the maximum number of threads has not been
explicitly defined. This limitation also matches clang.

Evaluating the collapsed loop trip count of SPMD and Generic-SPMD
kernels remains unsupported.
2025-01-14 13:07:38 +00:00
Sergio Afonso
fabc443e93
[OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (#116051)
This patch introduces a `TargetKernelRuntimeAttrs` structure to hold
host-evaluated `num_teams`, `thread_limit`, `num_threads` and trip count
values passed to the runtime kernel offloading call.

Additionally, kernel type information is used to influence target device
code generation and the `IsSPMD` flag is replaced by `ExecFlags`, which
provides more granularity.
2025-01-14 12:34:37 +00:00
Sergio Afonso
27bc6bdaba
[OMPIRBuilder] Introduce struct to hold default kernel teams/threads (#116050)
This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs`
structure used to simplify passing default and constant values for
number of teams and threads, and possibly other target kernel-related
information in the future.

This is used to forward values passed to `createTarget` to
`createTargetInit`, which previously used a default unrelated set of
values.
2025-01-14 11:08:55 +00:00
Sergio Afonso
9d7d8d2c87
[MLIR][OpenMP] Add host_eval clause to omp.target (#116049)
This patch adds the `host_eval` clause to the `omp.target` operation.
Additionally, it updates its op verifier to make sure all uses of block
arguments defined by this clause fall within one of the few cases where
they are allowed.

MLIR to LLVM IR translation fails on translation of this clause with a
not-yet-implemented error.
2025-01-14 10:21:46 +00:00
Kareem Ergawy
42da12063f
[flang][OpenMP] Extend delayed privatization for omp.simd (#122156)
Adds support for delayed privatization for `simd` directives. This PR
includes PFT down to LLVM IR lowering.
2025-01-12 07:46:58 +01:00
Kareem Ergawy
6f9e688203
[flang][OpenMP] Fix reduction init region block management (#122079)
Replaces https://github.com/llvm/llvm-project/pull/121886
Fixes https://github.com/llvm/llvm-project/issues/120254 (hopefully 🤞)

## Problem

Consider the following example:
```fortran
program test
  real :: x(1)
  integer :: i
  !$omp parallel do reduction(+:x)
    do i = 1,1
      x = 1
    end do
  !$omp end parallel do
end program
```

The HLFIR+OMP IR for this example looks like this:
```mlir
  func.func @_QQmain() {
    ...
    omp.parallel {
      %5 = fir.embox %4#0(%3) : (!fir.ref<!fir.array<1xf32>>, !fir.shape<1>) -> !fir.box<!fir.array<1xf32>>
      %6 = fir.alloca !fir.box<!fir.array<1xf32>>
      ...
      omp.wsloop private(@_QFEi_private_ref_i32 %1#0 -> %arg0 : !fir.ref<i32>) reduction(byref @add_reduction_byref_box_1xf32 %6 -> %arg1 : !fir.ref<!fir.box<!fir.array<1xf32>>>) {
        omp.loop_nest (%arg2) : i32 = (%c1_i32) to (%c1_i32_0) inclusive step (%c1_i32_1) {
          ...
          omp.yield
        }
      }
      omp.terminator
    }
    return
  }
```

The problem addressed by this PR is related to: the `alloca` in the
`omp.parallel` region + the related `reduction` clause on the
`omp.wsloop` op. When we try translate the reduction from MLIR to LLVM,
we have to choose an `alloca` insertion point. This happens in
`convertOmpWsloop` where at entry to that function, this is what the
LLVM module looks like:

```llvm
define void @_QQmain() {
  %tid.addr = alloca i32, align 4
  ...

entry:
  %omp_global_thread_num = call i32 @__kmpc_global_thread_num(ptr @1)
  br label %omp.par.entry

omp.par.entry:
  %tid.addr.local = alloca i32, align 4
  ...
  br label %omp.par.region

omp.par.region:
  br label %omp.par.region1

omp.par.region1:
  ...
  %5 = alloca { ptr, i64, i32, i8, i8, i8, i8, [1 x [3 x i64]] }, align 8
```

Now, when we choose an `alloca` insertion point for the reduction, this
is the chosen block `omp.par.entry` (without the changes in this PR).
The problem is that the allocation needed for the reduction needs to
reference the `%5` SSA value. This results in inserting allocations in
`omp.par.entry` that reference allocations in a later block
`omp.par.region1` which causes the `Instruction does not dominate all
uses!` error.

## Possible solution - take 2:

This PR contains a more localized solution than
https://github.com/llvm/llvm-project/pull/121886. It makes sure that on
entry to `initReductionVars`, the IR builder is at a point where we can
starting inserting initialization region; to make things cleaner, we
still split the builder insertion point to a dedicated
`omp.reduction.init`. This way we avoid splitting after the latest
allocation block; which is what causing the issue.
2025-01-09 16:11:18 +01:00
agozillon
fa56e8bb64
[OpenMP][MLIR] Fix threadprivate lowering when compiling for target when target operations are in use (#119310)
Currently the compiler will ICE in programs like the following on the
device lowering pass:

```
program main
    implicit none

    type i1_t
       integer :: val(1000)
    end type i1_t
    integer :: i
    type(i1_t), pointer :: newi1
    type(i1_t), pointer :: tab=>null()

    integer, dimension(:), pointer :: tabval

!$omp THREADPRIVATE(tab)

allocate(newi1)

tab=>newi1
tab%val(:)=1
tabval=>tab%val

!$omp target teams distribute parallel do
  do i = 1, 1000
   tabval(i) = i
 end do
!$omp end target teams distribute parallel do

end program main
```

This is due to the fact that THREADPRIVATE returns a result operation,
and this operation can actually be used by other LLVM dialect (or other
dialect) operations. However, we currently skip the lowering of
threadprivate, so we effectively never generate and bind an LLVM-IR
result to the threadprivate operation result. So when we later go on to
lower dependent LLVM dialect operations, we are missing the required
LLVM-IR result, try to access and use it and then ICE. The fix in this
particular PR is to allow compilation of threadprivate for device as
well as host, and simply treat the device compilation as a no-op,
binding the LLVM-IR result of threadprivate with no alterations and
binding it, which will allow the rest of the compilation to proceed,
where we'll eventually discard the host segment in any case.

The other possible solution to this I can think of, is doing something
similar to Flang's passes that occur prior to CodeGen to the LLVM
dialect, where they erase/no-op certain unrequired operations or
transform them to lower level series of operations. And we would
erase/no-op threadprivate on device as we'd never have these in target
regions.

The main issues I can see with this are that we currently do not
specialise this stage based on wether we're compiling for device or
host, so it's setting a precedent and adding another point of having to
understand the separation between target and host compilation. I am also
not sure we'd necessarily want to enforce this at a dialect level incase
someone else wishes to add a different lowering flow or translation
flow. Another possible issue is that a target operation we have/utilise
would depend on the result of threadprivate, meaning we'd not be allowed
to entirely erase/no-op it, I am not sure of any situations where this
may be an issue currently though.
2025-01-03 18:01:01 +01:00
Kaviya Rajendiran
d3eb65f15d
[MLIR][OpenMP] Lowering aligned clause to LLVM IR for SIMD directive (#119536)
This patch,
- Added a translation support for aligned clause in SIMD directive by passing the alignment details to "llvm.assume" intrinsic.
- Updated the insertion point for llvm.assume intrinsic call in "OMPIRBuilder.cpp".
- Added a check in aligned clause MLIR lowering, to ensure that the alignment value must be a power of 2.
2025-01-03 16:22:38 +05:30
Thirumalai Shaktivel
cbe583b0bd
[Flang] Add translation support for MutexInOutSet and InOutSet (#120715)
Implementatoin details:
Both Mutexinoutset and Inoutset is recognized as flag=0x4 
and 0x8 respectively, the flags is set to `kmp_depend_info` and 
passed as argument to `__kmpc_omp_task_with_deps` runtime call
2024-12-26 15:02:09 +05:30
Muhammad Omair Javaid
927a70daf3 Revert "[Flang OpenMP] Add LLVM translation support for UNTIED in Task (#115283)"
This reverts commit 919aead1db64b2f1444842bc75a3af7836238671.
It breaks following LLVM bots:
https://lab.llvm.org/buildbot/#/builders/199
https://lab.llvm.org/buildbot/#/builders/143
https://lab.llvm.org/buildbot/#/builders/17
2024-12-24 01:47:24 +05:00
Thirumalai Shaktivel
919aead1db
[Flang OpenMP] Add LLVM translation support for UNTIED in Task (#115283)
Implementation details:
The UNTIED clause is recognized by setting the flag=0 for the default
case or performing logical OR to flag if other clauses are specified,
and this flag is passed as an argument to the `__kmpc_omp_task_alloc`
runtime call.
2024-12-20 16:36:51 +05:30
Ivan R. Ivanov
7c9404c279
[flang][OpenMP] Add frontend support for ompx_bare clause (#111106) 2024-12-13 21:44:43 +09:00
Jie Fu
46ec271e03 [mlir] Fix -Wunused-variable in OpenMPToLLVMIRTranslation.cpp (NFC)
/llvm-project/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp:3921:12:
 error: unused variable 'varType' [-Werror,-Wunused-variable]
      Type varType = mapInfoOp.getVarType();
           ^
1 error generated.
2024-12-12 22:11:41 +08:00
Kareem Ergawy
f9734b9df1
[mlir][OpenMP] - MLIR to LLVMIR translation support for delayed privatization of allocatables in omp.target ops (#116576)
This PR adds support to translate the `private` clause from MLIR to
LLVMIR when used on allocatables in the context of an `omp.target` op.

This replaces https://github.com/llvm/llvm-project/pull/113208.

Parent PR: https://github.com/llvm/llvm-project/pull/116770. Only the
latest commit is relevant to the PR.
2024-12-12 14:39:58 +01:00
Kareem Ergawy
0e70e0edd5
[reapply (#118463)][OpenMP][OMPIRBuilder] Add delayed privatization support for wsloop (#119170)
This reapplies PR #118463 after introducing a fix for a bug uncovere by
the test suite. The problem is that when the alloca block is terminated
with a conditional branch, this violates a pre-condition of
`allocatePrivateVars` (which assumes the alloca block has a single
successor). This new PR includes a test that reproduces the issue.

Extend MLIR to LLVM lowering by adding support for `omp.wsloop` for
delayed privatization. This also refactors a few bit of code to isolate
the logic needed for `firstprivate` initialization in a shared util that
can be used across constructs that need it. The same is done for
`dealloc` regions.
2024-12-09 14:32:04 +01:00
NimishMishra
9eb4056144
[mlir][llvm] Translation support for task detach (#116601)
This PR adds translation support for task detach. Essentially, if the
`detach` clause is present on a task, emit a
`__kmpc_task_allow_completion_event` on it, and store its return (of
type `kmp_event_t*`) into the `event_handle`.
2024-12-08 06:09:52 -08:00
Kareem Ergawy
c54616ea48
Revert "[OpenMP][OMPIRBuilder] Add delayed privatization support for wsloop (#118463)" (#118848) 2024-12-05 20:49:13 +01:00
Kareem Ergawy
0993335134
[OpenMP][OMPIRBuilder] Add delayed privatization support for wsloop (#118463)
Extend MLIR to LLVM lowering by adding support for `omp.wsloop` for
delayed privatization. This also refactors a few bit of code to isolate
the logic needed for `firstprivate` initialization in a shared util that
can be used across constructs that need it. The same is done for
`dealloc`
regions.

Parent PR: https://github.com/llvm/llvm-project/pull/118447. Only latest
commit is relevant for this PR.
2024-12-05 05:59:52 +01:00
Kareem Ergawy
7f72d71de7
[OpenMP][OMPIRBuilder] Refactor reduction initialization logic into one util (#118447)
This refactors the logic needed to emit init logic for reductions by
moving some duplicated code into a shared util. The logic for doing is
quite involved and is needed for any construct that has reductions.
Moreover, when a construct has both private and reduction clauses, both
sets of clauses need to cooperate with each other when emitting the
logic needed for allocation and initialization. Therefore, this PR
clearly sets the boundaries for the logic needed to initialize
reductions.
2024-12-05 05:23:49 +01:00
NimishMishra
b9e3a769b9
[flang][mlir][llvm][OpenMP] Add lowering and translation support for mergeable clause on task (#114662)
Add FIR generation and LLVMIR translation support for mergeable clause
on task construct. If mergeable clause is present on a task, the
relevant flag in `ompt_task_flag_t` is set and passed to
`__kmpc_omp_task_alloc`.
2024-11-26 02:40:26 -08:00
Tom Eccles
a6385a3fc8
[mlir][OpenMP][NFC] use llvm::zip_equal for firstprivate copy region translation (#116416)
I think this is a bit easier to read.
2024-11-18 10:25:19 +00:00
agozillon
b5db75bfce
[OpenMP][MLIR] Descriptor explicit member map lowering changes (#113556)
This is one of 3 PRs in a PR stack that aims to add support for explicit
mapping of allocatable members in derived types.

The primary changes in this PR are the OpenMPToLLVMIRTranslation.cpp
changes, which are small and seek to alter the current member mapping to
add an additional map insertion for pointers. Effectively, if the member
is a pointer (currently indicated by having a varPtrPtr field) we add an
additional map for the pointer and then alter the subsequent mapping of
the member (the data) to utilise the member rather than the parents base
pointer. This appears to be necessary in certain cases when mapping
pointer data within record types to avoid segfaulting on device (due to
incorrect data mapping). In general this record type mapping may be
simplifiable in the future.

There are also additions of tests which should help to showcase the
affect of the changes above.
2024-11-16 12:26:29 +01:00
agozillon
d84d0caf28
[Flang][OpenMP] Update MapInfoFinalization to use BlockArgs Interface and modify use_device_ptr/addr to be order independent (#113919)
This patch primarily updates the MapInfoFinalization pass to utilise the
BlockArgument interface. It also shuffles newly added arguments the
MapInfoFinalization passes to the end of the BlockArg/Relevant MapInfo
lists, instead of one prior to the owning descriptor type.

During this it was noted that the use_device_ptr/addr handling of target
data was a little bit too order dependent so I've attempted to make it
less so, as we cannot depend on argument ordering to be the same as
Fortran for any future frontends.
2024-11-14 15:47:37 +01:00
Tom Eccles
8269c400b4
[mlir][OpenMP][NFC] delayed privatisation cleanup (#115298)
Upstreaming some code cleanups ahead of supporting delayed task
execution.
- Make allocatePrivateVars not need to be a template (it will need to
operate separately on firstprivate and private variables for delayed
task execution so it can't index into lists of all variables in the
operation).
 - Use llvm::SmallVectorImpl for function arguments
- collectPrivatizationDecls already reserves size for privateDecls so we
don't need to do that in callers
 - Use llvm::zip_equal instead of C-style array indexing
2024-11-07 12:27:31 +00:00
Tom Eccles
28452acac0
[mlir][OpenMP] delayed privatisation for TASK (#114785)
This uses essentially an identical implementation to that used for
ParallelOp. The private variable allocation and deallocation use shared
functions to avoid code duplication. FIRSTPRIVATE variable copying uses
duplicated code for now because I anticipate the implementation
diverging in the near future once I store data for firstprivate
variables in the task description structure.

After enabling delayed privatisation for TASK in flang, one more test in
the fujitsu test suite passes (I haven't looked into why).
2024-11-06 13:19:12 +00:00
Sergio Afonso
d3e796c2d0
[MLIR][OpenMP] Update not-yet-implemented errors, NFC (#114966)
This patch improves not-yet-implemented error diagnostics to more
closely follow the format used by Flang lowering for the same kind of
errors. This helps keep some level of uniformity from a user
perspective.
2024-11-05 12:48:54 +00:00
Sergio Afonso
6c28530ed0
[Flang][OpenMP] Properly bind arguments of composite operations (#113682)
When composite constructs are lowered, clauses for each leaf construct
are lowered before creating the set of loop wrapper operations, using
these outside values to populate their operand lists. Then, when the
loop nest associated to that composite construct is lowered, the binding
of Fortran symbols to the entry block arguments defined by these loop
wrappers is performed, resulting in the creation of `hlfir.declare`
operations in the entry block of the `omp.loop_nest`.

This approach prevents `hlfir.declare` operations related to the binding
and other operations resulting from the evaluation of the clauses from
being inserted between loop wrapper operations, which would be an
illegal MLIR representation. However, this introduces the problem of
entry block arguments defined by a wrapper that then should be used by
one of its nested wrappers, because the corresponding Fortran symbol
would still be mapped to an outside value at the time of gathering the
list of operands for the nested wrapper.

This patch adds operand re-mapping logic to update wrappers without
changing when clauses are evaluated or where the `hlfir.declare`
creation is performed.
2024-10-31 16:39:53 +00:00
Sergio Afonso
bd6c21460f
[MLIR][OpenMP] Emit descriptive errors for all unsupported clauses (#114037)
This patch improves error reporting in the MLIR to LLVM IR translation
pass for the 'omp' dialect by emitting descriptive errors when
encountering clauses not yet supported by that pass.

Additionally, not-yet-implemented errors previously missing for some
clauses are added, to avoid silently ignoring them.

Error messages related to inlining of `omp.private` and
`omp.declare_reduction` regions have been updated to use the same
format.
2024-10-31 11:59:51 +00:00
Sergio Afonso
21a6032eca
[MLIR][OpenMP] Simplify translation to LLVM IR error handling (#114036)
This patch unifies the handling of errors passed through the
OpenMPIRBuilder and removes some redundant error messages through the
introduction of a custom `ErrorInfo` subclass.

Additionally, the current list of operations and clauses unsupported by
the MLIR to LLVM IR translation pass is added to a new Lit test to check
they are being reported to the user.
2024-10-31 11:34:24 +00:00
Sergio Afonso
a1f2fb6078
[MLIR][OpenMP] Prevent composite omp.simd related crashes (#113680)
This patch updates the translation of `omp.wsloop` with a nested
`omp.simd` to prevent uses of block arguments defined by the latter from
triggering null pointer dereferences.

This happens because the inner `omp.simd` operation representing
composite `do simd` constructs is currently skipped and not translated,
but this results in block arguments defined by it not being mapped to an
LLVM value. The proposed solution is to map these block arguments to the
LLVM value associated to the corresponding operand, which is defined
above.
2024-10-29 17:05:12 +00:00
Sergio Afonso
d87964de78
[OpenMP][OMPIRBuilder] Error propagation across callbacks (#112533)
This patch implements an approach to communicate errors between the
OMPIRBuilder and its users. It introduces `llvm::Error` and
`llvm::Expected` objects to replace the values returned by callbacks
passed to `OMPIRBuilder` codegen functions. These functions then check
the result for errors when callbacks are called and forward them back to
the caller, which has the flexibility to recover, exit cleanly or dump a
stack trace.

This prevents a failed callback to leave the IR in an invalid state and
still continue the codegen process, triggering unrelated assertions or
segmentation faults. In the case of MLIR to LLVM IR translation of the
'omp' dialect, this change results in the compiler emitting errors and
exiting early instead of triggering a crash for not-yet-implemented
errors. The behavior in Clang and openmp-opt stays unchanged, since
callbacks will continue always returning 'success'.
2024-10-25 11:30:16 +01:00
Kareem Ergawy
ad70f3e095
[flang][OpenMP] Support target enter|update|exit .. nowait (#113305)
Extends `nowait` support for other device directives. This PR refactors
the task generation utils used for the `target` directive so that they
are general enough to be reused for other device directives as well.
2024-10-23 10:48:54 +02:00
Tom Eccles
621fcf892b
[mlir][OpenMP] rewrite conversion of privatisation for omp.parallel (#111844)
The existing conversion inlined private alloc regions and firstprivate
copy regions in mlir, then undoing the modification of the mlir module
before completing the conversion. To make this work, LLVM IR had to be
generated using the wrong mapping for privatised values and then later
fixed inside of OpenMPIRBuilder.

This approach violated an assumption in OpenMPIRBuilder that private
variables would be values not constants. Flang sometimes generates code
where private variables are promoted to globals, the address of which is
treated as a constant in LLVM IR. This caused the incorrect values for
the private variable from being replaced by OpenMPIRBuilder: ultimately
resulting in programs producing incorrect results.

This patch rewrites delayed privatisation for omp.parallel to work more
similarly to reductions: translating directly into LLVMIR with correct
mappings for private variables.

RFC:
https://discourse.llvm.org/t/rfc-openmp-fix-issue-in-mlir-to-llvmir-translation-for-delayed-privatisation/81225

Tested against the gfortran testsuite and our internal test suite.
Linaro's post-commit bots will check against the fujitsu test suite.

I decided to add the new tests as flang integration tests rather than in
mlir/test/Target/LLVMIR:
- The regression test is for an issue filed against flang. i wanted to
keep the reproducer similar to the code in the ticket.
- I found the "worst case" CFG test difficult to reason about in
abstract it helped me to think about what was going on in terms of a
Fortran program.

Fixes #106297
2024-10-16 14:43:57 +01:00
Sergio Afonso
15d85769f1
[Flang][OpenMP] Support lowering of simd reductions (#112194)
This patch enables lowering to MLIR of the reduction clause of `simd`
constructs. Lowering from MLIR to LLVM IR remains unimplemented, so at
that stage it will result in errors being emitted rather than silently
ignoring it as it is currently done.

On composite `do simd` constructs, this lowering error will remain
untriggered, as the `omp.simd` operation in that case is currently
ignored. The MLIR representation, however, will now contain `reduction`
information.
2024-10-16 10:27:50 +01:00
Kareem Ergawy
d0d03805f8
[flang][OpenMP] Support target ... nowait (#111823)
Adds MLIR to LLVM lowering support for `target ... nowait`. This
leverages the already existings code-gen patterns for `task` by treating
`target ... nowait` as `task ... if(1)` and `target` (without `nowait`)
as `task ... if(0)`; similar to what clang does.
2024-10-15 14:39:16 +02:00
Sergio Afonso
7ec3209493
[MLIR][OpenMP] Named recipe op's block args accessors (NFC) (#112192)
This patch adds extra class declarations to the `omp.declare_reduction`
and `omp.private` operations to access the entry block arguments defined
by their regions. Some existing accesses to these arguments are updated
to use the new named methods to improve code readability.
2024-10-15 11:50:30 +01:00
NimishMishra
aec87a2143
[llvm][mlir][flang][OpenMP] Emit __atomic_load and __atomic_compare_exchange libcalls for complex types in atomic update (#92364)
This patch adds functionality to emit relevant libcalls in case
atomicrmw instruction can not be emitted (for instance, in case of
complex types). The IRBuilder is modified to directly emit __atomic_load
and __atomic_compare_exchange libcalls. The added functions follow a
similar codegen path as Clang, so that LLVM Flang generates almost
similar IR as Clang.

Fixes https://github.com/llvm/llvm-project/issues/83760 and
https://github.com/llvm/llvm-project/issues/75138

Co-authored-by: Michael Kruse <llvm-project@meinersbur.de>
2024-10-02 23:32:36 -07:00
Sergio Afonso
5894d4e8e4
[MLIR][OpenMP] Use map format to represent use_device_{addr,ptr} (#109810)
This patch updates the `omp.target_data` operation to use the same
formatting as `map` clauses on `omp.target` for `use_device_addr` and
`use_device_ptr`. This is done so the mapping that is being enforced
between op arguments and associated entry block arguments is explicit.

The way it is achieved is by marking these clauses as entry block
argument-defining and adjusting printer/parsers accordingly.

As a result of this change, block arguments for `use_device_addr` come
before those for `use_device_ptr`, which is the opposite of the previous
undocumented situation. Some unit tests are updated based on this
change, in addition to those updated because of the format change.
2024-10-01 16:45:59 +01:00
Sergio Afonso
d0f67773b2
[MLIR][OpenMP] Normalize handling of entry block arguments (#109808)
This patch introduces a new MLIR interface for the OpenMP dialect aimed
at providing a uniform way of verifying and handling entry block
arguments defined by OpenMP clauses.

The approach consists in defining a set of overrideable methods that
return the number of block arguments the operation holds regarding each
of the clauses that may define them. These by default return 0, but they
are overriden by the corresponding clause through the
`extraClassDeclaration` mechanism.

Another set of interface methods to get the actual lists of block
arguments is defined, which is implemented based on the previously
described methods. These implicitly define a standardized ordering
between the list of block arguments associated to each clause, based on
the alphabetical ordering of their names. They should be the preferred
way of matching operation arguments and entry block arguments to that
operation's first region.

Some updates are made to the printing/parsing of `omp.parallel` to
follow the expected order between `private` and `reduction` clauses, as
well as the MLIR to LLVM IR translation pass to access block arguments
using the new interface. Unit tests of operations impacted by additional
verification checks and sorting of entry block arguments.
2024-10-01 15:04:27 +01:00
Pranav Bhandarkar
47d42cfa59
[mlir][OpenMP] - MLIR to LLVMIR translation support for delayed privatization in omp.target ops. (#109668)
This patch adds support to translate the `private` clause on
`omp.target` ops from MLIR to LLVMIR. This first cut only handles
non-allocatables. Also, this is for delayed privatization.
2024-09-30 21:58:44 -05:00