8 Commits

Author SHA1 Message Date
Kareem Ergawy
dcb124e820
[flang][OpenMP] Enable delayed privatization by default omp.wsloop (#125732)
Reapplies #122471

This is based on https://github.com/llvm/llvm-project/pull/125699, only
the latest commit is relevant.

With changes in this PR and the parent one, the previously reported
failures in the Fujitsu(*) test suite should hopefully be resolved (I
verified all the 14 reported failures and they pass now).

(*) https://linaro.atlassian.net/browse/LLVM-1521
2025-02-06 19:11:04 +01:00
Tom Eccles
aeaafce464
[mlir][OpenMP][flang] make private variable allocation implicit in omp.private (#124019)
The intention of this work is to give MLIR->LLVMIR conversion freedom to
control how the private variable is allocated so that it can be
allocated on the stack in ordinary cases or as part of a structure used
to give closure context for tasks which might outlive the current stack
frame. See RFC:

https://discourse.llvm.org/t/rfc-openmp-supporting-delayed-task-execution-with-firstprivate-variables/83084

For example, a privatizer for an integer used to look like
```mlir
  omp.private {type = private} @x.privatizer : !fir.ref<i32> alloc {
  ^bb0(%arg0: !fir.ref<i32>):
    %0 = ... allocate proper memory for the private clone ...
    omp.yield(%0 : !fir.ref<i32>)
  }
```

After this change, allocation become implicit in the operation:
```mlir
  omp.private {type = private} @x.privatizer : i32
```

For more complex types that require initialization after allocation, an
init region can be used:
``` mlir
  omp.private {type = private} @x.privatizer : !some.type init {
  ^bb0(%arg0: !some.pointer<!some.type>, %arg1: !some.pointer<!some.type>):
    // initialize %arg1, using %arg0 as a mold for allocations
    omp.yield(%arg1 : !some.pointer<!some.type>)
  } dealloc {
    ^bb0(%arg0: !some.pointer<!some.type>):
    ... deallocate memory allocated by the init region ...
    omp.yield
  }
```

This patch lays the groundwork for delayed task execution but is not
enough on its own.

After this patch all gfortran tests which previously passed still pass.
There
are the following changes to the Fujitsu test suite:
- 0380_0009 and 0435_0009 are fixed
- 0688_0041 now fails at runtime. This patch is testing firstprivate
variables with tasks. Previously we got lucky with the undefined
behavior and won the race. After these changes we no longer get lucky.
This patch lays the groundwork for a proper fix for this issue.

In flang the lowering re-uses the existing lowering used for reduction
init and dealloc regions.

In flang, before this patch we hit a TODO with the same wording when
generating the copy region for firstprivate polymorphic variables. After
this patch the box-like fir.class is passed by reference into the copy
region, leading to a different path that didn't hit that old TODO but
the generated code still didn't work so I added a new TODO in
DataSharingProcessor.
2025-01-31 09:35:26 +00:00
Kareem Ergawy
937cbce14c
Revert "[flang][OpenMP] Enable delayed privatization by default omp.wsloop (#122471)" (#123324)
This seems to have caused some regressions in Fujitsu's test-suite:
https://linaro.atlassian.net/browse/LLVM-1521

This reverts commit 6f82408bb53f57a859953d8f1114f1634a5d3ee9.
2025-01-22 10:16:40 +01:00
Valentin Clement (バレンタイン クレメン)
12ba74e181
[flang] Do not produce result for void runtime call (#123155)
Runtime function call to a void function are producing a ssa value
because the FunctionType result is set to NoneType with is later
translated to a empty struct. This is not an issue when going to LLVM IR
but it breaks when lowering a gpu module to PTX. This patch update the
RTModel to correctly set the FunctionType result type to nothing.

This is one runtime call before this patch at the LLVM IR dialect step.
```
%45 = llvm.call @_FortranAAssign(%arg0, %1, %44, %4) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> !llvm.struct<()>
```

After the patch the call would be correctly formed
```
llvm.call @_FortranAAssign(%arg0, %1, %44, %4) : (!llvm.ptr, !llvm.ptr, !llvm.ptr, i32) -> ()
```

Without the patch it would lead to error like:
```
ptxas /tmp/mlir-cuda_device_mod-nvptx64-nvidia-cuda-sm_60-e804b6.ptx, line 10; error   : Output parameter cannot be an incomplete array.
ptxas /tmp/mlir-cuda_device_mod-nvptx64-nvidia-cuda-sm_60-e804b6.ptx, line 125; error   : Call has wrong number of parameters
```

The change is pretty much mechanical.
2025-01-16 12:34:38 -08:00
Kareem Ergawy
6f82408bb5
[flang][OpenMP] Enable delayed privatization by default omp.wsloop (#122471)
This enable delayed privatization by default for `omp.wsloop` ops, with
one caveat! I had to workaround the "impure" alloc region issue that
being resolved at the moment. The workaround detects whether the alloc
region's argument is used in the region and at the same time defined in
block that does not dominate the chosen alloca insertion point. If so,
we move the alloca insertion point below the defining instruction of the
alloc region argument. This basically reverts to the
non-delayed-privatizaiton behavior.
2025-01-16 15:44:59 +01:00
Sergio Afonso
0a17bdfc36
[MLIR][OpenMP] Remove terminators from loop wrappers (#112229)
This patch simplifies the representation of OpenMP loop wrapper
operations by introducing the `NoTerminator` trait and updating
accordingly the verifier for the `LoopWrapperInterface`.

Since loop wrappers are already limited to having exactly one region
containing exactly one block, and this block can only hold a single
`omp.loop_nest` or loop wrapper and an `omp.terminator` that does not
return any values, it makes sense to simplify the representation of loop
wrappers by removing the terminator.

There is an extensive list of Lit tests that needed updating to remove
the `omp.terminator`s adding some noise to this patch, but actual
changes are limited to the definition of the `omp.wsloop`, `omp.simd`,
`omp.distribute` and `omp.taskloop` loop wrapper ops, Flang lowering for
those, `LoopWrapperInterface::verifyImpl()`, SCF to OpenMP conversion
and OpenMP dialect documentation.
2024-10-15 11:28:39 +01:00
Tom Eccles
f2027a9388
[flang][OpenMP] use reduction alloc region (#102525)
I removed the `*-hlfir*` tests because they are duplicate now that the
other tests have been updated to use the HLFIR lowering.

3/3
Part 1: https://github.com/llvm/llvm-project/pull/102522
Part 2: https://github.com/llvm/llvm-project/pull/102524
2024-08-22 14:12:07 +01:00
Tom Eccles
b6b0f975a6
[flang][OpenMP] Support reduction of POINTER variables (#95148)
Just treat them the same as ALLOCATABLE. gfortran doesn't allow POINTER
objects in a REDUCTION clause, but so far as I can tell the standard
explicitly allows it (openmp5.2 section 5.5.5).
2024-06-14 10:11:12 +01:00