303 Commits

Author SHA1 Message Date
Razvan Lupusoru
4128cf3b26
[flang][acc] Lower do and do concurrent loops specially in acc regions (#149614)
When OpenACC is enabled and Fortran loops are annotated with `acc loop`,
they are lowered to `acc.loop` operation. And rest of the contained
loops use the normal FIR lowering path.

Hovever, the OpenACC specification has special provisions related to
contained loops and their induction variable. In order to adhere to
this, we convert all valid contained loops to `acc.loop` in order to
store this information appropriately.

The provisions in the spec that motivated this change (line numbers are
from OpenACC 3.4):
- 1353 Loop variables in Fortran do statements within a compute
construct are predetermined to be private to the thread that executes
the loop.
- 3783 When do concurrent appears without a loop construct in a kernels
construct it is treated as if it is annotated with loop auto. If it
appears in a parallel construct or an accelerator routine then it is
treated as if it is annotated with loop independent.

By valid loops - we convert do loops and do concurrent loops which have
induction variable. Loops which are unstructured are not handled.
2025-07-29 10:03:22 -07:00
Maksim Levental
a3a007ad5f
[mlir][NFC] update flang/Lower create APIs (8/n) (#149912)
See https://github.com/llvm/llvm-project/pull/147168 for more info.
2025-07-21 19:54:29 -04:00
delaram-talaashrafi
0dae924c1f
[openacc][flang] Support two type bindName representation in acc routine (#149147)
Based on the OpenACC specification — which states that if the bind name
is given as an identifier it should be resolved according to the
compiled language, and if given as a string it should be used unmodified
— we introduce two distinct `bindName` representations for `acc routine`
to handle each case appropriately: one as an array of `SymbolRefAttr`
for identifiers and another as an array of `StringAttr` for strings.

To ensure correct correspondence between bind names and devices, this
patch also introduces two separate sets of device attributes. The
routine operation is extended accordingly, along with the necessary
updates to the OpenACC dialect and its lowering.
2025-07-17 09:38:02 -07:00
nvptm
a7f595efd8
[flang][acc] Create UseDeviceOp for both results of hlfir.declare (#148017)
A sample such as 
```
program test
  integer :: N = 100
  real*8 :: b(-1:N)
  !$acc data copy(b)
  !$acc host_data use_device(b)
  call vadd(b)
  !$acc end host_data
  !$acc end data
end

```
is lowered to
```
    %13:2 = hlfir.declare %11(%12) {uniq_name = "_QFEb"} : (!fir.ref<!fir.array<?xf64>>, !fir.shapeshift<1>) -> (!fir.box<!fir.array<?xf64>>, !fir.ref<!fir.array<?xf64>>)
    %14 = acc.copyin var(%13#0 : !fir.box<!fir.array<?xf64>>) -> !fir.box<!fir.array<?xf64>> {dataClause = #acc<data_clause acc_copy>, name = "b"}
    acc.data dataOperands(%14 : !fir.box<!fir.array<?xf64>>) {
      %15 = acc.use_device var(%13#0 : !fir.box<!fir.array<?xf64>>) -> !fir.box<!fir.array<?xf64>> {name = "b"}
      acc.host_data dataOperands(%15 : !fir.box<!fir.array<?xf64>>) {
        fir.call @_QPvadd(%13#1) fastmath<contract> : (!fir.ref<!fir.array<?xf64>>) -> ()
        acc.terminator
      }
      acc.terminator
    }
    acc.copyout accVar(%14 : !fir.box<!fir.array<?xf64>>) to var(%13#0 : !fir.box<!fir.array<?xf64>>) {dataClause = #acc<data_clause acc_copy>, name = "b"}
```
Note that while the use_device clause is applied to %13#0, the argument
passed to vadd is %13#1. To avoid problems later in lowering, this
change additionally applies the use_device clause to %13#1, so that the
resulting MLIR is
```
   %13:2 = hlfir.declare %11(%12) {uniq_name = "_QFEb"} : (!fir.ref<!fir.array<?xf64>>, !fir.shapeshift<1>) -> (!fir.box<!fir.array<?xf64>>, !fir.ref<!fir.array<?xf64>>)
    %14 = acc.copyin var(%13#0 : !fir.box<!fir.array<?xf64>>) -> !fir.box<!fir.array<?xf64>> {dataClause = #acc<data_clause acc_copy>, name = "b"}
    acc.data dataOperands(%14 : !fir.box<!fir.array<?xf64>>) {
      %15 = acc.use_device var(%13#0 : !fir.box<!fir.array<?xf64>>) -> !fir.box<!fir.array<?xf64>> {name = "b"}
      %16 = acc.use_device varPtr(%13#1 : !fir.ref<!fir.array<?xf64>>) -> !fir.ref<!fir.array<?xf64>> {name = "b"}
      acc.host_data dataOperands(%15, %16 : !fir.box<!fir.array<?xf64>>, !fir.ref<!fir.array<?xf64>>) {
        fir.call @_QPvadd(%13#1) fastmath<contract> : (!fir.ref<!fir.array<?xf64>>) -> ()
        acc.terminator
      }
      acc.terminator
    }
    acc.copyout accVar(%14 : !fir.box<!fir.array<?xf64>>) to var(%13#0 : !fir.box<!fir.array<?xf64>>) {dataClause = #acc<data_clause acc_copy>, name = "b"}
  
```
2025-07-17 09:04:44 -07:00
Razvan Lupusoru
b54cfa46a7
[flang][acc] Implement MappableType's generatePrivateInit (#148302)
The recipe body generation was moved from lowering into FIR's
implementation of MappableType API. And now since all Fortran variable
types implement this type, lowering of OpenACC was updated to use this
API directly. No test changes were needed - all of the private,
firstprivate, and recipe tests get the same body as before.
2025-07-14 10:53:54 -07:00
Razvan Lupusoru
4859b92b7f
[flang][acc] Update FIR ref, heap, and pointer to be MappableType (#147834)
The MappableType OpenACC type interface is a richer interface that
allows OpenACC dialect to be capable to better interact with a source
dialect, FIR in this case. fir.box and fir.class types already
implemented this interface. Now the same is being done with the other
FIR types that represent variables.

One additional notable change is that fir.array no longer implements
this interface. This is because MappableType is primarily intended for
variables - and FIR variables of this type have storage associated and
thus there's a pointer-like type (fir.ref/heap/pointer) that holds the
array type.

The end goal of promoting these FIR types to MappableType is that we
will soon implement ability to generate recipes outside of the frontend
via this interface.
2025-07-10 15:23:57 -07:00
Vijay Kandiah
c4138a24dc
[mlir][acc][flang] Lower nested ACC loops with tile clause as collapsed loops (#147801)
In the case of nested loops, `acc.loop` is meant to subsume all of the
loops that it applies to (when explicitly described as doing so in the
OpenACC specification). So when there is a `acc loop tile(...)` present
on nested Fortran DO loops, `acc.loop` should apply to the `n` loops
that `tile` applies to. This change lowers such nested Fortran loops
with tile clause into a collapsed `acc.loop` with `n` IVs, loop bounds,
and step, in a similar fashion to the current lowering for acc loops
with `collapse` clause.
2025-07-09 15:47:11 -05:00
delaram-talaashrafi
1f7effc887
[mlir][acc][flang] Use SymbolRefAttr for func_name in ACC routine (#146951)
Changed the type of the `func_name` attribute from SymbolNameAttr to
SymbolRefAttr. SymbolNameAttr is typically used when defining a symbol
(e.g., `sym_name`), while SymbolRefAttr is appropriate for referencing
existing operations. This change ensures that MLIR can correctly track
the link to the referenced `func.func` operation.
2025-07-03 14:55:49 -07:00
Peter Klausler
9fd22cb56d
[flang][NFC] Move new code to right place (#144551)
Some new code was added to flang/Semantics that only depends on
facilities in flang/Evaluate. Move it into Evaluate and clean up some
minor stylistic problems.
2025-06-19 13:42:46 -07:00
Razvan Lupusoru
775ad3e49c
[flang][acc] Ensure all acc.loop get a default parallelism determination mode (#143623)
This PR updates the flang lowering to explicitly implement the OpenACC
rules:
- As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop
construct with no auto or seq clause is treated as if it has the
independent clause when it is an orphaned loop construct or its parent
compute construct is a parallel construct.
- As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent
compute construct is a kernels construct, a loop construct with no
independent or seq clause is treated as if it has the auto clause.
- Loops in serial regions are `seq` if they have no other parallelism
marking such as gang, worker, vector.

For now the `acc.loop` verifier has not yet been updated to enforce
this.
2025-06-11 07:16:58 -07:00
Peter Klausler
b994a4c04f
[flang][NFC] Clean up code in two new functions (#142037)
Two recently-added functions in Semantics/tools.h need some cleaning up
to conform to the coding style of the project. One of them should
actually be in Parser/tools.{h,cpp}, the other doesn't need to be
defined in the header.
2025-06-10 14:44:41 -07:00
Andre Kuhlenschmidt
df03f7ed4c
[flang][openacc] use location of end directive for exit operations (#140763)
Make sure to preserve the location of the end statement on data
declarations for use in debugging OpenACC runtime.
2025-05-23 13:39:11 -07:00
Yang Zaizhou
5530474e3e
[Flang][OpenMP] fix crash on sematic error in atomic capture clause (#140710)
Fix a crash caused by an invalid expression in the atomic capture
clause, due to the `checkForSymbolMatch` function not accounting for
`GetExpr` potentially returning null.

Fix https://github.com/llvm/llvm-project/issues/139884
2025-05-23 07:15:10 -05:00
Scott Manley
651db24a9c
[OpenACC] rename private/firstprivate recipe attributes (#140719)
Make private and firstprivate recipe attribute names consistent with
reductionRecipes attribute
2025-05-21 07:38:08 -05:00
Scott Manley
a6303099fd
[OpenACC] unify reduction and private-like init region recipes (#140652)
Between firstprivate, private and reduction init regions, the difference
is largely whether or not the temp that is created is initialized or
not. Some recent fixes were made to privatization (#135698, #137869) but
did not get propagated to reductions, even though they need to return
the yield the same things from their init regions.

To mitigate this discrepancy in the future, refactor the init region
recipes so they can be shared between the three recipe ops.

Also add "none" to the OpenACC_ReductionOperator enum for better error
checking.
2025-05-20 05:27:09 -05:00
Slava Zakharin
09b772e2ef
[flang] Postpone hlfir.end_associate generation for calls. (#138786)
If we generate hlfir.end_associate at the end of the statement,
we get easier optimizable HLFIR, because there are no compiler
generated operations with side-effects in between the call
and the consumers. This allows more hlfir.eval_in_mem to reuse
the LHS instead of allocating temporary buffer.

I do not think the same can be done for hlfir.copy_out always, e.g.:
```
subroutine test2(x)
  interface
     function array_func2(x,y)
       real:: x(*), array_func2(10), y
     end function array_func2
  end interface
  real :: x(:)
  x = array_func2(x, 1.0)
end subroutine test2
```

If we postpone the copy-out until after the assignment, then
the result may be wrong.
2025-05-12 14:03:15 -07:00
Andre Kuhlenschmidt
4d9479fa8f
[flang][openacc] Allow open acc routines from other modules. (#136012)
OpenACC routines annotations in separate compilation units currently get
ignored, which leads to errors in compilation. There are two reason for
currently ignoring open acc routine information and this PR is
addressing both.
- The module file reader doesn't read back in openacc directives from
module files.
  - Simple fix in `flang/lib/Semantics/mod-file.cpp`
- The lowering to HLFIR doesn't generate routine directives for symbols
imported from other modules that are openacc routines.
- This is the majority of this diff, and is address by the changes that
start in `flang/lib/Lower/CallInterface.cpp`.
2025-05-09 11:12:24 -07:00
Razvan Lupusoru
1a6b0413e0
[flang][acc] Fix issue with privatization recipe for box ref (#137869)
When privatizing allocatable/pointer arrays, the code was creating a
temporary but this was a box type. This led to inconsistency between the
input and output of recipe.

The updated logic now creates storage when a box ref is requested.
2025-04-29 13:13:29 -07:00
khaki3
b95cc91aaf
[flang][acc] Remove an unused variable (#137731)
Fixes what https://github.com/llvm/llvm-project/pull/137691 introduced.
2025-04-28 16:30:47 -07:00
khaki3
dd2a1590c3
[flang][acc] Generate constructors and destructors for common blocks (#137691) 2025-04-28 16:11:29 -07:00
Krzysztof Parzyszek
9ea5254f77
[flang][OpenACC][OpenMP] Separate implementations of ATOMIC constructs (#137517)
The OpenMP implementation of the ATOMIC construct will change in the
near future to accommodate atomic conditional-update and conditional-
update-capture operations. This patch separates the shared implemen-
tations to avoid interfering with OpenACC.
2025-04-28 15:43:39 -05:00
Razvan Lupusoru
e79d8f6892
[flang][acc] Update stride calculation to include inner-dimensions (#136613)
The acc.bounds operation allows specifying stride - but it did not
clarify what it meant. The dialect was updated to specifically note that
stride must capture inner dimension sizes when specified for outer
dimensions.

Flang lowering was also updated for OpenACC to adhere to this. This was
already the case for descriptor-based arrays - but now this is also
being done for all arrays.
2025-04-21 16:03:12 -07:00
Razvan Lupusoru
91c2607aac
[flang][acc] Avoid implicitly privatizing IVs already privatized (#136181)
When generating `acc.loop`, the IV was always implicitly privatized.
However, if the user explicitly privatized it, the IR generated wasn't
quite right.

For example:
```
  !$acc loop private(i)
  do i = 1, n
    a(i) = b(i)
  end do
```

The IR generated looked like:
```
    %65 = acc.private varPtr(%19#0 : !fir.ref<i32>) -> !fir.ref<i32>
{implicit = true, name = "i"}
    %66:2 = hlfir.declare %65 {uniq_name = "_QFEi"} : (!fir.ref<i32>) ->
(!fir.ref<i32>, !fir.ref<i32>)
    %67 = acc.private varPtr(%66#0 : !fir.ref<i32>) -> !fir.ref<i32>
{name = "i"}
    acc.loop  private(@privatization_ref_i32 -> %65 : !fir.ref<i32>,
@privatization_ref_i32 -> %67 : !fir.ref<i32>) control(%arg0 : i32) =
(%c1_i32_46 : i32) to (%c10_i32_47 : i32)  step (%c1_i32_48 : i32) {
      fir.store %arg0 to %66#0 : !fir.ref<i32>
```

In order to fix this, we first process all of the clauses. Then when
attempting to generate implicit private IV, we look for an already
existing data clause operation.

The result is the following IR:
```
    %65 = acc.private varPtr(%19#0 : !fir.ref<i32>) -> !fir.ref<i32>
{name = "i"}
    %66:2 = hlfir.declare %65 {uniq_name = "_QFEi"} : (!fir.ref<i32>) ->
(!fir.ref<i32>, !fir.ref<i32>)
    acc.loop  private(@privatization_ref_i32 -> %65 : !fir.ref<i32>)
control(%arg0 : i32) = (%c1_i32_46 : i32) to (%c10_i32_47 : i32)  step
(%c1_i32_48 : i32) {
      fir.store %arg0 to %66#0 : !fir.ref<i32>
```
2025-04-17 13:11:52 -07:00
Scott Manley
b581bd3429
[flang][OpenACC] use correct type when create private box init recipe (#135698)
The recipe for initializing private box types was incorrect because
hlfir::createTempFromMold() is not a suitable utility function when the
box element type is a trivial type.
2025-04-15 14:21:19 -05:00
Razvan Lupusoru
c3d8c625af
[flang][acc] Fill-in name for privatized loop iv (#126601)
When the loop induction variable implicit private clause was being
generated, the name was left empty. The intent is that the data clause
operation holds the source language variable name. Thus, add the missing
name now.
2025-02-11 09:16:59 -08:00
Razvan Lupusoru
83edbd4958
[flang][acc] Ensure data exit action is generated for present & nocreate (#126560)
The acc.delete operation has semantics of decrementing present counter
and deleting the data when the counter reaches zero. Since both
acc.present and acc.nocreate are both intended to increment present
counter, this matching exit action must be inserted.

This is also what was specified in OpenACC dialect documentation:
https://mlir.llvm.org/docs/Dialects/OpenACCDialect/#operation-categories
2025-02-10 13:04:10 -08:00
Razvan Lupusoru
bd30838422
[flang][acc] Improve acc lowering around fir.box and arrays (#125600)
The current implementation of OpenACC lowering includes explicit
expansion of following cases:
- Creation of `acc.bounds` operations for all arrays, including those
whose dimensions are captured in the type (eg `!fir.array<100xf32>`)
- Expansion of box types by only putting the box's address in the data
clause. The address was extracted with a `fir.box_addr` operation and
the bounds were filled with `fir.box_dims` operation.

However, with the creation of the new type interface `MappableType`, the
idea is that specific type-based semantics can now be used. This also
really simplifies representation in the IR. Consider the following
example:
```
subroutine sub(arr)
  real :: arr(:)
  !$acc enter data copyin(arr)
end subroutine
```

Before the current PR, the relevant acc dialect IR looked like:
```
func.func @_QPsub(%arg0: !fir.box<!fir.array<?xf32>> {fir.bindc_name =
"arr"}) {
  ...
  %1:2 = hlfir.declare %arg0 dummy_scope %0 {uniq_name = "_QFsubEarr"} :
(!fir.box<!fir.array<?xf32>>, !fir.dscope) ->
(!fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>)
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index
  %2:3 = fir.box_dims %1#0, %c0 : (!fir.box<!fir.array<?xf32>>, index)
-> (index, index, index)
  %c0_0 = arith.constant 0 : index
  %3 = arith.subi %2#1, %c1 : index
  %4 = acc.bounds lowerbound(%c0_0 : index) upperbound(%3 : index)
extent(%2#1 : index) stride(%2#2 : index) startIdx(%c1 : index)
{strideInBytes = true}
  %5 = fir.box_addr %1#0 : (!fir.box<!fir.array<?xf32>>) ->
!fir.ref<!fir.array<?xf32>>
  %6 = acc.copyin varPtr(%5 : !fir.ref<!fir.array<?xf32>>) bounds(%4) ->
!fir.ref<!fir.array<?xf32>> {name = "arr", structured = false}
  acc.enter_data dataOperands(%6 : !fir.ref<!fir.array<?xf32>>)
```

After the current change, it looks like:
```
func.func @_QPsub(%arg0: !fir.box<!fir.array<?xf32>> {fir.bindc_name =
"arr"}) {
  ...
  %1:2 = hlfir.declare %arg0 dummy_scope %0 {uniq_name = "_QFsubEarr"} :
(!fir.box<!fir.array<?xf32>>, !fir.dscope) ->
(!fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>)
  %2 = acc.copyin var(%1#0 : !fir.box<!fir.array<?xf32>>) ->
!fir.box<!fir.array<?xf32>> {name = "arr", structured = false}
  acc.enter_data dataOperands(%2 : !fir.box<!fir.array<?xf32>>)
```

Restoring the old behavior can be done with following command line
options:
`--openacc-unwrap-fir-box=true --openacc-generate-default-bounds=true`
2025-02-04 08:08:16 -08:00
jeanPerier
662133a278
[flang][OpenMP][OpenACC] remove libEvaluate dependency in passes (#123784)
Move OpenACC/OpenMP helpers from Lower/DirectivesCommon.h that are also
used in OpenACC and OpenMP mlir passes into a new
Optimizer/Builder/DirectivesCommon.h so that parser and evaluate headers
are not included in Optimizer libraries (this both introduce
compile-time and link-time pointless overheads).

This should fix https://github.com/llvm/llvm-project/issues/123377
2025-01-21 20:32:42 +01:00
khaki3
f3d3ec86d1
[flang][acc] Add a missing acc.delete generation for the copyin clause (#122539)
We are missing the deletion part of the copyin clause after a region or
in a destructor. This PR completes its implementation for data regions,
compute regions, and global declarations.

Example:
```f90
subroutine sub()
  real :: x(1:10)
  !$acc data copyin(x)
  !$acc end data
end subroutine sub
```
We are getting the following:
```mlir
    %5 = acc.copyin varPtr(%2#0 : !fir.ref<!fir.array<10xf32>>) bounds(%4) -> !fir.ref<!fir.array<10xf32>> {name = "x"}
    acc.data dataOperands(%5 : !fir.ref<!fir.array<10xf32>>) {
      acc.terminator
    }
    return
```
With this PR, we'll get:
```mlir
    %5 = acc.copyin varPtr(%2#0 : !fir.ref<!fir.array<10xf32>>) bounds(%4) -> !fir.ref<!fir.array<10xf32>> {name = "x"}
    acc.data dataOperands(%5 : !fir.ref<!fir.array<10xf32>>) {
      acc.terminator
    }
    acc.delete accPtr(%5 : !fir.ref<!fir.array<10xf32>>) bounds(%4) {dataClause = #acc<data_clause acc_copyin>, name = "x"}
    return
```
2025-01-10 23:39:16 -08:00
Matthias Springer
c870632ef6
[flang] Fix some memory leaks (#121050)
This commit fixes some but not all memory leaks in Flang. There are
still 91 tests that fail with ASAN.

- Use `mlir::OwningOpRef` instead of `std::unique_ptr`. The latter does
not free allocations of nested blocks.
- Pass `ModuleOp` as value instead of reference.
- Add few missing deallocations in test cases and other places.
2024-12-25 09:42:03 +01:00
Kareem Ergawy
e532241b02
Re-apply (#117867): [flang][OpenMP] Implicitly map allocatable record fields (#120374)
This re-applies #117867 with a small fix that hopefully prevents build
bot failures. The fix is avoiding `dyn_cast` for the result of
`getOperation()`. Instead we can assign the result to `mlir::ModuleOp`
directly since the type of the operation is known statically (`OpT` in
`OperationPass`).
2024-12-18 09:19:45 +01:00
Kareem Ergawy
dc936f3c19
Revert "[flang][OpenMP] Implicitly map allocatable record fields (#117867)" (#120360) 2024-12-18 06:52:24 +01:00
Kareem Ergawy
db09014a07
[flang][OpenMP] Implicitly map allocatable record fields (#117867)
This is a starting PR to implicitly map allocatable record fields.

This PR contains the following changes:
1. Re-purposes some of the utils used in `Lower/OpenMP.cpp` so that
   these utils work on the `mlir::Value` level rather than the
   `semantics::Symbol` level. This takes one step towards to enabling
   MLIR passes to more easily do some lowering themselves (e.g. creating
   `omp.map.bounds` ops for implicitely caputured data like this PR
   does).
2. Adds support for implicitely capturing and mapping allocatable fields
   in record types.

There is quite some distant to still cover to have full support for
this. I added a number of todos to guide further development.

Co-authored-by: Andrew Gozillon <andrew.gozillon@amd.com>

Co-authored-by: Andrew Gozillon <andrew.gozillon@amd.com>
2024-12-18 05:37:58 +01:00
Razvan Lupusoru
a0eb794da8
[MLIR][acc] Introduce varType to acc data clause operations (#119007)
The acc data clause operations hold an operand named `varPtr`. This was
intended to hold a pointer to a variable - where the element type of
that pointer specifies the type of the variable. However, for both
memref and llvm dialects, this assumption is not true. This is because
memref element type for cases like memref<10xf32> is simply f32 and for
LLVM, after opaque pointers, the variable type is no longer recoverable.

Thus, introduce varType to ensure that appropriate semantics are kept.

Both the parser and printer for this new type attribute allow it to not
be specified in cases where a dialect's getElementType() applied to
`varPtr`'s type has a recoverable type. And more specifically, for FIR,
no changes are needed in the MLIR unit tests.
2024-12-09 15:14:48 -08:00
jeanPerier
c4204c0b29
[flang] replace fir.complex usages with mlir complex (#110850)
Core patch of
https://discourse.llvm.org/t/rfc-flang-replace-usages-of-fir-complex-by-mlir-complex-type/82292.
After that, the last step is to remove fir.complex from FIR types.
2024-10-03 17:10:57 +02:00
Valentin Clement (バレンタイン クレメン)
5257fa19c9
[flang][openacc] Attach post allocate action on the correct operation (#106805)
In some cases (when using stat), the action was attached to the
invisible fir.result op. Apply same fix as in #89662.
2024-08-30 22:45:56 -07:00
jeanPerier
a527248a3c
[flang][acc] allow and ignore DIR between ACC and loops (#106522)
The current pattern was failing OpenACC semantics in acc parse tree
canonicalization:

```
!acc loop
!dir vector aligned
do i=1,n
...
```

Fix it by moving the directive before the OpenACC construct node.

Note that I think it could make sense to propagate the $dir info to the
acc.loop, at least with classic flang, the $dir seems to make a
difference. This is not done here since few directives are supported
anyway.
2024-08-30 08:27:38 +02:00
Razvan Lupusoru
7634a96589
[flang][acc] Improve lowering of Fortran optional in data clause (#102224)
Fortran optional arguments are effectively null references. To deal with
this possibility, flang lowering of OpenACC data clauses creates three
if-else regions when preparing the data pointer for the data clause:
1) Load box value from box reference
2) Load box addr from box value
3) Load box dims from box value

However, this pattern makes it more complicated to find the original box
reference. Effectively, the first if-else region to get the box value is
not needed - since the value can be loaded before the corresponding
`fir.box_addr` and `fir.box_dims` operations. Thus, reduce the number of
if-else regions by deferring the box load to the use sites.

For non-optional cases, the old functionality is left alone - which
preloads the box value.
2024-08-07 08:04:06 -07:00
Slava Zakharin
40278bb119
[mlir][acc] Added async to data clause operations. (#97307)
As long as the data clause operations are not tightly
"associated" with the compute/data operations (e.g.
they can be optimized as SSA producers and made block
arguments), the information about the original async()
clause should be attached to the data clause operations
to make it easier to generate proper runtime actions
for them. This change propagates the async() information
from the OpenACC data/compute constructs to the data clause
operations. This change also adds the CurrentDeviceIdResource
to guarantee proper ordering of the operations that read
and write the current device identifier.
2024-07-03 02:03:46 -07:00
Alexander Shaposhnikov
77d8cfb3c5
[Flang] Switch to common::visit more call sites (#90018)
Switch to common::visit more call sites.

Test plan: ninja check-all
2024-06-17 12:59:04 -07:00
khaki3
3af717d661
[flang] Add parsing of DO CONCURRENT REDUCE clause (#92518)
Derived from #92480. This PR supports parsing of the DO CONCURRENT
REDUCE clause in Fortran 2023. Following the style of the OpenMP parser
in MLIR, the front end accepts both arbitrary operations and procedures
for the REDUCE clause. But later Semantics can notify type errors and
resolve procedure names.
2024-05-30 11:34:19 -07:00
Slava Zakharin
1710c8cf0f
[flang] Lowering changes for assigning dummy_scope to hlfir.declare. (#90989)
The lowering produces fir.dummy_scope operation if the current
function has dummy arguments. Each hlfir.declare generated
for a dummy argument is then using the result of fir.dummy_scope
as its dummy_scope operand. This is only done for HLFIR.

I was not able to find a reliable way to identify dummy symbols
in `genDeclareSymbol`, so I added a set of registered dummy symbols
that is alive during the variables instantiation for the current
function. The set is initialized during the mapping of the dummy
argument symbols to their MLIR values. It is reset right after
all variables are instantiated - this is done to avoid generating
hlfir.declare operations with dummy_scope for the clones of
the dummy symbols (e.g. this happens with OpenMP privatization).

If this can be done in a cleaner way, please advise.
2024-05-08 16:48:14 -07:00
Christian Sigg
fac349a169
Reapply "[mlir] Mark isa/dyn_cast/cast/... member functions depreca… (#90406)
…ted. (#89998)" (#90250)

This partially reverts commit 7aedd7dc754c74a49fe84ed2640e269c25414087.

This change removes calls to the deprecated member functions. It does
not mark the functions deprecated yet and does not disable the
deprecation warning in TypeSwitch. This seems to cause problems with
MSVC.
2024-04-28 22:01:42 +02:00
dyung
7aedd7dc75
Revert "[mlir] Mark isa/dyn_cast/cast/... member functions deprecated. (#89998)" (#90250)
This reverts commit 950b7ce0b88318f9099e9a7c9817d224ebdc6337.

This change is causing build failures on a bot
https://lab.llvm.org/buildbot/#/builders/216/builds/38157
2024-04-26 12:09:13 -07:00
Christian Sigg
950b7ce0b8
[mlir] Mark isa/dyn_cast/cast/... member functions deprecated. (#89998)
See https://mlir.llvm.org/deprecation and
https://discourse.llvm.org/t/preferred-casting-style-going-forward.
2024-04-26 16:28:30 +02:00
Valentin Clement (バレンタイン クレメン)
7c0da7993e
[flang][cuda] Use fir.cuda_deallocate for automatic deallocation (#89662)
Automatic deallocation of allocatable that are cuda device variable must
use the fir.cuda_deallocate operation. This patch update the automatic
deallocation code generation to use this operation when the variable is
a cuda variable.

This patch has also the side effect to correctly call
`attachDeclarePostDeallocAction` for OpenACC declare variable on
automatic deallocation as well. Update the code in
`attachDeclarePostDeallocAction` so we do not attach on fir.result but
on the correct last op.
2024-04-24 08:43:54 -07:00
jeanPerier
a4798bb0b6
[flang][NFC] use mlir::SymbolTable in lowering (#86673)
Whenever lowering is checking if a function or global already exists in
the mlir::Module, it was doing module->lookup.

On big programs (~5000 globals and functions), this causes important
slowdowns because these lookups are linear. Use mlir::SymbolTable to
speed-up these lookups. The SymbolTable has to be created from the
ModuleOp and maintained in sync. It is therefore placed in the
converter, and FirOPBuilders can take a pointer to it to speed-up the
lookups.

This patch does not bring mlir::SymbolTable to FIR/HLFIR passes, but
some passes creating a lot of runtime calls could benefit from it too.
More analysis will be needed.

As an example of the speed-ups, this patch speeds-up compilation of
Whizard compare_amplitude_UFO.F90 from 5 mins to 2 mins on my machine
(there is still room for speed-ups).
2024-04-02 14:29:29 +02:00
Razvan Lupusoru
14e17ea1f6
[flang][acc] Add support for lowering combined constructs (#86696)
PR#80319 added support to record combined construct semantics via an
attribute. Add lowering support for this.
2024-03-26 12:52:13 -07:00
Krzysztof Parzyszek
84115494d6
[flang][Lower] Convert OMP Map and related functions to evaluate::Expr (#81626)
The related functions are `gatherDataOperandAddrAndBounds` and
`genBoundsOps`. The former is used in OpenACC as well, and it was
updated to pass evaluate::Expr instead of parser objects.

The difference in the test case comes from unfolded conversions of index
expressions, which are explicitly of type integer(kind=8).

Delete now unused `findRepeatableClause2` and `findClause2`.

Add `AsGenericExpr` that takes std::optional. It already returns
optional Expr. Making it accept an optional Expr as input would reduce
the number of necessary checks when handling frequent optional values in
evaluator.

[Clause representation 4/6]
2024-03-20 15:00:29 -05:00
Tom Eccles
3b0a426b3f
[flang][NFC] move extractSequenceType helper out of OpenACC to share code (#84957)
Moving extractSequenceType to FIRType.h so that this can also be used
from OpenMP.

OpenMP array reductions 5/6
Previous PR: https://github.com/llvm/llvm-project/pull/84955
Next PR: https://github.com/llvm/llvm-project/pull/84958
2024-03-20 10:09:50 +00:00