199 Commits

Author SHA1 Message Date
Erich Keane
e5e3e4bdb5
[OpenACC] Add firstprivate recipe helper methods to ACC dialect (#153604)
Like we did for the 'private' clause, this adds an easier to use helper
function to add the 'firstprivate' clause + recipe to the Parallel and
Serial ops.
2025-08-14 13:07:59 -07:00
Erich Keane
25c07763f7
[OpenACC][CIR] Implement 'private' clause lowering. (#151360)
The private clause is the first with 'recipes', so a lot of
infrastructure is included here, including some MLIR dialect changes
that allow simple adding of a privatization. We'll likely get similar
for firstprivate and reduction.

Also, we have quite a bit of infrastructure in clause lowering to make
sure we have most cases we could think of covered.

At the moment, ONLY private is implemented, so all it requires is an
'init' segment (that doesn't call any copy operations), and potentially
a 'destroy' segment. However, actually calling 'init' functions on each
of the elements in them are not properly implemented, and will be in a
followup patch.

This patch implements all of that, and adds tests in a way that will be
useful for firstprivate as well.
2025-08-01 09:27:15 -07:00
Razvan Lupusoru
4128cf3b26
[flang][acc] Lower do and do concurrent loops specially in acc regions (#149614)
When OpenACC is enabled and Fortran loops are annotated with `acc loop`,
they are lowered to `acc.loop` operation. And rest of the contained
loops use the normal FIR lowering path.

Hovever, the OpenACC specification has special provisions related to
contained loops and their induction variable. In order to adhere to
this, we convert all valid contained loops to `acc.loop` in order to
store this information appropriately.

The provisions in the spec that motivated this change (line numbers are
from OpenACC 3.4):
- 1353 Loop variables in Fortran do statements within a compute
construct are predetermined to be private to the thread that executes
the loop.
- 3783 When do concurrent appears without a loop construct in a kernels
construct it is treated as if it is annotated with loop auto. If it
appears in a parallel construct or an accelerator routine then it is
treated as if it is annotated with loop independent.

By valid loops - we convert do loops and do concurrent loops which have
induction variable. Loops which are unstructured are not handled.
2025-07-29 10:03:22 -07:00
Longsheng Mou
3eb49c482c
[mlir][NFC] Use hasOneBlock instead of llvm::hasSingleElement(region) (#149809) 2025-07-24 10:11:21 +08:00
delaram-talaashrafi
0dae924c1f
[openacc][flang] Support two type bindName representation in acc routine (#149147)
Based on the OpenACC specification — which states that if the bind name
is given as an identifier it should be resolved according to the
compiled language, and if given as a string it should be used unmodified
— we introduce two distinct `bindName` representations for `acc routine`
to handle each case appropriately: one as an array of `SymbolRefAttr`
for identifiers and another as an array of `StringAttr` for strings.

To ensure correct correspondence between bind names and devices, this
patch also introduces two separate sets of device attributes. The
routine operation is extended accordingly, along with the necessary
updates to the OpenACC dialect and its lowering.
2025-07-17 09:38:02 -07:00
Razvan Lupusoru
4859b92b7f
[flang][acc] Update FIR ref, heap, and pointer to be MappableType (#147834)
The MappableType OpenACC type interface is a richer interface that
allows OpenACC dialect to be capable to better interact with a source
dialect, FIR in this case. fir.box and fir.class types already
implemented this interface. Now the same is being done with the other
FIR types that represent variables.

One additional notable change is that fir.array no longer implements
this interface. This is because MappableType is primarily intended for
variables - and FIR variables of this type have storage associated and
thus there's a pointer-like type (fir.ref/heap/pointer) that holds the
array type.

The end goal of promoting these FIR types to MappableType is that we
will soon implement ability to generate recipes outside of the frontend
via this interface.
2025-07-10 15:23:57 -07:00
Erich Keane
857815f3fa
[OpenACC][CIR] Implement 'rest' of update clause lowering (#146414)
This implements the async, wait, if, and if_present (as well as
    device_type, but that is a detail of async/wait) lowering. All of
these are implemented the same way they are for the compute constructs,
      so this is a pretty mild amount of changes.
2025-07-01 06:05:08 -07:00
Erich Keane
a99fee6989
[OpenACC][CIR] Implement 'exit data' construct + clauses (#146167)
Similar to 'enter data', except the data clauses have a 'getdeviceptr'
operation before, so that they can properly use the 'exit' operation
correctly. While this is a touch awkward, it fits perfectly into the
existing infrastructure.

Same as with 'enter data', we had to add some add-functions for async
and wait.
2025-06-30 06:19:43 -07:00
Erich Keane
33d20828d1
[OpenACC][CIR] Implement enter-data + clause lowering (#146146)
'enter data' is a new construct type that requires one of the data
clauses, so we had to wait for all clauses to be ready before we could
commit this. Most of the clauses are simple, but there is a little bit
of work to get 'async' and 'wait' to have similar interfaces in the ACC
dialect, where helpers were added.
2025-06-27 13:47:42 -07:00
Erich Keane
3463aba45f
[OpenACC][CIR] Implement copyin/copyout/create lowering for compute/c… (#145976)
…ombined

This patch does the lowering of copyin (represented as a
    acc.copyin/acc.delete), copyout (acc.create/acc.copyin), and create
(acc.create/acc.delete).

Additionally, it found a few problems with #144806, so it fixes those as
well.
2025-06-27 07:25:58 -07:00
Razvan Lupusoru
4847bd5ae4
[mlir][acc] Add support for data clause modifiers (#144806)
The OpenACC data clause operations have been updated to support the
OpenACC 3.4 data clause modifiers. This includes ensuring verifiers
check that only supported ones are used on relevant operations.

In order to support a seamless update from encoding the modifiers in the
data clause to this attribute, the following considerations were made:
- Ensure that modifier builders which do not take modifier are still
available.
- All data clause enum values are left in place until a complete
transition is made to the new modifiers.
2025-06-24 07:48:06 -07:00
Razvan Lupusoru
34a1b8ce25
[acc] acc.loop verifier now requires parallelism determination flag (#143720)
The OpenACC specification for `acc loop` describe that a loop's
parallelism determination mode is either auto, independent, or seq. The
rules are as follows.
- As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop
construct with no auto or seq clause is treated as if it has the
independent clause when it is an orphaned loop construct or its parent
compute construct is a parallel construct.
- As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent
compute construct is a kernels construct, a loop construct with no
independent or seq clause is treated as if it has the auto clause.
- Additionally, loops marked with gang, worker, or vector are not
guaranteed to be parallel. Specifically noted in 2.9.7 auto clause: If
not, or if it is unable to make a determination, it must treat the auto
clause as if it is a seq clause, and it must ignore any gang, worker, or
vector clauses on the loop construct.

The verifier for `acc.loop` was updated to enforce this marking because
the context in which a loop appears is not trivially determined once IR
transformations begin. For example, orphaned loops are implicitly
`independent`, but after inlining into an `acc.kernels` region they
would be implicitly considered `auto`. Thus now the verifier requires
that a frontend specifically generates acc dialect with this marking
since it knows the context.
2025-06-11 12:37:08 -07:00
Erich Keane
574f77a1ee
[OpenACC][CIR] Add parallelism determ. to all acc.loops (#143751)
PR #143720 adds a requirement to the ACC dialect that every acc.loop
must have a seq, independent, or auto attribute for the 'default'
device_type. The standard has rules for how this can be intuited:

orphan/parallel/parallel loop: independent
kernels/kernels loop: auto
serial/serial loop: seq, unless there is a gang/worker/vector, at which
point it should be 'auto'.

This patch implements all of this rule as a 'cleanup' step on the IR
generation for combined/loop operations. Note that the test impact is
much less since I inadvertently have my 'operation' terminating curley
matching the end curley from 'attribute' instead of the front of the
line, so I've added sufficient tests to ensure I captured the above.
2025-06-11 12:04:26 -07:00
Razvan Lupusoru
775ad3e49c
[flang][acc] Ensure all acc.loop get a default parallelism determination mode (#143623)
This PR updates the flang lowering to explicitly implement the OpenACC
rules:
- As per OpenACC 3.3 standard section 2.9.6 independent clause: A loop
construct with no auto or seq clause is treated as if it has the
independent clause when it is an orphaned loop construct or its parent
compute construct is a parallel construct.
- As per OpenACC 3.3 standard section 2.9.7 auto clause: When the parent
compute construct is a kernels construct, a loop construct with no
independent or seq clause is treated as if it has the auto clause.
- Loops in serial regions are `seq` if they have no other parallelism
marking such as gang, worker, vector.

For now the `acc.loop` verifier has not yet been updated to enforce
this.
2025-06-11 07:16:58 -07:00
Scott Manley
651db24a9c
[OpenACC] rename private/firstprivate recipe attributes (#140719)
Make private and firstprivate recipe attribute names consistent with
reductionRecipes attribute
2025-05-21 07:38:08 -05:00
khaki3
f9dbfb1566
[flang][acc] Update assembly formats to include asyncOnly, async, and wait (#140122)
The async implementation is inconsistent in terms of the assembly
format. While renaming `UpdateOp`'s `async` to `asyncOnly`, this PR
handles `asyncOnly` along with async operands in every operation.

Regarding `EnterDataOp` and `ExitDataOp`, they do not accept device
types; thus, the async and the wait clauses without values lead to the
`async` and the `wait` attributes (not `asyncOnly` nor `waitOnly`). This
PR also processes them with async and wait operands all together.
2025-05-15 12:56:15 -07:00
Erich Keane
ac4bb42b97
[OpenACC][CIR] Implement 'gang' lowering for 'loop' (#138968)
This clause requires an entire additional collection to keep track of
the gang 'kind' or 'type'. That work is maintained in the OpenACC
dialect functions. Otherwise, this is effectively the same as the
worker/vectors.
2025-05-09 05:35:06 -07:00
Erich Keane
f4e7ba02cc
[OpenACC][CIR] Implement 'worker'/'vector' lowering (#138765)
This patch implements worker and vector lowering for the loop construct,
which are fairly simple clauses, except that they also have a 'no
argument' form which requires a touch more work. Else, these are just
like a handful of other clauses where we just keep the device_type array
and operands in sync.
2025-05-07 13:48:17 -07:00
Erich Keane
bb09f79f0f
[OpenACC] Implement tile/collapse lowering (#138576)
These two ended up being pretty similar in frontend implementation, and
fairly trivial when doing lowering. The collapse clause jsut results in
a normal device_type style attribute with some mild additional
complexity, and 'tile' just uses the current infrastructure for 'with
segments'.
2025-05-06 13:11:49 -07:00
Erich Keane
4efcc52ed8
[OpenACC][CIR] Implement Loop lowering of seq/auto/independent (#138164)
These just add a standard 'device_type' flag to the acc.loop, so
implement that lowering. This also modifies the dialect to add helpers
for these as well, to be consistent with the previous ones.
2025-05-01 14:30:11 -07:00
Susan Tan (ス-ザン タン)
a073bb5afd
[mlir][acc] Add LegalizeDataValues support for DeclareEnterOp (#138008)
The patch extends the existing LegalizeDataValues to support
DeclareEnter and DeclareExit pair.
Since unlike other ops, DeclareEnter and DeclareExit don't have a region
defined, we use dominance/post dominance information to ensure only the
uses within the region dominated by DeclareEnter and post dominated by
DeclareExit are updated with data on device.
2025-05-01 13:46:34 -07:00
Razvan Lupusoru
e8f590e0e3
[mlir][acc] Improve acc.loop support as a container (#137887)
Dialects which have their own loop representation not representable with
numeric bounds + steps cannot be represented cleanly with acc.loop. In
such a case, we permit the dialects representation with acc.loop merely
encompasing its loop representation. This limitation became obvious when
looking at range / random iterator C++ loops.

The API of acc.loop was updated to test for this differentiation.
Additionally, the verifier was updated to check for consistent bounds
and whether inner-loops are contained when it works as a container.
2025-04-30 13:48:51 -07:00
Razvan Lupusoru
740f674917
[mlir][acc] Fix extraneous space when printing acc.loop (#137839)
The acc.loop printer inserted two spaces after the operation. This
occurred because the custom combined loop attribute printer was not
conditional - and thus the tablegen inserted an automatic space before
invoking the custom printer. Then for each additional attribute it also
inserted a space in beginning.

Since lit tests were not sensitive to this, no tests need updated. But
the issue with the extraneous space is resolved.
2025-04-29 14:46:31 -07:00
Kazu Hirata
bbbb178a35 [mlir] Fix a warning
This patch fixes:

  mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp:2905:17: error: unused
  variable 'var' [-Werror,-Wunused-variable]

  mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp:2908:42: error: unused
  variable 'dataClauseOptional' [-Werror,-Wunused-variable]
2025-04-28 22:18:55 -07:00
Susan Tan (ス-ザン タン)
dfdc50be8e
[mlir][acc] Remove declare attribute verification (#137676)
The part that verifies the declare attributes are preserved in the
verifier can fail easily during the FIR lowering pipeline. For example,
during FIR lowering to FIRCG, fir.declare can be removed. Thus, any
fir.declare that has acc.declare attributes will cause a verifier
failure. Since the declare attribute only existed to simplify the effort
of locating acc declare enter and exit points, which can be easily
replaced by a def-use chain traversal, we are considering removing the
verification of declare attributes in this MR.

Example:

```  
%1 = fir.alloca !fir.array<10xf32> {bindc_name = "arr", uniq_name = "_QMmmFsubEarr"}                                                                                                                                  
%2 = fir.shape %c10 : (index) -> !fir.shape<1>
%3 = fir.declare %1(%2) {acc.declare = #acc.declare<dataClause =  acc_create>, uniq_name = "_QMmmFsubEarr"} : (!fir.ref<!fir.array<10xf32>>, !fir.shape<1>) -> !fir.ref<!fir.array<10xf32>>
%4 = acc.create varPtr(%3 : !fir.ref<!fir.array<10xf32>>) -> !fir.ref<!fir.array<10xf32>> {name = "arr"}
%5 = acc.declare_enter dataOperands(%4 : !fir.ref<!fir.array<10xf32>>) 
```

the acc.declare_enter itself is enough to identify when the data region
starts.
2025-04-28 13:11:26 -07:00
Erich Keane
abfb2ce2f5
[OpenACC][NFCI] Implement 'helpers' for all of the clauses I've used so far (#137396)
As a follow up to 3c4dff3ac6884b85fe93fe512c5bdaf014738c45 I audited all
uses of 'process clause and use additive methods', and added explicit
functions to the construct to make it easier for the next project to
attempt to use this mechanism (vs construct all operands/etc in advance,
then add all at once).

I've only done ones that I have attempted to use so far(as a catch-up,
so no var-list clauses, and no constructs that can't be used without a
var-list, and no loop, and no compound constructs). I intend to do those
"as I go" with the lowering of each of those things instead.

---------

Co-authored-by: Andy Kaylor <akaylor@nvidia.com>
2025-04-28 06:06:42 -07:00
Erich Keane
3c4dff3ac6
[NFC][OpenACC] addDeviceType to init/shutdown ops (#137372)
As a first step of attempting to make for a 'better' interface to
lowering to the OpenACC dialect, this patch adds a helper function to
InitOp and ShutdownOp to make adding a device-type clause easier.
2025-04-25 13:14:44 -07:00
Valentin Clement (バレンタイン クレメン)
09b012fa2d
[flang][openacc] Fix wait clause printer (#137263)
wait clause printer is failing with case like: 

```
!$acc serial device_type(nvidia) wait
!$acc end serial
```
2025-04-25 07:35:42 -07:00
nvptm
3633de7029
[mlir][acc] Handle OpenACC host_data in LegalizeDataValues (#134767)
`LegalizeDataValuesInRegion` is intended to replace the SSA values used
in a region with the output of data operations, but misses the handling
of the OpenACC `host_data` construct. As a result, currently

```
 !$acc host_data use_device(%var)
   ...%var...
 !$acc end host_data

```
is lowered to

```
 %dev_var = acc.use_device(%var)
 acc.host_data data_operands(%dev_var) {
   ...%var...
 }
```

This pull request updates the LegalizeDataValuesInRegion to handle
HostDataOp such that lowering results in

```
 %dev_var = acc.use_device(%var)
 acc.host_data data_operands(%dev_var) {
   ...%dev_var...
 }
```
2025-04-14 16:29:17 -07:00
Razvan Lupusoru
83edbd4958
[flang][acc] Ensure data exit action is generated for present & nocreate (#126560)
The acc.delete operation has semantics of decrementing present counter
and deleting the data when the counter reaches zero. Since both
acc.present and acc.nocreate are both intended to increment present
counter, this matching exit action must be inserted.

This is also what was specified in OpenACC dialect documentation:
https://mlir.llvm.org/docs/Dialects/OpenACCDialect/#operation-categories
2025-02-10 13:04:10 -08:00
Razvan Lupusoru
1c583c19bb
[acc][mlir] Add functionality for categorizing OpenACC variable types (#126167)
OpenACC specification describes the following type categories: scalar,
array, composite, and aggregate (which includes arrays, composites, and
others such as Fortran pointer/allocatable).

Decision for how to do implicit mapping is dependent on a variable's
category. Since acc dialect's only means of distinguishing between types
is through the interfaces attached, add API to be able to get the type
category.

In addition to defining the new API, attempt to provide a base
implementation for memref which matches what OpenACC spec describes.
2025-02-10 08:03:38 -08:00
Razvan Lupusoru
bd30838422
[flang][acc] Improve acc lowering around fir.box and arrays (#125600)
The current implementation of OpenACC lowering includes explicit
expansion of following cases:
- Creation of `acc.bounds` operations for all arrays, including those
whose dimensions are captured in the type (eg `!fir.array<100xf32>`)
- Expansion of box types by only putting the box's address in the data
clause. The address was extracted with a `fir.box_addr` operation and
the bounds were filled with `fir.box_dims` operation.

However, with the creation of the new type interface `MappableType`, the
idea is that specific type-based semantics can now be used. This also
really simplifies representation in the IR. Consider the following
example:
```
subroutine sub(arr)
  real :: arr(:)
  !$acc enter data copyin(arr)
end subroutine
```

Before the current PR, the relevant acc dialect IR looked like:
```
func.func @_QPsub(%arg0: !fir.box<!fir.array<?xf32>> {fir.bindc_name =
"arr"}) {
  ...
  %1:2 = hlfir.declare %arg0 dummy_scope %0 {uniq_name = "_QFsubEarr"} :
(!fir.box<!fir.array<?xf32>>, !fir.dscope) ->
(!fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>)
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index
  %2:3 = fir.box_dims %1#0, %c0 : (!fir.box<!fir.array<?xf32>>, index)
-> (index, index, index)
  %c0_0 = arith.constant 0 : index
  %3 = arith.subi %2#1, %c1 : index
  %4 = acc.bounds lowerbound(%c0_0 : index) upperbound(%3 : index)
extent(%2#1 : index) stride(%2#2 : index) startIdx(%c1 : index)
{strideInBytes = true}
  %5 = fir.box_addr %1#0 : (!fir.box<!fir.array<?xf32>>) ->
!fir.ref<!fir.array<?xf32>>
  %6 = acc.copyin varPtr(%5 : !fir.ref<!fir.array<?xf32>>) bounds(%4) ->
!fir.ref<!fir.array<?xf32>> {name = "arr", structured = false}
  acc.enter_data dataOperands(%6 : !fir.ref<!fir.array<?xf32>>)
```

After the current change, it looks like:
```
func.func @_QPsub(%arg0: !fir.box<!fir.array<?xf32>> {fir.bindc_name =
"arr"}) {
  ...
  %1:2 = hlfir.declare %arg0 dummy_scope %0 {uniq_name = "_QFsubEarr"} :
(!fir.box<!fir.array<?xf32>>, !fir.dscope) ->
(!fir.box<!fir.array<?xf32>>, !fir.box<!fir.array<?xf32>>)
  %2 = acc.copyin var(%1#0 : !fir.box<!fir.array<?xf32>>) ->
!fir.box<!fir.array<?xf32>> {name = "arr", structured = false}
  acc.enter_data dataOperands(%2 : !fir.box<!fir.array<?xf32>>)
```

Restoring the old behavior can be done with following command line
options:
`--openacc-unwrap-fir-box=true --openacc-generate-default-bounds=true`
2025-02-04 08:08:16 -08:00
Razvan Lupusoru
0d63a3d757
[mlir][acc] Update LegalizeDataValues pass to allow MappableType (#125134)
With the addition of new type interface MappableType, the
LegalizeDataValues should not make the assumption it can obtain a
pointer to the data (aka acc::getVarPtr() is now not guaranteed to get a
value - acc::getVar() must be used instead).

Thus update the pass to ensure it handles any var used in its data
clause operations.
2025-01-31 07:55:06 -08:00
Razvan Lupusoru
cbcb7ad32e
[mlir][acc] Introduce MappableType interface (#122146)
OpenACC data clause operations previously required that the variable
operand implemented PointerLikeType interface. This was a reasonable
constraint because the dialects currently mixed with `acc` do use
pointers to represent variables. However, this forces the "pointer"
abstraction to be exposed too early and some cases are not cleanly
representable through this approach (more specifically FIR's `fix.box`
abstraction).

Thus, relax this by allowing a variable to be a type which implements
either `PointerLikeType` interface or `MappableType` interface.
2025-01-09 10:27:37 -08:00
Razvan Lupusoru
a0eb794da8
[MLIR][acc] Introduce varType to acc data clause operations (#119007)
The acc data clause operations hold an operand named `varPtr`. This was
intended to hold a pointer to a variable - where the element type of
that pointer specifies the type of the variable. However, for both
memref and llvm dialects, this assumption is not true. This is because
memref element type for cases like memref<10xf32> is simply f32 and for
LLVM, after opaque pointers, the variable type is no longer recoverable.

Thus, introduce varType to ensure that appropriate semantics are kept.

Both the parser and printer for this new type attribute allow it to not
be specified in cases where a dialect's getElementType() applied to
`varPtr`'s type has a recoverable type. And more specifically, for FIR,
no changes are needed in the MLIR unit tests.
2024-12-09 15:14:48 -08:00
Razvan Lupusoru
c0a1597029
[mlir][acc] Consistency between acc.loop and acc compute ops (#114549)
- GangPrivate and GangFirstPrivate renamed to just Private and
Firstprivate respectively. This is makes compute ops consistent with the
loop op (and also with the acc spec wording for the clause).
- Added getBody to all compute ops
- Verifier for firstprivate ops / recipes is enabled
2024-11-01 10:53:51 -07:00
Razvan Lupusoru
ac9ee61857
[acc] Improve LegalizeDataValues pass to handle data constructs (#112990)
Renames LegalizeData to LegalizeDataValues since this pass fixes up SSA
values. LegalizeData suggested that it fixed data mapping.

This change also adds support to fix up ssa values for data clause
operations. Effectively, compute regions within a data region use the
ssa values from data operations also. The ssa values within data regions
but not within compute regions are not updated.

This change is to support the requirement in the OpenACC spec which
notes that a visible data clause is not just one on the current compute
construct but on the lexically containing data construct or visible
declare directive.
2024-10-21 09:49:58 -07:00
Valentin Clement (バレンタイン クレメン)
65bd5ed84f
[mlir][openacc] Update verifier to catch missing device type attribute (#111586)
Operands with device_type support need the corresponding attribute but
this was not catches in the verifier if it was missing. The custom
parser usually constructs it but creating the op from python could lead
to a segfault in the printer. This patch updates the verifier so we
catch this early on.
2024-10-09 13:07:09 -07:00
Kazu Hirata
6f52c1e6b1
[OpenACC] Avoid repeated hash lookups (NFC) (#108795) 2024-09-16 06:43:58 -07:00
khaki3
26d92826a5
[mlir][flang] Add an interface of OpenACC compute regions for further getAllocaBlock support (#100675)
This PR implements `ComputeRegionOpInterface` to define `getAllocaBlock`
of OpenACC loop and compute constructs (parallel/kernels/serial). The
primary objective here is to accommodate local variables in OpenACC
compute regions. The change in `fir::FirOpBuilder::getAllocaBlock`
allows local variable allocation inside loops and kernels.
2024-07-26 13:52:27 -07:00
Vijay Kandiah
8d5ba7598a
[mlir][openacc] Added custom builder for acc::ParallelOp (#98191)
This change adds a custom builder for `acc::ParallelOp`. This enables
users to only specify the operands they would need for their
`acc::ParallelOp` while building it. They can specify nothing to create
an empty `acc.parallel`, or all of the 11 operands listed
[here](https://mlir.llvm.org/docs/Dialects/OpenACCDialect/#operands-27),
or anywhere in between following the specified order in this custom
builder. Unspecified operands are left empty. Additionally, users can
later set the optional attributes such as `numGangsDeviceType` using the
available attribute setters for `acc::ParallelOp`.
2024-07-09 15:58:36 -05:00
Slava Zakharin
40278bb119
[mlir][acc] Added async to data clause operations. (#97307)
As long as the data clause operations are not tightly
"associated" with the compute/data operations (e.g.
they can be optimized as SSA producers and made block
arguments), the information about the original async()
clause should be attached to the data clause operations
to make it easier to generate proper runtime actions
for them. This change propagates the async() information
from the OpenACC data/compute constructs to the data clause
operations. This change also adds the CurrentDeviceIdResource
to guarantee proper ordering of the operations that read
and write the current device identifier.
2024-07-03 02:03:46 -07:00
Kareem Ergawy
d0413438ec
[flang][OpenMP] Handle omp.private in FirOpBuilder::getAllocaBlock() (#93927)
Fixes a crash uncovered by
[pr89651](https://github.com/llvm/llvm-test-suite/blob/main/Fortran/gfortran/regression/gomp/pr89651.f90)
in the test suite.

Fixes a crash caused by missing handling of `omp.private` ops in
`FirOpBuilder::getAllocaBlock()`.
2024-06-04 05:03:39 +02:00
Jakub Kuderski
971b852546
[mlir][NFC] Simplify type checks with isa predicates (#87183)
For more context on isa predicates, see:
https://github.com/llvm/llvm-project/pull/83753.
2024-04-01 11:40:09 -04:00
Razvan Lupusoru
a435e1f63b
[acc] Add attribute for combined constructs (#80319)
Combined constructs are decomposed into separate operations. However,
this does not adhere to `acc` dialect's goal to be able to regenerate
semantically equivalent clauses as user's intent. Thus, add an attribute
to keep track of the combined constructs.
2024-03-07 10:06:47 -08:00
Valentin Clement (バレンタイン クレメン)
4c9717c3be
[mlir][openacc] Add private/reduction in legalize data pass (#80882)
This is a follow up to #80351 and adds private and reduction operands
from acc.loop, acc.parallel and acc.serial operations.
2024-02-06 13:21:13 -08:00
Valentin Clement (バレンタイン クレメン)
6b42625b1f
[mlir][openacc] Simplify IR with acc.loop control (#80387)
When the new `acc.loop` design was introduced some of the loop
information like `gang`/`vector`/`worker` were also updated to support
`device_type`.
With a conflict in parsing/printing, the keyword only value for
`async`/`gang`/`vector`/`worker` were printed/parsed with an empty set
of parenthesis `()`. To make the IR clearer to read and similar across
the operations, the loop control part of is now prefixed by `control`
and this allow to remove the need of the empty `()`.
2024-02-05 14:22:36 -08:00
Valentin Clement
0d091206dd
[mlir][openacc] Add legalize data pass for compute operation (#80351)
This patch adds a simple pass to replace the uses inside compute operation. It
replaces the `varPtr` values with their corresponding `accPtr` values gathered
through the dataClauseOperands.

private and reductions variables are not included in this pass since they will
normally be replace when they are materialized.

Reland with fix for dependencies
2024-02-05 13:40:41 -08:00
Valentin Clement
4b6062619a
Revert "[mlir][openacc] Add legalize data pass for compute operation (#80351)"
This reverts commit fa7d0d3e35f74486ccb0faa88ec706defe7dd2d2.
2024-02-05 12:57:54 -08:00
Valentin Clement
9ac6eb5bec
[mlir][openacc] Add MLIRSupport to MLIROpenACCTransforms 2024-02-05 12:42:47 -08:00