7 Commits

Author SHA1 Message Date
Razvan Lupusoru
a88274f008
[mlir][acc] Support lazy remark message construction (#180665)
The OpenACC remark emission utilities previously only accepted Twine for
message construction. However, complex remarks often require additional
logic to build messages, such as resolving variable names. This results
in unnecessary work when remarks are disabled.

Add an overload that accepts a lambda for message generation, which is
only invoked when remark emission is enabled. Update ACCLoopTiling to
use this lazy API for tile size reporting.

Additionally, getVariableName now returns numeric strings for constant
integer values. This is also being used by ACCLoopTiling along with the
lazy remark update.
2026-02-10 10:13:19 -08:00
Razvan Lupusoru
ab7217a089
[acc][flang] Add isDeviceData APIs for device data detection (#176219)
Add comprehensive APIs to detect device-resident data across OpenACC
type and operation interfaces. This enables passes to identify data that
is already on the device (e.g., CUF device/managed/constant memory, GPU
address spaces) and handle it appropriately.

New interface methods:
- PointerLikeType::isDeviceData(Value): Returns true if the pointer
points to device data.
- MappableType::isDeviceData(Value): Returns true if the variable
represents device data.
- GlobalVariableOpInterface::isDeviceData(): Returns true if the global
variable is device data.

New utilities in OpenACCUtils:
- acc::isDeviceValue(Value): Checks if a value represents device data by
querying type interfaces, PartialEntityAccessOpInterface for base
entities, and AddressOfGlobalOpInterface for global symbols.
- acc::isValidValueUse(Value, Region): Checks if a value is legal in an
OpenACC region by verifying it comes from a data operation, is only used
by private clauses, or is device data.

Updated isValidSymbolUse to check
GlobalVariableOpInterface::isDeviceData()
for symbols referencing device-resident globals.

FIR implementations check for CUF data attributes (device, managed,
constant, shared, unified) on operations, block arguments, and globals.
The implementation traces through fir.rebox, fir.embox, fir.declare,
hlfir.declare, and fir.address_of to find the underlying data source.

Memref implementations check for gpu::AddressSpaceAttr on the memref
type.

Updated ACCImplicitData to use acc::isDeviceValue for generating
acc.deviceptr clauses for device-resident data instead of
copyin/copyout.

Updated OpenACCSupport::isValidValueUse to fallback to the new
acc::isValidValueUse utility.
2026-01-15 20:56:26 +00:00
Razvan Lupusoru
575d6892bc
[mlir][acc] Introduce acc loop tiling pass (#171692)
This pass implements the OpenACC loop tiling transformation for acc.loop
operations that have the tile clause (OpenACC 3.4 spec, section 2.9.8).

The tile clause specifies that the iterations of the associated loops
should be divided into tiles (rectangular blocks). The pass transforms a
single or nested acc.loop with tile clauses into a structure of "tile
loops" (iterating over tiles) containing "element loops" (iterating
within tiles).

For example, tiling a 2-level nested loop with tile(T1, T2):
```
  // Before tiling:
  acc.loop tile(T1, T2) control(%i, %j) = ...

  // After tiling:
  acc.loop control(%i) step (s1*T1) {        // tile loop 1
    acc.loop control(%j) step (s2*T2) {      // tile loop 2
      acc.loop control(%ii) = (%i) to (min(ub1, %i+s1*T1)) {
        acc.loop control(%jj) = (%j) to (min(ub2, %j+s2*T2)) {
          // loop body using %ii, %jj
        }
      }
    }
  }
```

Key features:
- Handles constant tile sizes and wildcard tile sizes ('*') which use a
configurable default tile size
- Properly handles collapsed loops with tile counts exceeding collapse
count by uncollapsing loops before tiling
- Distributes gang/worker/vector attributes appropriately: gang -> tile
loops, vector -> element loops
- Validates that tile size types are not wider than loop IV types
- Emits optimization remarks for tiling decisions

Three test files are added:
- acc-loop-tiling.mlir: Tests single and nested loop tiling with
constant tile sizes, unknown tile sizes (*), and loops with collapse
attributes
- acc-loop-tiling-invalid.mlir: Tests error diagnostic when tile size
type is wider than the loop IV type
- acc-loop-tiling-remarks.mlir: Tests optimization remarks emitted for
tiling decisions including default tile size selection

Co-authored-by: Vijay Kandiah <vkandiah@nvidia.com>
2025-12-10 14:18:28 -08:00
Valentin Clement (バレンタイン クレメン)
bf81bdec66
[mlir][acc] Add isValidValueUse to OpenACCSupport (#171538)
Add a new API `isValidValueUse ` to OpenACCSupport. This is used in
ACCImplicitData to check value that are already legal in the OpenACC
region and do not require implicit clause to be generated. An example
would be a CUDA Fortran device variable that is already on the GPU.
2025-12-10 08:19:04 -08:00
Razvan Lupusoru
afd1ffb4d3
[mlir][acc] Check legality of symbols in acc regions (#167957)
This PR adds a new utility function to check whether symbols used in
OpenACC regions are legal for offloading. Functions must be marked with
`acc routine` or be built-in intrinsics. Global symbols must be marked
with `acc declare`.

The utility is designed to be extensible, and the OpenACCSupport
analysis has been updated to allow handling of additional symbols that
do not necessarily use OpenACC attributes but are marked in a way that
still guarantees the symbol will be available when offloading. For
example, in the Flang implementation, CUF attributes can be validated as
legal symbols.
2025-11-14 12:52:30 -08:00
Razvan Lupusoru
4c6a38c32c
[acc] Expand OpenACCSupport to provide getRecipeName and emitNYI (#165628)
Extends OpenACCSupport utilities to include recipe name generation and
error reporting for unsupported features, providing foundation for
variable privatization handling.

Changes:
- Add RecipeKind enum (private, firstprivate, reduction) for APIs that
request a specific kind of recipe
- Add getRecipeName() API to OpenACCSupport and OpenACCUtils that
generates recipe names from types (e.g.,
"privatization_memref_5x10xf32_")
- Add emitNYI() API to OpenACCSupport for graceful handling of
not-yet-implemented cases
- Generalize MemRefPointerLikeModel template to support
UnrankedMemRefType
- Add unit tests and integration tests for new APIs
2025-10-29 16:09:33 -07:00
Razvan Lupusoru
f3599e55d1
[mlir][acc] Add OpenACCSupport for extensible dialect handling (#164510)
The OpenACC dialect must coexist with source language dialects (FIR,
CIR, etc.) to enable offloading. While type interfaces
(`PointerLikeType` and `MappableType`) provide the primary contract for
variable mapping, some scenarios require pipeline-specific customization
or need to express information that cannot be adequately captured
through operation and type interfaces alone.

This commit introduces the `OpenACCSupport` analysis, which provides
extensible support APIs that can be customized per-pipeline. The
analysis follows the Concept-Model pattern used in MLIR's
`AliasAnalysis` and is never invalidated, persisting throughout the pass
pipeline.

The initial API, `getVariableName(Value) -> string`, retrieves variable
names from MLIR values by:
- Checking for `acc.var_name` attributes
- Extracting names from ACC data clause operations (e.g., `acc.copyin`)
- Walking through `ViewLikeOpInterface` operations to find the source

This will be used in the implicit data mapping pass to automatically
generate device mappings with correct user-visible variable names.

Usage: Passes call `getAnalysis<OpenACCSupport>()` to get a cached
instance with either the default or a previously- registered custom
implementation. Custom implementations can be registered in a setup pass
by calling `setImplementation()` before the consumer pass runs.
2025-10-22 08:26:21 -07:00