llvm-project

shylie/llvm-project

Fork 0

Commit Graph

Author	SHA1	Message	Date
Matthias Springer	db8a119e8f	[mlir][ArmSME] Fix invalid rewriter API usage (#76123 ) When operations are modified in-place, the rewriter must be notified. This commit fixes `mlir/test/Conversion/ArmSMEToLLVM/unsupported.mlir`, `mlir/test/Dialect/ArmSME/tile-zero-masks.mlir` and `mlir/test/Dialect/ArmSME/vector-ops-to-llvm.mlir` when running with `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS` enabled.	2023-12-21 17:39:36 +09:00
Benjamin Maxwell	eaff02f28e	[mlir][ArmSME] Switch to an attribute-based tile allocation scheme (#73253 ) This reworks the ArmSME dialect to use attributes for tile allocation. This has a number of advantages and corrects some issues with the previous approach: * Tile allocation can now be done ASAP (i.e. immediately after `-convert-vector-to-arm-sme`) * SSA form for control flow is now supported (e.g.`scf.for` loops that yield tiles) * ArmSME ops can be converted to intrinsics very late (i.e. after lowering to control flow) * Tests are simplified by removing constants and casts * Avoids correctness issues with representing LLVM `immargs` as MLIR values - The tile ID on the SME intrinsics is an `immarg` (so is required to be a compile-time constant), `immargs` should be mapped to MLIR attributes (this is already the case for intrinsics in the LLVM dialect) - Using MLIR values for `immargs` can lead to invalid LLVM IR being generated (and passes such as -cse making incorrect optimizations) As part of this patch we bid farewell to the following operations: ```mlir arm_sme.get_tile_id : i32 arm_sme.cast_tile_to_vector : i32 to vector<[4]x[4]xi32> arm_sme.cast_vector_to_tile : vector<[4]x[4]xi32> to i32 ``` These are now replaced with: ```mlir // Allocates a new tile with (indeterminate) state: arm_sme.get_tile : vector<[4]x[4]xi32> // A placeholder operation for lowering ArmSME ops to intrinsics: arm_sme.materialize_ssa_tile : vector<[4]x[4]xi32> ``` The new tile allocation works by operations implementing the `ArmSMETileOpInterface`. This interface says that an operation needs to be assigned a tile ID, and may conditionally allocate a new SME tile. Operations allocate a new tile by implementing... ```c++ std::optional<arm_sme::ArmSMETileType> getAllocatedTileType() ``` ...and returning what type of tile the op allocates (ZAB, ZAH, etc). Operations that don't allocate a tile return `std::nullopt` (which is the default behaviour). Currently the following ops are defined as allocating: ```mlir arm_sme.get_tile arm_sme.zero arm_sme.tile_load arm_sme.outerproduct // (if no accumulator is specified) ``` Allocating operations become the roots for the tile allocation pass, which currently just (naively) assigns all transitive uses of a root operation the same tile ID. However, this is enough to handle current use cases. Once tile IDs have been allocated subsequent rewrites can forward the tile IDs to any newly created operations.	2023-11-30 10:22:22 +00:00
Cullen Rhodes	fb54fec726	[mlir][ArmSME] Implement tile allocation This patch adds a pass '-allocate-sme-tiles' to the ArmSME dialect that implements allocation of SME ZA tiles. It does this at the 'func.func' op level by replacing 'arm_sme.get_tile_id' ops with 'arith.constant' ops that represent the tile number. The tiles in use in a given function are tracked by an integer function attribute 'arm_sme.tiles_in_use' that is a 16-bit tile mask with a bit for each 128-bit element tile (ZA0.Q-ZA15.Q), the smallest ZA tile granule. This is initialized on the first 'arm_sme.get_tile_id' rewrite and updated on each subsequent rewrite. Mixing of different element tile types is supported. Section B2.3.2 of the SME spec [1] describes how the 128-bit element tiles overlap with other element tiles. Depends on D154941 [1] https://developer.arm.com/documentation/ddi0616/aa Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D154955	2023-07-18 08:46:40 +00:00

Author

SHA1

Message

Date

Matthias Springer

db8a119e8f

[mlir][ArmSME] Fix invalid rewriter API usage (#76123 )

When operations are modified in-place, the rewriter must be notified.
This commit fixes `mlir/test/Conversion/ArmSMEToLLVM/unsupported.mlir`,
`mlir/test/Dialect/ArmSME/tile-zero-masks.mlir` and
`mlir/test/Dialect/ArmSME/vector-ops-to-llvm.mlir` when running with
`MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS` enabled.

2023-12-21 17:39:36 +09:00

Benjamin Maxwell

eaff02f28e

[mlir][ArmSME] Switch to an attribute-based tile allocation scheme (#73253 )

This reworks the ArmSME dialect to use attributes for tile allocation.
This has a number of advantages and corrects some issues with the
previous approach:

* Tile allocation can now be done ASAP (i.e. immediately after
`-convert-vector-to-arm-sme`)
* SSA form for control flow is now supported (e.g.`scf.for` loops that
yield tiles)
* ArmSME ops can be converted to intrinsics very late (i.e. after
lowering to control flow)
 * Tests are simplified by removing constants and casts
* Avoids correctness issues with representing LLVM `immargs` as MLIR
values
- The tile ID on the SME intrinsics is an `immarg` (so is required to be
a compile-time constant), `immargs` should be mapped to MLIR attributes
(this is already the case for intrinsics in the LLVM dialect)
- Using MLIR values for `immargs` can lead to invalid LLVM IR being
generated (and passes such as -cse making incorrect optimizations)

As part of this patch we bid farewell to the following operations:

```mlir
arm_sme.get_tile_id : i32
arm_sme.cast_tile_to_vector : i32 to vector<[4]x[4]xi32>
arm_sme.cast_vector_to_tile : vector<[4]x[4]xi32> to i32
```

These are now replaced with:
```mlir
// Allocates a new tile with (indeterminate) state:
arm_sme.get_tile : vector<[4]x[4]xi32>
// A placeholder operation for lowering ArmSME ops to intrinsics:
arm_sme.materialize_ssa_tile : vector<[4]x[4]xi32>
```

The new tile allocation works by operations implementing the
`ArmSMETileOpInterface`. This interface says that an operation needs to
be assigned a tile ID, and may conditionally allocate a new SME tile.

Operations allocate a new tile by implementing...
```c++
std::optional<arm_sme::ArmSMETileType> getAllocatedTileType()
```
...and returning what type of tile the op allocates (ZAB, ZAH, etc).

Operations that don't allocate a tile return `std::nullopt` (which is
the default behaviour).

Currently the following ops are defined as allocating:
```mlir
arm_sme.get_tile
arm_sme.zero
arm_sme.tile_load
arm_sme.outerproduct // (if no accumulator is specified)
```

Allocating operations become the roots for the tile allocation pass,
which currently just (naively) assigns all transitive uses of a root
operation the same tile ID. However, this is enough to handle current
use cases.

Once tile IDs have been allocated subsequent rewrites can forward the
tile IDs to any newly created operations.

2023-11-30 10:22:22 +00:00

Cullen Rhodes

fb54fec726

[mlir][ArmSME] Implement tile allocation

This patch adds a pass '-allocate-sme-tiles' to the ArmSME dialect that
implements allocation of SME ZA tiles.

It does this at the 'func.func' op level by replacing
'arm_sme.get_tile_id' ops with 'arith.constant' ops that represent the
tile number. The tiles in use in a given function are tracked by an
integer function attribute 'arm_sme.tiles_in_use' that is a 16-bit tile
mask with a bit for each 128-bit element tile (ZA0.Q-ZA15.Q), the
smallest ZA tile granule. This is initialized on the first
'arm_sme.get_tile_id' rewrite and updated on each subsequent rewrite.
Mixing of different element tile types is supported.

Section B2.3.2 of the SME spec [1] describes how the 128-bit element
tiles overlap with other element tiles.

Depends on D154941

[1] https://developer.arm.com/documentation/ddi0616/aa

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D154955

2023-07-18 08:46:40 +00:00

3 Commits