llvm-project

Author	SHA1	Message	Date
Cullen Rhodes	2dd3f42083	[mlir][ArmSME] Lower vector.broadcast to ArmSME This adds support for lowering vector.broadcast ops to SME, if the source is either a scalar, 0-d vector, or 1-d vector, and the result a 2-d scalable vector that aligns with SME tiles. This follows on from D157005 which introduced a vector to tile slice op that moves a 1-d scalable vector to a slice of a 2-d scalable vector (tile). The lowering from vector.broadcast is similar, a couple of helper functions are added to prevent duplication. Lowering of vector.broadcast contributes towards a path from linalg.fill to SME. Depends on D157005 Reviewed By: awarzynski, dcaballe Differential Revision: https://reviews.llvm.org/D158586	2023-08-29 09:43:16 +00:00
Cullen Rhodes	3b4b6cbba5	[mlir][ArmSME] Add move vector to tile slice op and lowerings This adds a 'move_vector_to_tile_slice' op to the ArmSME dialect that moves a 1-D scalable vector to a slice of a 2-D tile at a given index. This is lowered to the 'llvm.aarch64.sme.write.horiz' intrinsic that maps to the MOVA (vector to tile, single) SME instruction [1] when lowering to LLVM. Like the SME load and store instructions this operates on ZA tile slices, which are 1D vectors of horizontally or vertically contiguous elements within a ZA tile. This patch extends the lowering of 'arith.constant' to SME to support non-zero constants using this new op. This requires materializing a loop that broadcasts the constant to each tile slice with the 'vector_to_tile_slice' op. Unlike load and store, this is done during conversion from Vector to ArmSME, rather than ArmSME to SCF. The latter would require a higher-level custom op in the ArmSME dialect like 'tile_load' and 'tile_store' and this isn't necessary. We may also remove the load and store ops in the future in favour of lowering straight from Vector, at which point this would converge. Currently only horizontal tile slices are supported. A future patch will extend this mechanism to support 'vector.broadcast'. Depends on D156980 D157004 [1] https://developer.arm.com/documentation/ddi0602 Reviewed By: awarzynski, dcaballe Differential Revision: https://reviews.llvm.org/D157005	2023-08-29 09:29:22 +00:00
Cullen Rhodes	dfa10ec2e6	[mlir][ArmSME] Extend arm_sme.zero for all types The arm_sme.zero op currently only supports 8-bit element tiles. This extends the op and lowering from 'arith.constant dense<0>' -> 'arm_sme.zero' to support all tile types. The lowering from arm_sme.zero to intrinsics is not updated as part of this patch and will be done separately. Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D156980	2023-08-11 12:44:56 +00:00
Cullen Rhodes	12e1a9b876	[mlir][ArmSME] Extend vector.transfer_write lowering Enables the lowering of other tile types and values to match the vector.store -> arm_sme.tile_store lowering. Reviewed By: awarzynski, dcaballe Differential Revision: https://reviews.llvm.org/D156976	2023-08-11 12:33:09 +00:00
Cullen Rhodes	781883ea62	[mlir][ArmSME] Split lowering of arith.constant from vector.transfer_write An 'arith.constant dense<0>' is currently lowered to 'arm_sme.zero' as part of the 'vector.transfer_write' lowering during '-vector-to-arm-sme' conversion. This patch makes this lowering independent of the 'vector.transfer_write'. This can then be extended for further tile types and non-zero constants. Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D156802	2023-08-03 08:57:33 +00:00
Cullen Rhodes	ca9a3354d0	[mlir][ArmSME] Add tile load op and extend tile store tile size support This extends the existing 'arm_sme.tile_store' op to support all tile sizes and adds a new op 'arm_sme.tile_load', as well as lowerings from vector -> custom ops and custom ops -> intrinsics. Currently there's no lowering for i128. Depends on D154867 Reviewed By: awarzynski, dcaballe Differential Revision: https://reviews.llvm.org/D155306	2023-07-25 08:28:36 +00:00
Andrzej Warzynski	447bb5bee4	[mlir][ArmSME] Introduce new lowering layer (Vector -> ArmSME) At the moment, the lowering from the Vector dialect to SME looks like this: * Vector --> SME LLVM IR intrinsics This patch introduces a new lowering layer between the Vector dialect and the Arm SME extension: * Vector --> ArmSME dialect (custom Ops) --> SME LLVM IR intrinsics. This is motivated by 2 considerations: 1. Storing `ZA` to memory (e.g. `vector.transfer_write`) requires an `scf.for` loop over all rows of `ZA`. Similar logic will apply to "load to ZA from memory". This is a rather complex transformation and a custom Op seems justified. 2. As discussed in [1], we need to prevent the LLVM type converter from having to convert types unsupported in LLVM, e.g. `vector<[16]x[16]xi8>`. A dedicated abstraction layer with custom Ops opens a path to some fine tuning (e.g. custom type converters) that will allow us to avoid this. To facilitate this change, two new custom SME Op are introduced: * `TileStoreOp`, and * `ZeroOp`. Note that no new functionality is added - these Ops merely model what's already supported. In particular, the following tile size is assumed (dimension and element size are fixed): * `vector<[16]x[16]xi8>` The new lowering layer is introduced via a conversion pass between the Vector and the SME dialects. You can use the `-convert-vector-to-sme` flag to run it. The following function: ``` func.func @example(%arg0 : memref<?x?xi8>) { // (...) %cst = arith.constant dense<0> : vector<[16]x[16]xi8> vector.transfer_write %cst, %arg0 : vector<[16]x[16]xi8>, memref<?x?xi8> return } ``` would be lowered to: ``` func.func @example(%arg0: memref<?x?xi8>) { // (...) %0 = arm_sme.zero : vector<[16]x[16]xi8> arm_sme.tile_store %arg0[%c0, %c0], %0 : memref<?x?xi8>, vector<[16]x[16]xi8> return } ``` Later, a mechanism will be introduced to guarantee that `arm_sme.zero` and `arm_sme.tile_store` operate on the same virtual tile. For `i8` elements this is not required as there is only one tile. In order to lower the above output to LLVM, use * `-convert-vector-to-llvm="enable-arm-sme"`. [1] https://github.com/openxla/iree/issues/14294 Reviewed By: WanderAway Differential Revision: https://reviews.llvm.org/D154867	2023-07-18 08:04:59 +00:00

7 Commits