llvm-project

Author	SHA1	Message	Date
Kojo Acquah	04bf1a4090	Update `LowerContractionToSMMLAPattern` to ingnore matvec (#88288 ) Patterns in `LowerContractionToSMMLAPattern` are designed to handle vector-to-matrix multiplication but not matrix-to-vector. This leads to the following error when processing `rhs` with rank < 2: ``` iree-compile: /usr/local/google/home/kooljblack/code/iree-build/llvm-project/tools/mlir/include/mlir/IR/BuiltinTypeInterfaces.h.inc:268: int64_t mlir::detail::ShapedTypeTrait<mlir::VectorType>::getDimSize(unsigned int) const [ConcreteType = mlir::VectorType]: Assertion `idx < getRank() && "invalid index for shaped type"' failed. ``` Updates to explicitly check the rhs rank and fail cases that cannot process.	2024-04-10 13:18:47 -04:00
Kojo Acquah	c511c90680	[mlir][ArmNeon] Updates LowerContractionToSMMLAPattern with vecmat unroll patterns (#86005 ) Updates smmla unrolling patterns to handle vecmat contracts where `dimM=1`. This includes explicit vecmats in the form: `<1x8xi8> x <8x8xi8> --> <1x8xi32>` or implied with the leading dim folded: `<8xi8> x <8x8xi8> --> <8xi32>` Since the smmla operates on two `<2x8xi8>` input vectors to produce `<2x2xi8>` accumulators, half of each 2x2 accumulator tile is dummy data not pertinent to the computation, resulting in half throughput.	2024-04-03 19:24:18 -04:00
Kojo Acquah	fe84369cc6	[mlir][ArmNeon] Implements unrolling patterns for LowerContractionToSMMLAPattern (#84848 ) This patch updates `LowerContractionToSMMLAPattern` to unroll larger vector contracts into multiple smmla instructions. Now accepts up to [8,8,8] tiles (previously only [2,2,8]). The N/M dimensions must be powers of 2. `vector.extract_strided_slice`/`vector.insert_strided_slice` divides the contract into tiles to be processed in a row.	2024-03-19 13:09:33 -04:00
Kojo Acquah	cb6ff746e0	[mlir][ArmNeon] Implements LowerVectorToArmNeon Pattern for SMMLA (#81895 ) This patch adds a the `LowerVectorToArmNeonPattern` patterns to the ArmNeon. This pattern inspects `vector.contract` ops that can be 1-1 mapped to an `arm.neon.smmla` intrinsic. The contract ops must be separated into tiles who's inputs must fit that of a single smmla op (`2x8xi32` inputs and `2x2xi32` output). The `vector.contract` inputs must be sign extended from narrow types (<=i8) to be converted. If all conditions are met, an smmla op is inserted with additional `vector.shape_casts` to handle linearizing the input and output dimension.	2024-03-08 14:50:13 -08:00

4 Commits