llvm-project

Author	SHA1	Message	Date
Felix Schneider	d5a0fb39ae	[mlir][vector] Handle empty `MaskOp` in `LowerVectorMask`, `MaskOpRewritePattern` (#72031 ) This patch adds handling of an empty `MaskOp` to `MaskOpRewritePattern` and thereby fixes a crash. It also pulls the `MaskOp` canonicalization patterns into `LowerVectorMask` so that empty `MaskOp`s are folded away in the Pass. Fix https://github.com/llvm/llvm-project/issues/71036	2023-11-12 08:12:28 +01:00
Quinn Dawkins	bc81f8c87e	[mlir][vector] Drop incorrect startRootUpdate calls in vector distribution (#71988 ) Fixes asan failures in https://lab.llvm.org/buildbot/#/builders/5/builds/38191 introduced by #71964.	2023-11-10 17:07:39 -05:00
Quinn Dawkins	aa2376a083	[mlir][vector] Notify the rewriter when sinking out of warp ops (#71964 ) A number of the warp distribution patterns work by rewriting a warp op in place by moving a contained op outside. This notifies the rewriter that the warp op is changing in this case.	2023-11-10 14:45:18 -05:00
Han-Chung Wang	2bac720101	[mlir][vector] Take dim sizes into account in DropInnerMostUnitDims. (#71752 ) The `stride == 1` does not imply that we can drop it. Because it could load more than 1 elements. We should also take source sizes and vector sizes into account. Otherwise it generates invalid IRs. E.g., ```mlir func.func @foo(%arg0: memref<1x1xf32>) -> vector<4x8xf32> { %c0 = arith.constant 0 : index %cst = arith.constant 0.000000e+00 : f32 %0 = vector.transfer_read %arg0[%c0, %c0], %cst : memref<1x1xf32>, vector<4x8xf32> return %0 : vector<4x8xf32> } ``` Fixes https://github.com/openxla/iree/issues/15493	2023-11-10 09:27:59 -08:00
Quinn Dawkins	d4d2891447	[mlir][vector] Add distribution pattern for vector.create_mask (#71619 ) This is the last step needed for basic support for distributing masked vector code. The lane id gets delinearized based on the distributed mask shape and then compared against the original mask sizes to compute the bounds for the distributed mask. Note that the distribution of masks is implicit on the shape specified by the warp op. As a result, it is the responsibility of the consumer of the mask to ensure the distributed mask will match its own distribution semantics.	2023-11-10 10:09:37 -05:00
Quinn Dawkins	df49a97ab2	[mlir][vector] Root the transfer write distribution pattern on the warp op (#71868 ) Currently when there is a mix of transfer read ops and transfer write ops that need to be distributed, because the pattern for write distribution is rooted on the transfer write, it is hard to guarantee that the write gets distributed after the read when the two aren't directly connected by SSA. This is likely still relatively unsafe when there are undistributable ops, but structurally these patterns are a bit difficult to work with. For now pattern benefits give fairly good guarantees for happy paths.	2023-11-10 08:49:33 -05:00
Quinn Dawkins	7360d5d30f	[mlir][vector] Fix cases with multiple yielded transfer_read ops (#71625 ) This fixes two bugs: 1) When deciding whether a transfer read could be propagated out of a warp op, it looked for the first yield operand that was produced by a transfer read. If this transfer read wasn't ready to be distributed, the pattern would not re-check for any other transfer reads that could have been propagated. 2) When dropping dead warp results, we do so by updating the warp op signature and splicing in the old region. This does not add the ops in the body of the warp op back to the pattern applicator's worklist, and thus those operations won't be DCE'd. This is a problem for patterns like the one for transfer reads that will still see the dead operation as a user.	2023-11-09 11:35:54 -05:00
Quinn Dawkins	771f5759df	[mlir][vector] Add pattern to distribute masked reads (#71610 ) Because the distribution is based on types, supporting general masked reads requires first materializing the permutation map in IR to align the elements of the mask with the elements read by the transfer op. For now just support cases with the trivial permutation map.	2023-11-09 09:24:26 -05:00
Quinn Dawkins	25ec1fa969	[mlir][vector] Add support for distributing masked writes (#71482 ) General distribution of masked writes requires materializing the permutation on the vector of the write in IR to ensure the vector lines up with the mask. For now just support cases with trivial permutation maps.	2023-11-07 17:54:49 -05:00
Quinn Dawkins	796d48b080	[mlir][vector] Add leading unit dim folding patterns for masked transfers (#71466 ) This handles `vector.transfer_read`, `vector.transfer_write`, and `vector.constant_mask`. The unit dims are only relevant for masks created by `create_mask` and `constant_mask` if the mask size for the unit dim is non-one, in which case all subsequent sizes must also be zero. From the perspective of the vector transfers, however, these unit dims can just be dropped directly.	2023-11-06 20:40:14 -05:00
Quinn Dawkins	98dcd98a1a	[mlir][vector] Hoist uniform scalar loop code after scf.for distribution (#71422 ) After propagation of `vector.warp_execute_on_lane_0` through `scf.for`, uniform operations like those on the loop iterators can now be hoisted out of the inner warp op.	2023-11-06 14:16:15 -05:00
saienduri	24cf476bd6	[mlir] Add support for vector.store sub-byte emulation. (#70293 )	2023-11-01 18:57:21 -07:00
Matthias Springer	1df6504ac2	[mlir][vector] LISH: Implement `SubsetOpInterface` for transfer_read/write (#70629 ) - Implement `SubsetOpInterface`, `SubsetExtractionOpInterface`, `SubsetInsertionOpInterface` for `vector.transfer_read` and `vector.transfer_write`. - Move all tensor subset hoisting test cases from `Linalg` to `loop-invariant-subset-hoisting.mlir`. (Removing 1 duplicate test case.)	2023-11-01 12:19:30 +09:00
tyb0807	674261b203	[mlir][Vector] Add narrow type emulation pattern for vector.maskedload (#68443 )	2023-10-27 10:49:58 +02:00
Andrzej Warzyński	5270df3d17	[mlir][vector] Add scalable vectors to tests for vector.contract (#70039 ) Update the remaining tests for matrix multiplication (_matmul_) in: * vector-contract-to-outerproduct-transforms.mlir with cases for scalable vectors. Note that in order for the "vector.contract -> vector.outerproduct" patterns to work, only the non-reduction dimension can be scalable (). For Matmul operations that is set to be the N dimension (i.e. rows of the output matrix), which matches how matrix multiplication are normally implemented for e.g. Arm's SVE. However, making the M dimension scalable (i.e. columns of the output matrix) should work as well. Making both parellel dimensions scalable is left as a TODO for when support for 2-D scalable vectors is more established (this is work-in-progress as part of the effort to support Arm's SME in MLIR). The change in: `UnrolledOuterProductGenerator` is a "bug fix" to make sure that the conversion pattern correctly propagates scalability when creating `arith.extf` operations. (*) The conversion tested in this file unrolls along the reduction dimension, which is not supported for scalable vectors.	2023-10-27 09:38:36 +01:00
Lei Zhang	3049ac44e6	[mlir][vector] Enable transfer op hoisting with dynamic indices (#68500 ) Recent changes (https://github.com/llvm/llvm-project/pull/66930) disabled vector transfer ops hoisting with view-like intermediate ops. The recommended way is to fold subview ops into transfer op indices before invoking hoisting. That would mean now we see transfer op indices involving dynamic values, instead of static constant values before with subview ops. Therefore hoisting won't kick in anymore. This breaks downstream users. To fix it, this commit enables hoisting transfer ops with dynamic indices by using `ValueBoundsConstraintSet` to prove ranges are disjoint in `isDisjointTransferIndices`. Given that utility is used in many places including op folders, right now we introduce a flag to it and only set as true for "heavy" transforms in hoisting and load-store forwarding.	2023-10-15 16:37:54 -07:00
Andrzej Warzyński	e01c8673ba	[mlir][vector] Restore assert and fix typos (#68581 ) Follow-up for #68400 - restoring an assert that was accidentally removed and fixed a typo in a diagnostic.	2023-10-09 14:22:26 +01:00
Andrzej Warzynski	c91d3b0b08	[mlir][vector] Constrain patterns: vector.contract -> vector.outerproduct This patch constrains the patterns for converting `vector.contract` to `vector.outerproduct` so that * the reduction dimension is _not unrolled_ if the corresponding dimension is scalable. This is necessary as the current lowering is incorrect for scalable dims. Indeed, the following unrolling for `vector.contract` would be invalid if the corresponding dimension was scalable (K is the size of the reduction dimension): ``` // K times. This is valid if K _is not_ scalable. %lhs = vector.extract %LHS[0] %rhs = vector.extract %RHS[0] vector.outerproduct %lhs, %rhs %lhs = vector.extract %LHS[1] %rhs = vector.extract %RHS[1] vector.outerproduct %lhs, %rhs // ... ``` Instead, a `for` loop should be generated: ``` // This would be valid regardless of whether K is scalable or not scf.for %k = 0 to K step 1 %lhs = vector.extract LHS[%k] %rhs = vector.extract RHS[%k] vector.outerproduct %lhs, %rhs ``` However, the lowering of: * `vector.extract` of vector slices with dynamic indices is incomplete and hence the implementation proposed above (with `scf.for`) wouldn't work just yet, i.e. it wouldn't be possible to lower it further. Instead, this patch disables unrolling in cases when the reduction dimension is scalable, i.e. where the generated code would be functionally incorrect. In order to document unsupported cases, a dedicated test file is added: * "vector-contract-to-outerproduct-transforms-unsupported.mlir" This is the first patch in a series of patches that strives to update these patterns (and to test them) for scalable vectors. Resolves #68400	2023-10-06 16:07:07 +00:00
MaheshRavishankar	f28f09dcf0	[mlir][Vector] Add Broadcast -> CastOp reordering to SinkVectorBroadcasting patterns. (#68257 ) Also fix an issue with sink broadcast across elementwise where `arith.cmpf` is elementwise, but result type is different. The result type is not same as the operand type, creating illegal IR. Similar issue with `vector.fma` which only accepts vector operand types, while broadcasts can have scalar sources. Sinking broadcast across would result in an illegal `vector.fma` (with scalar operands).	2023-10-04 21:27:24 -07:00
tyb0807	a3af099785	[mlir][NFC] Fix comment explaining ConverVectorLoad (#67864 ) The new number of elements should be the original one divided by a scale factor computed from old and new bit width.	2023-09-30 00:52:05 +02:00
Cullen Rhodes	8c07d5ec6d	[mlir][vector] don't emit non-rank 1 masked load and store (#67656 ) The following patterns - TransferReadToVectorLoadLowering - TransferWriteToVectorStoreLowering attempt to generate invalid vector.maskedload and vector.maskedstore ops for non rank-1 vector types. These ops operate on 1-D vectors. This patch adds a check to prevent this.	2023-09-28 13:06:50 +01:00
Cullen Rhodes	9816edc9f3	[mlir][vector] add result type to vector.extract assembly format (#66499 ) The vector.extract assembly format currently only contains the source type, for example: %1 = vector.extract %0[1] : vector<3x7x8xf32> it's not immediately obvious if this is the source or result type. This patch improves the assembly format to make this clearer, so the above becomes: %1 = vector.extract %0[1] : vector<7x8xf32> from vector<3x7x8xf32>	2023-09-28 11:11:16 +01:00
Oleksandr "Alex" Zinenko	a509a18731	[mlir][vector] proper masking support for contract lowering (#67145 ) Support all known permutations when lowering masked vector.contract to vector.outerproduct, and not just the canonical permutation.	2023-09-25 13:38:46 +02:00
Diego Caballero	98f6289a34	[mlir][Vector] Add support for Value indices to vector.extract/insert `vector.extract/insert` ops only support constant indices. This PR is extending them so that arbitrary values can be used instead. This work is part of the RFC: https://discourse.llvm.org/t/rfc-psa-remove-vector-extractelement-and-vector-insertelement-ops-in-favor-of-vector-extract-and-vector-insert-ops Differential Revision: https://reviews.llvm.org/D155034	2023-09-22 00:39:32 +00:00
Benjamin Maxwell	2f11ce5579	[mlir][VectorOps] Extend vector.constant_mask to support 'all true' scalable dims (#66638 ) This extends `vector.constant_mask` so that mask dim sizes that correspond to a scalable dimension are treated as if they're implicitly multiplied by vscale. Currently this is limited to mask dim sizes of 0 or the size of the dim/vscale. This allows constant masks to represent all true and all false scalable masks (and some variations): ``` // All true scalable mask %mask = vector.constant_mask [8] : vector<[8]xi1> // All false scalable mask %mask = vector.constant_mask [0] : vector<[8]xi1> // First two scalable rows %mask = vector.constant_mask [2,4] : vector<4x[4]xi1> ```	2023-09-20 14:54:42 +01:00
Andrzej Warzyński	59fbba9490	[mlir][vector] Make ReorderElementwiseOpsOnBroadcast support vector.splat (#66596 ) Extend `ReorderElementwiseOpsOnBroadcast` so that the broadcasting op could be either `vector.broadcast` (already supported) as well as `vector.splat` (support added in this patch).	2023-09-20 09:56:43 +01:00
Nicolas Vasilache	04ba475e85	[mlir][Vector] Add a rewrite pattern for better low-precision ext(bit… (#66648 ) …cast) expansion This revision adds a rewrite for sequences of vector `ext(bitcast)` to use a more efficient sequence of vector operations comprising `shuffle` and `bitwise` ops. Such patterns appear naturally when writing quantization / dequantization functionality with the vector dialect. The rewrite performs a simple enumeration of each of the bits in the result vector and determines its provenance in the source vector. The enumeration is used to generate the proper sequence of `shuffle`, `andi`, `ori` with shifts`. The rewrite currently only applies to 1-D non-scalable vectors and bails out if the final vector element type is not a multiple of 8. This is a failsafe heuristic determined empirically: if the resulting type is not an even number of bytes, further complexities arise that are not improved by this pattern: the heavy lifting still needs to be done by LLVM.	2023-09-18 19:02:46 +02:00
Jie Fu	dd6dde1166	[mlir][Vector] Fix -Wunused-function in VectorEmulateNarrowType.cpp (NFC) /data/llvm-project/mlir/lib/Dialect/Vector/Transforms/VectorEmulateNarrowType.cpp:229:21: error: unused function 'operator<<' [-Werror,-Wunused-function] static raw_ostream &operator<<(raw_ostream &os, ^ 1 error generated.	2023-09-18 21:47:33 +08:00
frgossen	06f9ffa050	Fix unused variable (#66644 )	2023-09-18 09:35:20 -04:00
Nicolas Vasilache	bf7c490ab7	[mlir][Vector] Add a rewrite pattern for better low-precision bitcast… (#66387 ) …(trunci) expansion This revision adds a rewrite for sequences of vector `bitcast(trunci)` to use a more efficient sequence of vector operations comprising `shuffle` and `bitwise` ops. Such patterns appear naturally when writing quantization / dequantization functionality with the vector dialect. The rewrite performs a simple enumeration of each of the bits in the result vector and determines its provenance in the pre-trunci vector. The enumeration is used to generate the proper sequence of `shuffle`, `andi`, `ori` followed by an optional final `trunci`/`extui`. The rewrite currently only applies to 1-D non-scalable vectors and bails out if the final vector element type is not a multiple of 8. This is a failsafe heuristic determined empirically: if the resulting type is not an even number of bytes, further complexities arise that are not improved by this pattern: the heavy lifting still needs to be done by LLVM.	2023-09-18 15:08:18 +02:00
Matthias Springer	5cf714bb2f	[mlir][SCF] scf.for: Consistent API around `initArgs` (#66512 ) * Always use the auto-generated `getInitArgs` function. Remove the hand-written `getInitOperands` duplicate. * Remove `hasIterOperands` and `getNumIterOperands`. The names were inconsistent because the "arg" is called `initArgs` in TableGen. Use `getInitArgs().size()` instead. * Fix verification around ops with no results.	2023-09-18 09:13:43 +02:00
Andrzej Warzyński	57cf6896cd	[mlir][vector] Fix vector.broadcast lowering for scalable vectors (#66344 ) This patch makes sure that the following case is lowered correctly ("duplication"): ``` func.func @broadcast_scalable_duplication(%arg0: vector<[32]xf32>) -> vector<1x[32]xf32> { %res = vector.broadcast %arg0 : vector<[32]xf32> to vector<1x[32]xf32> return %res : vector<1x[32]xf32> } ```	2023-09-15 16:35:47 +01:00
Christopher Bate	831041be79	[mlir][vector] Cleanup VectorUnroll and create a generic tile iteration utility This change refactors some of the utilities used to unroll larger vector computations into smaller vector computations. In fact, the indexing computations used here are rather generic and are useful in other dialects or downstream projects. Therefore, a utility for iterating over all possible tile offsets for a particular pair of static (shape, tiled shape) is introduced in IndexingUtils and replaces the existing computations in the vector unrolling transformations. This builds off of the refactoring of IndexingUtils introduced in 203fad476b7e. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D150000	2023-09-14 20:34:44 -06:00
Cullen Rhodes	f75d46a7ec	[mlir][ArmSME] Lower vector.outerproduct to FMOPA/BFMOPA (#65621 ) This patch adds support for lowering vector.outerproduct to the ArmSME MOPA intrinsic for the following types: vector<[8]xf16>, vector<[8]xf16> -> vector<[8]x[8]xf16> vector<[8]xbf16>, vector<[8]xbf16> -> vector<[8]x[8]xbf16> vector<[4]xf32>, vector<[4]xf32> -> vector<[4]x[4]xf32> vector<[2]xf64>, vector<[2]xf64> -> vector<[2]x[2]xf64> The FP variants are lowered to FMOPA (non-widening) [1] and BFloat to BFMOPA (non-widening) [2]. Note at the ISA level these variants are implemented by different architecture features, these are listed below: FMOPA (non-widening) * half-precision - +sme2p1,+sme-f16f16 * single-precision - +sme * double-precision - +sme-f64f64 BFMOPA (non-widening) * half-precision - +sme2p1,+b16b16 There's currently no way to target different features when lowering to ArmSME. Integration tests are added for F32 and F64. We use QEMU to run the integration tests but SME2 support isn't available yet, it's targeted for 9.0, so integration tests for these variants excluded. Masking is currently unsupported. Depends on #65450. [1] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/FMOPA--non-widening---Floating-point-outer-product-and-accumulate- [2] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/BFMOPA--non-widening---BFloat16-floating-point-outer-product-and-accumulate-	2023-09-14 08:31:52 +01:00
Daniil Dudkin	4a831250b8	[mlir][vector] Rename vector reductions: `maxf` → `maximumf`, `minf` → `minimumf` This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671. Here, we are addressing task 2.1 from the plan, which involves renaming the vector reductions to align with the semantics of the corresponding LLVM intrinsics. Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D158618	2023-09-13 22:49:07 +00:00
Andrzej Warzyński	22f96ab6fb	[mlir][vector] Refine vector.transfer_read hoisting/forwarding (#65770 ) Make sure that when analysing a `vector.transfer_read` that's a candidate for either hoisting or store-to-load forwarding, `memref.collapse_shape` Ops are correctly included in the alias analysis. This is done by either * making sure that relevant users are taken into account, or * source Ops are correctly identified.	2023-09-12 10:33:58 +01:00
Daniil Dudkin	8a6e54c9b3	[mlir][arith] Rename operations: `maxf` → `maximumf`, `minf` → `minimumf` (#65800 ) This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671. This commit addresses Task 1.2 of the mentioned RFC. By renaming these operations, we align their names with LLVM intrinsics that have corresponding semantics.	2023-09-11 22:02:19 -07:00
Benjamin Maxwell	ccef726d09	[mlir][VectorOps] Don't drop scalable dims when lowering transfer_reads/writes (in VectorToLLVM) This is a follow-on to D158753, and allows the lowering of a transfer read/write of n-D vectors with a single trailing scalable dimension to primitive vector ops. The final conversion to LLVM depends on D158517 and D158752, without these patches type conversion will fail (or an assert is hit in the LLVM backend) if the final IR contains an array of scalable vectors. This patch adds `transform.apply_patterns.vector.lower_create_mask` which allows the lowering of vector.create_mask/constant_mask to be tested independently of --convert-vector-to-llvm. Reviewed By: c-rhodes, awarzynski, dcaballe Differential Revision: https://reviews.llvm.org/D159482	2023-09-11 16:47:51 +00:00
Benjamin Maxwell	8dffb71cba	[mlir][VectorOps] Add lowering for vector.shape_cast of scalable vectors This adds a lowering similar to the general shape_cast lowering, but instead moves elements a (scalable) subvector at a time via vector.scalable.extract/insert. It is restricted to the case where both the source and result vector types have a single trailing scalable dimension (due to limitations of the insert/extract ops). The current lowerings are now disabled for scalable vectors, as they produce incorrect results at runtime (due to assuming a fixed number of elements). Examples of casts that now work: // Flattening: %v = vector.shape_cast %arg0 : vector<4x[8]xi8> to vector<[32]xi8> // Un-flattening: %v = vector.shape_cast %arg0 : vector<[8]xi32> to vector<2x1x[4]xi32> Reviewed By: awarzynski, nicolasvasilache Differential Revision: https://reviews.llvm.org/D159217	2023-09-07 15:58:44 +00:00
Cullen Rhodes	067bd7d051	[mlir][vector] Use optional for outerproduct accumulator instead of variadic This was introduced before the Optional directive and uses Variadic, but it's really optional. Reviewed By: nicolasvasilache, benmxwl-arm, dcaballe Differential Revision: https://reviews.llvm.org/D159259	2023-09-01 05:50:01 +00:00
Benjamin Maxwell	296d5cb60c	[mlir][BuiltinTypes] Return VectorType from VectorType::Builder conversion operator 0-D vectors are now supported, so the special case of returning the just the element type can now be removed. A few callers that relied on the old behaviour have been updated. Reviewed By: awarzynski, nicolasvasilache Differential Revision: https://reviews.llvm.org/D159122	2023-08-30 13:47:06 +00:00
yzhang93	f4bef787bc	Add narrow type emulation pattern for vector.transfer_read Reviewed By: mravishankar, hanchung Differential Revision: https://reviews.llvm.org/D158757	2023-08-29 13:15:19 -07:00
Lei Zhang	d243378722	[mlir][vector] Use dyn_cast in if conditions Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D158336	2023-08-22 08:27:40 -07:00
Andrzej Warzynski	f9070b2dfb	[mlir][vector] Enable CastAwayElementwiseLeadingOneDim for scalable vec This patch effectively enables the CastAwayElementwiseLeadingOneDim rewrite pattern for scalable vectors. To this end, `ExtractOp::inferReturnTypes` is updated so that scalable dimensions are correctly recognised. The change to ExtractOp will likely make also other conversion patterns valid for scalable vectors, but this patch focuses on just one case. Other conversion patterns will be enabled in the forthcoming patches. Depends on D157993 Differential Revision: https://reviews.llvm.org/D158335	2023-08-22 11:40:46 +00:00
Andrzej Warzynski	576b184d6e	[mlir][vector] Add support for scalable vectors in `trimLeadingOneDims` This patch updates one specific hook in "VectorDropLeadUnitDim.cpp" to make sure that "scalable dims" are handled correctly. While this change affects multiple patterns, I am only adding one regression tests that captures one specific case that affects me right now. I am also adding Vector dialect to the list of dependencies of `-test-vector-to-vector-lowering`. Otherwise my test case won't work as a standalone test. Differential Revision: https://reviews.llvm.org/D157993	2023-08-22 08:45:59 +00:00
Lei Zhang	199442ea2c	[mlir][vector] Fix uniform transfer_read distribution If the original shape and the distributed shape is the same, we don't distribute at all--every thread is handling the whole. Reviewed By: hanchung Differential Revision: https://reviews.llvm.org/D158235	2023-08-17 17:38:55 -07:00
Mahesh Ravishankar	0f8bab8d59	[mlir] Revamp implementation of sub-byte load/store emulation. When handling sub-byte emulation, the sizes of the converted `memref`s also need to be updated (this was not done in the current implementation). This adds the additional complexity of having to linearize the `memref`s as well. Consider a `memref<3x3xi4>` where the `i4` elements are packed. This has a overall size of 5 bytes (rounded up to number of bytes). This can only be represented by a `memref<5xi8>`. A `memref<3x2xi8>` would imply an implicit padding of 4 bits at the end of each row. So incorporate linearization into the sub-byte load-store emulation. This patch also updates some of the utility functions to make better use of statically available information using `OpFoldResult` and `makeComposedFoldedAffineApplyOps`. Reviewed By: hanchung, yzhang93 Differential Revision: https://reviews.llvm.org/D158125	2023-08-17 20:27:53 +00:00
Lei Zhang	73ddc4474b	[mlir][vector] Enable distribution over multiple dimensions This commit starts enabling vector distruction over multiple dimensions. It requires delinearize the lane ID to match the expected rank. shape_cast and transfer_read now can properly handle multiple dimensions. Reviewed By: hanchung Differential Revision: https://reviews.llvm.org/D157931	2023-08-16 12:08:43 -07:00
Matthias Springer	a02ad6c177	[mlir][bufferization] Generalize getAliasingOpResults to getAliasingValues This revision is needed to support bufferization of `cf.br`/`cf.cond_br`. It will also be useful for better analysis of loop ops. This revision generalizes `getAliasingOpResults` to `getAliasingValues`. An OpOperand can now not only alias with OpResults but also with BlockArguments. In the case of `cf.br` (will be added in a later revision): a `cf.br` operand will alias with the corresponding argument of the destination block. If an op does not implement the `BufferizableOpInterface`, the analysis in conservative. It previously assumed that an OpOperand may alias with each OpResult. It now assumes that an OpOperand may alias with each OpResult and each BlockArgument of the entry block. Differential Revision: https://reviews.llvm.org/D157957	2023-08-15 15:02:47 +02:00
Andrzej Warzynski	12b4951866	[mlir][vector] Add missing support for scalable vectors This patch adds the missing logic so that the `TransferReadPermutationLowering` can be used for scalable vectors. To this end: * TransferOp custom C++ builder is updated to support scalable vectors, * `TransferOpReduceRank` is also updated to support scalable vectors. This pattern is relevant when lowering `linalg.matmul` via `vector_multi_reduction` for scalable vectors. I've also updated relevant code in `TransferOpReduceRank` not to use `llvm::to_vector` for constructing `SmallVector` from `ArrayRef`. That hook doesn't work for `ArraryRef<bool>` (), so for consistency I switched to an explicit constructor (so that both `newShape` and `newScalableDim` are constructed in a similar fashion). () IIUC, that's due how implicit narrowing conversions between `bool` and `bool` work. Note that these narrowing conversions change when using initializer lists, see https://en.cppreference.com/w/cpp/language/list_initialization. Depends on D157092 Differential Revision: https://reviews.llvm.org/D157268	2023-08-10 09:08:30 +00:00

1 2 3 4 5 ...

284 Commits