llvm-project

Author	SHA1	Message	Date
Max191	2bff9d9ffe	[mlir] Don't hoist transfers from potentially zero trip loops (#112752 ) The hoistRedundantVectorTransfers function does not verification of loop bounds when hoisting vector transfers. This is not safe in general, since it is possible that the loop will have zero trip count. This PR uses ValueBounds to verify that the lower bound is less than the upper bound of the loop before hoisting. Trip count verification is currently behind an option `verifyNonZeroTrip`, which is false by default. Zero trip count loops can arise in GPU code generation, where a loop bound can be dependent on a thread id. If not all threads execute the loop body, then hoisting out of the loop can cause these threads to execute the transfers when they are not supposed to. --------- Signed-off-by: Max Dawkins <max.dawkins@gmail.com>	2024-10-18 16:11:21 -04:00
Andrzej Warzyński	a758bcdbd9	[mlir][td] Rename pack_paddings in structured.pad (#111036 ) The pack_paddings attribute in the structure.pad TD Op is used to set the `nofold` attribute in the generated tensor.pad Op. The current name is confusing and suggests that there's a relation with the tensor.pack Op. This patch renames it as `nofold_flags` to better match the actual usage.	2024-10-15 19:24:43 +01:00
Quinn Dawkins	9144fed31b	[mlir] Add option for a cleanup pattern set to SCF tiling helper (#109554 ) The SCF helper for tiling an operation implementing the TilingInterface and greedily fusing consumers requires an uninterrupted chain of operations implementing the tiling interface to succeed. There can be cases with intermediate ops that don't implement the interface but have producers that could be fused if various canonicalization/simplification patterns could run in between fusion steps. This adds an option to SCFTileAndFuseOptions for a pattern set to run between fusion steps to the ops that result from fusion/tiling. Removed and newly inserted slices are tracked for continued fusion applications. See this RFC for more discussion: https://discourse.llvm.org/t/rfc-split-fusion-portions-of-the-tilinginterface-into-a-new-interface/81155	2024-10-04 14:42:55 -04:00
Andrzej Warzyński	d9d623310d	[mlir][linalg] Add a new helper hook: `hasVectorizationImpl` (#110708 ) The newly added hook simply returns `false` for Ops for which there's no "vectorization logic" in the Linalg Vectorizer (i.e. the `vectorize()` method). It's added so that the following two TD ops expose identical level of functionality (that's not the case ATM): * `transform.structured.vectorize_children_and_apply_patterns` * `transform.structured.vectorize` Specifically, ATM, the former works only for Linalg Ops, while the latter works for all Ops that the vectorizer supports (). With this change, I am making sure that both TD will behave consistently. Note, this shouldn't affect any of the current uses of the vectorizer. () This is implemented via the `vectorize()` method in Vectorization.cpp.	2024-10-04 10:06:33 +01:00
Rolf Morel	94cf80d6fd	[MLIR][Linalg] Pattern to fold AddOp to accumulation via contraction op's dest (#110514 ) Replaces a linalg.add with one operand the single user of a contraction, which has a zero-filled, "identity-mapped" destination and is dominated by the `other` operand, by the contraction with `other` as its dest. Benefits include elision of an elementwise op, namely the linalg.add, and removing a tensor.empty as a destination which is likely to require an allocation upon bufferization.	2024-10-03 12:22:57 +02:00
Kazu Hirata	b52885bc23	[mlir] Use std::optional::value_or (NFC) (#109893 )	2024-09-26 09:53:43 -07:00
Hugo Trachino	28039055e5	[MLIR][Transform] Hoist Pad generates linalg.transpose (#109669 ) For readability purpose, generate linalg named ops when possible. For maintainability purpose, get rid of duplicated code.	2024-09-26 09:33:47 +01:00
Andrzej Warzyński	42944da5ba	[mlir][vector] Group re-order patterns together (#102856 ) Group all patterns that re-order vector.transpose and vector.broadcast Ops () under `populateSinkVectorOpsPatterns`. These patterns are normally used to "sink" redundant Vector Ops, hence grouping together. Example: ```mlir %at = vector.transpose %a, [1, 0]: vector<4x2xf32> to vector<2x4xf32> %bt = vector.transpose %b, [1, 0]: vector<4x2xf32> to vector<2x4xf32> %r = arith.addf %at, %bt : vector<2x4xf32> ``` would get converted to: ```mlir %0 = arith.addf %a, %b : vector<4x2xf32> %r = vector.transpose %0, [1, 0] : vector<2x4xf32> ``` This patch also moves all tests for these patterns so that all of them are: run under one test-flag: `test-vector-sink-patterns`, * located in one file: "vector-sink.mlir". To facilitate this change: * `-test-sink-vector-broadcast` is renamed as `test-vector-sink-patterns`, * "sink-vector-broadcast.mlir" is renamed as "vector-sink.mlir", * tests for `ReorderCastOpsOnBroadcast` and `ReorderElementwiseOpsOnTranspose` patterns are moved from "vector-reduce-to-contract.mlir" to "vector-sink.mlir", * `ReorderElementwiseOpsOnTranspose` patterns are removed from `populateVectorReductionToContractPatterns` and added to (newly created) `populateSinkVectorOpsPatterns`, * `ReorderCastOpsOnBroadcast` patterns are removed from `populateVectorReductionToContractPatterns` - these are already present in `populateSinkVectorOpsPatterns`. This should allow us better layering and more straightforward testing. For the latter, the goal is to be able to easily identify which pattern a particular test is exercising (especially when it's a specific pattern). NOTES FOR DOWNSTREAM USERS In order to preserve the current functionality, please make sure to add * `populateSinkVectorOpsPatterns`, wherever you are using `populateVectorReductionToContractPatterns`. Also, rename `populateSinkVectorBroadcastPatterns` as `populateSinkVectorOpsPatterns`. (*) I didn't notice any other re-order patterns.	2024-08-16 16:53:53 +01:00
Hsiangkai Wang	c4bf949171	[mlir][linalg] Implement TilingInterface for winograd operators (#96184 ) In order to support arbitrary size input data of conv2d, implement TilingInterface for winograd operations. Before converting winograd operations into nested loops with matrix multiply, tile the input of conv2d into the supported size first. Add a transform operation structured.decompose_winograd_op to decompose winograd operations. Before applying the transform op, use tile_using_for to tile the input data into supported size. The test case shows how to tile and decompose winograd operations.	2024-08-16 16:22:02 +01:00
Kazu Hirata	5262865aac	[mlir] Construct SmallVector with ArrayRef (NFC) (#101896 )	2024-08-04 11:43:05 -07:00
MaheshRavishankar	6740d701bd	[mlir][Linalg] Deprecate `linalg::tileToForallOp` and `linalg::tileToForallOpUsingTileSizes` (#91878 ) The implementation of these methods are legacy and they are removed in favor of using the `scf::tileUsingSCF` methods as replacements. To get the latter on par with requirements of the deprecated methods, the tiling allows one to specify the maximum number of tiles to use instead of specifying the tile sizes. When tiling to `scf.forall` this specification is used to generate the `num_threads` version of the operation. A slight deviation from previous implementation is that the deprecated method always generated the `num_threads` variant of the `scf.forall` operation. Instead now this is driven by the tiling options specified. This reduces the indexing math generated when the tile sizes are specified. Moving from `linalg::tileToForallOp` to `scf::tileUsingSCF` ``` OpBuilder b; TilingInterface op; ArrayRef<OpFoldResult> numThreads; ArrayAttr mapping; FailureOr<ForallTilingResult> result =linalg::tileToForallOp(b, op, numThreads, mapping); ``` can be replaced by ``` scf::SCFTilingOptions options; options.setNumThreads(numThreads); options.setLoopType(scf::SCFTilingOptions::LoopType::ForallOp); options.setMapping(mapping.getValue()); /note the difference that setMapping takes an ArrayRef<Attribute> / FailureOr<scf::SCFTilingResult> result = scf::tileUsingSCF(b, op, options); ``` This generates the `numThreads` version of the `scf.forall` for the inter-tile loops, i.e. ``` ... = scf.forall (%arg0, %arg1) in (%nt0, %nt1) shared_outs(...) ``` Moving from `linalg::tileToForallOpUsingTileSizes` to `scf::tileUsingSCF` ``` OpBuilder b; TilingInterface op; ArrayRef<OpFoldResult> tileSizes; ArrayAttr mapping; FailureOr<ForallTilingResult> result =linalg::tileToForallOpUsingTileSizes(b, op, tileSizes, mapping); ``` can be replaced by ``` scf::SCFTilingOptions options; options.setTileSizes(tileSizes); options.setLoopType(scf::SCFTilingOptions::LoopType::ForallOp); options.setMapping(mapping.getValue()); /note the difference that setMapping takes an ArrayRef<Attribute> / FailureOr<scf::SCFTilingResult> result = scf::tileUsingSCF(b, op, options); ``` Also note that `linalg::tileToForallOpUsingTileSizes` would effectively call the `linalg::tileToForallOp` by computing the `numThreads` from the `op` and `tileSizes` and generate the `numThreads` version of the `scf.forall`. That is not the case anymore. Instead this will directly generate the `tileSizes` version of the `scf.forall` op ``` ... = scf.forall(%arg0, %arg1) = (%lb0, %lb1) to (%ub0, %ub1) step(%step0, %step1) shared_outs(...) ``` If you actually want to use the `numThreads` version, it is upto the caller to compute the `numThreads` and set `options.setNumThreads` instead of `options.setTileSizes`. Note that there is a slight difference in the num threads version and tile size version. The former requires an additional `affine.max` on the tile size to ensure non-negative tile sizes. When lowering to `numThreads` version this `affine.max` is not needed since by construction the tile sizes are non-negative. In previous implementations, the `numThreads` version generated when using the `linalg::tileToForallOpUsingTileSizes` method would avoid generating the `affine.max` operation. To get the same state, downstream users will have to additionally normalize the `scf.forall` operation. Changes to `transform.structured.tile_using_forall` The transform dialect op that called into `linalg::tileToForallOp` and `linalg::tileToForallOpUsingTileSizes` have been modified to call `scf::tileUsingSCF`. The transform dialect op always generates the `numThreads` version of the `scf.forall` op. So when `tile_sizes` are specified for the transform dialect op, first the `tile_sizes` version of the `scf.forall` is generated by the `scf::tileUsingSCF` method which is then further normalized to get back to the same state. So there is no functional change to `transform.structured.tile_using_forall`. It always generates the `numThreads` version of the `scf.forall` op (as it did before this change). --------- Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2024-07-31 12:32:07 -07:00
Felix Schneider	ef8819e256	[mlir] Extend `tile_using_for` verifier to fix a crash (#98366 ) This patch adds a check for the correct number of `loops` results of the `transform.structured.tile_using_for` Op to the verifier, fixing a crash. Fix https://github.com/llvm/llvm-project/issues/98008	2024-07-12 09:05:06 +02:00
Hsiangkai Wang	d9c26b9d56	[mlir][linalg] Add transform operator for Winograd Conv2D algorithm (#96182 ) Add a transform operation structured.winograd_conv2d to convert linalg.conv_2d_nhwc_fhwc to Linalg winograd operations. Reviewers: ftynse, Max191, GeorgeARM, nicolasvasilache, MaheshRavishankar, dcaballe, rengolin Reviewed By: ftynse, Max191 Pull Request: https://github.com/llvm/llvm-project/pull/96182	2024-07-11 14:45:36 +01:00
Matthias Springer	f2d3d829b9	[mlir][linalg][Transform] Fix use-after-free in `SplitOp::apply` (#96390 ) Detected with ASAN. `Operation::getLoc()` was called after erasing the operation. Reverts 48cf6b6bbe7a22bfcd98f82dc7afd21c9decd22f, which attempted to fix the use-after-free. (But the use-after-free is still there when the `hasFailed` branch is taken.)	2024-06-24 21:35:58 +02:00
Christian Sigg	48cf6b6bbe	[mlir] Fix use-after-free introduced in a9efcbf490d9b8f46ec37062ca8653b4068000e5.	2024-06-23 18:56:23 +02:00
muneebkhan85	a9efcbf490	[MLIR] Add continuous tiling to transform dialect (#82792 ) This patch enables continuous tiling of a target structured op using diminishing tile sizes. In cases where the tensor dimensions are not exactly divisible by the tile size, we are left with leftover tensor chunks that are irregularly tiled. This approach enables tiling of the leftover chunk with a smaller tile size and repeats this process recursively using exponentially diminishing tile sizes. This eventually generates a chain of loops that apply tiling using diminishing tile sizes. Adds `continuous_tile_sizes` op to the transform dialect. This op, when given a tile size and a dimension, computes a series of diminishing tile sizes that can be used to tile the target along the given dimension. Additionally, this op also generates a series of chunk sizes that the corresponding tile sizes should be applied to along the given dimension. Adds `multiway` attribute to `transform.structured.split` that enables multiway splitting of a single target op along the given dimension, as specified in a list enumerating the chunk sizes.	2024-06-21 16:39:43 +02:00
donald chen	2c1ae801e1	[mlir][side effect] refactor(*): Include more precise side effects (#94213 ) This patch adds more precise side effects to the current ops with memory effects, allowing us to determine which OpOperand/OpResult/BlockArgument the operation reads or writes, rather than just recording the reading and writing of values. This allows for convenient use of precise side effects to achieve analysis and optimization. Related discussions: https://discourse.llvm.org/t/rfc-add-operandindex-to-sideeffect-instance/79243	2024-06-19 22:10:34 +08:00
MaheshRavishankar	b99d0b3440	[mlir][TilingInterface] Update `PartialReductionOpInterface` to get it more in line with `TilingInterface`. (#95460 ) The `TilingInterface` methods have return values that allow the interface implementation to return multiple operations, and also return tiled values explicitly. This is to avoid the assumption that the interface needs to return a single operation and this operations result are the expected tiled values. Make the `PartialReductionOpInterface::tileToPartialReduction` return `TilingResult` as well for the same reason. Similarly make the `PartialReductionOpInterface::mergeReductions` also return a list of generated operations and values to use as replacements. This is just a refactoring to allow for deprecation of `linalg::tileReductionUsingForall` with `scf::tileReductionUsingSCF` method.	2024-06-18 09:07:29 -07:00
Kunwar Grover	9329b20d5d	[mlir][TilingInterface] Allow multiple results in PartialReductionOpInterface (#92624 ) This patch adds support for reducing operations with multiple results using PartialReductionOpInterface. Also adds an implementation of PartialReductionOpInterface for multiple results for linalg.generic.	2024-05-22 19:21:20 +01:00
srcarroll	2c1c67674c	[mlir][transform] Consistent `linalg` `transform` op syntax for dynamic index lists (#90897 ) This patch is a first pass at making consistent syntax across the `LinalgTransformOp`s that use dynamic index lists for size parameters. Previously, there were two different forms: inline types in the list, or place them in the functional style tuple. This patch goes for the latter. In order to do this, the `printPackedOrDynamicIndexList`, `printDynamicIndexList` and their `parse` counterparts were modified so that the types can be optionally provided to the corresponding custom directives. All affected ops now use tablegen `assemblyFormat`, so custom `parse`/`print` functions have been removed. There are a couple ops that will likely add dynamic size support, and once that happens it should be made sure that the assembly remains consistent with the changes in this patch. The affected ops are as follows: `pack`, `pack_greedily`, `tile_using_forall`. The `tile_using_for` and `vectorize` ops already used this syntax, but their custom assembly was removed. --------- Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>	2024-05-08 09:11:53 -05:00
Kazu Hirata	64ee821fad	[mlir] Fix a warning This patch fixes: llvm-project/mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp:1715:28: error: unused variable 'kPadToMultipleOfKeyword' [-Werror,-Wunused-const-variable]	2024-05-04 16:53:48 -07:00
srcarroll	f2f65eddc5	[mlir][transform] Add support for transform.param pad multiples in `PadOp` (#90755 ) This patch modifies the definition of `PadOp` to take transform params and handles for the `pad_to_multiple_of` operand. --------- Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>	2024-05-04 17:34:40 -05:00
Cullen Rhodes	be1c72d2ba	[mlir][linalg] Move transpose_matmul to targeted transform op (#89717 ) More targeted than a blanket "apply everywhere" pattern. Follow up to #89075 to address @ftynse's feedback.	2024-04-23 10:52:50 +01:00
Cullen Rhodes	7922534974	[mlir][linalg] Add patterns to convert matmul to transposed variants (#89075 ) This adds patterns to convert from the Linalg matmul and batch_matmul ops to the transposed variants. By default the LHS matrix is transposed. Our work enabling a lowering path from linalg.matmul to ArmSME has revealed the current lowering results in non-contiguous memory accesses for the A matrix and very poor performance. These patterns provide a simple option to fix this.	2024-04-23 07:21:06 +01:00
Steven Varoumas	35b292efc6	[mlir][Hoisting] Hoisting vector.extract/vector.broadcast pairs (#86108 ) This transformation, inspired by what is done in hoist_redundant_transfers, hoists pairs of extract/broadcast operations out of scf.for loops. It changes a loop of the form: ``` %res = scf.for _ = _ to _ step _ iter_args(%iarg = %v) -> (t1) { %e = vector.extract %iarg : t1 to t2 %u = "some_use"(%e) : (t2) -> t2 %b = vector.broadcast %u : t2 to t1 scf.yield %b : t1 } ``` into the following: ``` %e = vector.extract %v: t1 to t2 %res' = scf.for _ = _ to _ step _ iter_args(%iarg = %e) -> (t2) { %u' = "some_use"(%iarg) : (t2) -> t2 scf.yield %u' : t2 } %res = vector.broadcast %res' : t2 to t1 ```	2024-04-22 14:54:01 +02:00
Christian Sigg	a5757c5b65	Switch member calls to `isa/dyn_cast/cast/...` to free function calls. (#89356 ) This change cleans up call sites. Next step is to mark the member functions deprecated. See https://mlir.llvm.org/deprecation and https://discourse.llvm.org/t/preferred-casting-style-going-forward.	2024-04-19 15:58:27 +02:00
srcarroll	b79db39659	[mlir][linalg] Support `ParamType` in `vector_sizes` option of `VectorizeOp` transform (#87557 )	2024-04-09 15:52:40 -05:00
srcarroll	bbcfe6f431	[mlir][transform] Emit error message with `emitSilenceableFailure` (#86146 ) The previous implementation used a `notifyMatchFailure` to emit failure message inappropriately and then used the `emitDefaultSilenceableFailure`. This patch changes this to use the more appropriate `emitSilenceableFailure` with error message. Additionally a failure test has been added.	2024-03-22 12:37:39 -05:00
srcarroll	df9ed9cf52	[mlir][transform] Fix failure in flattening already flattened linalg ops (#86037 ) The previous implementation was doing an early successful return on `rank <= 1` without adding the original op to transform results. This resulted in errors about number of returns. This patch fixes this by adding the original op to results. Additionally, we first check if op is elementwise and return a slienceable failure early if not.	2024-03-21 00:25:07 -05:00
Oleksandr "Alex" Zinenko	5a9bdd85ee	[mlir] split transform interfaces into a separate library (#85221 ) Transform interfaces are implemented, direction or via extensions, in libraries belonging to multiple other dialects. Those dialects don't need to depend on the non-interface part of the transform dialect, which includes the growing number of ops and transitive dependency footprint. Split out the interfaces into a separate library. This in turn requires flipping the dependency from the interface on the dialect that has crept in because both co-existed in one library. The interface shouldn't depend on the transform dialect either. As a consequence of splitting, the capability of the interpreter to automatically walk the payload IR to identify payload ops of a certain kind based on the type used for the entry point symbol argument is disabled. This is a good move by itself as it simplifies the interpreter logic. This functionality can be trivially replaced by a `transform.structured.match` operation.	2024-03-20 22:15:17 +01:00
lhunloh	47bc565ca7	[MLIR] [Transforms] Let `transform.structured.convert_to_loops` return handles to loops (#83984 ) This lets `transform.structured.convert_to_loops` return handles to the generated loops, making this transformation more useful to use for (transformation-)nesting purposes. This is modelled after SCFs `transform.loop.forall_to_for` which returns handles to loops. Introduced in commit aa2a96a24ae3a8cc04635ab6ede474c5f2665053, with a note that they might move out of the `Linalg`-Dialect, but no reason given for the non-return of handles. As far as I can see, this transform always returns loops.	2024-03-06 17:07:30 -05:00
Congcong Cai	0597644a64	[mlir][transform] replace original op to loop ops (#83537 )	2024-03-05 03:58:12 +08:00
srcarroll	b6f4dd9ee8	[mlir][transform] Implement `FlattenElementwiseLinalgOp` transform op (#81431 ) A `transform.structured.flatten_elementwise` op is implemented for flattening the iteration space and (applicable) operands/results to a single dimension.	2024-02-28 11:19:06 -06:00
Balaji V. Iyer	adf838daee	[mlir][Vectorizer] Added support to Vectorize tensor.unpack (#76087 ) Added support to vectorized tensor.unpack. The unpack Op is split into a `vector.transfer_read`, `vector.transpose`, `vector.shape_cast` and a `vector.transfer_write`.	2024-02-20 16:10:14 -06:00
Matthias Springer	914e607487	[mlir][IR][NFC] Rename `notifyRemoved` to `notifyErased` (#82253 ) Rename listener callback names: * `notifyOperationRemoved` -> `notifyOperationErased` * `notifyBlockRemoved` -> `notifyBlockErased` The current callback names are misnomers. The callbacks are triggered when an operation/block is erased, not when it is removed (unlinked). E.g.: ```c++ /// Notify the listener that the specified operation is about to be erased. /// At this point, the operation has zero uses. /// /// Note: This notification is not triggered when unlinking an operation. virtual void notifyOperationErased(Operation *op) {} ``` This change is in preparation of adding listener support to the dialect conversion. The dialect conversion internally unlinks IR before erasing it at a later point of time. There is an important difference between "remove" and "erase". Lister callback names should be accurate to avoid confusion.	2024-02-20 09:08:19 +01:00
Max191	7880b2c858	[mlir] Add direct vectorization lowering for `tensor.pack` ops (#78660 ) This PR adds a direct vectorization lowering of `tensor.pack` into `mask(vector.transfer_read)`->`vector.shape_cast`->`vector.transpose`->`vector.transfer_write`.	2024-02-07 14:11:11 -05:00
jinchen62	d439f3640b	Add support of param type for transform.structured.tile_using_forall (#72097 ) Make transform.structured.tile_using_forall be able to take param type tile sizes. Examples: ``` %tile_sizes = transform.param.constant 16 : i64 -> !transform.param<i64> transform.structured.tile_using_forall %matmul tile_sizes [%tile_sizes : !transform.param<i64>, 32] ( mapping = [#gpu.block<x>, #gpu.block<y>] ) : (!transform.any_op) -> (!transform.any_op, !transform.any_op) ``` ``` %c10 = transform.param.constant 10 : i64 -> !transform.any_param %c20 = transform.param.constant 20 : i64 -> !transform.any_param %tile_sizes = transform.merge_handles %c10, %c20 : !transform.any_param transform.structured.tile_using_forall %matmul tile_sizes *(%tile_sizes : !transform.any_param) ( mapping = [#gpu.block<x>, #gpu.block<y>] ) : (!transform.any_op) -> (!transform.any_op, !transform.any_op) ```	2024-01-31 10:02:39 +01:00
MaheshRavishankar	76ead96c1d	[mlir][TilingInterface] Use `LoopLikeOpInterface` in tiling using SCF to unify tiling with `scf.for` and `scf.forall`. (#77874 ) Using `LoopLikeOpInterface` as the basis for the implementation unifies all the tiling logic for both `scf.for` and `scf.forall`. The only difference is the actual loop generation. This is a follow up to https://github.com/llvm/llvm-project/pull/72178 Instead of many entry points for each loop type, the loop type is now passed as part of the options passed to the tiling method. This is a breaking change with the following changes 1) The `scf::tileUsingSCFForOp` is renamed to `scf::tileUsingSCF` 2) The `scf::tileUsingSCFForallOp` is deprecated. The same functionality is obtained by using `scf::tileUsingSCF` and setting the loop type in `scf::SCFTilingOptions` passed into this method to `scf::SCFTilingOptions::LoopType::ForallOp` (using the `setLoopType` method). 3) The `scf::tileConsumerAndFusedProducerGreedilyUsingSCFForOp` is renamed to `scf::tileConsumerAndFuseProducerUsingSCF`. The use of the `controlFn` in `scf::SCFTileAndFuseOptions` allows implementing any strategy with the default callback implemeting the greedy fusion. 4) The `scf::SCFTilingResult` and `scf::SCFTileAndFuseResult` now use `SmallVector<LoopLikeOpInterface>`. 5) To make `scf::ForallOp` implement the parts of `LoopLikeOpInterface` needed, the `getOutputBlockArguments()` method is replaced with `getRegionIterArgs()` These changes now bring the tiling and fusion capabilities using `scf.forall` on par with what was already supported by `scf.for`	2024-01-25 21:26:23 -08:00
Matthias Springer	5cc0f76d34	[mlir][IR] Add rewriter API for moving operations (#78988 ) The pattern rewriter documentation states that "all IR mutations [...] are required to be performed via the `PatternRewriter`." This commit adds two functions that were missing from the rewriter API: `moveOpBefore` and `moveOpAfter`. After an operation was moved, the `notifyOperationInserted` callback is triggered. This allows listeners such as the greedy pattern rewrite driver to react to IR changes. This commit narrows the discrepancy between the kind of IR modification that can be performed and the kind of IR modifications that can be listened to.	2024-01-25 11:01:28 +01:00
Mehdi Amini	de5cedefab	Apply clang-tidy fixes for llvm-qualified-auto in LinalgTransformOps.cpp (NFC)	2024-01-18 16:39:20 -08:00
Mehdi Amini	42427d805a	Apply clang-tidy fixes for bugprone-macro-parentheses in LinalgTransformOps.cpp (NFC)	2024-01-18 16:39:20 -08:00
Matthias Springer	5fcf907b34	[mlir][IR] Rename "update root" to "modify op" in rewriter API (#78260 ) This commit renames 4 pattern rewriter API functions: * `updateRootInPlace` -> `modifyOpInPlace` * `startRootUpdate` -> `startOpModification` * `finalizeRootUpdate` -> `finalizeOpModification` * `cancelRootUpdate` -> `cancelOpModification` The term "root" is a misnomer. The root is the op that a rewrite pattern matches against (https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional). A rewriter must be notified of all in-place op modifications, not just in-place modifications of the root (https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old function names were confusing and have contributed to various broken rewrite patterns. Note: The new function names use the term "modify" instead of "update" for consistency with the `RewriterBase::Listener` terminology (`notifyOperationModified`).	2024-01-17 11:08:59 +01:00
Kazu Hirata	8e8bbbd48e	[mlir] Use llvm::is_contained (NFC)	2024-01-12 22:08:29 -08:00
MaheshRavishankar	aa2a96a24a	[mlir][TilingInterface] Move TilingInterface tests to use transform dialect ops. (#77204 ) In the process a couple of test transform dialect ops are added just for testing. These operations are not intended to use as full flushed out of transformation ops, but are rather operations added for testing. A separate operation is added to `LinalgTransformOps.td` to convert a `TilingInterface` operation to loops using the `generateScalarImplementation` method implemented by the operation. Eventually this and other operations related to tiling using the `TilingInterface` need to move to a better place (i.e. out of `Linalg` dialect)	2024-01-11 21:31:03 -08:00
lorenzo chelini	06c4f78b07	[MLIR][Linalg] improve silenceable failure msg for `lower_pack` (NFC) (#75053 ) Adjust the silenceable failure message as we lower `tensor.unpack` as a combination of `linalg.transpose` + `tensor.collapse_shape` and `tensor.extract_slice`.	2023-12-12 13:06:17 +01:00
Pablo Antonio Martinez	b396e5429c	Reland "[MLIR][Transform] Add attribute in MatchOp to filter by operand type (#67994 )" Test was failing due to a different transform sequence declaration (transform sequence were used, while now it should be named transform sequence). Test is now fixed.	2023-12-07 11:57:02 +00:00
Mikhail Goncharov	10879403e5	Revert "[MLIR][Transform] Add attribute in MatchOp to filter by operand type (#67994 )" This reverts commit c4399130ae403acf4e6325b8b46a51bb6abf222f. Test fails https://lab.llvm.org/buildbot/#/builders/272/builds/2757	2023-12-07 10:28:35 +01:00
Pablo Antonio Martinez	c4399130ae	[MLIR][Transform] Add attribute in MatchOp to filter by operand type (#67994 ) This patchs adds the `filter_operand_types` attribute to transform::MatchOp, allowing to filter ops depending on their operand types.	2023-12-07 08:28:52 +00:00
Andrzej Warzyński	03c2f5d8bb	[mlir][linalg][conv] Flatten the channel dimension when vectorizing (#71918 ) The current vectorization of 1D depthwise convolutions in Linalg is _sub-optimal_ for tensor with a low number of channel dimensions, e.g.: ```mlir linalg.depthwise_conv_1d_nwc_wc {dilations = dense<1> : vector<1xi64>, strides = dense<1> : vector<1xi64>} ins(%input, %filter : tensor<1x8x3xi8>, tensor<1x3xi8>) outs(%output : tensor<1x8x3xi8>) -> tensor<1x8x3xi8> ``` That's due to the fact that ultimately (i.e. at LLVM level), vectorization happens along the trailing dimension (i.e. the channel dimension). In this case it leads to vectors with 3 elements (or worse, if there's e.g. only 1 channel dimension). For comparison, a 128 bit wide vector registers can hold 16 x i8. Instead, this patch adds an option to flatten/collapse the channel dimension into the width dimension of the input/filter/output using `vector.shape_cast` operation: ```mlir %sc_input = vector.shape_cast %input : vector<1x8x3xi8> to vector<1x24xi8> %sc_output = vector.shape_cast %output : vector<1x8x3xi8> to vector<1x24xi8> %b_filter = vector.broadcast %filter : vector<3xi8> to vector<1x8x3xi8> %sc_filter = vector.shape_cast %b_filter : vector<1x8x3xi8> to vector<1x24xi8> ``` This new vectorization mode is implemented in `depthwiseConv` by inserting `vector.shape_cast` Ops before and after `depthwiseConv1dSliceAsMulAcc` is invoked. It can be selected through e.g. a transform dialect attribute: ```mlir transform.structured.vectorize_children_and_apply_patterns %conv {flatten_1d_depthwise_conv} ``` A forthcoming patch will implement a strategy to automatically switch between the two implementations, depending on the shape of the input tensors. Co-authored by: Bradley Smith <bradley.smith@arm.com>	2023-12-06 21:35:03 +00:00
Felix Schneider	e07c92a9c3	[mlir] Fix TileUsingForOp attr-dict printing/parsing (#73261 ) `TileUsingForOp` has an optional Attribute `interchange` which was given in curly braces like this: `{interchange = [...]}`. The way this was parsed meant that no `attr-dict` could be attached to the Op. This patch adds printing / parsing of an `attr-dict` to the Op and prints/parses the `interchange` Attribute separate from the discardable Attributes.	2023-12-06 20:08:01 +01:00

1 2 3 4 5 ...

289 Commits