45 Commits

Author SHA1 Message Date
Max191
f1595ecfdc
[mlir] Fix bug in UnPackOp tiling implementation causing infinite loop (#113571)
This fixes a bug in the tiling implementation of tensor.unpack that was
causing an infinite loop when certain unpack ops get tiled and fused as
a producer. The tiled implementation of tensor.unpack sometimes needs to
create an additional tensor.extract_slice on the result of the tiled
unpack op, but this slice was getting added to the `generatedSlices` of
the tiling result. The `generatedSlices` are used to find the next
producers to fuse, so it caused an infinite loop of fusing the same
unpack op after it was already in the loop. This fixes the bug by adding
the slice of the source instead of the result.

Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
2024-10-24 21:32:45 -04:00
Abhishek Varma
b8c974f093
[MLIR][TilingInterface] Extend consumer fusion for multi-use of producer shared by terminator ops (#110105)
-- This commit extends consumer fusion to take place even if the
producer has multiple uses.
-- The multiple uses of the producer essentially means that besides the
consumer op in concern, the only other uses of the producer are
allowed in :-
   1. scf.yield
   2. tensor.parallel_insert_slice

Signed-off-by: Abhishek Varma <abhvarma@amd.com>
2024-09-30 14:51:06 +05:30
MaheshRavishankar
d5f0969c96
[mlir][TilingInterface] Avoid looking at operands for getting slices to continue tile + fuse. (#107882)
Current implementation of `scf::tileConsumerAndFuseProducerUsingSCF`
looks at operands of tiled/tiled+fused operations to see if they are
produced by `extract_slice` operations to populate the worklist used to
continue fusion. This implicit assumption does not always work. Instead
make the implementations of `getTiledImplementation` return the slices
to use to continue fusion.

This is a breaking change

- To continue to get the same behavior of
`scf::tileConsumerAndFuseProducerUsingSCF`, change all out-of-tree
implementation of `TilingInterface::getTiledImplementation` to return
the slices to continue fusion on. All in-tree implementations have been
adapted to this.
- This change touches parts that required a simplification to the
`ControlFn` in `scf::SCFTileAndFuseOptions`. It now returns a
`std::optional<scf::SCFTileAndFuseOptions::ControlFnResult>` object that
should be `std::nullopt` if fusion is not to be performed.

Signed-off-by: MaheshRavishankar <mahesh.revishankar@gmail.com>
2024-09-11 22:15:43 -07:00
Yun-Fly
a9ba1b6dd5
[mlir][scf] Extend consumer fuse to single nested scf.for (#108318)
Refactor current consumer fusion based on `addInitOperandsToLoopNest` to support single nested `scf.for`, E.g.

```
%0 = scf.for() {
  %1 = scf.for() {
     tiledProducer
  }
  yield %1
}
%2 = consumer ins(%0)
```

Compared with #94190, this PR fix build failure by making C++17 happy.
2024-09-12 12:01:23 +08:00
Kazu Hirata
335538c271 Revert "[mlir][scf] Extend consumer fuse to single nested scf.for (#94190)"
This reverts commit 2d4bdfba96d4cf88b12226b2b511bf55ee5e6559.

A build breakage is reported at:

https://lab.llvm.org/buildbot/#/builders/138/builds/3524
2024-09-11 19:18:37 -07:00
Yun-Fly
2d4bdfba96
[mlir][scf] Extend consumer fuse to single nested scf.for (#94190)
Refactor current consumer fusion based on `addInitOperandsToLoopNest` to support single nested `scf.for`, E.g.

```
%0 = scf.for() {
  %1 = scf.for() {
     tiledProducer
  }
  yield %1
}
%2 = consumer ins(%0)
```
2024-09-12 10:02:57 +08:00
Yun-Fly
c8763f04bf
[mlir][tensor] Fix consumer fusion for tensor.pack without explicit outer_dims_perm attribute (#106687) 2024-09-04 09:19:09 +08:00
Yun-Fly
f06563a5c0
[mlir][tensor] Add consumer fusion for tensor.pack op. (#103715)
Add missing `getIterationDomainTileFromOperandTile` and `getTiledImplementationFromOperandTile` to `tensor.pack` and enable fusing it as a consumer. NOTE that, it only expects perfect tiling scenario without padding semantic currently.
2024-08-23 10:07:17 +08:00
MaheshRavishankar
6740d701bd
[mlir][Linalg] Deprecate linalg::tileToForallOp and linalg::tileToForallOpUsingTileSizes (#91878)
The implementation of these methods are legacy and they are removed in
favor of using the `scf::tileUsingSCF` methods as replacements. To get
the latter on par with requirements of the deprecated methods, the
tiling allows one to specify the maximum number of tiles to use instead
of specifying the tile sizes. When tiling to `scf.forall` this
specification is used to generate the `num_threads` version of the
operation.

A slight deviation from previous implementation is that the deprecated
method always generated the `num_threads` variant of the `scf.forall`
operation. Instead now this is driven by the tiling options specified.
This reduces the indexing math generated when the tile sizes are
specified.

**Moving from `linalg::tileToForallOp` to `scf::tileUsingSCF`**

```
OpBuilder b;
TilingInterface op;
ArrayRef<OpFoldResult> numThreads;
ArrayAttr mapping;
FailureOr<ForallTilingResult> result =linalg::tileToForallOp(b, op, numThreads, mapping);
```

can be replaced by
```
scf::SCFTilingOptions options;
options.setNumThreads(numThreads);
options.setLoopType(scf::SCFTilingOptions::LoopType::ForallOp);
options.setMapping(mapping.getValue()); /*note the difference that setMapping takes an ArrayRef<Attribute> */
FailureOr<scf::SCFTilingResult> result = scf::tileUsingSCF(b, op, options);
```

This generates the `numThreads` version of the `scf.forall` for the
inter-tile loops, i.e.

```
... = scf.forall (%arg0, %arg1) in (%nt0, %nt1) shared_outs(...)
```

**Moving from `linalg::tileToForallOpUsingTileSizes` to
`scf::tileUsingSCF`**

```
OpBuilder b;
TilingInterface op;
ArrayRef<OpFoldResult> tileSizes;
ArrayAttr mapping;
FailureOr<ForallTilingResult> result =linalg::tileToForallOpUsingTileSizes(b, op, tileSizes, mapping);
```

can be replaced by
```
scf::SCFTilingOptions options;
options.setTileSizes(tileSizes);
options.setLoopType(scf::SCFTilingOptions::LoopType::ForallOp);
options.setMapping(mapping.getValue()); /*note the difference that setMapping takes an ArrayRef<Attribute> */
FailureOr<scf::SCFTilingResult> result = scf::tileUsingSCF(b, op, options);
```

Also note that `linalg::tileToForallOpUsingTileSizes` would effectively
call the `linalg::tileToForallOp` by computing the `numThreads` from the
`op` and `tileSizes` and generate the `numThreads` version of the
`scf.forall`. That is not the case anymore. Instead this will directly
generate the `tileSizes` version of the `scf.forall` op

```
... = scf.forall(%arg0, %arg1) = (%lb0, %lb1) to (%ub0, %ub1) step(%step0, %step1) shared_outs(...)
```

If you actually want to use the `numThreads` version, it is upto the
caller to compute the `numThreads` and set `options.setNumThreads`
instead of `options.setTileSizes`. Note that there is a slight
difference in the num threads version and tile size version. The former
requires an additional `affine.max` on the tile size to ensure
non-negative tile sizes. When lowering to `numThreads` version this
`affine.max` is not needed since by construction the tile sizes are
non-negative. In previous implementations, the `numThreads` version
generated when using the `linalg::tileToForallOpUsingTileSizes` method
would avoid generating the `affine.max` operation. To get the same
state, downstream users will have to additionally normalize the
`scf.forall` operation.

**Changes to `transform.structured.tile_using_forall`**

The transform dialect op that called into `linalg::tileToForallOp` and
`linalg::tileToForallOpUsingTileSizes` have been modified to call
`scf::tileUsingSCF`. The transform dialect op always generates the
`numThreads` version of the `scf.forall` op. So when `tile_sizes` are
specified for the transform dialect op, first the `tile_sizes` version
of the `scf.forall` is generated by the `scf::tileUsingSCF` method which
is then further normalized to get back to the same state. So there is no
functional change to `transform.structured.tile_using_forall`. It always
generates the `numThreads` version of the `scf.forall` op (as it did
before this change).

---------

Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
2024-07-31 12:32:07 -07:00
Yun-Fly
7ef08eacd5
[mlir][scf] Extend option to yield replacement for multiple results case (#93144)
This patch extends the functionality of yielding replacement for multiple 
results case and adds another optional argument called `yieldResultNumber` 
indicating which result(s) need yield. If not given, all of results will be yield 
by default.
2024-06-28 20:43:52 +08:00
Abhishek Varma
2b2ce50fe8
[MLIR][SCF] Add an API to fuse consumer to a producer within scf loop (#88712)
This commit adds an API (`tileAndFuseConsumerOfSlice`) to fuse consumer to a producer within
scf.for/scf.forall loop.

To support this two new methods are added to the `TilingInterface`
- `getIterationDomainTileFromOperandTile`
- `getTiledImplementationFromOperandTile`.

Consumer operations that implement this method can be used to be fused with tiled producer operands in a manner similar to (but essentially the inverse of) the fusion of an untiled producer with a tiled consumer.

Note that this only does one `tiled producer` -> `consumer` fusion. This could be called repeatedly for fusing multiple consumers. The current implementation also is conservative in when this kicks in (like single use of the value returned by the inter-tile loops that surround the tiled producer, etc.) These can be relaxed over time.

Signed-off-by: Abhishek Varma <abhvarma@amd.com>

---------

Signed-off-by: Abhishek Varma <abhvarma@amd.com>
Signed-off-by: Abhishek Varma <avarma094@gmail.com>
Co-authored-by: cxy <chenxunyu1993@gmail.com>
2024-06-01 11:23:41 -07:00
srcarroll
2c1c67674c
[mlir][transform] Consistent linalg transform op syntax for dynamic index lists (#90897)
This patch is a first pass at making consistent syntax across the
`LinalgTransformOp`s that use dynamic index lists for size parameters.
Previously, there were two different forms: inline types in the list, or
place them in the functional style tuple. This patch goes for the
latter.

In order to do this, the `printPackedOrDynamicIndexList`,
`printDynamicIndexList` and their `parse` counterparts were modified so
that the types can be optionally provided to the corresponding custom
directives.

All affected ops now use tablegen `assemblyFormat`, so custom
`parse`/`print` functions have been removed. There are a couple ops that
will likely add dynamic size support, and once that happens it should be
made sure that the assembly remains consistent with the changes in this
patch.

The affected ops are as follows: `pack`, `pack_greedily`,
`tile_using_forall`. The `tile_using_for` and `vectorize` ops already
used this syntax, but their custom assembly was removed.

---------

Co-authored-by: Oleksandr "Alex" Zinenko <ftynse@gmail.com>
2024-05-08 09:11:53 -05:00
lhunloh
47bc565ca7
[MLIR] [Transforms] Let transform.structured.convert_to_loops return handles to loops (#83984)
This lets `transform.structured.convert_to_loops` return handles to the
generated loops, making this transformation more useful to use for
(transformation-)nesting purposes. This is modelled after SCFs
`transform.loop.forall_to_for` which returns handles to loops.

Introduced in commit aa2a96a24ae3a8cc04635ab6ede474c5f2665053, with a
note that they might move out of the `Linalg`-Dialect, but no reason
given for the non-return of handles. As far as I can see, this transform
always returns loops.
2024-03-06 17:07:30 -05:00
Congcong Cai
0597644a64
[mlir][transform] replace original op to loop ops (#83537) 2024-03-05 03:58:12 +08:00
MaheshRavishankar
76ead96c1d
[mlir][TilingInterface] Use LoopLikeOpInterface in tiling using SCF to unify tiling with scf.for and scf.forall. (#77874)
Using `LoopLikeOpInterface` as the basis for the implementation unifies
all the tiling logic for both `scf.for` and `scf.forall`. The only
difference is the actual loop generation. This is a follow up to
https://github.com/llvm/llvm-project/pull/72178
Instead of many entry points for each loop type, the loop type is now
passed as part of the options passed to the tiling method.

This is a breaking change with the following changes

1) The `scf::tileUsingSCFForOp` is renamed to `scf::tileUsingSCF`
2) The `scf::tileUsingSCFForallOp` is deprecated. The same
   functionality is obtained by using `scf::tileUsingSCF` and setting
   the loop type in `scf::SCFTilingOptions` passed into this method to
   `scf::SCFTilingOptions::LoopType::ForallOp` (using the
   `setLoopType` method).
3) The `scf::tileConsumerAndFusedProducerGreedilyUsingSCFForOp` is
   renamed to `scf::tileConsumerAndFuseProducerUsingSCF`. The use of
   the `controlFn` in `scf::SCFTileAndFuseOptions` allows implementing
   any strategy with the default callback implemeting the greedy fusion.
4) The `scf::SCFTilingResult` and `scf::SCFTileAndFuseResult` now use
   `SmallVector<LoopLikeOpInterface>`.
5) To make `scf::ForallOp` implement the parts of
   `LoopLikeOpInterface` needed, the `getOutputBlockArguments()`
   method is replaced with `getRegionIterArgs()`

These changes now bring the tiling and fusion capabilities using
`scf.forall` on par with what was already supported by `scf.for`
2024-01-25 21:26:23 -08:00
MaheshRavishankar
974ded9725
[mlir][Linalg] Change linalg.transpose to use the output indexing map as identity. (#77951)
This makes it consistent with how other linalg operations represent
indexing maps.
2024-01-12 14:17:51 -08:00
MaheshRavishankar
aa2a96a24a
[mlir][TilingInterface] Move TilingInterface tests to use transform dialect ops. (#77204)
In the process a couple of test transform dialect ops are added just
for testing. These operations are not intended to use as full flushed
out of transformation ops, but are rather operations added for testing.

A separate operation is added to `LinalgTransformOps.td` to convert a
`TilingInterface` operation to loops using the
`generateScalarImplementation` method implemented by the
operation. Eventually this and other operations related to tiling
using the `TilingInterface` need to move to a better place (i.e. out
of `Linalg` dialect)
2024-01-11 21:31:03 -08:00
MaheshRavishankar
4a020018ce
[NFC] Simplify the tiling implementation using cloning. (#72178)
The current implementation of tiling using `scf.for` is convoluted to
make sure that the destination passing style of the untiled program is
preserved. The addition of support to tile using `scf.forall` (adapted
from the transform operation in Linalg) in
https://github.com/llvm/llvm-project/pull/67083 used cloning of the
tiled operations to better streamline the implementation. This PR adapts
the other tiling methods to use a similar approach, making the
transformations (and handling destination passing style semantics) more
systematic.

---------

Co-authored-by: Abhishek-Varma <avarma094@gmail.com>
2023-11-20 09:05:48 -08:00
MaheshRavishankar
d871daea81
[mlir][TilingInterface] Add scf::tileUsingSCFForallOp method to tile using the interface to generate scf::forall. (#67083)
Similar to `scf::tileUsingSCFForOp` that is a method that tiles
operations that implement the `TilingInterface`, using `scf.for`
operations, this method introduces tiling of operations using
`scf.forall`. Most of this implementation is derived from
`linalg::tileToForallOp` method. Eventually that method will either be
deprecated or moved to use the method introduced here.
2023-10-19 23:21:45 -07:00
MaheshRavishankar
93c42299bd
[mlir][TilingInterface] NFC code changes separated out from introduction of scf::tileUsingSCFForallop. (#67081)
This patch contains NFC changes that are precursor to the introduction
of `scf::tileUsingSCFForallOp` method introduced in
https://github.com/llvm/llvm-project/pull/67083.
2023-09-26 13:42:27 -07:00
Daniil Dudkin
8a6e54c9b3
[mlir][arith] Rename operations: maxfmaximumf, minfminimumf (#65800)
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.

This commit addresses Task 1.2 of the mentioned RFC. By renaming these operations, we align their names with LLVM intrinsics that have corresponding semantics.
2023-09-11 22:02:19 -07:00
Matthias Springer
2268459985 [mlir][Interfaces] TilingInterface: Add test case for linalg.copy on memrefs
Differential Revision: https://reviews.llvm.org/D153347
2023-06-21 08:52:25 +02:00
Mahesh Ravishankar
9db7d4edd8 [mlir][TilingInterface] Add an option to tile and fuse to yield replacement for the fused producer.
This patch adds an option to the method that fuses a producer with a
tiled consumer, to also yield from the tiled loops a value that can be
used to replace the original producer. This is only valid if it can be
assertained that the slice of the producer computed within each
iteration of the tiled loop nest does not compute slices of the
producer redundantly. The analysis to derive this is very involved. So
this is left to the caller to assertain.  A test is added that mimics
the `scf::tileConsumerAndFuseProducersGreedilyUsingSCFForOp`, but also
yields the values of all fused producers. This can be used as a
reference for how a caller could use this functionality.

Differential Revision: https://reviews.llvm.org/D141028
2023-01-16 18:30:13 +00:00
Mahesh Ravishankar
0478401c43 [mlir][TilingInterface] Add test for tile + fuse of sequence of reductions.
This just adds a test. With CSE of single block ops, and other
previously landed changes, this works at HEAD. Just adding a test that
triggered this line of work that I missed adding.

Differential Revision: https://reviews.llvm.org/D139385
2022-12-07 02:19:19 +00:00
Hanhan Wang
0a1569a400 [mlir][NFC] Remove trailing whitespaces from *.td and *.mlir files.
This is generated by running

```
sed --in-place 's/[[:space:]]\+$//' mlir/**/*.td
sed --in-place 's/[[:space:]]\+$//' mlir/**/*.mlir
```

Reviewed By: rriddle, dcaballe

Differential Revision: https://reviews.llvm.org/D138866
2022-11-28 15:26:30 -08:00
Oleg Shyshkov
3598c24983 [mlir][linalg] Change linalg.broadcast dimensions attribute to represent added dimensions.
Original [RFC](discourse.llvm.org/t/rfc-primitive-ops-add-broadcastop-to-linalg/66313) defined `dimensions` as a map from input to init, but a discussion in reviews.llvm.org/D138291 concluded that it's more natural for `dimensions` to represent added dims. Also this way is more consistent with `linalg.reduce`.

Differential Revision: https://reviews.llvm.org/D138408
2022-11-21 13:16:41 +01:00
Oleg Shyshkov
fd64de3212 [mlir][linalg] Add BroadcastOp to Linalg structured ops.
[[RFC] Primitive Ops: add BroadcastOp to Linalg](https://discourse.llvm.org/t/rfc-primitive-ops-add-broadcastop-to-linalg/66313?u=olegshyshkov)

Differential Revision: https://reviews.llvm.org/D137331
2022-11-04 12:07:18 +01:00
Nicolas Vasilache
d4c4e49196 [mlir][Linalg] Drop usage of tileWithLinalgTilingOptions in the structured.tile transform
This is on a path to deprecation.
Context: https://discourse.llvm.org/t/psa-retire-tileandfuselinalgops-method/63850

As the interface-based transformation is more generic, some additional folding of AffineMin/MaxOp and some extra canonicalizations are needed.
This can be further evolved.

Differential Revision: https://reviews.llvm.org/D137195
2022-11-01 14:36:24 -07:00
Oleg Shyshkov
acdd576d05 [mlir][linalg] Fix linalg.transpose region builder.
The region should yield the first argument (input) not the last argument
(output). Also fix a few tests that were affected by this bug.

Differential Revision: https://reviews.llvm.org/D136924
2022-10-28 11:19:02 +02:00
Alexander Belyaev
451b1ff376 [mlir] Add lower-to-loops tests for linalg.map/reduce/transpose.
Differential Revision: https://reviews.llvm.org/D136691
2022-10-25 17:10:43 +02:00
Matthias Springer
81ca5aa452 [mlir][tensor][NFC] Rename linalg.init_tensor to tensor.empty
tensor.empty/linalg.init_tensor produces an uninititalized tensor that can be used as a destination operand for destination-style ops (ops that implement `DestinationStyleOpInterface`).

This change makes it possible to implement `TilingInterface` for non-destination-style ops without depending on the Linalg dialect.

RFC: https://discourse.llvm.org/t/rfc-add-tensor-from-shape-operation/65101

Differential Revision: https://reviews.llvm.org/D135129
2022-10-04 17:25:35 +09:00
Mahesh Ravishankar
7ee34550f5 [mlir][TilingInterface] Fix iter_args handling in tile (and fuse).
The current approach for handling `iter_args` was to replace all uses
of the value that is used as `init` value with the corresponding
region block argument within the `scf.for`. This is not always
correct. Instead a more deliberate approach needs to be taken to
handle these. If the slice being fused represents a slice of the
destination operand of the untiled op, then
- Make the destination of the fused producer the `init` value of the
  loop nest
- For the tiled and fused producer op created, replace the slice of
  the destination operand with a slice of the corresponding region
  iter arg of the innermost loop of the generated loop nest

Differential Revision: https://reviews.llvm.org/D134411
2022-09-26 19:09:29 +00:00
Mahesh Ravishankar
a235562c0a [mlir][TilingInterface] Enabling tiling tensor.pad using TilingInterface.
Update the implementation of `TilingInterface` for `tensor.pad`
operations to allow tiling the op using the existing patterns for the
interface. Verify that tests that pass with existing pad tiling
patterns producer the same results through TilingInterface patterns.

Reviewed By: antiagainst

Differential Revision: https://reviews.llvm.org/D132720
2022-08-26 16:29:32 +00:00
lorenzo chelini
954de25a92 [MLIR] TilingInterface: Avoid map when tile divides iteration domain
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D131080
2022-08-04 19:43:59 +02:00
Mahesh Ravishankar
6f03a10e4f [mlir][TilingInterface] Add a method to generate scalar implementation of the op.
While The tiling interface provides a mechanism for operations to be
tiled into tiled version of the op (or another op at the same level of
abstraction), the `generateScalarImplementation` method added here is
the "exit point" after all transformations have been done. Ops that
implement this method are expected to generate IR that are directly
lowerable to backend dialects like LLVM or SPIR-V dialects.

Differential Revision: https://reviews.llvm.org/D130612
2022-07-28 16:37:15 +00:00
lorenzo chelini
2ed7c3fd84 [MLIR][SCF] Enable better bufferization for TileConsumerAndFuseProducersUsingSCFForOp
Replace iterators of the outermost loop with region arguments of the innermost
one. The changes avoid later `bufferization` passes to insert allocation within
the body of the innermost loop.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D130083
2022-07-21 10:14:26 +02:00
lorenzo chelini
7f1c03171d Revert "[RFC][MLIR][SCF] Enable better bufferization for TileConsumerAndFuseProducersUsingSCFForOp"
This reverts commit 9e6585030533e901a8c24dcb05b38d3f0d10331f.
2022-07-21 09:40:30 +02:00
lorenzo chelini
9e65850305 [RFC][MLIR][SCF] Enable better bufferization for TileConsumerAndFuseProducersUsingSCFForOp
Replace iterators of the outermost loop with region arguments of the innermost
one. The changes avoid later `bufferization` passes to insert allocation within
the body of the innermost loop.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D130083
2022-07-21 08:56:50 +02:00
Mahesh Ravishankar
485190df95 [mlir][Linalg] Deprecate tileAndFuseLinalgOps method and associated patterns.
The `tileAndFuseLinalgOps` is a legacy approach for tiling + fusion of
Linalg operations. Since it was also intended to work on operations
with buffer operands, this method had fairly complex logic to make
sure tile and fuse was correct even with side-effecting linalg ops.
While complex, it still wasnt robust enough. This patch deprecates
this method and thereby deprecating the tiling + fusion method for ops
with buffer semantics. Note that the core transformation to do fusion
of a producer with a tiled consumer still exists. The deprecation here
only removes methods that auto-magically tried to tile and fuse
correctly in presence of side-effects.

The `tileAndFuseLinalgOps` also works with operations with tensor
semantics. There are at least two other ways the same functionality
exists.
1) The `tileConsumerAndFuseProducers` method. This does a similar
   transformation, but using a slightly different logic to
   automatically figure out the legal tile + fuse code. Note that this
   is also to be deprecated soon.
2) The prefered way uses the `TilingInterface` for tile + fuse, and
   relies on the caller to set the tiling options correctly to ensure
   that the generated code is correct.
As proof that (2) is equivalent to the functionality provided by
`tileAndFuseLinalgOps`, relevant tests have been moved to use the
interface, where the test driver sets the tile sizes appropriately to
generate the expected code.

Differential Revision: https://reviews.llvm.org/D129901
2022-07-21 05:05:06 +00:00
Mahesh Ravishankar
b8a1f00d41 [mlir][TilingInterface] Add support for interchange to tiling patterns that use the TilingInterface.
Differential Revision: https://reviews.llvm.org/D129956
2022-07-20 05:24:17 +00:00
Alex Zinenko
81b62f7feb [mlir] Handle linalg.index correctly in TilingInterface
The existing implementation of the TilingInterface for Linalg ops was not
modifying the `linalg.index` ops contained within other Linalg ops (they need
to be summed up with the values of respective tile loop induction variables),
which led to the interface-based tiling being incorrect for any Linalg op with
index semantics.

In the process, fix the function performing the index offsetting to use the
pattern rewriter API instead of RAUW as it is being called from patterns and
may mess up the internal state of the rewriter. Also rename the function to
clearly catch all uses.

Depends On D129365

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D129366
2022-07-12 12:36:33 +00:00
Mahesh Ravishankar
2f637fe730 [mlir][TilingInterface] Enable tile and fuse using TilingInterface.
This patch implements tile and fuse transformation for ops that
implement the tiling interface. To do so,
- `TilingInterface` needs a new method that generates a tiled
  implementation of the operation based on the tile of the result
  needed.
- A pattern is added that replaces a `tensor.extract_slice` whose
  source is defined by an operation that implements the
  `TilingInterface` with a tiled implementation that produces the
  extracted slice in-place (using the method added to
  `TilingInterface`).
- A pattern is added that takes a sequence of operations that
  implement the `TilingInterface` (for now `LinalgOp`s), tiles the
  consumer, and greedily fuses its producers iteratively.

Differential Revision: https://reviews.llvm.org/D127809
2022-06-21 17:22:58 +00:00
Mahesh Ravishankar
c584771f54 Revert "[mlir][TilingInterface] Enable tile and fuse using TilingInterface."
This reverts commit ea75511319d9dff8c38c8794c3949c40b63a38d7 due to build failures.
2022-06-21 16:56:59 +00:00
Mahesh Ravishankar
ea75511319 [mlir][TilingInterface] Enable tile and fuse using TilingInterface.
This patch implements tile and fuse transformation for ops that
implement the tiling interface. To do so,
- `TilingInterface` needs a new method that generates a tiled
  implementation of the operation based on the tile of the result
  needed.
- A pattern is added that replaces a `tensor.extract_slice` whose
  source is defined by an operation that implements the
  `TilingInterface` with a tiled implementation that produces the
  extracted slice in-place (using the method added to
  `TilingInterface`).
- A pattern is added that takes a sequence of operations that
  implement the `TilingInterface` (for now `LinalgOp`s), tiles the
  consumer, and greedily fuses its producers iteratively.

Differential Revision: https://reviews.llvm.org/D127809
2022-06-21 16:47:14 +00:00
Mahesh Ravishankar
cf6a7c1947 [mlir][TilingInterface] Add pattern to tile using TilingInterface and implement TilingInterface for Linalg ops.
This patch adds support for tiling operations that implement the
TilingInterface.
- It separates the loop constructs that are used to iterate over tile
  from the implementation of the tiling itself. For example, the use
  of destructive updates is more related to use of scf.for for
  iterating over tiles that are tensors.
- To test the transformation, TilingInterface is implemented for
  LinalgOps. The separation of the looping constructs used from the
  implementation of tile code generation greatly simplifies the
  latter.
- The implementation of TilingInterface for LinalgOp is kept as an
  external model for now till this approach can be fully flushed out
  to replace the existing tiling + fusion approaches in Linalg.

Differential Revision: https://reviews.llvm.org/D127133
2022-06-13 20:37:44 +00:00