183 Commits

Author SHA1 Message Date
Matthias Springer
c9b3638126 [mlir][scf][bufferize] Fix bufferizesToMemoryRead with 0 loop iterations
There was a bug in scf.for loop bufferization that could lead to a missing buffer copy (alloc was there, but not the copy).

Differential Revision: https://reviews.llvm.org/D135053
2022-10-24 14:34:41 +02:00
Matthias Springer
b169643f3a [mlir][interfaces] Remove getDestinationOperands from TilingInterface
`getDestinationOperands` was almost a duplicate of `DestinationStyleOpInterface::getOutputOperands`. Now that the interface has been moved to mlir/Interfaces, it is no longer needed.

Differential Revision: https://reviews.llvm.org/D136240
2022-10-24 09:26:19 +02:00
Peiming Liu
d3f5f33067 [mlir][scf] support 1:N type conversion for scf.for.
scf.for used to only support 1:1 type conversion, this patch add support for 1:N type conversion.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D136314
2022-10-21 21:11:55 +00:00
Mehdi Amini
6d4baa7442 Apply clang-tidy fixes for performance-unnecessary-value-param in TileUsingInterface.cpp (NFC) 2022-10-12 05:03:45 +00:00
Mehdi Amini
2a6f0fb34a Apply clang-tidy fixes for performance-for-range-copy in TileUsingInterface.cpp (NFC) 2022-10-12 05:03:45 +00:00
Mehdi Amini
23f989a2e3 Apply clang-tidy fixes for readability-simplify-boolean-expr in BufferizableOpInterfaceImpl.cpp (NFC) 2022-10-12 01:16:36 +00:00
Nicolas Vasilache
7915027926 [mlir][Linalg] Retire LinalgStrategyTileAndFusePass and filter-based pattern.
Context: https://discourse.llvm.org/t/psa-retire-linalg-filter-based-patterns/63785

In the process, also retire `tileConsumerAndFuseProducers` that is now replaced by `tileConsumerAndFuseProducerGreedilyUsingSCFForOp`.

Context: https://discourse.llvm.org/t/psa-retire-tileandfuselinalgops-method/63850

When performing this replacement, a change of behavior appeared: the older `tileConsumerAndFuseProducers` would split the parallel
and non-parallel dimensions automatically and perform a first level of tile-and-fuse on parallel dimensions only and then introduce a
second level of tiling-only on the reduction dimensions. The newer `tileConsumerAndFuseProducerGreedilyUsingSCFForOp` on the other hand
does not perform this breakdown. As a consequence, the transform specification is evolved to produce the same output.

Additionally, replace some uses of `unsigned` by `int64_t` where possible without pulling in larger interface changes (left for a future PR).

Context: https://www.youtube.com/watch?v=Puio5dly9N8

Lastly, tests that were performing tile and fuse and distribute on tensors are retired: the generated IR mixing scf.for, tensors and
distributed processor ids was racy at best ..

Differential Revision: https://reviews.llvm.org/D135559
2022-10-10 07:04:01 -07:00
Adrian Kuegel
67bcf9825a [mlir][SCF] Apply ClangTidyPerformance finding (NFC) 2022-09-30 12:47:32 +02:00
Jakub Kuderski
abc362a107 [mlir][arith] Change dialect name from Arithmetic to Arith
Suggested by @lattner in https://discourse.llvm.org/t/rfc-define-precise-arith-semantics/65507/22.

Tested with:
`ninja check-mlir check-mlir-integration check-mlir-mlir-spirv-cpu-runner check-mlir-mlir-vulkan-runner check-mlir-examples`

and `bazel build --config=generic_clang @llvm-project//mlir:all`.

Reviewed By: lattner, Mogball, rriddle, jpienaar, mehdi_amini

Differential Revision: https://reviews.llvm.org/D134762
2022-09-29 11:23:28 -04:00
Mahesh Ravishankar
97f919820b [mlir][TilingInterface] NFC Refactor of tile and fuse using TilingInterface.
This patch refactors the tiling and tile + fuse implementation using
`TilingInterface`. Primarily, it exposes the functionality as simple
utility functions instead of as a Pattern to allow calling it from a
pattern as it is done in the test today or from within the transform
dialect (in the future). This is a step towards deprecating similar
methods in Linalg dialect.

- The utility methods do not erase the root operations.
- The return value provides the values to use for replacements.

Differential Revision: https://reviews.llvm.org/D134144
2022-09-28 20:25:33 +00:00
Mahesh Ravishankar
7ee34550f5 [mlir][TilingInterface] Fix iter_args handling in tile (and fuse).
The current approach for handling `iter_args` was to replace all uses
of the value that is used as `init` value with the corresponding
region block argument within the `scf.for`. This is not always
correct. Instead a more deliberate approach needs to be taken to
handle these. If the slice being fused represents a slice of the
destination operand of the untiled op, then
- Make the destination of the fused producer the `init` value of the
  loop nest
- For the tiled and fused producer op created, replace the slice of
  the destination operand with a slice of the corresponding region
  iter arg of the innermost loop of the generated loop nest

Differential Revision: https://reviews.llvm.org/D134411
2022-09-26 19:09:29 +00:00
Johannes Reifferscheid
eaf20c4fc2 [mlir] Fix a cast that should be a dyn_cast.
This fixes a crash for certain IR, see the new test case for an
example.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D134424
2022-09-22 13:13:21 +02:00
Christopher Bate
f5fe92f693 [mlir][SCF] Fix loop pipelining unable to handle ops with regions
This change allows the SCF LoopPipelining transform to handle ops with
nested regions within the pipelined `scf.for` body. The op and nested
regions are treated as a single unit from the transform's perspective.
This change also makes explicit the requirement that only ops whose
parent Block is the loop body Block are allowed to be scheduled by the
caller.

Reviewed By: ThomasRaoux, nicolasvasilache

Differential Revision: https://reviews.llvm.org/D133965
2022-09-20 21:58:53 -06:00
Johannes Reifferscheid
d1536ee48c Fix clang-format. 2022-09-08 11:05:12 +02:00
Johannes Reifferscheid
6247988e07 One-shot-bufferize: fix for inconsistent while arg types in before/after.
Currently, if the `before` and `after` regions of a while op have
tensor args in different indices, this leads to a crash.

This moves the pass-through check for args to the handling of the
condition block, since that is where the results are produced, so
it's also where copies must be made.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D133477
2022-09-08 10:24:11 +02:00
Johannes Reifferscheid
fb9fc79809 One-shot-bufferize: allow non-tensor arguments in scg.while/for.
Currently, one-shot-bufferize crashes as soon as there's
a mixture of tensor and non-tensor arguments. This seems
to happen for no good reason.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D133419
2022-09-07 15:54:25 +02:00
Mehdi Amini
b285d708a7 Apply clang-tidy fixes for performance-for-range-copy in TileUsingInterface.cpp (NFC) 2022-09-07 09:40:59 +00:00
Mehdi Amini
8eab900170 Apply clang-tidy fixes for llvm-qualified-auto in Bufferize.cpp (NFC) 2022-09-07 09:40:59 +00:00
Matthias Springer
4cd7362083 [mlir][SCF] foreach_thread: Capture shared output tensors explicitly
This change refines the semantics of scf.foreach_thread. Tensors that are inserted into in the terminator must now be passed to the region explicitly via `shared_outs`. Inside of the body of the op, those tensors are then accessed via block arguments.

The body of a scf.foreach_thread is now treated as a repetitive region. I.e., op dominance can no longer be used in conflict detection when using a value that is defined outside of the body. Such uses may now be considered as conflicts (if there is at least one read and one write in the body), effectively privatizing the tensor. Shared outputs are not privatized when they are used via their corresponding block arguments.

As part of this change, it was also necessary to update the "tiling to scf.foreach_thread", such that the generated tensor.extract_slice ops use the scf.foreach_thread's block arguments. This is implemented by cloning the TilingInterface op inside the scf.foreach_thread, rewriting all of its outputs with block arguments and then calling the tiling implementation. Afterwards, the cloned op is deleted again.

Differential Revision: https://reviews.llvm.org/D133114
2022-09-02 14:54:04 +02:00
Matthias Springer
547942841f [mlir][interfaces] Drop dest/tileDestOperands from TilingInterface
`getTiledImplementation`/`generateResultTileValue` only computes the tiled operation, but does not insert the result into any tensor.

Differential Revision: https://reviews.llvm.org/D133015
2022-09-01 08:53:53 +02:00
Michele Scuttari
67d0d7ac0a
[MLIR] Update pass declarations to new autogenerated files
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838
2022-08-31 12:28:45 +02:00
Michele Scuttari
039b969b32
Revert "[MLIR] Update pass declarations to new autogenerated files"
This reverts commit 2be8af8f0e0780901213b6fd3013a5268ddc3359.
2022-08-30 22:21:55 +02:00
Michele Scuttari
2be8af8f0e
[MLIR] Update pass declarations to new autogenerated files
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838
2022-08-30 21:56:31 +02:00
Matthias Springer
86974e32a4 [mlir][SCF][bufferize] Support different iter_arg/init_arg types (scf.while)
This change implements the same functionality as D132860, but for scf.while.

Differential Revision: https://reviews.llvm.org/D132927
2022-08-30 16:58:21 +02:00
Matthias Springer
9d6096c56f [mlir][SCF][bufferize][NFC] Move scf.if buffer type computation to getBufferType
A part of the functionality of `bufferize` is extracted into `getBufferType`. Also, bufferized scf.yields inside scf.if are now created with the correct bufferized type from the get-to.

Differential Revision: https://reviews.llvm.org/D132862
2022-08-30 16:48:10 +02:00
Matthias Springer
123c4b0251 [mlir][SCF][bufferize] Support different iter_arg/init_arg types (scf.for)
Even though iter_arg and init_arg of an scf.for loop may have the same tensor type, their bufferized memref types are not necessarily equal. It is sometimes necessary to insert a cast in case of differing layout maps.

Differential Revision: https://reviews.llvm.org/D132860
2022-08-30 16:35:32 +02:00
Matthias Springer
111c919665 [mlir][bufferization] Generalize getBufferType
This change generalizes getBufferType. This function can be used to predict the buffer type of any tensor value (not just BlockArguments) without changing any IR. It also subsumes getMemorySpace. This is useful for loop bufferization, where the precise buffer type of an iter_arg cannot be known without examining the loop body.

Differential Revision: https://reviews.llvm.org/D132859
2022-08-30 16:26:44 +02:00
Jeff Niu
5b569ed2cd [mlir] Add Block::eraseArguments that erases a subrange
This patch adds a an `eraseArguments` function that erases a subrange of
a block's arguments. This can be used inplace of the terrible pattern

```
block->eraseArguments(llvm::to_vector(llvm::seq(...)));
```

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D132890
2022-08-29 15:34:21 -07:00
Benjamin Kramer
9fa59e7643 [mlir] Use C++17 structured bindings instead of std::tie where applicable. NFCI 2022-08-09 13:34:17 +02:00
lorenzo chelini
954de25a92 [MLIR] TilingInterface: Avoid map when tile divides iteration domain
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D131080
2022-08-04 19:43:59 +02:00
Mahesh Ravishankar
6f03a10e4f [mlir][TilingInterface] Add a method to generate scalar implementation of the op.
While The tiling interface provides a mechanism for operations to be
tiled into tiled version of the op (or another op at the same level of
abstraction), the `generateScalarImplementation` method added here is
the "exit point" after all transformations have been done. Ops that
implement this method are expected to generate IR that are directly
lowerable to backend dialects like LLVM or SPIR-V dialects.

Differential Revision: https://reviews.llvm.org/D130612
2022-07-28 16:37:15 +00:00
Alex Zinenko
70e99f387a [mlir] Make ViewLikeInterface Range work with attributes
While most of methods in ViewLikeInterface accept an `OpFoldResult` for
the offset/size/stride that may be static, represented as `Attribute`,
or dynamic, represented as `Value`, the `Range` abstraction only
accepted `Values`. This can often lead to known-constant
offset/size/strides being materialized into constant operations and
hinder further constant propagation without explicitly running the
constant folding pass. This often leads to a more complicated than
necessary addressing code being emitted. Switch `Range` to use
`OpFoldResult`. Code that uses `Range` currently keeps materializing the
constants to minimize the effect of this change on the IR. Further
commits will make use of this.

Reviewed By: nicolasvasilache, mravishankar

Differential Revision: https://reviews.llvm.org/D129633
2022-07-27 08:52:13 +00:00
lorenzo chelini
2ed7c3fd84 [MLIR][SCF] Enable better bufferization for TileConsumerAndFuseProducersUsingSCFForOp
Replace iterators of the outermost loop with region arguments of the innermost
one. The changes avoid later `bufferization` passes to insert allocation within
the body of the innermost loop.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D130083
2022-07-21 10:14:26 +02:00
lorenzo chelini
7f1c03171d Revert "[RFC][MLIR][SCF] Enable better bufferization for TileConsumerAndFuseProducersUsingSCFForOp"
This reverts commit 9e6585030533e901a8c24dcb05b38d3f0d10331f.
2022-07-21 09:40:30 +02:00
lorenzo chelini
9e65850305 [RFC][MLIR][SCF] Enable better bufferization for TileConsumerAndFuseProducersUsingSCFForOp
Replace iterators of the outermost loop with region arguments of the innermost
one. The changes avoid later `bufferization` passes to insert allocation within
the body of the innermost loop.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D130083
2022-07-21 08:56:50 +02:00
Mahesh Ravishankar
b8a1f00d41 [mlir][TilingInterface] Add support for interchange to tiling patterns that use the TilingInterface.
Differential Revision: https://reviews.llvm.org/D129956
2022-07-20 05:24:17 +00:00
Kazu Hirata
10bcfeebfa [mlir] Remove unused using (NFC)
Identified with misc-unused-using-decls.
2022-07-17 18:08:48 -07:00
Kazu Hirata
c27d815249 [mlir] Use value instead of getValue (NFC) 2022-07-14 00:19:59 -07:00
Jacques Pienaar
136d746ec7 [mlir] Flip accessors to prefixed form (NFC)
Another mechanical sweep to keep diff small for flip to _Prefixed.
2022-07-10 21:19:11 -07:00
Nicolas Vasilache
7fbf55c927 [mlir][Tensor] Move ParallelInsertSlice to the tensor dialect
This is moslty NFC and will allow tensor.parallel_insert_slice to gain
rank-reducing semantics by reusing the vast majority of the tensor.insert_slice impl.

Depends on D128857

Differential Revision: https://reviews.llvm.org/D128920
2022-07-04 01:53:12 -07:00
Nicolas Vasilache
b994d388ae [mlir][SCF] Add a ParallelCombiningOpInterface to decouple scf::PerformConcurrently from its contained operations
This allows purging references of scf.ForeachThreadOp and scf.PerformConcurrentlyOp from
ParallelInsertSliceOp.
This will allowmoving the op closer to tensor::InsertSliceOp with which it should share much more
code.

In the future, the decoupling will also allow extending the type of ops that can be used in the
parallel combinator as well as semantics related to multiple concurrent inserts to the same
result.

Differential Revision: https://reviews.llvm.org/D128857
2022-07-01 00:16:02 -07:00
Matthias Springer
76f7e4b7a3 [mlir][SCF][bufferize][NFC] Utilize recently added helper function
This should have been part of D128666.

Differential Revision: https://reviews.llvm.org/D128885
2022-06-30 09:54:52 +02:00
Jacques Pienaar
04235d07ad [mlir] Update flipped accessors (NFC)
Follow up with memref flipped and flipping any intermediate changes
made.
2022-06-28 13:11:26 -07:00
Matthias Springer
04dac2ca7c [mlir][SCF][bufferize][NFC] Implement resolveConflicts for ParallelInsertSliceOp
This was previous implemented as part of the BufferizableOpInterface of ForEachThreadOp. Moving the implementation to ParallelInsertSliceOp to be consistent with the remaining ops and to have a nice example op that can serve as a blueprint for other ops.

Differential Revision: https://reviews.llvm.org/D128666
2022-06-28 12:18:22 +02:00
Matthias Springer
f164814f2f [mlir][SCF][bufferize] Small simplification and more comments
Differential Revision: https://reviews.llvm.org/D128651
2022-06-27 17:04:29 +02:00
Matthias Springer
c0b0b6a00a [mlir][bufferize] Infer memory space in all bufferization patterns
This change updates all remaining bufferization patterns (except for scf.while) and the remaining bufferization infrastructure to infer the memory space whenever possible instead of falling back to "0". (If a default memory space is set in the bufferization options, we still fall back to that value if the memory space could not be inferred.)

Differential Revision: https://reviews.llvm.org/D128423
2022-06-27 16:32:52 +02:00
Matthias Springer
45b995cda4 [mlir][bufferize][NFC] Change signature of allocateTensorForShapedValue
Add a failure return value and bufferization options argument. This is to keep a subsequent change smaller.

Differential Revision: https://reviews.llvm.org/D128278
2022-06-27 16:00:06 +02:00
Nicolas Vasilache
a0f843fdaf [SCF] Add thread_dim_mapping attribute to scf.foreach_thread
An optional thread_dim_mapping index array attribute specifies for each
virtual thread dimension, how it remaps 1-1 to a set of concrete processing
element resources (e.g. a CUDA grid dimension or a level of concrete nested
async parallelism). At this time, the specification is backend-dependent and
is not verified by the op, beyond being an index array attribute.
It is the reponsibility of the lowering to interpret the index array in the
context of the concrete target the op is lowered to, or to ignore it when
the specification is ill-formed or unsupported for a particular target.

Differential Revision: https://reviews.llvm.org/D128633
2022-06-27 04:58:36 -07:00
Matthias Springer
5d50f51c97 [mlir][bufferization][NFC] Add error handling to getBuffer
This is in preparation of adding memory space support.

Differential Revision: https://reviews.llvm.org/D128277
2022-06-27 13:48:01 +02:00
Matthias Springer
3ff93f838e [mlir][SCF][bufferize][NFC] Bufferize scf.for terminator separately
This allows for better type inference during bufferization and is in preparation of supporting memory spaces.

Differential Revision: https://reviews.llvm.org/D128422
2022-06-27 13:26:32 +02:00