15 Commits

Author SHA1 Message Date
Nikhil Kalra
0ad6ac8c53
[NFC][MLIR] Fix: alloca promotion for AllocationOpInterface (#97672)
The std::optional returned by buildPromotedAlloc was directly
dereferenced and assumed to be non-null, even though the documentation
for AllocationOpInterface indicates that std::nullopt is a legal value
if buffer stack promotion is not supported (and is the default value
supplied by the TableGen interface file). This patch removes the direct
dereference so that the optional can be null-checked prior to use.

Co-authored-by: Nikhil Kalra <nkalra@apple.com>
2024-07-04 08:49:33 +02:00
Rafael Ubal
a42a2ca19b
Avoid buffer hoisting from parallel loops (#90735)
This change corrects an invalid behavior in pass
`--buffer-loop-hoisting`. The pass is in charge of extracting buffer
allocations (e.g., `memref.alloca`) from loop regions (e.g., `scf.for`)
when possible. This works OK for looks with sequential execution
semantics. However, a buffer allocated in the body of a parallel loop
may be concurrently accessed by multiple thread to store its local data.
Extracting such buffer from the loop causes all threads to wrongly share
the same memory region.

In the following example, dimension 1 of the input tensor is reversed.
Dimension 0 is traversed with a parallel loop.

```
func.func @f(%input: memref<2x3xf32>) -> memref<2x3xf32> {
  %c0 = index.constant 0
  %c1 = index.constant 1
  %c2 = index.constant 2
  %c3 = index.constant 3

  %output = memref.alloc() : memref<2x3xf32>
  scf.parallel (%index) = (%c0) to (%c2) step (%c1) {
    // Create subviews for working input and output slices
    %input_slice = memref.subview %input[%index, 2][1, 3][1, -1] : memref<2x3xf32> to memref<1x3xf32, strided<[3, -1], offset: ?>>
    %output_slice = memref.subview %output[%index, 0][1, 3][1, 1] : memref<2x3xf32> to memref<1x3xf32, strided<[3, 1], offset: ?>>

    // Copy the input slice into this temporary buffer. This intermediate
    // copy is unnecessary, but is used for illustration purposes.
    %temp = memref.alloc() : memref<1x3xf32>
    memref.copy %input_slice, %temp : memref<1x3xf32, strided<[3, -1], offset: ?>> to memref<1x3xf32>

    // Copy temporary buffer into output slice
    memref.copy %temp, %output_slice : memref<1x3xf32> to memref<1x3xf32, strided<[3, 1], offset: ?>>
    scf.reduce
  }

  return %output : memref<2x3xf32>
}
```

The patch submitted here prevents `%temp = memref.alloc() :
memref<1x3xf32>` from being hoisted when the containing op is
`scf.parallel` or `scf.forall`. A new op trait called
`HasParallelRegion` is introduced and assigned to these two ops to
indicate that their regions have parallel execution semantics.

@joker-eph @ftynse @nicolasvasilache @sabauma
2024-05-04 08:35:36 +02:00
Matthias Springer
dd450f08cf
[mlir][Interfaces][NFC] Move region loop detection to RegionBranchOpInterface (#77090)
`BufferPlacementTransformationBase::isLoop` checks if there a loop in
the region branching graph of an operation. This algorithm is similar to
`isRegionReachable` in the `RegionBranchOpInterface`. To avoid duplicate
code, `isRegionReachable` is generalized, so that it can be used to
detect region loops. A helper function
`RegionBranchOpInterface::hasLoop` is added.

This change also turns a recursive implementation into an iterative one,
which is the preferred implementation strategy in LLVM.

Also move the `isLoop` to `BufferOptimizations.cpp`, so that we can
gradually retire `BufferPlacementTransformationBase`. (This is so that
proper error handling can be added to `BufferViewFlowAnalysis`.)
2024-01-07 13:49:29 +01:00
Xiaolei Shi
bcabaa5590 Add LLVM_MARK_AS_BITMASK_ENUM to HoistingKind enum
This revision adds LLVM_MARK_AS_BITMASK_ENUM to HoistingKind to avoid static_cast when performing bitwise operations.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D158580
2023-08-22 23:22:32 -07:00
Xiaolei Shi
55e3857931 Make buffer hoisting/promotion passes use AllocationOpInterface
This update implements the usage of AllocationOpInterface in the buffer hoisting/promotion passes. Two interface methods, namely `getHoistingKind` and `buildPromotedAlloc`, have been added. The former indicates which kind of hoisting (loop, block) an allocation operation supports, while the latter builds a stack allocation operation for promotable allocations used by the promote-buffers-to-stack pass.

This update makes these passes be functional for user customized allocation operation.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D158398
2023-08-22 16:51:04 -07:00
Markus Böck
10ae8ae837 [mlir][NFC] Make ReturnLike trait imply RegionBranchTerminatorOpInterface
This implication was already done de-facto and there were plenty of users and wrapper functions specifically used to handle the "return-like or RegionBranchTerminatorOpInterface" case. These simply existed due to up until recently missing features in ODS.

With the new capabilities of traits, we can make `ReturnLike` imply `RegionBranchTerminatorOpInterface` and auto generate proper definitions for its methods.
Various occurrences and wrapper methods used for `isa<RegionBranchTerminatorOpInterface>() || hasTrait<ReturnLike>()` have all been removed.

Differential Revision: https://reviews.llvm.org/D157402
2023-08-08 22:11:39 +02:00
Matthias Springer
98770ecd76 [mlir][bufferization] Add buffer_loop_hoisting transform op
This op hoists buffer allocation from loops.

Differential Revision: https://reviews.llvm.org/D155289
2023-07-14 17:09:38 +02:00
Tres Popp
5550c82189 [mlir] Move casting calls from methods to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.

Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.

Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.

Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
  for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443

Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.

Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
   additional check:
   https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
   and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
   them to a pure state.
4. Some changes have been deleted for the following reasons:
   - Some files had a variable also named cast
   - Some files had not included a header file that defines the cast
     functions
   - Some files are definitions of the classes that have the casting
     methods, so the code still refers to the method instead of the
     function without adding a prefix or removing the method declaration
     at the same time.

```
ninja -C $BUILD_DIR clang-tidy

run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
               -header-filter=mlir/ mlir/* -fix

rm -rf $BUILD_DIR/tools/mlir/**/*.inc

git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
            mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
            mlir/lib/**/IR/\
            mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
            mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
            mlir/test/lib/Dialect/Test/TestTypes.cpp\
            mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
            mlir/test/lib/Dialect/Test/TestAttributes.cpp\
            mlir/unittests/TableGen/EnumsGenTest.cpp\
            mlir/test/python/lib/PythonTestCAPI.cpp\
            mlir/include/mlir/IR/
```

Differential Revision: https://reviews.llvm.org/D150123
2023-05-12 11:21:25 +02:00
Maya Amrami
ace6072bca [mlir] PromoteBuffersToStackPass - Copy attributes of original AllocOp
Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D143185
2023-02-16 17:06:45 +02:00
Michele Scuttari
67d0d7ac0a
[MLIR] Update pass declarations to new autogenerated files
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838
2022-08-31 12:28:45 +02:00
Michele Scuttari
039b969b32
Revert "[MLIR] Update pass declarations to new autogenerated files"
This reverts commit 2be8af8f0e0780901213b6fd3013a5268ddc3359.
2022-08-30 22:21:55 +02:00
Michele Scuttari
2be8af8f0e
[MLIR] Update pass declarations to new autogenerated files
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838
2022-08-30 21:56:31 +02:00
Jacques Pienaar
136d746ec7 [mlir] Flip accessors to prefixed form (NFC)
Another mechanical sweep to keep diff small for flip to _Prefixed.
2022-07-10 21:19:11 -07:00
Benjamin Kramer
b70366c9c4 [mlir][BufferOptimization] Use datalayout instead of a flag to find index size
This has the additional advantage of supporting more types.

Differential Revision: https://reviews.llvm.org/D118348
2022-01-27 13:50:29 +01:00
River Riddle
0e9a4a3b65 [mlir] Move the Buffer related source files out of Transforms/
Transforms/ should only contain dialect-independent transformations,
and these files are a much better fit for the bufferization dialect anyways.

Differential Revision: https://reviews.llvm.org/D117839
2022-01-24 19:25:52 -08:00