llvm-project

Author	SHA1	Message	Date
Matthias Springer	10056c821a	[mlir][SCF] `scf.parallel`: Make reductions part of the terminator (#75314 ) This commit makes reductions part of the terminator. Instead of `scf.yield`, `scf.reduce` now terminates the body of `scf.parallel` ops. `scf.reduce` may contain an arbitrary number of reductions, with one region per reduction. Example: ```mlir %init = arith.constant 0.0 : f32 %r:2 = scf.parallel (%iv) = (%lb) to (%ub) step (%step) init (%init, %init) -> f32, f32 { %elem_to_reduce1 = load %buffer1[%iv] : memref<100xf32> %elem_to_reduce2 = load %buffer2[%iv] : memref<100xf32> scf.reduce(%elem_to_reduce1, %elem_to_reduce2 : f32, f32) { ^bb0(%lhs : f32, %rhs: f32): %res = arith.addf %lhs, %rhs : f32 scf.reduce.return %res : f32 }, { ^bb0(%lhs : f32, %rhs: f32): %res = arith.mulf %lhs, %rhs : f32 scf.reduce.return %res : f32 } } ``` `scf.reduce` operations can no longer be interleaved with other ops in the body of `scf.parallel`. This simplifies the op and makes it possible to assign the `RecursiveMemoryEffects` trait to `scf.reduce`. (This was not possible before because the op was not a terminator, causing the op to be DCE'd.)	2023-12-20 11:06:27 +09:00
Martin Erhart	ba727ac219	[mlir][bufferization][scf] Implement BufferDeallocationOpInterface for scf.reduce.return (#66886 ) This is necessary to run the new buffer deallocation pipeline as part of the sparse compiler pipeline.	2023-09-20 14:19:13 +02:00
Martin Erhart	66aa9a2517	[mlir][bufferization] Implement BufferDeallocationopInterface for scf.forall.in_parallel (#66351 ) The scf.forall.in_parallel terminator operation has a nested graph region with the NoTerminator trait. Such regions are not supported by the default implementations. Therefore, this commit adds a specialized implementation for this operation which only covers the case where the nested region is empty. This is because after bufferization, ops like tensor.parallel_insert_slice were already converted to memref operations residing int the scf.forall only and the nested region of scf.forall.in_parallel ends up empty.	2023-09-14 16:20:24 +02:00
Martin Erhart	ccb16acd46	Revert "[mlir][bufferization] Implement BufferDeallocationopInterface for scf.forall.in_parallel" This reverts commit 1356e853d47723c1be6eee2368d95c514a1816d1. This caused problems in downstream projects. We are reverting to give them more time for integration.	2023-09-13 13:53:47 +00:00
Martin Erhart	1356e853d4	[mlir][bufferization] Implement BufferDeallocationopInterface for scf.forall.in_parallel The scf.forall.in_parallel terminator operation has a nested graph region with the NoTerminator trait. Such regions are not supported by the default implementations. Therefore, this commit adds a specialized implementation for this operation which only covers the case where the nested region is empty. This is because after bufferization, ops like tensor.parallel_insert_slice were already converted to memref operations residing int the scf.forall only and the nested region of scf.forall.in_parallel ends up empty. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D158979	2023-09-13 09:30:24 +00:00

5 Commits