9 Commits

Author SHA1 Message Date
Dmitriy Smirnov
bb4696ce30
[mlir][linalg] Fix for bias handling for Winograd (#110331)
PR makes winograd.output_transform op a destination style op and fixes
handing of a pre-existing data in its output argument (i.e. possibly
pre-initialized with bias, which was discarded before).

---------

Signed-off-by: Dmitriy Smirnov <dmitriy.smirnov@arm.com>
2024-10-11 09:39:19 +01:00
Thomas Preud'homme
326287fd5b
Add missing FillOp to winograd lowering (#108181)
Winograd lowering involves a number of matmul and batch_matmul which
are currently passed tensor.empty result as out parameter, thereby
are undefined behaviour. This commit adds the necessary linalg.fill.

---------

Co-authored-by: Max191 <44243577+Max191@users.noreply.github.com>
2024-09-13 15:48:17 +01:00
Hsiangkai Wang
c4bf949171
[mlir][linalg] Implement TilingInterface for winograd operators (#96184)
In order to support arbitrary size input data of conv2d, implement
TilingInterface for winograd operations. Before converting winograd
operations into nested loops with matrix multiply, tile the input of
conv2d into the supported size first.

Add a transform operation structured.decompose_winograd_op to decompose
winograd operations. Before applying the transform op, use
tile_using_for to tile the input data into supported size. The test case
shows how to tile and decompose winograd operations.
2024-08-16 16:22:02 +01:00
Adrian Kuegel
7b08c2774c [mlir][Linalg] Remove unused header include.
There seems to be no direct usage of any tosa utils.
2024-07-18 06:35:42 +00:00
Hsiangkai Wang
27ee33d136
[mlir][linalg] Decompose winograd operators (#96183)
Convert Linalg winograd_filter_transform, winograd_input_transform, and
winograd_output_transform into nested loops with matrix multiplication
with constant transform matrices.

Support several configurations of Winograd Conv2D, including F(2, 3),
F(4, 3) and F(2, 5). These configurations show that the implementation
can support different kernel size (3 and 5) and different output size
(2 and 4). Besides symetric kernel size 3x3 and 5x5, this patch also
supports 1x3, 3x1, 1x5, and 5x1 kernels.

The implementation is based on the paper, Fast Algorithm for
Convolutional Neural Networks. (https://arxiv.org/abs/1509.09308)

Reviewers: ftynse, Max191, GeorgeARM, nicolasvasilache, MaheshRavishankar, dcaballe, rengolin

Reviewed By: ftynse, Max191

Pull Request: https://github.com/llvm/llvm-project/pull/96183
2024-07-18 06:04:53 +01:00
Pranav Kant
9c1861bd5d
[mlir][NFC] Remove unused includes (#98557)
Adding dep to TosaDialect increases binary size unnecessarily
2024-07-11 14:54:04 -07:00
Hsiangkai Wang
d9c26b9d56
[mlir][linalg] Add transform operator for Winograd Conv2D algorithm (#96182)
Add a transform operation structured.winograd_conv2d to convert
linalg.conv_2d_nhwc_fhwc to Linalg winograd operations.

Reviewers: ftynse, Max191, GeorgeARM, nicolasvasilache, MaheshRavishankar, dcaballe, rengolin

Reviewed By: ftynse, Max191

Pull Request: https://github.com/llvm/llvm-project/pull/96182
2024-07-11 14:45:36 +01:00
Benjamin Kramer
34c544e1cc [mlir][linalg] Remove unused #includes. NFC. 2024-07-10 22:01:22 +02:00
Hsiangkai Wang
7d246e84a4
[mlir][linalg] Implement Conv2D using Winograd Conv2D algorithm (#96181)
Define high level winograd operators and convert conv_2d_nhwc_fhwc into
winograd operators. According to Winograd Conv2D algorithm, we need
three transform operators for input, filter, and output transformation.

The formula of Winograd Conv2D algorithm is

Y = A^T x [(G x g x G^T) @ (B^T x d x B)] x A

filter transform: G x g x G^T
input transform: B^T x d x B
output transform: A^T x y x A

The implementation is based on the paper, Fast Algorithm for
Convolutional Neural Networks. (https://arxiv.org/abs/1509.09308)

Reviewers: stellaraccident, ftynse, Max191, GeorgeARM, cxy-1993, nicolasvasilache, MaheshRavishankar, dcaballe, rengolin

Reviewed By: ftynse, Max191, stellaraccident

Pull Request: https://github.com/llvm/llvm-project/pull/96181
2024-07-10 07:30:45 +01:00