Adding the argument of aggressiveReduceConstant to the
TosaLayerwiseConstantFoldPass which would
allow performing the constant optimizations on the reduce ops always.
(e.g. without considering the
number of users of the input of the reduce operation)
Replace the different reduce operations which is getting
a constant tensor as an input argument with a constant
tensor.
As the arguement of the reduce operation is constant tensor
and has only a single user we could calculate the resulted
constant tensor in compilation time and replace it
with reduced memory tensor
This optimization has been implemented for:
tosa.reduce_sum
tosa.reduce_prod
tosa.reduce_any
tosa.reduce_all
tosa.reduce_max
tosa.reduce_min
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D154832
Add constant fold for tosa.reciprocal, which can be applied if the input is a dense constant tensor. The reciprocal is computed for every element and the result is a tensor with the same dimensions as the input tensor.
As the input tensor might require a lot of memory and the folding might double the required memory, a heuristic decides when to actually apply the folding. Currently, the operation will be replaced only if the input constant is a splat (i.e. requires little memory) or has in single user (similar to the already existing fold for constant transposes). This keeps the additionally required space low.
Differential Revision: https://reviews.llvm.org/D150578
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.
Reviewed By: mehdi_amini, rriddle
Differential Review: https://reviews.llvm.org/D132838
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.
Reviewed By: mehdi_amini, rriddle
Differential Review: https://reviews.llvm.org/D132838
Now that C++17 is enabled in LLVM, a lot of the TODOs and patterns to emulate C++17 features can be eliminated.
The steps I have taken were essentially:
```
git grep C++17
git grep c++17
git grep "initializer_list<int>"
```
and address given comments and patterns.
Most of the changes boiled down to just using fold expressions rather than initializer_list.
While doing this I also discovered that Clang by default restricts the depth of fold expressions to 256 elements. I specifically hit this with `TestDialect` in `addOperations`. I opted to not replace it with fold expressions because of that but instead adding a comment documenting the issue.
If any other functions may be called with more than 256 elements in the future we might have to revert other parts as well.
I don't think this is a common occurence besides the `TestDialect` however. If need be, this could potentially be fixed via `mlir-tblgen` in the future.
Differential Revision: https://reviews.llvm.org/D131323
Scope ops file to ops. Used canonicalization as grouping for canonicalization
patterns and folders (also considered OpTransforms but that felt too generic
and the former two are used together).
Reviewed By: silvas, rsuderman
Differential Revision: https://reviews.llvm.org/D130297
Transpose operations on constant data were getting folded during the
canonicalization process. This has compile time cost proportional to
the constant size. Moving this to a separate pass to enable optionality
and flexibility of how such scenarios can be handled.
Reviewed By: rsuderman, jpienaar, stellaraccident
Differential Revision: https://reviews.llvm.org/D124685