This is a major step along the way towards the new STEA design. While a great deal of this patch is simple renaming, there are several significant changes as well. I've done my best to ensure that this patch retains the previous behavior and error-conditions, even though those are at odds with the eventual intended semantics of the `dimToLvl` mapping. Since the majority of the compiler does not yet support non-permutations, I've also added explicit assertions in places that previously had implicitly assumed it was dealing with permutations.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D151505
This commit is part of the migration of towards the new STEA syntax/design. In particular, this commit includes the following changes:
* Renaming compiler-internal functions/methods:
* `SparseTensorEncodingAttr::{getDimLevelType => getLvlTypes}`
* `Merger::{getDimLevelType => getLvlType}` (for consistency)
* `sparse_tensor::{getDimLevelType => buildLevelType}` (to help reduce confusion vs actual getter methods)
* Renaming external facets to match:
* the STEA parser and printer
* the C and Python bindings
* PyTACO
However, the actual renaming of the `DimLevelType` itself (along with all the "dlt" names) will be handled in a separate commit.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D150330
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.
Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.
Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.
Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443
Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.
Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
additional check:
https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
them to a pure state.
4. Some changes have been deleted for the following reasons:
- Some files had a variable also named cast
- Some files had not included a header file that defines the cast
functions
- Some files are definitions of the classes that have the casting
methods, so the code still refers to the method instead of the
function without adding a prefix or removing the method declaration
at the same time.
```
ninja -C $BUILD_DIR clang-tidy
run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
-header-filter=mlir/ mlir/* -fix
rm -rf $BUILD_DIR/tools/mlir/**/*.inc
git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
mlir/lib/**/IR/\
mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
mlir/test/lib/Dialect/Test/TestTypes.cpp\
mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
mlir/test/lib/Dialect/Test/TestAttributes.cpp\
mlir/unittests/TableGen/EnumsGenTest.cpp\
mlir/test/python/lib/PythonTestCAPI.cpp\
mlir/include/mlir/IR/
```
Differential Revision: https://reviews.llvm.org/D150123
These functions don't need a`PatternRewriter`, they only need an `OpBuilder`. And, the builder should be the first argument, before the `Location`, to match the style used everywhere else in MLIR.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D148059
This commit contains several code changes which are ultimately required for converting the varions `Merger` identifiers from typedefs to newtypes. The actual implementation of the newtypes themselves has been split off into separate commits, in hopes of simplifying the review process.
Depends On D146561
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D146684
Replace references to enumerate results with either result_pairs
(reference wrapper type) or structured bindings. I did not use
structured bindings everywhere as it wasn't clear to me it would
improve readability.
This is in preparation to the switch to zip semantics which won't
support non-const lvalue reference to elements:
https://reviews.llvm.org/D144503.
I chose to use values instead of const lvalue-refs because MLIR is
biased towards avoiding `const` local variables. This won't degrade
performance because currently `result_pair` is cheap to copy (size_t
+ iterator), and in the future, the enumerator iterator dereference
will return temporaries anyway.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D146006
This change does a bunch of renaming to clear up confusions in these files. In particular, this change:
* Renames variables and methods to clarify the "dim"/"lvl" distinction, and changes them to use the `Dimension`/`Level` types as appropriate.
* Introduces new typedefs
* `ExprId`, `LatPointId`, `LatSetId`: to clarify the interning design of the Merger.
* `LoopId`, `LoopOrd`: to clarify the distinction between arbitrary names for loop-variables, vs numeric identifiers based on the actual order of loop generation.
* `TensorId`
* (Future CLs will change these from typedefs to structs/classes, so that the typechecker can help avoid mixups.)
* Updates documentation to match the new terminology
* Adds additional assertions
* Adds `const` to local variables along the way
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D145756
The old "pointer/index" names often cause confusion since these names clash with names of unrelated things in MLIR; so this change rectifies this by changing everything to use "position/coordinate" terminology instead.
In addition to the basic terminology, there have also been various conventions for making certain distinctions like: (1) the overall storage for coordinates in the sparse-tensor, vs the particular collection of coordinates of a given element; and (2) particular coordinates given as a `Value` or `TypedValue<MemRefType>`, vs particular coordinates given as `ValueRange` or similar. I have striven to maintain these distinctions
as follows:
* "p/c" are used for individual position/coordinate values, when there is no risk of confusion. (Just like we use "d/l" to abbreviate "dim/lvl".)
* "pos/crd" are used for individual position/coordinate values, when a longer name is helpful to avoid ambiguity or to form compound names (e.g., "parentPos"). (Just like we use "dim/lvl" when we need a longer form of "d/l".)
I have also used these forms for a handful of compound names where the old name had been using a three-letter form previously, even though a longer form would be more appropriate. I've avoided renaming these to use a longer form purely for expediency sake, since changing them would require a cascade of other renamings. They should be updated to follow the new naming scheme, but that can be done in future patches.
* "coords" is used for the complete collection of crd values associated with a single element. In the runtime library this includes both `std::vector` and raw pointer representations. In the compiler, this is used specifically for buffer variables with C++ type `Value`, `TypedValue<MemRefType>`, etc.
The bare form "coords" is discouraged, since it fails to make the dim/lvl distinction; so the compound names "dimCoords/lvlCoords" should be used instead. (Though there may exist a rare few cases where is is appropriate to be intentionally ambiguous about what coordinate-space the coords live in; in which case the bare "coords" is appropriate.)
There is seldom the need for the pos variant of this notion. In most circumstances we use the term "cursor", since the same buffer is reused for a 'moving' pos-collection.
* "dcvs/lcvs" is used in the compiler as the `ValueRange` analogue of "dimCoords/lvlCoords". (The "vs" stands for "`Value`s".) I haven't found the need for it, but "pvs" would be the obvious name for a pos-`ValueRange`.
The old "ind"-vs-"ivs" naming scheme does not seem to have been sustained in more recent code, which instead prefers other mnemonics (e.g., adding "Buf" to the end of the names for `TypeValue<MemRefType>`). I have cleaned up a lot of these to follow the "coords"-vs-"cvs" naming scheme, though haven't done an exhaustive cleanup.
* "positions/coordinates" are used for larger collections of pos/crd values; in particular, these are used when referring to the complete sparse-tensor storage components.
I also prefer to use these unabbreviated names in the documentation, unless there is some specific reason why using the abbreviated forms helps resolve ambiguity.
In addition to making this terminology change, this change also does some cleanup along the way:
* correcting the dim/lvl terminology in certain places.
* adding `const` when it requires no other code changes.
* miscellaneous cleanup that was entailed in order to make the proper distinctions. Most of these are in CodegenUtils.{h,cpp}
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D144773
* `RewriterBase::mergeBlocks` is simplified: it is implemented in terms of `mergeBlockBefore`.
* The signature of `mergeBlockBefore` is consistent with other API (such as `inlineRegionBefore`): an overload for a `Block::iterator` is added.
* Additional safety checks are added to `mergeBlockBefore`: detect cases where the resulting IR could be invalid (no more `dropAllUses`) or partly unreachable (likely a case of incorrect API usage).
* Rename `mergeBlockBefore` to `inlineBlockBefore`.
Differential Revision: https://reviews.llvm.org/D144969
Rewrite a NewOp into a NewOp of a sorted COO tensor and a ConvertOp for
converting the sorted COO tensor to the destination tensor type.
Codegen a NewOp of a sorted COO tensor to use the new bulk reader API and sort
the elements only when the input is not sorted.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D144504
Eliminates the sort seems make the whole conversion slower (probably because loop rotation leads to bad locality).
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D144517
No need for a temp COO and sort even when converting dense -> CSC, we can instead rotate the loop to yield a ordered coordinates at beginning.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D144213
We will support symmetric MTX without expanding the data in the sparse tensor
storage.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D144059
* Flattening/simplifying some nested conditionals
* const-ifying some local variables
Depends On D143800
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D143949
This change adds a new `SparseTensorType` class for making the "dim" vs "lvl" distinction more overt, and for abstracting over the differences between sparse-tensors and dense-tensors. In addition, this change also adds new type aliases `Dimension`, `Level`, and `FieldIndex` to make code more self-documenting.
Although the diff is very large, the majority of the changes are mechanical in nature (e.g., changing types to use the new aliases, updating variable names to match, etc). Along the way I also made many variables `const` when they could be; the majority of which required only adding the keyword. A few places had conditional definitions of these variables, requiring actual code changes; however, that was only done when the overall change was extremely local and easy to extract. All these changes are included in the current patch only because it would be too onerous to split them off into a separate patch.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D143800
This adds the hint to a number of tensor allocations in codegens,
shaving off quite some time from e.g. reading in sparse matrices
due to zero-reallocation scheme. Note that we can probably provide
hints on all allocations, and refine the heuristics that use them
for general tensors.
Reviewed By: bixia
Differential Revision: https://reviews.llvm.org/D143309
Currently `TypedValue` can be constructed directly from `Value`, hiding
errors that could be caught at compile time. For example the following
will compile, but crash/assert at runtime:
```
void foo(TypedValue<IntegerType>);
void bar(TypedValue<FloatType> v) {
foo(v);
}
```
This change removes the constructors and replaces them with explicit
llvm casts.
Depends on D142852
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D142855
Currently, all the non-stable sorting algorithms are implemented via the
straightforward quick sort. This will be fixed in the following PR.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D142678
Previously, we rely on InsertOp to add values to the result, in the same way we
add values to a sparse tensor with compressed dimensions. We now direct store
values to the values buffer.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D141517
The patch adds operations to `BlockAndValueMapping` and renames it to `IRMapping`. When operations are cloned, old operations are mapped to the cloned operations. This allows mapping from an operation to a cloned operation. Example:
```
Operation *opWithRegion = ...
Operation *opInsideRegion = &opWithRegion->front().front();
IRMapping map
Operation *newOpWithRegion = opWithRegion->clone(map);
Operation *newOpInsideRegion = map.lookupOrNull(opInsideRegion);
```
Migration instructions:
All includes to `mlir/IR/BlockAndValueMapping.h` should be replaced with `mlir/IR/IRMapping.h`. All uses of `BlockAndValueMapping` need to be renamed to `IRMapping`.
Reviewed By: rriddle, mehdi_amini
Differential Revision: https://reviews.llvm.org/D139665
Previously, we use a temporary tensor with identity ordering. We now use a
temporary tensor with the destination dimension ordering, to enable the use of
sort_coo for sorting the tensor.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D141295
This patch fixes build failure due to -Wsign-compare in sparse2SparseRewrite(...) after https://reviews.llvm.org/D140871.
```
llvm-project/mlir/lib/Dialect/SparseTensor/Transforms/SparseTensorRewriting.cpp:842:32: error: comparison of integers of different signs: 'uint64_t' (aka 'unsigned long') and 'int64_t' (aka 'long') [-Werror,-Wsign-compare]
for (uint64_t i = 0; i < rank; i++) {
~ ^ ~~~~
1 error generated.
```
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D141104
When concat along dim 0, and all inputs/outputs are ordered with identity dimension ordering,
the concatenated coordinates will be yield in lexOrder, thus no need to sort.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D140228
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This commit updates how the `SparseTensorConversion` pass handles `NewOp`. It breaks up the underlying `openSparseTensor` function into two parts (`SparseTensorReader::create` and `SparseTensorReader::readSparseTensor`) so that the pass can inject code for constructing `lvlSizes` between those two parts. Migrating the construction of `lvlSizes` out of the runtime and into the pass is a necessary first step toward fully supporting non-permutations. (The alternative would be for the pass to generate a `FuncOp` for performing the construction and then passing that to the runtime; which doesn't seem to have any benefits over the design of this commit.) And since the pass now generates the code to call these two functions, this change also removes the `Action::kFromFile` value from the enum used by `_mlir_ciface_newSparseTensor`.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D138363
The attribute tells the operator to handle symmetric structures for 2D tensors.
By default, the operator assumes the input tensor is not symmetric.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D138230