32 Commits

Author SHA1 Message Date
Peter Hawkins
5cd4274772
[mlir python] Port in-tree dialects to nanobind. (#119924)
This is a companion to #118583, although it can be landed independently
because since #117922 dialects do not have to use the same Python
binding framework as the Python core code.

This PR ports all of the in-tree dialect and pass extensions to
nanobind, with the exception of those that remain for testing pybind11
support.

This PR also:
* removes CollectDiagnosticsToStringScope from NanobindAdaptors.h. This
was overlooked in a previous PR and it is duplicated in Diagnostics.h.

---------

Co-authored-by: Jacques Pienaar <jpienaar@google.com>
2024-12-20 20:32:32 -08:00
Mateusz Sokół
a9746675a5
[MLIR][Python] Add encoding argument to tensor.empty Python function (#110656)
Hi @xurui1995 @makslevental,

I think in https://github.com/llvm/llvm-project/pull/103087 there's
unintended regression where user can no longer create sparse tensors
with `tensor.empty`.

Previously I could pass:
```python
out = tensor.empty(tensor_type, [])
```
where `tensor_type` contained `shape`, `dtype`, and `encoding`.

With the latest 
```python
tensor.empty(sizes: Sequence[Union[int, Value]], element_type: Type, *, loc=None, ip=None)
```
it's no longer possible.

I propose to add `encoding` argument which is passed to
`RankedTensorType.get(static_sizes, element_type, encoding)` (I updated
one of the tests to check it).
2024-10-01 16:48:00 -04:00
Yinying Li
a10d67f9fb
[mlir][sparse] Enable explicit and implicit value in sparse encoding (#88975)
1. Explicit value means the non-zero value in a sparse tensor. If
explicitVal is set, then all the non-zero values in the tensor have the
same explicit value. The default value Attribute() indicates that it is
not set.

2. Implicit value means the "zero" value in a sparse tensor. If
implicitVal is set, then the "zero" value in the tensor is equal to the
implicit value. For now, we only support `0` as the implicit value but
it could be extended in the future. The default value Attribute()
indicates that the implicit value is `0` (same type as the tensor
element type).

Example:

```
#CSR = #sparse_tensor.encoding<{
  map = (d0, d1) -> (d0 : dense, d1 : compressed),
  posWidth = 64,
  crdWidth = 64,
  explicitVal = 1 : i64,
  implicitVal = 0 : i64
}>
```

Note: this PR tests that implicitVal could be set to other values as
well. The following PR will add verifier and reject any value that's not
zero for implicitVal.
2024-04-24 16:20:25 -07:00
Peiming Liu
56d58295dd
[mlir][sparse] Introduce batch level format. (#83082) 2024-02-26 16:08:28 -08:00
Peiming Liu
429919e328
[mlir][sparse][pybind][CAPI] remove LevelType enum from CAPI, constru… (#81682)
…ct LevelType from LevelFormat and properties instead.

**Rationale**
We used to explicitly declare every possible combination between
`LevelFormat` and `LevelProperties`, and it now becomes difficult to
scale as more properties/level formats are going to be introduced.
2024-02-13 16:45:22 -08:00
Yinying Li
2a6b521b36
[mlir][sparse] Add more tests and verification for n:m (#81186)
1. Add python test for n out of m
2. Add more methods for python binding
3. Add verification for n:m and invalid encoding tests
4. Add e2e test for n:m

Previous PRs for n:m #80501 #79935
2024-02-09 14:34:36 -05:00
Yinying Li
e5924d6499
[mlir][sparse] Implement parsing n out of m (#79935)
1. Add parsing methods for block[n, m].
2. Encode n and m with the newly extended 64-bit LevelType enum.
3. Update 2:4 methods names/comments to n:m.
2024-02-08 14:38:42 -05:00
Yinying Li
cd481fa827
[mlir][sparse] Change LevelType enum to 64 bit (#80501)
1. C++ enum is set through enum class LevelType : uint_64.
2. C enum is set through typedef uint_64 level_type. It is due to the
limitations in Windows build: setting enum width to ui64 is not
supported in C.
2024-02-05 17:00:52 -05:00
Aart Bik
1944c4f76b
[mlir][sparse] rename DimLevelType to LevelType (#73561)
The "Dim" prefix is a legacy left-over that no longer makes sense, since
we have a very strict "Dimension" vs. "Level" definition for sparse
tensor types and their storage.
2023-11-27 14:27:52 -08:00
Yinying Li
b165650aee
[mlir][sparse] Return actual identity map instead of null map (#70365)
Changes:

1. For both dimToLvl and lvlToDim, always returns the actual map instead
of AffineMap() for identity map.
2. Updated custom builder for encoding to have default values.
3. Non-inferable lvlToDim will still return AffineMap() during
inference, so it will be caught by verifier.
2023-10-26 18:30:34 -04:00
Yinying Li
d4088e7d5f
[mlir][sparse] Populate lvlToDim (#68937)
Updates:
1. Infer lvlToDim from dimToLvl
2. Add more tests for block sparsity
3. Finish TODOs related to lvlToDim, including adding lvlToDim to python
binding

Verification of lvlToDim that user provides will be implemented in the
next PR.
2023-10-17 16:09:39 -04:00
Yinying Li
6280e23124
[mlir][sparse] Print new syntax (#68130)
Printing changes from `#sparse_tensor.encoding<{ lvlTypes = [
"compressed" ] }>` to `map = (d0) -> (d0 : compressed)`. Level
properties, ELL and slice are also supported.
2023-10-04 16:36:05 -04:00
Yinying Li
e2e429d994
[mlir][sparse] Migrate more tests to new syntax (#66309)
CSR:
`lvlTypes = [ "dense", "compressed" ]` to `map = (d0, d1) -> (d0 :
dense, d1 : compressed)`

CSC:
`lvlTypes = [ "dense", "compressed" ], dimToLvl = affine_map<(d0, d1) ->
(d1, d0)>` to `map = (d0, d1) -> (d1 : dense, d0 : compressed)`

This is an ongoing effort: #66146
2023-09-14 12:21:13 -04:00
Yinying Li
dbe1be9aa4
[mlir][sparse] Migrate tests to use new syntax (#66146)
lvlTypes = [ "compressed" ] to map = (d0) -> (d0 : compressed)
lvlTypes = [ "dense" ] to map = (d0) -> (d0 : dense)
2023-09-13 11:41:25 -04:00
wren romano
76647fce13 [mlir][sparse] Combining dimOrdering+higherOrdering fields into dimToLvl
This is a major step along the way towards the new STEA design.  While a great deal of this patch is simple renaming, there are several significant changes as well.  I've done my best to ensure that this patch retains the previous behavior and error-conditions, even though those are at odds with the eventual intended semantics of the `dimToLvl` mapping.  Since the majority of the compiler does not yet support non-permutations, I've also added explicit assertions in places that previously had implicitly assumed it was dealing with permutations.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D151505
2023-05-30 15:19:50 -07:00
Tobias Hieta
f9008e6366
[NFC][Py Reformat] Reformat python files in mlir subdir
This is an ongoing series of commits that are reformatting our
Python code.

Reformatting is done with `black`.

If you end up having problems merging this commit because you
have made changes to a python file, the best way to handle that
is to run git checkout --ours <yourfile> and then reformat it
with black.

If you run into any problems, post to discourse about it and
we will try to help.

RFC Thread below:

https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style

Differential Revision: https://reviews.llvm.org/D150782
2023-05-26 08:05:40 +02:00
wren romano
a0615d020a [mlir][sparse] Renaming the STEA field dimLevelType to lvlTypes
This commit is part of the migration of towards the new STEA syntax/design.  In particular, this commit includes the following changes:
* Renaming compiler-internal functions/methods:
  * `SparseTensorEncodingAttr::{getDimLevelType => getLvlTypes}`
  * `Merger::{getDimLevelType => getLvlType}` (for consistency)
  * `sparse_tensor::{getDimLevelType => buildLevelType}` (to help reduce confusion vs actual getter methods)
* Renaming external facets to match:
  * the STEA parser and printer
  * the C and Python bindings
  * PyTACO

However, the actual renaming of the `DimLevelType` itself (along with all the "dlt" names) will be handled in a separate commit.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D150330
2023-05-17 14:24:09 -07:00
wren romano
84cd51bb97 [mlir][sparse] Renaming "pointer/index" to "position/coordinate"
The old "pointer/index" names often cause confusion since these names clash with names of unrelated things in MLIR; so this change rectifies this by changing everything to use "position/coordinate" terminology instead.

In addition to the basic terminology, there have also been various conventions for making certain distinctions like: (1) the overall storage for coordinates in the sparse-tensor, vs the particular collection of coordinates of a given element; and (2) particular coordinates given as a `Value` or `TypedValue<MemRefType>`, vs particular coordinates given as `ValueRange` or similar.  I have striven to maintain these distinctions
as follows:

  * "p/c" are used for individual position/coordinate values, when there is no risk of confusion.  (Just like we use "d/l" to abbreviate "dim/lvl".)

  * "pos/crd" are used for individual position/coordinate values, when a longer name is helpful to avoid ambiguity or to form compound names (e.g., "parentPos").  (Just like we use "dim/lvl" when we need a longer form of "d/l".)

    I have also used these forms for a handful of compound names where the old name had been using a three-letter form previously, even though a longer form would be more appropriate.  I've avoided renaming these to use a longer form purely for expediency sake, since changing them would require a cascade of other renamings.  They should be updated to follow the new naming scheme, but that can be done in future patches.

  * "coords" is used for the complete collection of crd values associated with a single element.  In the runtime library this includes both `std::vector` and raw pointer representations.  In the compiler, this is used specifically for buffer variables with C++ type `Value`, `TypedValue<MemRefType>`, etc.

    The bare form "coords" is discouraged, since it fails to make the dim/lvl distinction; so the compound names "dimCoords/lvlCoords" should be used instead.  (Though there may exist a rare few cases where is is appropriate to be intentionally ambiguous about what coordinate-space the coords live in; in which case the bare "coords" is appropriate.)

    There is seldom the need for the pos variant of this notion.  In most circumstances we use the term "cursor", since the same buffer is reused for a 'moving' pos-collection.

  * "dcvs/lcvs" is used in the compiler as the `ValueRange` analogue of "dimCoords/lvlCoords".  (The "vs" stands for "`Value`s".)  I haven't found the need for it, but "pvs" would be the obvious name for a pos-`ValueRange`.

    The old "ind"-vs-"ivs" naming scheme does not seem to have been sustained in more recent code, which instead prefers other mnemonics (e.g., adding "Buf" to the end of the names for `TypeValue<MemRefType>`).  I have cleaned up a lot of these to follow the "coords"-vs-"cvs" naming scheme, though haven't done an exhaustive cleanup.

  * "positions/coordinates" are used for larger collections of pos/crd values; in particular, these are used when referring to the complete sparse-tensor storage components.

    I also prefer to use these unabbreviated names in the documentation, unless there is some specific reason why using the abbreviated forms helps resolve ambiguity.

In addition to making this terminology change, this change also does some cleanup along the way:
  * correcting the dim/lvl terminology in certain places.
  * adding `const` when it requires no other code changes.
  * miscellaneous cleanup that was entailed in order to make the proper distinctions.  Most of these are in CodegenUtils.{h,cpp}

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D144773
2023-03-06 12:23:33 -08:00
rkayaith
66645a03fc [mlir][python] Include anchor op in PassManager.parse
The pipeline string must now include the pass manager's anchor op. This
makes the parse API properly roundtrip the printed form of a pass
manager.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D136405
2022-11-03 11:49:48 -04:00
wren romano
794d347988 [mlir][sparse] Fixing bug in python test
This is a followup to D135004, to correct one of the tests that didn't get caught by the buildbot.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D135336
2022-10-05 18:06:22 -07:00
Aart Bik
c48e90877f [mlir][sparse] introduce a higher-order tensor mapping
This extension to the sparse tensor type system in MLIR
opens up a whole new set of sparse storage schemes, such as
block sparse storage (e.g. BCSR) and ELL (aka jagged diagonals).

This revision merely introduces the type extension and
initial documentation. The actual interpretation of the type
(reading in tensors, lowering to code, etc.) will follow.

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D135206
2022-10-05 09:40:51 -07:00
Aart Bik
0baec207ea [mlir][sparse][python] improve sparse encoding test
Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D133971
2022-09-15 14:01:57 -07:00
wren romano
286248db2c [mlir][sparse] Moving integration tests that merely use the Python API
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D114192
2021-11-23 10:59:38 -08:00
Aart Bik
e9b1c974be [mlir][sparse] run less combinations of SpMM in test (to reduce runtime)
This revision also adds a few passes to the sparse compiler part to unify the transformation sequence with all other paths we currently use.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D111900
2021-10-15 16:04:01 -07:00
Alex Zinenko
8b58ab8ccd [mlir] Factor type reconciliation out of Standard-to-LLVM conversion
Conversion to the LLVM dialect is being refactored to be more progressive and
is now performed as a series of independent passes converting different
dialects. These passes may produce `unrealized_conversion_cast` operations that
represent pending conversions between built-in and LLVM dialect types.
Historically, a more monolithic Standard-to-LLVM conversion pass did not need
these casts as all operations were converted in one shot. Previous refactorings
have led to the requirement of running the Standard-to-LLVM conversion pass to
clean up `unrealized_conversion_cast`s even though the IR had no standard
operations in it. The pass must have been also run the last among all to-LLVM
passes, in contradiction with the partial conversion logic. Additionally, the
way it was set up could produce invalid operations by removing casts between
LLVM and built-in types even when the consumer did not accept the uncasted
type, or could lead to cryptic conversion errors (recursive application of the
rewrite pattern on `unrealized_conversion_cast` as a means to indicate failure
to eliminate casts).

In fact, the need to eliminate A->B->A `unrealized_conversion_cast`s is not
specific to to-LLVM conversions and can be factored out into a separate type
reconciliation pass, which is achieved in this commit. While the cast operation
itself has a folder pattern, it is insufficient in most conversion passes as
the folder only applies to the second cast. Without complex legality setup in
the conversion target, the conversion infra will either consider the cast
operations valid and not fold them (a separate canonicalization would be
necessary to trigger the folding), or consider the first cast invalid upon
generation and stop with error. The pattern provided by the reconciliation pass
applies to the first cast operation instead. Furthermore, having a separate
pass makes it clear when `unrealized_conversion_cast`s could not have been
eliminated since it is the only reason why this pass can fail.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D109507
2021-09-09 16:51:24 +02:00
Aart Bik
24ea94ad0c [mlir][sparse][python] migrate more code from boilerplate into proper numpy land
The boilerplate was setting up some arrays for testing. To fully illustrate
python - MLIR potential, however, this data should also come from numpy land.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108336
2021-08-20 09:18:17 -07:00
Aart Bik
19a906f372 [mlir][sparse][python] make imports more selective
Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D108055
2021-08-16 11:53:29 -07:00
Aart Bik
56d607006d [mlir][sparse][python] add an "exhaustive" sparse test using python
Using the python API to easily set up sparse kernels, this test
exhaustively builds, compilers, and runs SpMM for all annotations
on a sparse tensor, making sure every version generates the correct
result. This test also illustrates using the python API to set up
a sparse kernel and sparse compilation.

Reviewed By: bixia

Differential Revision: https://reviews.llvm.org/D107943
2021-08-12 11:13:04 -07:00
Aart Bik
58d12332a4 [mlir][sparse][capi][python] add sparse tensor passes
First set of "boilerplate" to get sparse tensor
passes available through CAPI and Python.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D102362
2021-05-12 16:40:50 -07:00
Stella Laurenzo
a2c8aebd8f [mlir][Python] Finish adding RankedTensorType support for encoding.
Differential Revision: https://reviews.llvm.org/D102184
2021-05-10 20:39:16 +00:00
Stella Laurenzo
f38633d1bb [mlir][Python] Re-export cext sparse_tensor module to the public namespace.
* This was left out of the previous commit accidentally.

Differential Revision: https://reviews.llvm.org/D102183
2021-05-10 18:08:29 +00:00
Stella Laurenzo
f13893f66a [mlir][Python] Upstream the PybindAdaptors.h helpers and use it to implement sparse_tensor.encoding.
* The PybindAdaptors.h file has been evolving across different sub-projects (npcomp, circt) and has been successfully used for out of tree python API interop/extensions and defining custom types.
* Since sparse_tensor.encoding is the first in-tree custom attribute we are supporting, it seemed like the right time to upstream this header and use it to define the attribute in a way that we can support for both in-tree and out-of-tree use (prior, I had not wanted to upstream dead code which was not used in-tree).
* Adapted the circt version of `mlir_type_subclass`, also providing an `mlir_attribute_subclass`. As we get a bit of mileage on this, I would like to transition the builtin types/attributes to this mechanism and delete the old in-tree only `PyConcreteType` and `PyConcreteAttribute` template helpers (which cannot work reliably out of tree as they depend on internals).
* Added support for defaulting the MlirContext if none is passed so that we can support the same idioms as in-tree versions.

There is quite a bit going on here and I can split it up if needed, but would prefer to keep the first use and the header together so sending out in one patch.

Differential Revision: https://reviews.llvm.org/D102144
2021-05-10 17:15:43 +00:00