7 Commits

Author SHA1 Message Date
Sandeep Dasgupta
81d7eef134
Sub-channel quantized type implementation (#120172)
This is an implementation for [RFC: Supporting Sub-Channel Quantization
in
MLIR](https://discourse.llvm.org/t/rfc-supporting-sub-channel-quantization-in-mlir/82694).

In order to make the review process easier, the PR has been divided into
the following commit labels:

1. **Add implementation for sub-channel type:** Includes the class
design for `UniformQuantizedSubChannelType`, printer/parser and bytecode
read/write support. The existing types (per-tensor and per-axis) are
unaltered.
2. **Add implementation for sub-channel type:** Lowering of
`quant.qcast` and `quant.dcast` operations to Linalg operations.
3. **Adding C/Python Apis:** We first define he C-APIs and build the
Python-APIs on top of those.
4. **Add pass to normalize generic ....:** This pass normalizes
sub-channel quantized types to per-tensor per-axis types, if possible.


A  design note:
- **Explicitly storing the `quantized_dimensions`, even when they can be
derived for ranked tensor.**
While it's possible to infer quantized dimensions from the static shape
of the scales (or zero-points) tensor for ranked
data tensors
([ref](https://discourse.llvm.org/t/rfc-supporting-sub-channel-quantization-in-mlir/82694/3)
for background), there are cases where this can lead to ambiguity and
issues with round-tripping.

```
Consider the example: tensor<2x4x!quant.uniform<i8:f32:{0:2, 0:2}, {{s00:z00, s01:z01}}>>
```

The shape of the scales tensor is [1, 2], which might suggest that only
axis 1 is quantized. While this inference is technically correct, as the
block size for axis 0 is a degenerate case (equal to the dimension
size), it can cause problems with round-tripping. Therefore, even for
ranked tensors, we are explicitly storing the quantized dimensions.
Suggestions welcome!


PS: I understand that the upcoming holidays may impact your schedule, so
please take your time with the review. There's no rush.
2025-03-23 07:37:55 -05:00
Peter Hawkins
5cd4274772
[mlir python] Port in-tree dialects to nanobind. (#119924)
This is a companion to #118583, although it can be landed independently
because since #117922 dialects do not have to use the same Python
binding framework as the Python core code.

This PR ports all of the in-tree dialect and pass extensions to
nanobind, with the exception of those that remain for testing pybind11
support.

This PR also:
* removes CollectDiagnosticsToStringScope from NanobindAdaptors.h. This
was overlooked in a previous PR and it is duplicated in Diagnostics.h.

---------

Co-authored-by: Jacques Pienaar <jpienaar@google.com>
2024-12-20 20:32:32 -08:00
annuasd
47ef5c4b7f
[mlir][Bindings] Fix missing return value of functions and incorrect type hint in pyi. (#116731)
The zero points of UniformQuantizedPerAxisType should be List[int].
And there are two methods missing return value.

Co-authored-by: 牛奕博 <niuyibo@niuyibodeMacBook-Pro.local>
2024-11-19 15:24:39 -06:00
Mehdi Amini
285a229f20 [MLIR] Apply clang-tidy fixes for misc-include-cleaner (NFC) 2023-11-12 20:35:46 -08:00
Kazu Hirata
410480e32b Ensure newlines at the end of files (NFC) 2022-01-06 23:44:02 -08:00
Alex Zinenko
95ddbed9b7 [mlir] Split out Python bindings for dialects into separate libs
Historically, the bindings for the Linalg dialect were included into the
"core" bindings library because they depended on the C++ implementation
of the "core" bindings. The other dialects followed the pattern. Now
that this dependency is gone, split out each dialect into a separate
Python extension library.

Depends On D116649, D116605

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D116662
2022-01-06 10:31:14 +01:00
Alex Zinenko
66d4090d9b [mlir] Introduce Python bindings for the quantization dialect
So far, only the custom dialect types are exposed.

The build and packaging is same as for Linalg and SparseTensor, and in
need of refactoring that is beyond the scope of this patch.

Reviewed By: stellaraccident

Differential Revision: https://reviews.llvm.org/D116605
2022-01-05 16:26:31 +01:00