llvm-project

Author	SHA1	Message	Date
Sandeep Dasgupta	81d7eef134	Sub-channel quantized type implementation (#120172 ) This is an implementation for [RFC: Supporting Sub-Channel Quantization in MLIR](https://discourse.llvm.org/t/rfc-supporting-sub-channel-quantization-in-mlir/82694). In order to make the review process easier, the PR has been divided into the following commit labels: 1. Add implementation for sub-channel type: Includes the class design for `UniformQuantizedSubChannelType`, printer/parser and bytecode read/write support. The existing types (per-tensor and per-axis) are unaltered. 2. Add implementation for sub-channel type: Lowering of `quant.qcast` and `quant.dcast` operations to Linalg operations. 3. Adding C/Python Apis: We first define he C-APIs and build the Python-APIs on top of those. 4. Add pass to normalize generic ....: This pass normalizes sub-channel quantized types to per-tensor per-axis types, if possible. A design note: - Explicitly storing the `quantized_dimensions`, even when they can be derived for ranked tensor. While it's possible to infer quantized dimensions from the static shape of the scales (or zero-points) tensor for ranked data tensors ([ref](https://discourse.llvm.org/t/rfc-supporting-sub-channel-quantization-in-mlir/82694/3) for background), there are cases where this can lead to ambiguity and issues with round-tripping. ``` Consider the example: tensor<2x4x!quant.uniform<i8:f32:{0:2, 0:2}, {{s00:z00, s01:z01}}>> ``` The shape of the scales tensor is [1, 2], which might suggest that only axis 1 is quantized. While this inference is technically correct, as the block size for axis 0 is a degenerate case (equal to the dimension size), it can cause problems with round-tripping. Therefore, even for ranked tensors, we are explicitly storing the quantized dimensions. Suggestions welcome! PS: I understand that the upcoming holidays may impact your schedule, so please take your time with the review. There's no rush.	2025-03-23 07:37:55 -05:00
Peter Hawkins	5cd4274772	[mlir python] Port in-tree dialects to nanobind. (#119924 ) This is a companion to #118583, although it can be landed independently because since #117922 dialects do not have to use the same Python binding framework as the Python core code. This PR ports all of the in-tree dialect and pass extensions to nanobind, with the exception of those that remain for testing pybind11 support. This PR also: * removes CollectDiagnosticsToStringScope from NanobindAdaptors.h. This was overlooked in a previous PR and it is duplicated in Diagnostics.h. --------- Co-authored-by: Jacques Pienaar <jpienaar@google.com>	2024-12-20 20:32:32 -08:00
annuasd	47ef5c4b7f	[mlir][Bindings] Fix missing return value of functions and incorrect type hint in pyi. (#116731 ) The zero points of UniformQuantizedPerAxisType should be List[int]. And there are two methods missing return value. Co-authored-by: 牛奕博 <niuyibo@niuyibodeMacBook-Pro.local>	2024-11-19 15:24:39 -06:00
Mehdi Amini	285a229f20	[MLIR] Apply clang-tidy fixes for misc-include-cleaner (NFC)	2023-11-12 20:35:46 -08:00
Kazu Hirata	410480e32b	Ensure newlines at the end of files (NFC)	2022-01-06 23:44:02 -08:00
Alex Zinenko	95ddbed9b7	[mlir] Split out Python bindings for dialects into separate libs Historically, the bindings for the Linalg dialect were included into the "core" bindings library because they depended on the C++ implementation of the "core" bindings. The other dialects followed the pattern. Now that this dependency is gone, split out each dialect into a separate Python extension library. Depends On D116649, D116605 Reviewed By: stellaraccident Differential Revision: https://reviews.llvm.org/D116662	2022-01-06 10:31:14 +01:00
Alex Zinenko	66d4090d9b	[mlir] Introduce Python bindings for the quantization dialect So far, only the custom dialect types are exposed. The build and packaging is same as for Linalg and SparseTensor, and in need of refactoring that is beyond the scope of this patch. Reviewed By: stellaraccident Differential Revision: https://reviews.llvm.org/D116605	2022-01-05 16:26:31 +01:00

7 Commits