llvm-project

Author	SHA1	Message	Date
Sandeep Dasgupta	baacd1287b	Fix printing of `mlirUniformQuantizedSubChannelTypeGetNumBlockSizes` in 32-bit machine. (#133763 ) Fixes the issue reported in https://github.com/llvm/llvm-project/pull/120172#issuecomment-2763212827 cc @mgorny	2025-03-31 16:45:43 -04:00
Sandeep Dasgupta	81d7eef134	Sub-channel quantized type implementation (#120172 ) This is an implementation for [RFC: Supporting Sub-Channel Quantization in MLIR](https://discourse.llvm.org/t/rfc-supporting-sub-channel-quantization-in-mlir/82694). In order to make the review process easier, the PR has been divided into the following commit labels: 1. Add implementation for sub-channel type: Includes the class design for `UniformQuantizedSubChannelType`, printer/parser and bytecode read/write support. The existing types (per-tensor and per-axis) are unaltered. 2. Add implementation for sub-channel type: Lowering of `quant.qcast` and `quant.dcast` operations to Linalg operations. 3. Adding C/Python Apis: We first define he C-APIs and build the Python-APIs on top of those. 4. Add pass to normalize generic ....: This pass normalizes sub-channel quantized types to per-tensor per-axis types, if possible. A design note: - Explicitly storing the `quantized_dimensions`, even when they can be derived for ranked tensor. While it's possible to infer quantized dimensions from the static shape of the scales (or zero-points) tensor for ranked data tensors ([ref](https://discourse.llvm.org/t/rfc-supporting-sub-channel-quantization-in-mlir/82694/3) for background), there are cases where this can lead to ambiguity and issues with round-tripping. ``` Consider the example: tensor<2x4x!quant.uniform<i8:f32:{0:2, 0:2}, {{s00:z00, s01:z01}}>> ``` The shape of the scales tensor is [1, 2], which might suggest that only axis 1 is quantized. While this inference is technically correct, as the block size for axis 0 is a degenerate case (equal to the dimension size), it can cause problems with round-tripping. Therefore, even for ranked tensors, we are explicitly storing the quantized dimensions. Suggestions welcome! PS: I understand that the upcoming holidays may impact your schedule, so please take your time with the review. There's no rush.	2025-03-23 07:37:55 -05:00
Tom Eccles	5d91f79fce	[mlir] Fix -Wstrict-prototypes warning These warnings prevent compilation using clang and -DLLVM_ENABLE_WERROR=On. Differential revision: https://reviews.llvm.org/D139322	2022-12-12 12:04:58 +00:00
Alex Zinenko	9bcf13bf3e	[mlir] Introduce C API for the Quantization dialect types Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D116546	2022-01-05 16:20:29 +01:00

4 Commits