llvm-project

Author	SHA1	Message	Date
Christopher Bate	ced2fc7819	[mlir][bufferization] Fix OneShotBufferize when `defaultMemorySpaceFn` is used (#91524 ) As described in issue llvm/llvm-project#91518, a previous PR llvm/llvm-project#78484 introduced the `defaultMemorySpaceFn` into bufferization options, allowing one to inform OneShotBufferize that it should use a specified function to derive the memory space attribute from the encoding attribute attached to tensor types. However, introducing this feature exposed unhandled edge cases, examples of which are introduced by this change in the new test under `test/Dialect/Bufferization/Transforms/one-shot-bufferize-encodings.mlir`. Fixing the inconsistencies introduced by `defaultMemorySpaceFn` is pretty simple. This change: - Updates the `bufferization.to_memref` and `bufferization.to_tensor` operations to explicitly include operand and destination types, whereas previously they relied on type inference to deduce the tensor types. Since the type inference cannot recover the correct tensor encoding/memory space, the operand and result types must be explicitly included. This is a small assembly format change, but it touches a large number of test files. - Makes minor updates to other bufferization functions to handle the changes in building the above ops. - Updates bufferization of `tensor.from_elements` to handle memory space. Integration/upgrade guide: In downstream projects, if you have tests or MLIR files that explicitly use `bufferization.to_tensor` or `bufferization.to_memref`, then update them to the new assembly format as follows: ``` %1 = bufferization.to_memref %0 : memref<10xf32> %2 = bufferization.to_tensor %1 : memref<10xf32> ``` becomes ``` %1 = bufferization.to_memref %0 : tensor<10xf32> to memref<10xf32> %2 = bufferization.to_tensor %0 : memref<10xf32> to tensor<10xf32> ```	2024-11-26 09:45:57 -07:00
Peiming Liu	fc9f1d49aa	[mlir][sparse] use a consistent order between [dis]assembleOp and sto… (#84079 ) …rage layout.	2024-03-06 09:57:41 -08:00
Yinying Li	e5924d6499	[mlir][sparse] Implement parsing n out of m (#79935 ) 1. Add parsing methods for block[n, m]. 2. Encode n and m with the newly extended 64-bit LevelType enum. 3. Update 2:4 methods names/comments to n:m.	2024-02-08 14:38:42 -05:00
Aart Bik	41a07e668c	[mlir][sparse] recognize NVidia 2:4 type for matmul (#76758 ) This removes the temporary DENSE24 attribute and replaces it with proper recognition of dense to 24 conversion. The compressionh will be performed on the device prior to performing the matrix mult. Note that we no longer need to start with the linalg version, we can lift this to the proper named linalg op. Also renames some files into more consistent names.	2024-01-02 14:44:24 -08:00
Yinying Li	c5a67e16b6	[mlir][sparse] Use variable instead of inlining sparse encoding (#72561 ) Example: #CSR = #sparse_tensor.encoding<{ map = (d0, d1) -> (d0 : dense, d1 : compressed), }> // CHECK: #[[$CSR.]] = #sparse_tensor.encoding<{ map = (d0, d1) -> (d0 : dense, d1 : compressed) }> // CHECK-LABEL: func private @sparse_csr( // CHECK-SAME: tensor<?x?xf32, #[[$CSR]]*>) func.func private @sparse_csr(tensor<?x?xf32, #CSR>)	2023-11-16 19:30:21 -05:00
Peiming Liu	06a65ce500	[mlir][sparse] schedule sparse kernels in a separate pass from sparsification. (#72423 )	2023-11-15 12:16:05 -08:00
Aart Bik	5f32bcfbae	[mlir][sparse][gpu] re-enable all GPU libgen tests (#72185 ) Previous change no longer properly used the GPU libgen pass (even though most tests still passed falling back to CPU). This revision puts the proper pass order into place. Also bit of a cleanup of CPU codegen vs. libgen setup.	2023-11-14 09:06:15 -08:00
Aart Bik	a4eadd7fb6	[mlir][sparse][gpu] add GPU BSR SDDMM check test (#71491 ) also minor edits in other GPU check tests	2023-11-06 22:36:25 -08:00
Peiming Liu	6ca47eb49d	[mlir][sparse] rename sparse_tensor.(un)pack to sparse_tensor.(dis)as… (#67717 ) …semble Pack/Unpack are overridden in many other places, rename the operations to avoid confusion.	2023-09-28 11:01:10 -07:00
Yinying Li	79b9d41bd7	[mlir][sparse] Generalize sparse encoding in check tests (#67476 ) For all the mlir tests (except for roundtrip_coding.mlir), change the check test to use general form of encoding `#sparse_tensor.encoding<{{{.*}}}>` instead of actual encoding such as `#sparse_tensor.encoding<{ lvlTypes = [ "compressed", "singleton" ] }>`.	2023-09-26 16:56:06 -04:00
Aart Bik	3e4a8c2c7d	[mlir][sparse] remove most bufferization.alloc_tensor ops from sparse (#66847 ) The only ones left need actual deprecation in bufferization module.	2023-09-20 09:51:08 -07:00
Aart Bik	619a888dd8	[mlir][sparse][gpu] free all buffers allocated for spGEMM (#66813 ) Yup, a bit of an oversight ;-)	2023-09-19 14:33:12 -07:00
Yinying Li	3dc621124f	[mlir][sparse] Migrate tests to use new syntax (#66543 ) COO `lvlTypes = [ "compressed_nu", "singleton" ]` to `map = (d0, d1) -> (d0 : compressed(nonunique), d1 : singleton)` `lvlTypes = [ "compressed_nu_no", "singleton_no" ]` to `map = (d0, d1) -> (d0 : compressed(nonunique, nonordered), d1 : singleton(nonordered))` SortedCOO `lvlTypes = [ "compressed_nu", "singleton" ]` to `map = (d0, d1) -> (d0 : compressed(nonunique), d1 : singleton)` BCOO `lvlTypes = [ "dense", "compressed_hi_nu", "singleton" ]` to `map = (d0, d1, d2) -> (d0 : dense, d1 : compressed(nonunique, high), d2 : singleton)` BCSR `lvlTypes = [ "compressed", "compressed", "dense", "dense" ], dimToLvl = affine_map<(d0, d1) -> (d0 floordiv 2, d1 floordiv 3, d0 mod 2, d1 mod 3)>` to `map = ( i, j ) -> ( i floordiv 2 : compressed, j floordiv 3 : compressed, i mod 2 : dense, j mod 3 : dense )` Tensor and other supported formats(e.g. CCC, CDC, CCCC) Currently, ELL and slice are not supported yet in the new syntax and the CHECK tests will be updated once printing is set to output the new syntax. Previous PRs: #66146, #66309, #66443	2023-09-15 16:12:20 -04:00
Yinying Li	e2e429d994	[mlir][sparse] Migrate more tests to new syntax (#66309 ) CSR: `lvlTypes = [ "dense", "compressed" ]` to `map = (d0, d1) -> (d0 : dense, d1 : compressed)` CSC: `lvlTypes = [ "dense", "compressed" ], dimToLvl = affine_map<(d0, d1) -> (d1, d0)>` to `map = (d0, d1) -> (d1 : dense, d0 : compressed)` This is an ongoing effort: #66146	2023-09-14 12:21:13 -04:00
Yinying Li	51ebecf309	[mlir][sparse] Changed sparsity properties to use _ instead of - Example: compressed-no -> compressed_no Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D158567	2023-08-23 17:00:27 +00:00
Aart Bik	289f7231f9	[mlir][sparse][gpu] minor code cleanup for sparse gpu ops Consistent order of ops and related methods. Also, renamed SpGEMMGetSizeOp to SpMatGetSizeOp since this is a general utility for sparse matrices, not specific to GEMM ops only. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D157922	2023-08-14 15:08:57 -07:00
Aart Bik	7a8dcf2ec7	[mlir][sparse][gpu] add CHECK test to spGEMM libgen Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D157675	2023-08-11 09:56:13 -07:00
Kun Wu	1e491c425b	[mlir][sparse][gpu] add 2:4 spmm prune_and_check flag Differential Revision: https://reviews.llvm.org/D155909	2023-08-01 18:24:18 +00:00
K-Wu	e37fc3cc39	[mlir][sparse][gpu] Impl 2:4 SpMM rewrite for linalg op w/ DENSE24 attr Differential Revision: https://reviews.llvm.org/D154772	2023-07-10 22:36:57 +00:00
Aart Bik	03125e6894	[mlir][sparse][gpu] fix missing dealloc This dealloc was incorrectly removed in https://reviews.llvm.org/D153173 Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D154564	2023-07-06 09:48:19 -07:00
Kun Wu	be2dd22b8f	[mlir][sparse][gpu] reuse CUDA environment handle throughout instance lifetime Differential Revision: https://reviews.llvm.org/D153173	2023-06-30 21:52:34 +00:00
Aart Bik	f14c8eb595	[mlir][sparse][gpu] refine SDDMM pattern for cuSPARSE Old pattern was missing some cases (e.g. swapping the arguments) but it also allowed too many cases (e.g. non-empty "absent" or different arguments for add/mul). This fixes the issues. Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D153487	2023-06-21 18:31:55 -07:00
Kun Wu	9167dd46ba	[mlir][sparse][gpu] recognizing sddmm pattern in GPU libgen path Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151582	2023-06-15 23:48:11 +00:00
Kun Wu	97f4c22b3a	[mlir][sparse][gpu] unify dnmat and dnvec handle and ops Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152465	2023-06-09 17:16:48 +00:00
Kun Wu	8ed59c53de	[mlir][sparse][gpu] add sm8.0+ tensor core 2:4 sparsity support Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151775	2023-06-06 23:13:21 +00:00
Aart Bik	22caafc9f3	[mlir][sparse][gpu] end to end test for matmul (1) minor bug fix in copy back [always nice to run stuff ;-)] (2) run with and without lib (even though some fall back to CPU) Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D151507	2023-05-25 16:10:22 -07:00
Aart Bik	bcb698bfdc	[mlir][sparse][gpu] various cuSparse refinements (1) keep all cuSparse ops on single stream without wait() in right order (2) use more type precise memref types for COO (3) use ToTensor on resulting memref (even though it folds away again) Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D151404	2023-05-24 22:32:52 -07:00
Aart Bik	b75d6a40f1	[mlir][sparse][gpu] recognize SpMM cuSparse during sparsification Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D150715	2023-05-19 17:22:59 -07:00
wren romano	7f5fb90bbb	[mlir][sparse] Fixing GPU tests (followup to D150330) The GPU tests weren't updated when rebasing D150330, so this patch fixes that. Reviewed By: anlunx Differential Revision: https://reviews.llvm.org/D150822	2023-05-17 15:29:54 -07:00
wren romano	a0615d020a	[mlir][sparse] Renaming the STEA field `dimLevelType` to `lvlTypes` This commit is part of the migration of towards the new STEA syntax/design. In particular, this commit includes the following changes: * Renaming compiler-internal functions/methods: * `SparseTensorEncodingAttr::{getDimLevelType => getLvlTypes}` * `Merger::{getDimLevelType => getLvlType}` (for consistency) * `sparse_tensor::{getDimLevelType => buildLevelType}` (to help reduce confusion vs actual getter methods) * Renaming external facets to match: * the STEA parser and printer * the C and Python bindings * PyTACO However, the actual renaming of the `DimLevelType` itself (along with all the "dlt" names) will be handled in a separate commit. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D150330	2023-05-17 14:24:09 -07:00
Aart Bik	ee42e23614	[mlir][sparse][gpu] first implementation of the GPU libgen approach The sparse compiler now has two prototype strategies for GPU acceleration: * CUDA codegen: this converts sparsified code to CUDA threads * CUDA libgen: this converts pre-sparsified code to cuSPARSE library calls This revision introduces the first steps required for the second approach. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D150170	2023-05-15 08:49:38 -07:00
Aart Bik	86888e420c	[mlir][sparse][gpu] generate proper memcpy in/out host and device The host registration is a convenient way to get CUDA kernels running, but it may be slow and does not work for all buffer (like global constants). This revision uses the proper alloc copy dealloc chains for buffers, using asynchronous chains to increase overlap. The host registration mechanism is kept under a flag for the output, just for experimentation purposes while this project ramps up. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D148682	2023-04-21 09:30:42 -07:00
Aart Bik	4889214a48	[mlir][sparse][gpu] generate single module, unique kernel names This fixes a TODO in the first version. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D148406	2023-04-15 17:25:36 -07:00
Aart Bik	19466ebc7f	[mlir][sparse][gpu] a first prototype sparse GPU code generator This implements a proof-of-concept GPU code generator to the sparse compiler pipeline, currently only capable of generating CUDA threads for outermost parallel loops. The objective, obviously, is to grow this concept to a full blown GPU code generator, capable of the right combinaton of code generation as well as exploiting idiomatic kernels or vector specific libraries (think cuSparse). Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D147483	2023-04-05 11:32:06 -07:00

34 Commits