llvm-project

Author	SHA1	Message	Date
Aart Bik	306f4c306a	[mlir][sparse] implement non-permutation MapRef encoding (#69406 ) This enables reading block sparse from file using libgen! (and soon also direct IR codegen)	2023-10-18 13:01:12 -07:00
Yinying Li	d4088e7d5f	[mlir][sparse] Populate lvlToDim (#68937 ) Updates: 1. Infer lvlToDim from dimToLvl 2. Add more tests for block sparsity 3. Finish TODOs related to lvlToDim, including adding lvlToDim to python binding Verification of lvlToDim that user provides will be implemented in the next PR.	2023-10-17 16:09:39 -04:00
Peiming Liu	f248d0b28d	[mlir][sparse] implement sparse_tensor.reorder_coo (#68916 ) As a side effect of the change, it also unifies the convertOp implementation between lib/codegen path.	2023-10-12 13:22:45 -07:00
Peiming Liu	0083f8338c	[mlir][sparse] renaming sparse_tensor.sort_coo to sparse_tensor.sort (#68161 ) Rationale: the operation does not always sort COO tensors (also used for sparse_tensor.compress for example).	2023-10-03 16:28:25 -07:00
Yinying Li	d2e8517912	[mlir][sparse] Update Enum name for CompressedWithHigh (#67845 ) Change CompressedWithHigh to LooseCompressed.	2023-10-02 11:06:40 -04:00
Aart Bik	3231a365c1	[mlir][sparse][gpu] add CSC to libgen GPU sparsification using cuSparse (#67713 ) Add CSC, but also adds BSR as a future format. Coming soon!	2023-09-28 11:47:22 -07:00
Peiming Liu	6ca47eb49d	[mlir][sparse] rename sparse_tensor.(un)pack to sparse_tensor.(dis)as… (#67717 ) …semble Pack/Unpack are overridden in many other places, rename the operations to avoid confusion.	2023-09-28 11:01:10 -07:00
Cullen Rhodes	9816edc9f3	[mlir][vector] add result type to vector.extract assembly format (#66499 ) The vector.extract assembly format currently only contains the source type, for example: %1 = vector.extract %0[1] : vector<3x7x8xf32> it's not immediately obvious if this is the source or result type. This patch improves the assembly format to make this clearer, so the above becomes: %1 = vector.extract %0[1] : vector<7x8xf32> from vector<3x7x8xf32>	2023-09-28 11:11:16 +01:00
Yinying Li	256ac4619b	[mlir][sparse] Change tests to use new syntax for ELL and slice (#67569 ) Examples: 1. `#ELL = #sparse_tensor.encoding<{ lvlTypes = [ "dense", "dense", "compressed" ], dimToLvl = affine_map<(i,j)[c] -> (c4i, i, j)> }>` to `#ELL = #sparse_tensor.encoding<{ map = [s0](d0, d1) -> (d0 * (s0 * 4) : dense, d0 : dense, d1 : compressed) }>` 2. `#CSR_SLICE = #sparse_tensor.encoding<{ lvlTypes = [ "dense", "compressed" ], dimSlices = [ (1, 4, 1), (1, 4, 2) ] }>` to `#CSR_SLICE = #sparse_tensor.encoding<{ map = (d0 : #sparse_tensor<slice(1, 4, 1)>, d1 : #sparse_tensor<slice(1, 4, 2)>) -> (d0 : dense, d1 : compressed) }>`	2023-09-27 19:40:52 -04:00
Yinying Li	d374a78545	[mlir][sparse] Treat high and 2OutOf4 as level formats (#67203 ) In the new syntax, we will parse loose_compressed as CompressedWithHigh and block2_4 as TwoOutOfFour level format. Currently, we support unique and order as level properties.	2023-09-25 11:04:55 -04:00
Peiming Liu	bfa3bc4378	[mlir][sparse] unifies sparse_tensor.sort_coo/sort into one operation. (#66722 ) The use cases of the two operations are largely overlapped, let's simplify it and only use one of them.	2023-09-19 17:02:32 -07:00
Peiming Liu	4176ce61f1	[mlir][sparse] fix logical error when generating sort_coo. (#66690 ) To fix issue: https://github.com/llvm/llvm-project/issues/66664	2023-09-18 15:26:01 -07:00
Yinying Li	3dc621124f	[mlir][sparse] Migrate tests to use new syntax (#66543 ) COO `lvlTypes = [ "compressed_nu", "singleton" ]` to `map = (d0, d1) -> (d0 : compressed(nonunique), d1 : singleton)` `lvlTypes = [ "compressed_nu_no", "singleton_no" ]` to `map = (d0, d1) -> (d0 : compressed(nonunique, nonordered), d1 : singleton(nonordered))` SortedCOO `lvlTypes = [ "compressed_nu", "singleton" ]` to `map = (d0, d1) -> (d0 : compressed(nonunique), d1 : singleton)` BCOO `lvlTypes = [ "dense", "compressed_hi_nu", "singleton" ]` to `map = (d0, d1, d2) -> (d0 : dense, d1 : compressed(nonunique, high), d2 : singleton)` BCSR `lvlTypes = [ "compressed", "compressed", "dense", "dense" ], dimToLvl = affine_map<(d0, d1) -> (d0 floordiv 2, d1 floordiv 3, d0 mod 2, d1 mod 3)>` to `map = ( i, j ) -> ( i floordiv 2 : compressed, j floordiv 3 : compressed, i mod 2 : dense, j mod 3 : dense )` Tensor and other supported formats(e.g. CCC, CDC, CCCC) Currently, ELL and slice are not supported yet in the new syntax and the CHECK tests will be updated once printing is set to output the new syntax. Previous PRs: #66146, #66309, #66443	2023-09-15 16:12:20 -04:00
Aart Bik	d2e787d5d7	[mlir][sparse][tensor] replace bufferization with empty tensor (#66450 ) Rationale: A bufferization.alloc_tensor can be directly replaced with tensor.empty since these are more or less semantically equivalent. The latter is considered a bit more "pure" with respect to SSA semantics.	2023-09-15 11:45:42 -07:00
Yinying Li	2a07f0fd40	[mlir][sparse] Migrate more tests to use new syntax (#66443 ) Dense `lvlTypes = [ "dense", "dense" ]` to `map = (d0, d1) -> (d0 : dense, d1 : dense)` `lvlTypes = [ "dense", "dense" ], dimToLvl = affine_map<(i,j) -> (j,i)>` to `map = (d0, d1) -> (d1 : dense, d0 : dense)` DCSR `lvlTypes = [ "compressed", "compressed" ]` to `map = (d0, d1) -> (d0 : compressed, d1 : compressed)` DCSC `lvlTypes = [ "compressed", "compressed" ], dimToLvl = affine_map<(i,j) -> (j,i)>` to `map = (d0, d1) -> (d1 : compressed, d0 : compressed)` Block Row `lvlTypes = [ "compressed", "dense" ]` to `map = (d0, d1) -> (d0 : compressed, d1 : dense)` Block Column `lvlTypes = [ "compressed", "dense" ], dimToLvl = affine_map<(i,j) -> (j,i)>` to `map = (d0, d1) -> (d1 : compressed, d0 : dense)` This is an ongoing effort: #66146, #66309	2023-09-14 23:19:57 +00:00
Fabian Mora	5093413a50	[mlir][gpu][NVPTX] Enable NVIDIA GPU JIT compilation path (#66220 ) This patch adds an NVPTX compilation path that enables JIT compilation on NVIDIA targets. The following modifications were performed: 1. Adding a format field to the GPU object attribute, allowing the translation attribute to use the correct runtime function to load the module. Likewise, a dictionary attribute was added to add any possible extra options. 2. Adding the `createObject` method to `GPUTargetAttrInterface`; this method returns a GPU object from a binary string. 3. Adding the function `mgpuModuleLoadJIT`, which is only available for NVIDIA GPUs, as there is no equivalent for AMD. 4. Adding the CMake flag `MLIR_GPU_COMPILATION_TEST_FORMAT` to specify the format to use during testing.	2023-09-14 18:00:27 -04:00
Yinying Li	e2e429d994	[mlir][sparse] Migrate more tests to new syntax (#66309 ) CSR: `lvlTypes = [ "dense", "compressed" ]` to `map = (d0, d1) -> (d0 : dense, d1 : compressed)` CSC: `lvlTypes = [ "dense", "compressed" ], dimToLvl = affine_map<(d0, d1) -> (d1, d0)>` to `map = (d0, d1) -> (d1 : dense, d0 : compressed)` This is an ongoing effort: #66146	2023-09-14 12:21:13 -04:00
Aart Bik	0f65df732c	[mlir][sparse] remove the MLIR PyTACO tests (#66302 ) Rationale: This test was really fun to compare the MLIR sparsifier with TACO using the PyTACO format. However, the underlying mechanism is rapidly growing outdated with our recent developments. Rather than maintaining the old code, we are moving toward the newer, better approaches. So if you are sad this is gone, stay tuned, something better is coming!	2023-09-13 15:54:49 -07:00
Aart Bik	9918d2556c	[mlir][sparse] remove sparse output python example (#66298 ) Rationale: This was actually just a pure "string based" test with very little actual python usage. The output sparse tensor was handled via the deprecated convertFromMLIRSparseTensor method.	2023-09-13 15:11:35 -07:00
Peiming Liu	098f46dce3	[sparse] allow unpack op to return 0-ranked tensor type. (#66269 ) Many frontends canonicalize scalar into 0-ranked tensor, it change will hopefully make the operation easier to use for those cases.	2023-09-13 11:33:01 -07:00
frgossen	1cddbf8cf5	Revert `Add host-supports-nvptx requirement to lit tests` (#66102 and #66129 ) (#66225 )	2023-09-13 12:20:38 -04:00
Yinying Li	dbe1be9aa4	[mlir][sparse] Migrate tests to use new syntax (#66146 ) lvlTypes = [ "compressed" ] to map = (d0) -> (d0 : compressed) lvlTypes = [ "dense" ] to map = (d0) -> (d0 : dense)	2023-09-13 11:41:25 -04:00
Peiming Liu	64df1c08d0	[sparse] allow unpack op to return any integer type. (#66161 )	2023-09-12 17:27:51 -07:00
frgossen	1c5161911c	Add host-supports-nvptx requirement to lit tests (#66129 )	2023-09-12 15:18:29 -04:00
frgossen	a3b894287f	Add host-supports-nvptx requirement to lit tests (#66102 )	2023-09-12 12:21:36 -04:00
Mehdi Amini	6f5ebfb987	Fix MLIR integration test that requires ARM SVE to reproduce Fix-forward for a9f30097586e914e074111d966c1408e82d04a8d	2023-09-09 15:29:00 -07:00
Mehdi Amini	a9f3009758	Switch MLIR to use the internal LIT shell by default (#65415 )	2023-09-09 13:51:27 -07:00
Fabian Mora	119c489cc1	Reland [mlir][test][gpu] Migrate CUDA tests to the TargetAttr compilation workflow (llvm#65768) The revert happened due to a build bot failure that threw 'CUDA_ERROR_UNSUPPORTED_PTX_VERSION'. The failure's root cause was a pass using "+ptx76" for compilation and an old CUDA driver on the bot. This commit relands the patch with "+ptx60". Original Gh PR: #65768 Original commit message: Migrate tests referencing `gpu-to-cubin` to the new compilation workflow using `TargetAttrs`. The `test-lower-to-nvvm` pass pipeline was modified to use the new compilation workflow to simplify the introduction of future tests. The `createLowerGpuOpsToNVVMOpsPass` function was removed, as it didn't allow for passing all options available in the `ConvertGpuOpsToNVVMOp` pass.	2023-09-09 12:45:21 +00:00
Fabian Mora	2c596ea951	Revert "[mlir][test][gpu] Migrate CUDA tests to the TargetAttr compilation workflow (#65768 ) (#65848 ) This reverts commit d21b67293be15f8a89378e4785d70cc037866406.	2023-09-09 07:14:19 -04:00
Fabian Mora	d21b67293b	[mlir][test][gpu] Migrate CUDA tests to the TargetAttr compilation workflow (#65768 ) Migrate tests referencing `gpu-to-cubin` to the new compilation workflow using `TargetAttrs`. The `test-lower-to-nvvm` pass pipeline was modified to use the new compilation workflow to simplify the introduction of future tests. The `createLowerGpuOpsToNVVMOpsPass` function was removed, as it didn't allow for passing all options available in the `ConvertGpuOpsToNVVMOp` pass.	2023-09-09 07:03:38 -04:00
Aart Bik	b86d3cbc12	[mlir][sparse] complete various FIXMEs in sparse support lib Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D159245	2023-08-30 21:30:25 -07:00
Peiming Liu	22e8d5b428	[mlir][sparse] Support strided convolution on dense level. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D159020	2023-08-30 20:00:50 +00:00
Peiming Liu	07bd5f20bc	[mlir][sparse] Support strided convolution on compressed level. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D158912	2023-08-30 19:37:50 +00:00
Peiming Liu	96e1914aa2	[mlir][sparse] fix crash when generating convolution kernel with sparse input in DCCD format. Reviewed By: aartbik, anlunx Differential Revision: https://reviews.llvm.org/D159170	2023-08-30 17:49:36 +00:00
Yinying Li	51ebecf309	[mlir][sparse] Changed sparsity properties to use _ instead of - Example: compressed-no -> compressed_no Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D158567	2023-08-23 17:00:27 +00:00
Peiming Liu	8c8aecdca9	[mlir][sparse] Supporting (non)uniqueness in SparseTensorStorage::lexDiff. Fix copied from https://reviews.llvm.org/D156946 but with a legit test case that triggers the bug. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D158578	2023-08-23 03:48:53 +00:00
Peiming Liu	6ca0b27298	[mlir][sparse] more complicated test for dual sparse convolution kernel. Reviewed By: anlunx Differential Revision: https://reviews.llvm.org/D158443	2023-08-21 18:48:01 +00:00
Andrzej Warzynski	51eaee3b42	[mlir][SparseTensor] Fix test regression Fix a regression caused by https://reviews.llvm.org/D158012. Failing bot: * https://lab.llvm.org/buildbot/#/builders/179/builds/7122 Note that both `RUN` lines in the affected file were previously tested with similar configuraiton (_with_ and _without_ vectorisation). This change restores that, though the new setting (from D158012) is used, i.e. * with direct IR generation, `enable-runtime-library=true`. This is sufficient to make the test pass and allows us to investigate the root cause offline. Issue reported here: https://github.com/llvm/llvm-project/issues/64727	2023-08-16 09:37:07 +00:00
Aart Bik	30c1866dec	[mlir][sparse][gpu] enable SpGEMM on GPU for libgen path Direct IR supports pack, but libgen parth did not until this was added in https://reviews.llvm.org/D158012 Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D158020	2023-08-15 17:16:37 -07:00
Peiming Liu	fa6726e27b	[mlir][sparse] supports sparse_tensor.pack on libgen path Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D158012	2023-08-15 20:20:54 +00:00
Benjamin Maxwell	f36e909da0	[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors Reland of the original patch after updating the Python binding tests, a few CUDA/GPU MLIR tests, and ensuring the assembly format is round-trippable. This patch splits the lowering of vector.print into first converting an n-D print into a loop of scalar prints of the elements, then a second pass that converts those scalar prints into the runtime calls. The former is done in VectorToSCF and the latter in VectorToLLVM. The main reason for this is to allow printing scalable vector types, which are not possible to fully unroll at compile time, though this also avoids fully unrolling very large vectors. To allow VectorToSCF to add the necessary punctuation between vectors and elements, a "punctuation" attribute has been added to vector.print. This abstracts calling the runtime functions such as printNewline(), without leaking the LLVM details into the higher abstraction levels. For example: vector.print punctuation <comma> lowers to llvm.call @printComma() : () -> () The output format and runtime functions remain the same, which avoids the need to alter a large number of tests (aside from the pipelines). Reviewed By: awarzynski, c-rhodes, aartbik Differential Revision: https://reviews.llvm.org/D156519	2023-08-11 09:29:54 +00:00
Andrzej Warzynski	25396e1352	[mlir][test] Fix typo in a test Remove unnecessary `"` that prevent correct `RUN` line expansion. Introduced in: https://reviews.llvm.org/D156625 Bot failure: https://lab.llvm.org/buildbot/#/builders/61/builds/47437	2023-08-11 09:37:08 +01:00
Andrzej Warzynski	23e5130ebf	[mlir][test] Reland: Refactor SparseTensor CPU integration tests CHANGES SINCE THE ORIGINAL VERSION ---------------------------------- The default test set-up was extracted from * SparseTensor/CPU/lit.local.cfg. and duplicated in all tests. This is to support downstream users that don't use these local LIT config files. SUMMARY OF CHANGES ------------------ This patch aims to reduce test duplication. This is a direct follow-up of: 1. https://reviews.llvm.org/D155403 (test duplication), and 2. https://reviews.llvm.org/D155405 (code re-use), All SVE/VLA tests are now enabled _conditionally_ and refactored to use `mlir-cpu-runner` rather than `lli`. The former helps with test duplication and the latter with code re-use. A few additional refactoring changes are included. 1. The reduce verbosity, long runtime library names like: %mlir_native_utils_lib_dir/libmlir_c_runner_utils%shlibext are replaced with: %mlir_c_runner_utils 2. In order to keep the code and the comments in sync, and to maintain consistency across the tests, the following: enable-runtime-library=true is swapped with (and vice-versa): enable-runtime-library=false Note that this change won't affect test coverage. Only few tests required such update. 3. A VLS vectorization `RUN` line is added in tests where there was a VLA/VLS `RUN` line, but no VLS `RUN` line (with a few exceptions of tests that only contained one `RUN` line to begin with). 4. A few test variables are renamed/added. Most notable example: * %{options}` --> %{sparse_compiler_opts} TEST RUNTIME IMPROVEMENT ------------------------ Tl;Dr This change improves test execution time by ~25%. At the moment, the following `llvm-lit` invocation takes ~7.30s on my AArch64 workstation (with SVE): llvm-lit <llvm-project>/mlir/test/Integration/Dialect/SparseTensor/CPU/ This timing doesn't change no matter what the value of the following CMake variable is (that should disable some tests): MLIR_RUN_ARM_SVE_TESTS With this patch, the execution time will indeed depend on the value of the above CMake variable: * with `MLIR_RUN_ARM_SVE_TESTS=true` the timing remains intact, * with `MLIR_RUN_ARM_SVE_TESTS=false` the timing drops to ~5.40s (~25% improvement). This is expected: * on average there are 4 `RUN` lines per test, * _without this change_ (and with `MLIR_RUN_ARM_SVE_TESTS=false`) the 4th `RUN` line would in most cases duplicate the 3rd `RUN` line, * _with this change) (and with `MLIR_RUN_ARM_SVE_TESTS=false`) the 4th `RUN` line becomes empty. PATCH SIZE ---------- While rather large and touching many files, most changes in this patch are rather mechanical. All test configurations have been preserved and only in a handful of cases new `RUN` lines added. Differential Revision: https://reviews.llvm.org/D156625	2023-08-11 08:16:01 +00:00
Aart Bik	76a80a0808	[mlir][sparse][gpu] sparsifier GPU libgen for SpGEMM in cuSparse With working integration end-to-end test Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D157652	2023-08-10 14:52:16 -07:00
Mehdi Amini	1b272d21c8	Revert "[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors" This reverts commit 490dae26cb3bee2e8401e4c2a7ad3e0996be67d0. Bot is broken, seems like there is a problem of ambiguity in the parser.	2023-08-09 19:37:01 -07:00
Benjamin Maxwell	490dae26cb	[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors Reland of the original patch after updating the Python binding tests and a few CUDA/GPU MLIR tests. This patch splits the lowering of vector.print into first converting an n-D print into a loop of scalar prints of the elements, then a second pass that converts those scalar prints into the runtime calls. The former is done in VectorToSCF and the latter in VectorToLLVM. The main reason for this is to allow printing scalable vector types, which are not possible to fully unroll at compile time, though this also avoids fully unrolling very large vectors. To allow VectorToSCF to add the necessary punctuation between vectors and elements, a "punctuation" attribute has been added to vector.print. This abstracts calling the runtime functions such as printNewline(), without leaking the LLVM details into the higher abstraction levels. For example: vector.print <comma> lowers to llvm.call @printComma() : () -> () The output format and runtime functions remain the same, which avoids the need to alter a large number of tests (aside from the pipelines). Reviewed By: awarzynski, c-rhodes, aartbik Differential Revision: https://reviews.llvm.org/D156519	2023-08-09 11:47:18 +00:00
Aart Bik	5a1f87f9fc	Revert "[mlir][test] Refactor SparseTensor CPU integration tests" This reverts commit e77e891d8953b487f5f06bf69225a61ef537f766. Differential Revision: https://reviews.llvm.org/D156947	2023-08-02 15:46:41 -07:00
Andrzej Warzynski	e77e891d89	[mlir][test] Refactor SparseTensor CPU integration tests SUMMARY OF CHANGES ------------------ This patch aims to reduce test duplication and to improve code re-use in SparseTensor integration tests for CPU. This is a direct follow-up of: 1. https://reviews.llvm.org/D155403 (test duplication), and 2. https://reviews.llvm.org/D155405 (code re-use), The key logic for this patch is implemented in: * SparseTensor/CPU/lit.local.cfg. Essentially, the set-up that used to be repeated across all test files has been extracted into a common LIT configuration file. This makes code re-use straightforward. All SVE/VLA tests are now enabled _conditionally_ and refactored to use `mlir-cpu-runner` rather than `lli`. The former helps with test duplication and the latter with code re-use. A few additional refactoring changes are included. 1. The reduce verbosity, long runtime library names like: %mlir_native_utils_lib_dir/libmlir_c_runner_utils%shlibext are replaced with: %mlir_c_runner_utils 2. In order to keep the code and the comments in sync, and to maintain consistency across the tests, the following: enable-runtime-library=true is swapped with (and vice-versa): enable-runtime-library=false Note that this change won't affect test coverage. Only few tests required such update. 3. A VLS vectorization `RUN` line is added in tests where there was a VLA/VLS `RUN` line, but no VLS `RUN` line (with a few exceptions of tests that only contained one `RUN` line to begin with). 4. A few test variables are renamed/added. Most notable example: * %{options}` --> %{sparse_compiler_opts} TEST RUNTIME IMPROVEMENT ------------------------ Tl;Dr This change improves test execution time by ~25%. At the moment, the following `llvm-lit` invocation takes ~7.30s on my AArch64 workstation (with SVE): llvm-lit <llvm-project>/mlir/test/Integration/Dialect/SparseTensor/CPU/ This timing doesn't change no matter what the value of the following CMake variable is (that should disable some tests): MLIR_RUN_ARM_SVE_TESTS With this patch, the execution time will indeed depend on the value of the above CMake variable: * with `MLIR_RUN_ARM_SVE_TESTS=true` the timing remains intact, * with `MLIR_RUN_ARM_SVE_TESTS=false` the timing drops to ~5.40s (~25% improvement). This is expected: * on average there are 4 `RUN` lines per test, * _without this change_ (and with `MLIR_RUN_ARM_SVE_TESTS=false`) the 4th `RUN` line would in most cases duplicate the 3rd `RUN` line, * _with this change) (and with `MLIR_RUN_ARM_SVE_TESTS=false`) the 4th `RUN` line becomes empty. PATCH SIZE ---------- While rather large and touching many files, most changes in this patch are rather mechanical. All test configurations have been preserved and only in a handful of cases new `RUN` lines added. Differential Revision: https://reviews.llvm.org/D156625	2023-08-02 20:21:50 +00:00
K-Wu	cfa82f7783	[mlir][sparse][gpu] introduce flag that controls host to device copy strategies (regular dma default) Differential Revision: https://reviews.llvm.org/D155352	2023-08-01 22:30:40 +00:00
Kun Wu	1e491c425b	[mlir][sparse][gpu] add 2:4 spmm prune_and_check flag Differential Revision: https://reviews.llvm.org/D155909	2023-08-01 18:24:18 +00:00

1 2 3 4 5 ...

399 Commits