llvm-project

Author	SHA1	Message	Date
Aart Bik	b86d3cbc12	[mlir][sparse] complete various FIXMEs in sparse support lib Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D159245	2023-08-30 21:30:25 -07:00
Peiming Liu	22e8d5b428	[mlir][sparse] Support strided convolution on dense level. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D159020	2023-08-30 20:00:50 +00:00
Peiming Liu	07bd5f20bc	[mlir][sparse] Support strided convolution on compressed level. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D158912	2023-08-30 19:37:50 +00:00
Peiming Liu	96e1914aa2	[mlir][sparse] fix crash when generating convolution kernel with sparse input in DCCD format. Reviewed By: aartbik, anlunx Differential Revision: https://reviews.llvm.org/D159170	2023-08-30 17:49:36 +00:00
Yinying Li	51ebecf309	[mlir][sparse] Changed sparsity properties to use _ instead of - Example: compressed-no -> compressed_no Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D158567	2023-08-23 17:00:27 +00:00
Peiming Liu	8c8aecdca9	[mlir][sparse] Supporting (non)uniqueness in SparseTensorStorage::lexDiff. Fix copied from https://reviews.llvm.org/D156946 but with a legit test case that triggers the bug. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D158578	2023-08-23 03:48:53 +00:00
Peiming Liu	6ca0b27298	[mlir][sparse] more complicated test for dual sparse convolution kernel. Reviewed By: anlunx Differential Revision: https://reviews.llvm.org/D158443	2023-08-21 18:48:01 +00:00
Andrzej Warzynski	51eaee3b42	[mlir][SparseTensor] Fix test regression Fix a regression caused by https://reviews.llvm.org/D158012. Failing bot: * https://lab.llvm.org/buildbot/#/builders/179/builds/7122 Note that both `RUN` lines in the affected file were previously tested with similar configuraiton (_with_ and _without_ vectorisation). This change restores that, though the new setting (from D158012) is used, i.e. * with direct IR generation, `enable-runtime-library=true`. This is sufficient to make the test pass and allows us to investigate the root cause offline. Issue reported here: https://github.com/llvm/llvm-project/issues/64727	2023-08-16 09:37:07 +00:00
Aart Bik	30c1866dec	[mlir][sparse][gpu] enable SpGEMM on GPU for libgen path Direct IR supports pack, but libgen parth did not until this was added in https://reviews.llvm.org/D158012 Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D158020	2023-08-15 17:16:37 -07:00
Peiming Liu	fa6726e27b	[mlir][sparse] supports sparse_tensor.pack on libgen path Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D158012	2023-08-15 20:20:54 +00:00
Benjamin Maxwell	f36e909da0	[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors Reland of the original patch after updating the Python binding tests, a few CUDA/GPU MLIR tests, and ensuring the assembly format is round-trippable. This patch splits the lowering of vector.print into first converting an n-D print into a loop of scalar prints of the elements, then a second pass that converts those scalar prints into the runtime calls. The former is done in VectorToSCF and the latter in VectorToLLVM. The main reason for this is to allow printing scalable vector types, which are not possible to fully unroll at compile time, though this also avoids fully unrolling very large vectors. To allow VectorToSCF to add the necessary punctuation between vectors and elements, a "punctuation" attribute has been added to vector.print. This abstracts calling the runtime functions such as printNewline(), without leaking the LLVM details into the higher abstraction levels. For example: vector.print punctuation <comma> lowers to llvm.call @printComma() : () -> () The output format and runtime functions remain the same, which avoids the need to alter a large number of tests (aside from the pipelines). Reviewed By: awarzynski, c-rhodes, aartbik Differential Revision: https://reviews.llvm.org/D156519	2023-08-11 09:29:54 +00:00
Andrzej Warzynski	25396e1352	[mlir][test] Fix typo in a test Remove unnecessary `"` that prevent correct `RUN` line expansion. Introduced in: https://reviews.llvm.org/D156625 Bot failure: https://lab.llvm.org/buildbot/#/builders/61/builds/47437	2023-08-11 09:37:08 +01:00
Andrzej Warzynski	23e5130ebf	[mlir][test] Reland: Refactor SparseTensor CPU integration tests CHANGES SINCE THE ORIGINAL VERSION ---------------------------------- The default test set-up was extracted from * SparseTensor/CPU/lit.local.cfg. and duplicated in all tests. This is to support downstream users that don't use these local LIT config files. SUMMARY OF CHANGES ------------------ This patch aims to reduce test duplication. This is a direct follow-up of: 1. https://reviews.llvm.org/D155403 (test duplication), and 2. https://reviews.llvm.org/D155405 (code re-use), All SVE/VLA tests are now enabled _conditionally_ and refactored to use `mlir-cpu-runner` rather than `lli`. The former helps with test duplication and the latter with code re-use. A few additional refactoring changes are included. 1. The reduce verbosity, long runtime library names like: %mlir_native_utils_lib_dir/libmlir_c_runner_utils%shlibext are replaced with: %mlir_c_runner_utils 2. In order to keep the code and the comments in sync, and to maintain consistency across the tests, the following: enable-runtime-library=true is swapped with (and vice-versa): enable-runtime-library=false Note that this change won't affect test coverage. Only few tests required such update. 3. A VLS vectorization `RUN` line is added in tests where there was a VLA/VLS `RUN` line, but no VLS `RUN` line (with a few exceptions of tests that only contained one `RUN` line to begin with). 4. A few test variables are renamed/added. Most notable example: * %{options}` --> %{sparse_compiler_opts} TEST RUNTIME IMPROVEMENT ------------------------ Tl;Dr This change improves test execution time by ~25%. At the moment, the following `llvm-lit` invocation takes ~7.30s on my AArch64 workstation (with SVE): llvm-lit <llvm-project>/mlir/test/Integration/Dialect/SparseTensor/CPU/ This timing doesn't change no matter what the value of the following CMake variable is (that should disable some tests): MLIR_RUN_ARM_SVE_TESTS With this patch, the execution time will indeed depend on the value of the above CMake variable: * with `MLIR_RUN_ARM_SVE_TESTS=true` the timing remains intact, * with `MLIR_RUN_ARM_SVE_TESTS=false` the timing drops to ~5.40s (~25% improvement). This is expected: * on average there are 4 `RUN` lines per test, * _without this change_ (and with `MLIR_RUN_ARM_SVE_TESTS=false`) the 4th `RUN` line would in most cases duplicate the 3rd `RUN` line, * _with this change) (and with `MLIR_RUN_ARM_SVE_TESTS=false`) the 4th `RUN` line becomes empty. PATCH SIZE ---------- While rather large and touching many files, most changes in this patch are rather mechanical. All test configurations have been preserved and only in a handful of cases new `RUN` lines added. Differential Revision: https://reviews.llvm.org/D156625	2023-08-11 08:16:01 +00:00
Aart Bik	76a80a0808	[mlir][sparse][gpu] sparsifier GPU libgen for SpGEMM in cuSparse With working integration end-to-end test Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D157652	2023-08-10 14:52:16 -07:00
Mehdi Amini	1b272d21c8	Revert "[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors" This reverts commit 490dae26cb3bee2e8401e4c2a7ad3e0996be67d0. Bot is broken, seems like there is a problem of ambiguity in the parser.	2023-08-09 19:37:01 -07:00
Benjamin Maxwell	490dae26cb	[mlir][VectorOps] Use SCF for vector.print and allow scalable vectors Reland of the original patch after updating the Python binding tests and a few CUDA/GPU MLIR tests. This patch splits the lowering of vector.print into first converting an n-D print into a loop of scalar prints of the elements, then a second pass that converts those scalar prints into the runtime calls. The former is done in VectorToSCF and the latter in VectorToLLVM. The main reason for this is to allow printing scalable vector types, which are not possible to fully unroll at compile time, though this also avoids fully unrolling very large vectors. To allow VectorToSCF to add the necessary punctuation between vectors and elements, a "punctuation" attribute has been added to vector.print. This abstracts calling the runtime functions such as printNewline(), without leaking the LLVM details into the higher abstraction levels. For example: vector.print <comma> lowers to llvm.call @printComma() : () -> () The output format and runtime functions remain the same, which avoids the need to alter a large number of tests (aside from the pipelines). Reviewed By: awarzynski, c-rhodes, aartbik Differential Revision: https://reviews.llvm.org/D156519	2023-08-09 11:47:18 +00:00
Aart Bik	5a1f87f9fc	Revert "[mlir][test] Refactor SparseTensor CPU integration tests" This reverts commit e77e891d8953b487f5f06bf69225a61ef537f766. Differential Revision: https://reviews.llvm.org/D156947	2023-08-02 15:46:41 -07:00
Andrzej Warzynski	e77e891d89	[mlir][test] Refactor SparseTensor CPU integration tests SUMMARY OF CHANGES ------------------ This patch aims to reduce test duplication and to improve code re-use in SparseTensor integration tests for CPU. This is a direct follow-up of: 1. https://reviews.llvm.org/D155403 (test duplication), and 2. https://reviews.llvm.org/D155405 (code re-use), The key logic for this patch is implemented in: * SparseTensor/CPU/lit.local.cfg. Essentially, the set-up that used to be repeated across all test files has been extracted into a common LIT configuration file. This makes code re-use straightforward. All SVE/VLA tests are now enabled _conditionally_ and refactored to use `mlir-cpu-runner` rather than `lli`. The former helps with test duplication and the latter with code re-use. A few additional refactoring changes are included. 1. The reduce verbosity, long runtime library names like: %mlir_native_utils_lib_dir/libmlir_c_runner_utils%shlibext are replaced with: %mlir_c_runner_utils 2. In order to keep the code and the comments in sync, and to maintain consistency across the tests, the following: enable-runtime-library=true is swapped with (and vice-versa): enable-runtime-library=false Note that this change won't affect test coverage. Only few tests required such update. 3. A VLS vectorization `RUN` line is added in tests where there was a VLA/VLS `RUN` line, but no VLS `RUN` line (with a few exceptions of tests that only contained one `RUN` line to begin with). 4. A few test variables are renamed/added. Most notable example: * %{options}` --> %{sparse_compiler_opts} TEST RUNTIME IMPROVEMENT ------------------------ Tl;Dr This change improves test execution time by ~25%. At the moment, the following `llvm-lit` invocation takes ~7.30s on my AArch64 workstation (with SVE): llvm-lit <llvm-project>/mlir/test/Integration/Dialect/SparseTensor/CPU/ This timing doesn't change no matter what the value of the following CMake variable is (that should disable some tests): MLIR_RUN_ARM_SVE_TESTS With this patch, the execution time will indeed depend on the value of the above CMake variable: * with `MLIR_RUN_ARM_SVE_TESTS=true` the timing remains intact, * with `MLIR_RUN_ARM_SVE_TESTS=false` the timing drops to ~5.40s (~25% improvement). This is expected: * on average there are 4 `RUN` lines per test, * _without this change_ (and with `MLIR_RUN_ARM_SVE_TESTS=false`) the 4th `RUN` line would in most cases duplicate the 3rd `RUN` line, * _with this change) (and with `MLIR_RUN_ARM_SVE_TESTS=false`) the 4th `RUN` line becomes empty. PATCH SIZE ---------- While rather large and touching many files, most changes in this patch are rather mechanical. All test configurations have been preserved and only in a handful of cases new `RUN` lines added. Differential Revision: https://reviews.llvm.org/D156625	2023-08-02 20:21:50 +00:00
K-Wu	cfa82f7783	[mlir][sparse][gpu] introduce flag that controls host to device copy strategies (regular dma default) Differential Revision: https://reviews.llvm.org/D155352	2023-08-01 22:30:40 +00:00
Kun Wu	1e491c425b	[mlir][sparse][gpu] add 2:4 spmm prune_and_check flag Differential Revision: https://reviews.llvm.org/D155909	2023-08-01 18:24:18 +00:00
Andrzej Warzynski	e62f366b01	[mlir] Update SVE integration tests to use mlir-cpu-runner With the recent addition of "-mattr" and "-march" to the list of options supported by mlir-cpu-runner [1], the SVE integration tests can be updated to use mlir-cpu-runner instead of lli. This will allow better code re-use and more consistency This patch updates 2 tests to demonstrate the new logic. The remaining tests will be updated in the follow-up patches. [1] https://reviews.llvm.org/D146917 Depends on D155403 Differential Revision: https://reviews.llvm.org/D155405	2023-07-19 08:29:17 +00:00
Andrzej Warzynski	aa9a10ac1d	[mlir][SparseTensor][ArmSVE] Conditionally disable SVE RUN line This patch updates one SparseTensor integration test so that the VLA vectorisation is run conditionally based on the value of the MLIR_RUN_ARM_SME_TESTS CMake variable. This change opens the path to reduce the duplication of RUN lines in "mlir/test/Integration/Dialect/SparseTensor/CPU/". ATM, there are usually 2 RUN lines to test vectorization in SparseTensor integration tests: * one for VLS vectorisation, * one for VLA vectorisation whenever that's available and which reduces to VLS vectorisation when VLA is not supported. When VLA is not available, VLS vectorisation is verified twice. This duplication should be avoided - integration test are relatively expansive to run. This patch makes sure that the 2nd vectorisation RUN line becomes: ``` if (SVE integration tests are enabled) run VLA vectorisation else return ``` This logic is implemented using LIT's (relatively new) conditional substitution [1]. It enables us to guarantee that all RUN lines are unique and that the VLA vectorisation is only enabled when supported. This patch updates only 1 test to set-up and to demonstrate the logic. Subsequent patches will update the remaining tests. [1] https://www.llvm.org/docs/TestingGuide.html Differential Revision: https://reviews.llvm.org/D155403	2023-07-18 06:59:08 +00:00
Kun Wu	d46bad7b55	[mlir][sparse][gpu] add the 2:4 spmm integration test from linalg Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D155351	2023-07-15 06:01:03 +00:00
Aart Bik	4df01dc270	[mlir][sparse][gpu][nvidia] add pruning step and check to 2:4 matrix multiplication (1) without the check, the results may silently be wrong, so check is needed (2) add pruning step to guarantee 2:4 property Note, in the longer run, we may want to split out the pruning step somehow, or make it optional. Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D155320	2023-07-14 12:08:13 -07:00
Aart Bik	f6f817d0d7	[mlir][sparse][gpu] minor improvements in 2:4 example Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D155244	2023-07-13 16:20:27 -07:00
Guray Ozen	22a32f7d9c	[mlir][gpu] Add dump-ptx option When targeting NVIDIA GPUs, seeing the generated PTX is important. Currently, we don't have simple way to do it. This work adds dump-ptx to gpu-to-cubin pass. One can use it like `gpu-to-cubin{chip=sm_90 features=+ptx80 dump-ptx}`. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155166	2023-07-13 21:14:57 +02:00
Peiming Liu	fc5d8fce7d	[mlir][sparse] support dual sparse convolution. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152601	2023-07-10 16:49:32 +00:00
Kun Wu	be2dd22b8f	[mlir][sparse][gpu] reuse CUDA environment handle throughout instance lifetime Differential Revision: https://reviews.llvm.org/D153173	2023-06-30 21:52:34 +00:00
Peiming Liu	a63d6a0014	[mlir][sparse] make UnpackOp return the actual filled length of unpacked memory This might simplify frontend implementation by avoiding recomputation for the same value. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D154244	2023-06-30 21:35:15 +00:00
Peiming Liu	e7df82816b	[mlir][sparse] rewrite arith::SelectOp to semiring operations to sparsify it. Reviewed By: aartbik, K-Wu Differential Revision: https://reviews.llvm.org/D153397	2023-06-21 21:22:18 +00:00
Aart Bik	cdbdf93bf0	[mlir][sparse][gpu] extend SDDMM gpu test Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D153378	2023-06-20 16:12:12 -07:00
Kun Wu	632ccc538c	[mlir][sparse][gpu] remove tuple as one of the spmm_buffer_size output type Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D153188	2023-06-19 15:57:50 +00:00
Kun Wu	9167dd46ba	[mlir][sparse][gpu] recognizing sddmm pattern in GPU libgen path Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151582	2023-06-15 23:48:11 +00:00
Kun Wu	b1c683f5c4	[mlir][sparse][gpu] enable sm80+ sparsity integration test only when explicitly set Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152966	2023-06-15 17:44:38 +00:00
Peiming Liu	faf7cd97d0	[mlir][sparse] merger extension to support sparsifying arith::CmpI/CmpF operation Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152761	2023-06-15 17:26:50 +00:00
Kun Wu	8f3fcbc687	[mlir][sparse][GPU] add 2:4 integration test Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152287	2023-06-13 02:10:26 +00:00
Aart Bik	80fe3168b5	[mlir][sparse] add support for direct prod/and/min/max reductions We recently fixed a bug in "sparsifying" such reductions, since it incorrectly changed this into reductions over stored elements only , which only works for add/sub/or/xor. However, we still want to be able to "sparsify" the reductions even in the general case, and this is a first step by rewriting them into a custom reduction that feeds in the implicit zeros. NOTE HOWEVER, that in the long run we want to do this better and feed in any implicit zero only ONCE for efficiency. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D152580	2023-06-12 09:27:47 -07:00
Aart Bik	e2167d89db	[mlir][sparse] refine absent branch feeding into custom op Document better that unary/binary may only feed to the output or the input of a custom reduction (not even a regular reduction since it may have "no value"!). Also fixes a bug when present branch is empty and feeds into custom reduction. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D152224	2023-06-06 09:57:15 -07:00
Peiming Liu	23dc96bbe4	[mlir][sparse] fix crashes when using custom reduce with unary operation. The tests case is directly copied from https://reviews.llvm.org/D152179 authored by @aartbik Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152204	2023-06-05 23:41:26 +00:00
Peiming Liu	e7b4c93f5e	[mlir][sparse] fix crash when using sparse_tensor::UnaryOp and ReduceOp. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152048	2023-06-03 01:19:05 +00:00
Aart Bik	6a38c772d4	[mlir][sparse] fixed bug with unary op, dense output Note that by sparse compiler convention, dense output is zerod out when not set, so complement results in zeros where elements were present. Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D152046	2023-06-02 18:15:33 -07:00
Peiming Liu	ce6f8c5afe	[mlir][sparse] fix various bug to support sparse pooling Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151776	2023-06-02 17:34:47 +00:00
Aart Bik	378f1885e3	[mlir][sparse] enhance sparse reduction support Formerly, we accepted and/prod reductions as a standard reduction but these change the semantics after sparsification by not looking at implicit zeros. Therefore, we only accept standard reductions that are insensitive to implicit vs. explicit zeros, and leave the more complex reductions to the sparse_tensor.reduce custom reduction implementation. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D151929	2023-06-01 16:30:21 -07:00
Peiming Liu	54ac02dd16	[mlir][sparse] fix crashes when generation conv_2d_nchw_fchw with Compressed Dense Compressed Dense sparse encoding. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151773	2023-05-31 18:06:01 +00:00
wren romano	540d5e0ce6	[mlir][sparse] Updating STEA parser/printer to use the name "dimSlices" Depends On D151505 Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D151513	2023-05-30 15:50:07 -07:00
wren romano	76647fce13	[mlir][sparse] Combining `dimOrdering`+`higherOrdering` fields into `dimToLvl` This is a major step along the way towards the new STEA design. While a great deal of this patch is simple renaming, there are several significant changes as well. I've done my best to ensure that this patch retains the previous behavior and error-conditions, even though those are at odds with the eventual intended semantics of the `dimToLvl` mapping. Since the majority of the compiler does not yet support non-permutations, I've also added explicit assertions in places that previously had implicitly assumed it was dealing with permutations. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151505	2023-05-30 15:19:50 -07:00
Peiming Liu	db7f639b90	[mlir][sparse] fix a crash when generating sparse convolution with nchw input Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151744	2023-05-30 20:16:54 +00:00
Tobias Hieta	f9008e6366	[NFC][Py Reformat] Reformat python files in mlir subdir This is an ongoing series of commits that are reformatting our Python code. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Differential Revision: https://reviews.llvm.org/D150782	2023-05-26 08:05:40 +02:00
Aart Bik	22caafc9f3	[mlir][sparse][gpu] end to end test for matmul (1) minor bug fix in copy back [always nice to run stuff ;-)] (2) run with and without lib (even though some fall back to CPU) Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D151507	2023-05-25 16:10:22 -07:00
Peiming Liu	f7b8b005ff	[mlir][sparse] fix bugs when computing the memory size when lowering pack op. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151481	2023-05-25 19:19:52 +00:00

1 2 3 4 5 ...

369 Commits