This removes the temporary DENSE24 attribute and replaces it with proper
recognition of dense to 24 conversion. The compressionh will be
performed on the device prior to performing the matrix mult. Note that
we no longer need to start with the linalg version, we can lift this to
the proper named linalg op. Also renames some files into more consistent
names.
Previous change no longer properly used the GPU libgen pass (even though
most tests still passed falling back to CPU). This revision puts the
proper pass order into place. Also bit of a cleanup of CPU codegen vs.
libgen setup.
For all the mlir tests (except for roundtrip_coding.mlir), change the
check test to use general form of encoding
`#sparse_tensor.encoding<{{{.*}}}>` instead of actual encoding such as
`#sparse_tensor.encoding<{ lvlTypes = [ "compressed", "singleton" ] }>`.
Consistent order of ops and related methods.
Also, renamed SpGEMMGetSizeOp to SpMatGetSizeOp
since this is a general utility for sparse matrices,
not specific to GEMM ops only.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D157922
Old pattern was missing some cases (e.g. swapping the arguments)
but it also allowed too many cases (e.g. non-empty "absent" or
different arguments for add/mul). This fixes the issues.
Reviewed By: K-Wu
Differential Revision: https://reviews.llvm.org/D153487
(1) minor bug fix in copy back [always nice to run stuff ;-)]
(2) run with and without lib (even though some fall back to CPU)
Reviewed By: wrengr
Differential Revision: https://reviews.llvm.org/D151507
(1) keep all cuSparse ops on single stream without wait() in right order
(2) use more type precise memref types for COO
(3) use ToTensor on resulting memref (even though it folds away again)
Reviewed By: K-Wu
Differential Revision: https://reviews.llvm.org/D151404
The GPU tests weren't updated when rebasing D150330, so this patch fixes that.
Reviewed By: anlunx
Differential Revision: https://reviews.llvm.org/D150822
This commit is part of the migration of towards the new STEA syntax/design. In particular, this commit includes the following changes:
* Renaming compiler-internal functions/methods:
* `SparseTensorEncodingAttr::{getDimLevelType => getLvlTypes}`
* `Merger::{getDimLevelType => getLvlType}` (for consistency)
* `sparse_tensor::{getDimLevelType => buildLevelType}` (to help reduce confusion vs actual getter methods)
* Renaming external facets to match:
* the STEA parser and printer
* the C and Python bindings
* PyTACO
However, the actual renaming of the `DimLevelType` itself (along with all the "dlt" names) will be handled in a separate commit.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D150330
The sparse compiler now has two prototype strategies for GPU acceleration:
* CUDA codegen: this converts sparsified code to CUDA threads
* CUDA libgen: this converts pre-sparsified code to cuSPARSE library calls
This revision introduces the first steps required for the second approach.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D150170
The host registration is a convenient way to get CUDA kernels
running, but it may be slow and does not work for all buffer
(like global constants). This revision uses the proper alloc
copy dealloc chains for buffers, using asynchronous chains
to increase overlap. The host registration mechanism is
kept under a flag for the output, just for experimentation
purposes while this project ramps up.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D148682
This implements a proof-of-concept GPU code generator
to the sparse compiler pipeline, currently only capable
of generating CUDA threads for outermost parallel loops.
The objective, obviously, is to grow this concept
to a full blown GPU code generator, capable of the
right combinaton of code generation as well as exploiting
idiomatic kernels or vector specific libraries (think cuSparse).
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D147483