llvm-project

Author	SHA1	Message	Date
Aart Bik	4df01dc270	[mlir][sparse][gpu][nvidia] add pruning step and check to 2:4 matrix multiplication (1) without the check, the results may silently be wrong, so check is needed (2) add pruning step to guarantee 2:4 property Note, in the longer run, we may want to split out the pruning step somehow, or make it optional. Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D155320	2023-07-14 12:08:13 -07:00
Aart Bik	97678cec1b	[mlir][sparse][gpu] remove zero init memset avoids quite a big memory fill for each setup Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D155251	2023-07-13 18:22:21 -07:00
Aart Bik	86eff489e7	[mlir][sparse][gpu] force 16-byte alignment on data structs for cuSparseLt Also makes some minor consistency edits in the cuSparseLt wrapper lib. Reviewed By: Peiming, K-Wu Differential Revision: https://reviews.llvm.org/D155139	2023-07-13 10:45:15 -07:00
Adrian Kuegel	f250fbcbbb	[mlir] Apply ClangTidy fix (NFC) The return statement is redundant.	2023-07-10 11:46:32 +02:00
Aart Bik	03125e6894	[mlir][sparse][gpu] fix missing dealloc This dealloc was incorrectly removed in https://reviews.llvm.org/D153173 Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D154564	2023-07-06 09:48:19 -07:00
Kun Wu	be2dd22b8f	[mlir][sparse][gpu] reuse CUDA environment handle throughout instance lifetime Differential Revision: https://reviews.llvm.org/D153173	2023-06-30 21:52:34 +00:00
Kun Wu	7a3ebba9cb	[mlir][sparse][gpu] Add explaining string to three static_assert stmts Differential Revision: https://reviews.llvm.org/D154243	2023-06-30 14:10:45 -05:00
Kun Wu	632ccc538c	[mlir][sparse][gpu] remove tuple as one of the spmm_buffer_size output type Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D153188	2023-06-19 15:57:50 +00:00
Kun Wu	9167dd46ba	[mlir][sparse][gpu] recognizing sddmm pattern in GPU libgen path Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151582	2023-06-15 23:48:11 +00:00
Kun Wu	ac30f48e37	[mlir][sparse][gpu]fix various cusparseLt bugs Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152489	2023-06-12 23:48:49 +00:00
Navdeep Katel	18cc07aa07	[MLIR][GPU] Add 16-bit version of cudaMemset in cudaRuntimeWrappers Add 16-bit version of cudaMemset in cudaRuntimeWrappers and update the GPU to LLVM lowering. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D151642	2023-06-08 17:33:26 +05:30
Aart Bik	50db4789a8	[mlir][sparse][gpu] refined build setup for cusparse Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D152387	2023-06-07 11:09:22 -07:00
Kun Wu	8ed59c53de	[mlir][sparse][gpu] add sm8.0+ tensor core 2:4 sparsity support Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151775	2023-06-06 23:13:21 +00:00
Aart Bik	9fc02a7a08	[mlir][sparse][gpu] add AoS COO support to cuSPARSE Even though this feature was deprecated in release 11.2, any library before this version still supports the feature, which is why we are making it available under a macro. Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D152290	2023-06-06 12:32:46 -07:00
Kun Wu	7e44f0736a	[mlir][gpu][sparse] fix broken type in cusparseCreateCsr Differential Revision: https://reviews.llvm.org/D151912	2023-06-01 18:06:09 +00:00
Kun Wu	be6c532005	[mlir][sparse][gpu] fixing broken literal names in cuda runner macros Differential Revision: https://reviews.llvm.org/D151910	2023-06-01 17:52:58 +00:00
Kun Wu	cc402de0b1	[mlir][sparse][gpu] add result type to spmv and spmm gpu libgen path Differential Revision: https://reviews.llvm.org/D151592	2023-06-01 17:17:40 +00:00
Aart Bik	752c04777f	[mlir][sparse][gpu] fix merge conflict Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D151619	2023-05-27 13:42:20 -07:00
Kun Wu	cf44847b4d	[mlir][gpu][sparse] adding cusparse sddmm support Differential Revision: https://reviews.llvm.org/D151279	2023-05-27 20:01:41 +00:00
Aart Bik	74e29d3715	[mlir][sparse][gpu] fix merge conflict Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D151574	2023-05-26 11:00:20 -07:00
Kun Wu	235fbe792b	[mlir] [sparse] [gpu] adding transpose support to spmm spmv Reviewed By: aartbik, wrengr Differential Revision: https://reviews.llvm.org/D151259	2023-05-26 17:07:09 +00:00
Aart Bik	bcb698bfdc	[mlir][sparse][gpu] various cuSparse refinements (1) keep all cuSparse ops on single stream without wait() in right order (2) use more type precise memref types for COO (3) use ToTensor on resulting memref (even though it folds away again) Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D151404	2023-05-24 22:32:52 -07:00
Aart Bik	4ebd836d9e	[mlir][sparse][gpu] fix F32 bug for SpMV and SpMM The alpha/beta variables, residing on the host, should have the 32-bit or 64-bit width of the result type. It was formerly always passed as double. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D151255	2023-05-23 17:36:03 -07:00
Aart Bik	a8e1f80f8b	[mlir][sparse][gpu] derive type of cuSparse op This no longer assumes just F64 output. Note, however, that it will be cleaner to carry the data type in the corresponding operation (rather than tracking operands). That will also allow for mixed type cases, where operands and result type are different This will be done in a follow revision where the result type is carried by the SpMV/SpMM op itself (and friends). Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D151005	2023-05-19 17:07:52 -07:00
Aart Bik	981cf1678d	[mlir][sparse][gpu] add SpMM to GPU ops dialect Reviewed By: ThomasRaoux, K-Wu Differential Revision: https://reviews.llvm.org/D150618	2023-05-19 12:46:11 -07:00
Aart Bik	b700a90cc0	[mlir][gpu][sparse] add gpu ops for sparse matrix computations This revision extends the GPU dialect with ops that can be lowered to host-oriented sparse matrix library calls (in this case cuSparse focused although the ops could be generalized to support more GPUs in principle). This will allow the "sparse compiler pipeline" to accelerate sparse operations (see follow up revisions with examples of this). For some background; https://discourse.llvm.org/t/sparse-compiler-and-gpu-code-generation/69786/2 Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D150152	2023-05-12 10:44:36 -07:00
max	8f7c8a6ea7	Add gpu::HostUnregisterOp Without explicitly unregistering you will get ``` 'cuMemHostRegister(ptr, sizeBytes, 0)' failed with 'CUDA_ERROR_HOST_MEMORY_ALREADY_REGISTERED' ``` in CUDA (for example) after repeated runs (e.g., during benchmarking the same kernel). Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D147277	2023-04-06 15:07:12 -05:00
Mehdi Amini	6b7e6ea489	Revert "Fix CUDA runtime wrapper for GPU mem alloc/free to async" This reverts commit b4117fede20b8c649320ad37364ae208baa0d0e7. This broke one of the MLIR bot, a test is failing.	2022-04-12 06:50:27 +00:00
Uday Bondhugula	b4117fede2	Fix CUDA runtime wrapper for GPU mem alloc/free to async Switch CUDA runtime wrapper for GPU mem alloc/free to async. The semantics of the GPU dialect ops (gpu.alloc/dealloc) and the wrappers it lowered to (gpu-to-llvm) was for the async versions -- however, this was being incorrectly mapped to cuMemAlloc/cuMemFree instead of cuMemAllocAsync/cuMemFreeAsync. Reviewed By: csigg Differential Revision: https://reviews.llvm.org/D123482	2022-04-12 09:04:02 +05:30
Krzysztof Drewniak	c5803ee4fa	[MLIR][GPU] Remove call to cudaSetDevice(), which no longer exists Differential Revision: https://reviews.llvm.org/D120085	2022-02-17 21:38:05 +00:00
Krzysztof Drewniak	84718d37db	[MLIR][GPU] Add gpu.set_default_device op This op is added to allow MLIR code running on multi-GPU systems to select the GPU they want to execute operations on when no GPU is otherwise specified. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D119883	2022-02-17 21:30:09 +00:00
Nicolas Vasilache	012c0cc7c3	[mlir] NFC - Avoid unused symbol in opt mode.	2021-10-14 11:26:33 +00:00
Loren Maggiore	361458b1ce	[mlir] create gpu memset op Create a gpu memset op and corresponding CUDA and ROCm wrappers. Reviewed By: herhut, lorenrose1013 Differential Revision: https://reviews.llvm.org/D107548	2021-09-04 08:13:04 +02:00
Aart Bik	b9f87e24f2	[mlir] add missing include, fix broken build Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D108873	2021-08-28 09:36:38 -07:00
Uday Bondhugula	4edc9e2acf	[MLIR][GPU] Drop mgpuMemHostRegisterMemRef's dependence on LLVM Support Drop mgpuMemHostRegisterMemRef's dependence on LLVM Support. This method is the only one in CUDA runtime wrappers library that creates a dependence on libLLVMSupport due to its use of SmallVector and ArrayRef. The code can be as easily/compactly written without those ADT. The dependence on LLVMSupport adds a significant amount of additional complexity for external things that want to link this library in (both statically or as a shared object) since libLLVMSupport includes numerous other objects that are sensitive to C++ compiler version and ABI. Differential Revision: https://reviews.llvm.org/D108684	2021-08-28 11:37:55 +05:30
Christian Sigg	f69d5a7fc7	[mlir] Initialize CUDA context lazily. So we can remove the ignore-warning pragma again. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D97864	2021-03-04 13:07:56 +01:00
Christian Sigg	b6ac26fce5	[mlir] Silence -Wglobal-constructors error in CudaRuntimeWrapper.cpp Until I have a better solution with dynamic initialization, to get the nvidia build bot green again.	2021-03-03 13:48:03 +01:00
Christian Sigg	9d7be77bf9	[mlir] Move cuda tests Move test inputs to test/Integration directory. Move runtime wrappers to ExecutionEngine. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D97463	2021-03-03 13:16:51 +01:00

38 Commits