67 Commits

Author SHA1 Message Date
Aart Bik
bcb698bfdc [mlir][sparse][gpu] various cuSparse refinements
(1) keep all cuSparse ops on single stream without wait() in right order
(2) use more type precise memref types for COO
(3) use ToTensor on resulting memref (even though it folds away again)

Reviewed By: K-Wu

Differential Revision: https://reviews.llvm.org/D151404
2023-05-24 22:32:52 -07:00
Aart Bik
4ebd836d9e [mlir][sparse][gpu] fix F32 bug for SpMV and SpMM
The alpha/beta variables, residing on the host, should have the
32-bit or 64-bit width of the result type. It was formerly always
passed as double.

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D151255
2023-05-23 17:36:03 -07:00
Aart Bik
a8e1f80f8b [mlir][sparse][gpu] derive type of cuSparse op
This no longer assumes just F64 output.

Note, however, that it will be cleaner to carry the data type in the corresponding operation (rather than tracking operands). That will also allow for mixed type cases, where operands and result type are different

This will be done in a follow revision where the result type is carried by the SpMV/SpMM op itself (and friends).

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D151005
2023-05-19 17:07:52 -07:00
Aart Bik
981cf1678d [mlir][sparse][gpu] add SpMM to GPU ops dialect
Reviewed By: ThomasRaoux, K-Wu

Differential Revision: https://reviews.llvm.org/D150618
2023-05-19 12:46:11 -07:00
Aart Bik
b700a90cc0 [mlir][gpu][sparse] add gpu ops for sparse matrix computations
This revision extends the GPU dialect with ops that can be lowered to
host-oriented sparse matrix library calls (in this case cuSparse focused
although the ops could be generalized to support more GPUs in principle).
This will allow the "sparse compiler pipeline" to accelerate sparse operations
(see follow up revisions with examples of this).

For some background;

https://discourse.llvm.org/t/sparse-compiler-and-gpu-code-generation/69786/2

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D150152
2023-05-12 10:44:36 -07:00
max
8f7c8a6ea7 Add gpu::HostUnregisterOp
Without explicitly unregistering you will get

```
'cuMemHostRegister(ptr, sizeBytes, 0)' failed with 'CUDA_ERROR_HOST_MEMORY_ALREADY_REGISTERED'
```

in CUDA (for example) after repeated runs (e.g., during benchmarking the same kernel).

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D147277
2023-04-06 15:07:12 -05:00
Mehdi Amini
6b7e6ea489 Revert "Fix CUDA runtime wrapper for GPU mem alloc/free to async"
This reverts commit b4117fede20b8c649320ad37364ae208baa0d0e7.
This broke one of the MLIR bot, a test is failing.
2022-04-12 06:50:27 +00:00
Uday Bondhugula
b4117fede2 Fix CUDA runtime wrapper for GPU mem alloc/free to async
Switch CUDA runtime wrapper for GPU mem alloc/free to async. The
semantics of the GPU dialect ops (gpu.alloc/dealloc) and the wrappers it
lowered to (gpu-to-llvm) was for the async versions -- however, this was
being incorrectly mapped to cuMemAlloc/cuMemFree instead of
cuMemAllocAsync/cuMemFreeAsync.

Reviewed By: csigg

Differential Revision: https://reviews.llvm.org/D123482
2022-04-12 09:04:02 +05:30
Krzysztof Drewniak
c5803ee4fa [MLIR][GPU] Remove call to cudaSetDevice(), which no longer exists
Differential Revision: https://reviews.llvm.org/D120085
2022-02-17 21:38:05 +00:00
Krzysztof Drewniak
84718d37db [MLIR][GPU] Add gpu.set_default_device op
This op is added to allow MLIR code running on multi-GPU systems to
select the GPU they want to execute operations on when no GPU is
otherwise specified.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D119883
2022-02-17 21:30:09 +00:00
Nicolas Vasilache
012c0cc7c3 [mlir] NFC - Avoid unused symbol in opt mode. 2021-10-14 11:26:33 +00:00
Loren Maggiore
361458b1ce [mlir] create gpu memset op
Create a gpu memset op and corresponding CUDA and ROCm wrappers.

Reviewed By: herhut, lorenrose1013

Differential Revision: https://reviews.llvm.org/D107548
2021-09-04 08:13:04 +02:00
Aart Bik
b9f87e24f2 [mlir] add missing include, fix broken build
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D108873
2021-08-28 09:36:38 -07:00
Uday Bondhugula
4edc9e2acf [MLIR][GPU] Drop mgpuMemHostRegisterMemRef's dependence on LLVM Support
Drop mgpuMemHostRegisterMemRef's dependence on LLVM Support. This
method is the only one in CUDA runtime wrappers library that creates
a dependence on libLLVMSupport due to its use of SmallVector and
ArrayRef. The code can be as easily/compactly written without those ADT.
The dependence on LLVMSupport adds a significant amount of additional
complexity for external things that want to link this library in (both
statically or as a shared object) since libLLVMSupport includes numerous
other objects that are sensitive to C++ compiler version and ABI.

Differential Revision: https://reviews.llvm.org/D108684
2021-08-28 11:37:55 +05:30
Christian Sigg
f69d5a7fc7 [mlir] Initialize CUDA context lazily.
So we can remove the ignore-warning pragma again.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D97864
2021-03-04 13:07:56 +01:00
Christian Sigg
b6ac26fce5 [mlir] Silence -Wglobal-constructors error in CudaRuntimeWrapper.cpp
Until I have a better solution with dynamic initialization, to get
the nvidia build bot green again.
2021-03-03 13:48:03 +01:00
Christian Sigg
9d7be77bf9 [mlir] Move cuda tests
Move test inputs to test/Integration directory.
Move runtime wrappers to ExecutionEngine.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D97463
2021-03-03 13:16:51 +01:00