21 Commits

Author SHA1 Message Date
Mehdi Amini
6883343843 [mlir] Guard NVPTX backend initialization on it being configured (NFC)
This is just helping with some build failure in some new configurations.
2023-11-03 22:23:01 -07:00
Rohan Yadav
71bdd2c238
mlir/lib/Dialect/GPU/Transforms: improve context management in SerializeToCubin (#65779)
This commit adjusts the CUDA context management in the SerializeToCubin
pass. In particular, it uses the device 0 primary context instead of
creating a new CUDA context on each invocation of SerializeToCubin. This
yields very large improvements in compile time, especially if an
application (like a JIT compiler) is calling SerializeToCubin
repeatedly.

Differential Revision: https://reviews.llvm.org/D159487

Co-authored-by: Rohan Yadav <rohany@cs.stanford.edu>
2023-10-20 23:05:10 +05:30
Nicolas Vasilache
7c4e8c6a27 [mlir] Disentangle dialect and extension registrations.
This revision avoids the registration of dialect extensions in Pass::getDependentDialects.

Such registration of extensions can be dangerous because `DialectRegistry::isSubsetOf` is
always guaranteed to return false for extensions (i.e. there is no mechanism to track
whether a lambda is already in the list of already registered extensions).
When the context is already in a multi-threaded mode, this is guaranteed to assert.

Arguably a more structured registration mechanism for extensions with a unique ExtensionID
could be envisioned in the future.

In the process of cleaning this up, multiple usage inconsistencies surfaced around the
registration of translation extensions that this revision also cleans up.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D157703
2023-08-22 00:40:09 +00:00
Ingo Müller
616eb0b2c4 [mlir][gpu] Fix error message on unknown CUDA error code.
This patch fixes the output of the error message that is printed when
the CUDA library cannot identity the error code. In that case, no error
message is provided by the library, and the previous implementation just
printed the content of a randomly initialized pointer. This patch
initializes the pointer to nullptr and only prints the content if that
has changed.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D156791
2023-08-11 08:04:58 +00:00
Mehdi Amini
5e8a1164f2 Revert "[mlir][gpu] Fallback to JIT compilation" "[mlir][gpu] Increase default SM version from 35 to 50" and "[mlir][gpu] Improving Cubin Serialization with ptxas Compiler"
This reverts commit 2e0e00ed841951e358a85a871647be9b3a622f51
and reverts commit a6eb40692c795a9cc29266779ceca2e304141114
and reverts commit 585cbe3f639783bf0307b47504acbd205f135310.

15 tests are broken on the mlir-nvidia buildbot:

'cuModuleLoadData(&module, data)' failed with 'CUDA_ERROR_INVALID_SOURCE'
'cuModuleGetFunction(&function, module, name)' failed with 'CUDA_ERROR_INVALID_HANDLE'
'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, smem, stream, params, extra)' failed with 'CUDA_ERROR_INVALID_HANDLE'
'cuModuleUnload(module)' failed with 'CUDA_ERROR_INVALID_HANDLE'
2023-07-24 10:23:15 -07:00
Guray Ozen
a6eb40692c [mlir][gpu] Increase default SM version from 35 to 50
Current SM version is 35 but it is deprecated long time ago. D155563 introduced ptxas compilations, using sm_35 causes failures in builtbot. This change increase default SM version to 50.

Differential Revision: https://reviews.llvm.org/D156098
2023-07-24 15:11:30 +02:00
Guray Ozen
2e0e00ed84 [mlir][gpu] Fallback to JIT compilation
Recent change introduces compilation with ptxas compiler. The change is important to be able to different versions of ptxas compiler without changing the compiler.

It causes some failures in builtbot. This change adds fallback mechanism to JIt compilation that is original path.

Differential Revision: https://reviews.llvm.org/D156096
2023-07-24 15:11:05 +02:00
Guray Ozen
585cbe3f63 [mlir][gpu] Improving Cubin Serialization with ptxas Compiler
This work improves how we compile the generated PTX code using the `ptxas` compiler. Currently, we rely on the driver's jit API to compile the PTX code. However, this approach has some limitations. It doesn't always produce the same binary output as the ptxas compiler, leading to potential inconsistencies in the generated Cubin files.

This work introduces a significant improvement by directly utilizing the ptxas compiler for PTX compilation. By doing so, we can achieve more consistent and reliable results in generating cubin files. Key Benefits:
- Using the Ptxas compiler directly ensures that the cubin files generated during the build process remain consistent with CUDA compilation using `nvcc` or `clang`.
- Another advantage of this work is that it allows developers to experiment with different ptxas compilers without the need to change the compiler. Performance among ptxas compiler versions are vary, therefore, one can easily try different ptxas compilers.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D155563
2023-07-24 12:29:53 +02:00
Krzysztof Drewniak
db647f5bd8 [mlir][GPU] Initialize LLVM exactly once during GPU compiles
No matter how one constructs their SerializeTo* pass, we want to
ensure that the LLVM initialization code runs once and only once. This
commit adds a static once_flag to ensure that.

I've run into mysterious segfaults when calling MLIR GPU compiles from
multiple threads, and this commit is a potential fix for the issue.

Reviewed By: fmorac

Differential Revision: https://reviews.llvm.org/D155226
2023-07-14 19:10:52 +00:00
Guray Ozen
22a32f7d9c [mlir][gpu] Add dump-ptx option
When targeting NVIDIA GPUs, seeing the generated PTX is important. Currently, we don't have simple way to do it.

This work adds dump-ptx to gpu-to-cubin pass. One can use it like `gpu-to-cubin{chip=sm_90 features=+ptx80 dump-ptx}`.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D155166
2023-07-13 21:14:57 +02:00
Vinayaka Bandishti
01c755ff80 Make optimize llvm common to both gpu-to-hsaco/cubin
Before serializing, optimizations on llvm were only called on path to
hsaco, and not cubin. Define opt-level for `gpu-to-cubin` pass as well,
and move call to optimize llvm to a common place.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D151554
2023-06-05 10:32:51 +05:30
Artem Belevich
d4ba4c6af7 Revert unintentionally committed "Use nvptxcompile library."
This reverts commit 5f66348e59aa7ce5e5780a972b3875268c45d57c.
2023-03-17 14:23:42 -07:00
Artem Belevich
5f66348e59 Use nvptxcompile library.
Differential Revision: https://reviews.llvm.org/D145527
2023-03-17 14:08:53 -07:00
Ivan Radanov Ivanov
e01c7f092f [MLIR] Revert default NVIDIA GPU version
Due to integration tests failing revert mlir::SerializeToCubinPass defaults to old ones (changed in https://reviews.llvm.org/D134153)

Reviewed By: akuegel

Differential Revision: https://reviews.llvm.org/D134414
2022-09-22 10:19:38 +02:00
Ivan Radanov Ivanov
f9211330f6 [MLIR] Set default NVIDIA GPU version 2022-09-21 18:10:59 -04:00
Ivan Radanov Ivanov
2f7a774ed7 [MLIR] Add a create function for mlir::SerializeToCubinPass
Differential Revision: https://reviews.llvm.org/D134153
2022-09-21 18:02:59 -04:00
Jeff Niu
b7f93c2809 [mlir] (NFC) run clang-format on all files 2022-07-14 13:32:13 -07:00
Mogball
d7ef488bb6 [mlir][gpu] Move GPU headers into IR/ and Transforms/
Depends on D127350

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127352
2022-06-09 22:49:03 +00:00
River Riddle
1269f96d2e [mlir] Add MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID to SerializeToCubinPass
This pass is defined in an anonymous namespace and requires an explicit TypeID
2022-04-04 14:28:10 -07:00
Mehdi Amini
b5e22e6d42 Migrate MLIR test passes to the new registration API
Make sure they all define getArgument()/getDescription().

Depends On D104421

Differential Revision: https://reviews.llvm.org/D104426
2021-06-16 23:42:17 +00:00
Christian Sigg
2224221fb3 [mlir] Add NVVM to CUBIN conversion to mlir-opt
If MLIR_CUDA_RUNNER_ENABLED, register a 'gpu-to-cubin' conversion pass to mlir-opt.

The next step is to switch CUDA integration tests from mlir-cuda-runner to mlir-opt + mlir-cpu-runner and remove mlir-cuda-runner.

Depends On D98279

Reviewed By: herhut, rriddle, mehdi_amini

Differential Revision: https://reviews.llvm.org/D98203
2021-03-11 10:07:11 +01:00