12 Commits

Author SHA1 Message Date
Ingo Müller
099045a045
[mlir][nvvm] Expose MLIR_NVPTXCOMPILER_ENABLED in mlir-config.h. (#84007)
This is another follow-up of #83004, which made the same change for
`MLIR_CUDA_CONVERSIONS_ENABLED`. As the previous PR, this PR commit
exposes mentioned CMake variable through `mlir-config.h` and uses the
macro that is introduced with the same name. This replaces the macro
`MLIR_NVPTXCOMPILER_ENABLED`, which the CMake files previously defined
manually.
2024-03-06 14:14:53 +01:00
Ingo Müller
d70254a623
[mlir][nvvm] Add missing include to llvm-config.h. (#83998)
This is another follow-up of #83004. `NVVM/Target.cpp` uses the macro
`MLIR_NVPTXCOMPILER_ENABLED`, which is defined in `llvm-config.h` but
did not include that file, yielding a warning when compiled with
`-Wundef`. This PR adds the include.

~~This is another follow-up of #83004, which made the same change for
`MLIR_CUDA_CONVERSIONS_ENABLED`. As the previous PR, this PR commit
exposes mentioned CMake variable through `mlir-config.h` and uses the
macro that is introduced with the same name. This replaces the macro
`MLIR_NVPTXCOMPILER_ENABLED`, which the CMake files previously defined
manually.~~
2024-03-06 10:13:12 +01:00
Ingo Müller
5f2097dbed
[mlir] Expose MLIR_CUDA_CONVERSIONS_ENABLED in mlir-config.h. (#83004)
That macro was not defined in some cases and thus yielded warnings if
compiled with `-Wundef`. In particular, they were not defined in the
BUILD files, so the GPU targets were broken when built with Bazel. This
commit exposes mentioned CMake variable through mlir-config.h and uses
the macro that is introduced with the same name. This replaces the macro
MLIR_CUDA_CONVERSIONS_ENABLED, which the CMake files previously defined
manually.
2024-02-28 14:48:40 +01:00
Mehdi Amini
332a504985 Apply clang-tidy fixes for readability-container-size-empty in Target.cpp (NFC) 2024-02-15 16:02:41 -08:00
Mehdi Amini
867678dd81 Apply clang-tidy fixes for llvm-qualified-auto in Target.cpp (NFC) 2024-02-15 16:02:41 -08:00
Mehdi Amini
d9dadfda85
Refactor ModuleToObject to offer more flexibility to subclass (NFC)
Some specific implementation of the offload may want more customization, and
even avoid using LLVM in-tree to dispatch the ISA translation to a custom
solution. This refactoring makes it possible for such implementation to work
without even configuring the target backend in LLVM.

Reviewers: fabianmcg

Reviewed By: fabianmcg

Pull Request: https://github.com/llvm/llvm-project/pull/71165
2023-11-03 13:41:45 -07:00
Fabian Mora
5093413a50
[mlir][gpu][NVPTX] Enable NVIDIA GPU JIT compilation path (#66220)
This patch adds an NVPTX compilation path that enables JIT compilation
on NVIDIA targets. The following modifications were performed:
1. Adding a format field to the GPU object attribute, allowing the
translation attribute to use the correct runtime function to load the
module. Likewise, a dictionary attribute was added to add any possible
extra options.

2. Adding the `createObject` method to `GPUTargetAttrInterface`; this
method returns a GPU object from a binary string.

3. Adding the function `mgpuModuleLoadJIT`, which is only available for
NVIDIA GPUs, as there is no equivalent for AMD.

4. Adding the CMake flag `MLIR_GPU_COMPILATION_TEST_FORMAT` to specify
the format to use during testing.
2023-09-14 18:00:27 -04:00
Fabian Mora
c16adb0dcb
[mlir][Target][NVPTX] Add fatbin support to NVPTX compilation. (#65398)
Currently, the NVPTX tool compilation path only calls `ptxas`; thus, the
GPU running the binary must be an exact match of the arch of the target,
or else the runtime throws an error due to the arch mismatch.

This patch adds a call to `fatbinary`, creating a fat binary with the
cubin object and the PTX code, allowing the driver to JIT the PTX at
runtime if there's an arch mismatch.
2023-09-07 07:44:41 -04:00
Adrian Kuegel
658a4fdb26 [mlir] Apply ClangTidy fix (NFC)
Prefer to use .empty() instead of checking size().
2023-09-05 10:16:55 +02:00
Nicolas Vasilache
7c4e8c6a27 [mlir] Disentangle dialect and extension registrations.
This revision avoids the registration of dialect extensions in Pass::getDependentDialects.

Such registration of extensions can be dangerous because `DialectRegistry::isSubsetOf` is
always guaranteed to return false for extensions (i.e. there is no mechanism to track
whether a lambda is already in the list of already registered extensions).
When the context is already in a multi-threaded mode, this is guaranteed to assert.

Arguably a more structured registration mechanism for extensions with a unique ExtensionID
could be envisioned in the future.

In the process of cleaning this up, multiple usage inconsistencies surfaced around the
registration of translation extensions that this revision also cleans up.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D157703
2023-08-22 00:40:09 +00:00
Fabian Mora
d923f4c2d8 [mlir][NVVM|ROCDL] Explicitly construct the return type in loadBitcodeFiles in (NVVM|ROCDL)Target
Fix a build failure in GCC 7 caused by the function:
`std::optional<SmallVector<std::unique_ptr<llvm::Module>>> loadBitcodeFiles` ,
in `NVVMTarget` & `ROCDLTarget`.
The failure is caused because GCC fails to use the move constructor in
`std::optional` for constructing the return value, which prompts a call to the
deleted copy constructor in `std::unique_ptr`, resulting in a failure.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D157804
2023-08-13 00:25:30 +00:00
Fabian Mora
211c9752c8 [mlir][NVVM] Adds the NVVM target attribute.
**For an explanation of these patches see D154153.**

Commit message:
This patch adds the NVVM target attribute for serializing GPU modules into
strings containing cubin.

Depends on D154113 and D154100 and D154097

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D154117
2023-08-08 19:21:36 +00:00