This is a follow up of #83004, which made the same change for
`MLIR_CUDA_CONVERSIONS_ENABLED`. As the previous PR, this PR commit
exposes mentioned CMake variable through `mlir-config.h` and uses the
macro that is introduced with the same name. This replaces the macro
`MLIR_ROCM_CONVERSIONS_ENABLED`, which the CMake files previously
defined manually.
Some specific implementation of the offload may want more customization, and
even avoid using LLVM in-tree to dispatch the ISA translation to a custom
solution. This refactoring makes it possible for such implementation to work
without even configuring the target backend in LLVM.
Reviewers: fabianmcg
Reviewed By: fabianmcg
Pull Request: https://github.com/llvm/llvm-project/pull/71165
SerializetToHsaco, as currently implemented, leaks the file descriptor
of the .hsaco temporary file, which causes issues in long-running
parallel compilation setups.
See also https://github.com/ROCmSoftwarePlatform/rocMLIR/pull/1257
This patch adds an NVPTX compilation path that enables JIT compilation
on NVIDIA targets. The following modifications were performed:
1. Adding a format field to the GPU object attribute, allowing the
translation attribute to use the correct runtime function to load the
module. Likewise, a dictionary attribute was added to add any possible
extra options.
2. Adding the `createObject` method to `GPUTargetAttrInterface`; this
method returns a GPU object from a binary string.
3. Adding the function `mgpuModuleLoadJIT`, which is only available for
NVIDIA GPUs, as there is no equivalent for AMD.
4. Adding the CMake flag `MLIR_GPU_COMPILATION_TEST_FORMAT` to specify
the format to use during testing.
This revision avoids the registration of dialect extensions in Pass::getDependentDialects.
Such registration of extensions can be dangerous because `DialectRegistry::isSubsetOf` is
always guaranteed to return false for extensions (i.e. there is no mechanism to track
whether a lambda is already in the list of already registered extensions).
When the context is already in a multi-threaded mode, this is guaranteed to assert.
Arguably a more structured registration mechanism for extensions with a unique ExtensionID
could be envisioned in the future.
In the process of cleaning this up, multiple usage inconsistencies surfaced around the
registration of translation extensions that this revision also cleans up.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D157703
Fix a build failure in GCC 7 caused by the function:
`std::optional<SmallVector<std::unique_ptr<llvm::Module>>> loadBitcodeFiles` ,
in `NVVMTarget` & `ROCDLTarget`.
The failure is caused because GCC fails to use the move constructor in
`std::optional` for constructing the return value, which prompts a call to the
deleted copy constructor in `std::unique_ptr`, resulting in a failure.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D157804
**For an explanation of these patches see D154153.**
Commit message:
This patch adds the ROCDL target attribute for serializing GPU modules into
strings containing HSAco.
Depends on D154117
Differential Revision: https://reviews.llvm.org/D154129
**For an explanation of these patches see D154153.**
Commit message:
This patch adds the ROCDL target attribute for serializing GPU modules into
strings containing HSAco.
Depends on D154117
Reviewed By: mehdi_amini, krzysz00
Differential Revision: https://reviews.llvm.org/D154129