Previous change no longer properly used the GPU libgen pass (even though
most tests still passed falling back to CPU). This revision puts the
proper pass order into place. Also bit of a cleanup of CPU codegen vs.
libgen setup.
The flag seems to be doing practically the same thing for zero cost and
pinned dma. In addition, the register host is not truly the right zero
cost mechanism according to Thomas. So we are simplifying the setup for
now, until we have a better definition for what to implement and test.
https://github.com/llvm/llvm-project/issues/64316
This patch adds an NVPTX compilation path that enables JIT compilation
on NVIDIA targets. The following modifications were performed:
1. Adding a format field to the GPU object attribute, allowing the
translation attribute to use the correct runtime function to load the
module. Likewise, a dictionary attribute was added to add any possible
extra options.
2. Adding the `createObject` method to `GPUTargetAttrInterface`; this
method returns a GPU object from a binary string.
3. Adding the function `mgpuModuleLoadJIT`, which is only available for
NVIDIA GPUs, as there is no equivalent for AMD.
4. Adding the CMake flag `MLIR_GPU_COMPILATION_TEST_FORMAT` to specify
the format to use during testing.
(1) minor bug fix in copy back [always nice to run stuff ;-)]
(2) run with and without lib (even though some fall back to CPU)
Reviewed By: wrengr
Differential Revision: https://reviews.llvm.org/D151507
The GPU tests weren't updated when rebasing D150330, so this patch fixes that.
Reviewed By: anlunx
Differential Revision: https://reviews.llvm.org/D150822