21 Commits

Author SHA1 Message Date
River Riddle
5a7b919409 [mlir][NFC] Rename StandardToLLVM to FuncToLLVM
The current StandardToLLVM conversion patterns only really handle
the Func dialect. The pass itself adds patterns for Arithmetic/CFToLLVM, but
those should be/will be split out in a followup. This commit focuses solely
on being an NFC rename.

Aside from the directory change, the pattern and pass creation API have been renamed:
 * populateStdToLLVMFuncOpConversionPattern -> populateFuncToLLVMFuncOpConversionPattern
 * populateStdToLLVMConversionPatterns -> populateFuncToLLVMConversionPatterns
 * createLowerToLLVMPass -> createConvertFuncToLLVMPass

Differential Revision: https://reviews.llvm.org/D120778
2022-03-07 11:25:23 -08:00
River Riddle
ace01605e0 [mlir] Split out a new ControlFlow dialect from Standard
This dialect is intended to model lower level/branch based control-flow constructs. The initial set
of operations are: AssertOp, BranchOp, CondBranchOp, SwitchOp; all split out from the current
standard dialect.

See https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061

Differential Revision: https://reviews.llvm.org/D118966
2022-02-06 14:51:16 -08:00
Mogball
aae5125550 [mlir] Replace StrEnumAttr -> EnumAttr in core dialects
Removes uses of `StrEnumAttr` in core dialects

Reviewed By: mehdi_amini, rriddle

Differential Revision: https://reviews.llvm.org/D117514
2022-01-18 17:15:00 +00:00
Krzysztof Drewniak
e1da62910e [MLIR][GPU] Define gpu.printf op and its lowerings
- Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports constant format strings and scalar arguments
- Define the lowering of gpu.pirntf to a call to printf() (which is what is required for AMD GPUs when using OpenCL) as well as to the hostcall interface present in the AMD Open Compute device library, which is the interface present when kernels are running under HIP.
- Add a "runtime" enum that allows specifying which of the possible runtimes a ROCDL kernel will be executed under or that the runtime is unknown. This enum controls how gpu.printf is lowered

This change does not enable lowering for Nvidia GPUs, but such a lowering should be possible in principle.

And:
[MLIR][AMDGPU] Always set amdgpu-implicitarg-num-bytes=56 on kernels

This is something that Clang always sets on both OpenCL and HIP kernels, and failing to include it causes mysterious crashes with printf() support.

In addition, revert the max-flat-work-group-size to (1, 256) to avoid triggering bugs in the AMDGPU backend.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D110448
2021-12-09 15:54:31 +00:00
Krzysztof Drewniak
f849640a0c [MLIR] Make the ROCM integration tests runnable
- Move the #define s to the GPU Transform library from GPU Ops so that
SerializeToHsaco is non-trivially compiled

- Add required includes to SerializeToHsaco

- Move MCSubtargetInfo creation to the correct point in the
compilation process

- Change mlir in ROCM tests to account for renamed/moved ops

Differential Revision: https://reviews.llvm.org/D114184
2021-11-19 17:09:53 +00:00
Krzysztof Drewniak
fb1a06aa13 [MLIR][GPU] Add target arguments to SerializeToHsaco
Compiling code for AMD GPUs requires knowledge of which chipset is
being targeted, especially if the code uses chipset-specific
intrinsics (which is the case in a downstream convolution generator).
This commit adds `target`, `chipset` and `features` arguments to the
SerializeToHsaco constructor to enable passing in this required
information.

It also amends the ROCm integration tests to pass in the target
chipset, which is set to the chipset of the first GPU on the system
executing the tests.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114107
2021-11-18 16:28:44 +00:00
thomasraoux
eacd6e1ebe [mlir][GPUtoNVVM] Relax restriction on wmma op lowering
Allow lowering of wmma ops with 64bits indexes. Change the default
version of the test to use default layout.

Differential Revision: https://reviews.llvm.org/D112479
2021-10-27 21:31:55 -07:00
Mogball
a54f4eae0e [MLIR] Replace std ops with arith dialect ops
Precursor: https://reviews.llvm.org/D110200

Removed redundant ops from the standard dialect that were moved to the
`arith` or `math` dialects.

Renamed all instances of operations in the codebase and in tests.

Reviewed By: rriddle, jpienaar

Differential Revision: https://reviews.llvm.org/D110797
2021-10-13 03:07:03 +00:00
Vladislav Vinogradov
505afd1e64 [mlir] Clean up boolean flags usage in LIT tests
* Call `llvm_canonicalize_cmake_booleans` for all CMake options,
  which are propagated to `lit.local.cfg` files.
* Use Python native boolean values instead of strings for such options.

This fixes the cases, when CMake variables have values other than `ON` (like `TRUE`).
This might happen due to IDE integration or due to CMake preset usage.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D110073
2021-10-12 11:44:48 +03:00
thomasraoux
750799b7bc [mlir][NFC] Don't outline kernel in MMA integration tests
This matches better how other gpu integration tests are done.

Differential Revision: https://reviews.llvm.org/D103099
2021-05-27 09:43:54 -07:00
thomasraoux
b44007bec2 [mlir][gpu] Relax restriction on MMA store op to allow chain of mma ops.
In order to allow large matmul operations using the MMA ops we need to chain
operations this is not possible unless "DOp" and "COp" type have matching
layout so remove the "DOp" layout and force accumulator and result type to
match.
Added a test for the case where the MMA value is accumulated.

Differential Revision: https://reviews.llvm.org/D103023
2021-05-27 09:13:51 -07:00
thomasraoux
dae9038611 [mlir] Lower sm version for TensorCore intergration tests
Those tests only require sm70, this allows to run those integration
tests on more hardware.

Differential Revision: https://reviews.llvm.org/D103049
2021-05-24 14:45:24 -07:00
Navdeep Kumar
e552fa28da [MLIR][GPU] Add CUDA Tensor core WMMA test
Add a test case to test the complete execution of WMMA ops on a Nvidia
GPU with tensor cores. These tests are enabled under
MLIR_RUN_CUDA_TENSOR_CORE_TESTS.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D95334
2021-05-22 16:19:36 +05:30
Eugene Zhulenev
8a316b00d6 [mlir] Convert async dialect passes from function passes to op agnostic passes
Differential Revision: https://reviews.llvm.org/D100401
2021-04-13 11:46:00 -07:00
thomasraoux
3587728ed5 [mlir] Fix cuda integration test failure 2021-03-19 10:33:55 -07:00
Christian Sigg
a825fb2c07 [mlir] Remove mlir-rocm-runner
This change combines for ROCm what was done for CUDA in D97463, D98203, D98360, and D98396.

I did not try to compile SerializeToHsaco.cpp or test mlir/test/Integration/GPU/ROCM because I don't have an AMD card. I fixed the things that had obvious bit-rot though.

Reviewed By: whchung

Differential Revision: https://reviews.llvm.org/D98447
2021-03-19 00:24:10 -07:00
thomasraoux
1a572f4509 [mlir] Add vector op support to cuda-runner including vector.print
Differential Revision: https://reviews.llvm.org/D97346
2021-03-18 13:03:08 -07:00
Alex Zinenko
7aa6f3aa0c [mlir] fix integration tests post e2310704d890ad252aeb1ca28b4b84d29514b1d1
The commit in question moved some ops across dialects but did not update
some of the target-specific integration tests that use these ops,
presumably because the corresponding target hardware was not available.
Fix these tests.
2021-03-15 14:41:27 +01:00
Christian Sigg
1ef544d4a9 [mlir] Remove mlir-cuda-runner
Change CUDA integration tests to use mlir-opt + mlir-cpu-runner instead.

Depends On D98203

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D98396
2021-03-12 14:06:43 +01:00
Christian Sigg
2224221fb3 [mlir] Add NVVM to CUBIN conversion to mlir-opt
If MLIR_CUDA_RUNNER_ENABLED, register a 'gpu-to-cubin' conversion pass to mlir-opt.

The next step is to switch CUDA integration tests from mlir-cuda-runner to mlir-opt + mlir-cpu-runner and remove mlir-cuda-runner.

Depends On D98279

Reviewed By: herhut, rriddle, mehdi_amini

Differential Revision: https://reviews.llvm.org/D98203
2021-03-11 10:07:11 +01:00
Christian Sigg
9d7be77bf9 [mlir] Move cuda tests
Move test inputs to test/Integration directory.
Move runtime wrappers to ExecutionEngine.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D97463
2021-03-03 13:16:51 +01:00