llvm-project

Author	SHA1	Message	Date
Thomas Raoux	7efdc117b1	[mlir][nvvm] Add lowering of gpu.printf to nvvm When converting to nvvm lowering gpu.printf to vprintf allows us to support printing when running on cuda. Differential Revision: https://reviews.llvm.org/D141049	2023-01-06 17:29:30 +00:00
Ivan Butygin	befd167050	[mlir][gpu] Fix cuda integration tests https://reviews.llvm.org/D138758 has added `uniform` flag to gpu reduce ops, update integration tests. Differential Revision: https://reviews.llvm.org/D140014	2022-12-14 14:01:00 +01:00
Navdeep Katel	3d35546cd1	Support `transpose` mode for `gpu.subgroup` WMMA ops Add support for loading, computing, and storing `gpu.subgroup` WMMA ops in transpose mode as well. Update the GPU to NVVM lowerings to support `transpose` mode and update integration tests as well. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D139021	2022-12-05 22:37:02 +05:30
rkayaith	13bd410962	[mlir][Pass] Include anchor op in -pass-pipeline In D134622 the printed form of a pass manager is changed to include the name of the op that the pass manager is anchored on. This updates the `-pass-pipeline` argument format to include the anchor op as well, so that the printed form of a pipeline can be directly passed to `-pass-pipeline`. In most cases this requires updating `-pass-pipeline='pipeline'` to `-pass-pipeline='builtin.module(pipeline)'`. This also fixes an outdated assert that prevented running a `PassManager` anchored on `'any'`. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D134900	2022-11-03 11:36:12 -04:00
rkayaith	1c0f541a4d	[mlir] Don't mix -pass-pipeline with other pass options These are test updates required for D135745, which disallows mixing `-pass-pipeline` and the individual `-pass-name` options. Reviewed By: rriddle, mehdi_amini Differential Revision: https://reviews.llvm.org/D135746	2022-11-02 12:10:51 -04:00
Christian Sigg	0f2ec35691	[MLIR] Switch lit tests to %mlir_lib_dir and %mlir_src_dir replacements. The old replacements will be removed soon: - `%linalg_test_lib_dir` - `%cuda_wrapper_library_dir` - `%spirv_wrapper_library_dir` - `%vulkan_wrapper_library_dir` - `%mlir_runner_utils_dir` - `%mlir_integration_test_dir` Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D133270	2022-09-06 12:34:14 +02:00
Krzysztof Drewniak	c2fc8d9b95	[mlir][GPU] Allow bare pointer memrefs when calling GPU kernels In the ROCm runtime (and probably CUDA as well), all kernel arguments are aligned. Therefore, enable using bare pointers for memref arguments to kernels when these memrefs have static shape and a trivial layout. This is a substantial optimization to launching kernels that use memrefs with known, static sizes, since it causes the kernel launch packet to no longer include information already known to the kernel, which can enable packing the kernel launch arguments into launch packets instead of having to allocate an entire separate structure to hold unneeded memref information. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D130716	2022-08-02 20:58:34 +00:00
Krzysztof Drewniak	d6ef3d20b4	[mlir] Remove VectorToROCDL Between issues such as https://github.com/llvm/llvm-project/issues/56323, the fact that this lowering (unlike the code in amdgpu-to-rocdl) does not correctly set up bounds checks (and thus will cause page faults on reads that might need to be padded instead), and that fixing these problems would, essentially, involve replicating amdgpu-to-rocdl, remove --vector-to-rocdl for being broken. In addition, the lowering does not support many aspects of transfer_{read,write}, like supervectors, and may not work correctly in their presence. We (the MLIR-based convolution generator at AMD) do not use this conversion pass, nor are we aware of any other clients. Migration strategies: - Use VectorToLLVM - If buffer ops are particularly needed in your application, use amdgpu.raw_buffer_{load,store} A VectorToAMDGPU pass may be introduced in the future. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D129308	2022-07-12 15:21:22 +00:00
Stella Stamenova	d4555698f8	[mlir] Fix the names of exported functions The names of the functions that are supposed to be exported do not match the implementations. This is due in part to `cac7aabbd8`. This change makes the implementations and declarations match and adds a couple missing declarations. The new names follow the pattern of the existing `verify` functions where the prefix is maintained as `_mlir_ciface_` but the suffix follows the new naming convention. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D124891	2022-05-05 13:46:15 -07:00
River Riddle	87db8e4439	[mlir][NFC] Update textual references of `func` to `func.func` in Integration tests The special case parsing of `func` operations is being removed.	2022-04-20 22:17:29 -07:00
River Riddle	5a7b919409	[mlir][NFC] Rename StandardToLLVM to FuncToLLVM The current StandardToLLVM conversion patterns only really handle the Func dialect. The pass itself adds patterns for Arithmetic/CFToLLVM, but those should be/will be split out in a followup. This commit focuses solely on being an NFC rename. Aside from the directory change, the pattern and pass creation API have been renamed: * populateStdToLLVMFuncOpConversionPattern -> populateFuncToLLVMFuncOpConversionPattern * populateStdToLLVMConversionPatterns -> populateFuncToLLVMConversionPatterns * createLowerToLLVMPass -> createConvertFuncToLLVMPass Differential Revision: https://reviews.llvm.org/D120778	2022-03-07 11:25:23 -08:00
River Riddle	ace01605e0	[mlir] Split out a new ControlFlow dialect from Standard This dialect is intended to model lower level/branch based control-flow constructs. The initial set of operations are: AssertOp, BranchOp, CondBranchOp, SwitchOp; all split out from the current standard dialect. See https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061 Differential Revision: https://reviews.llvm.org/D118966	2022-02-06 14:51:16 -08:00
Mogball	aae5125550	[mlir] Replace StrEnumAttr -> EnumAttr in core dialects Removes uses of `StrEnumAttr` in core dialects Reviewed By: mehdi_amini, rriddle Differential Revision: https://reviews.llvm.org/D117514	2022-01-18 17:15:00 +00:00
Krzysztof Drewniak	e1da62910e	[MLIR][GPU] Define gpu.printf op and its lowerings - Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports constant format strings and scalar arguments - Define the lowering of gpu.pirntf to a call to printf() (which is what is required for AMD GPUs when using OpenCL) as well as to the hostcall interface present in the AMD Open Compute device library, which is the interface present when kernels are running under HIP. - Add a "runtime" enum that allows specifying which of the possible runtimes a ROCDL kernel will be executed under or that the runtime is unknown. This enum controls how gpu.printf is lowered This change does not enable lowering for Nvidia GPUs, but such a lowering should be possible in principle. And: [MLIR][AMDGPU] Always set amdgpu-implicitarg-num-bytes=56 on kernels This is something that Clang always sets on both OpenCL and HIP kernels, and failing to include it causes mysterious crashes with printf() support. In addition, revert the max-flat-work-group-size to (1, 256) to avoid triggering bugs in the AMDGPU backend. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D110448	2021-12-09 15:54:31 +00:00
Krzysztof Drewniak	f849640a0c	[MLIR] Make the ROCM integration tests runnable - Move the #define s to the GPU Transform library from GPU Ops so that SerializeToHsaco is non-trivially compiled - Add required includes to SerializeToHsaco - Move MCSubtargetInfo creation to the correct point in the compilation process - Change mlir in ROCM tests to account for renamed/moved ops Differential Revision: https://reviews.llvm.org/D114184	2021-11-19 17:09:53 +00:00
Krzysztof Drewniak	fb1a06aa13	[MLIR][GPU] Add target arguments to SerializeToHsaco Compiling code for AMD GPUs requires knowledge of which chipset is being targeted, especially if the code uses chipset-specific intrinsics (which is the case in a downstream convolution generator). This commit adds `target`, `chipset` and `features` arguments to the SerializeToHsaco constructor to enable passing in this required information. It also amends the ROCm integration tests to pass in the target chipset, which is set to the chipset of the first GPU on the system executing the tests. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D114107	2021-11-18 16:28:44 +00:00
thomasraoux	eacd6e1ebe	[mlir][GPUtoNVVM] Relax restriction on wmma op lowering Allow lowering of wmma ops with 64bits indexes. Change the default version of the test to use default layout. Differential Revision: https://reviews.llvm.org/D112479	2021-10-27 21:31:55 -07:00
Mogball	a54f4eae0e	[MLIR] Replace std ops with arith dialect ops Precursor: https://reviews.llvm.org/D110200 Removed redundant ops from the standard dialect that were moved to the `arith` or `math` dialects. Renamed all instances of operations in the codebase and in tests. Reviewed By: rriddle, jpienaar Differential Revision: https://reviews.llvm.org/D110797	2021-10-13 03:07:03 +00:00
Vladislav Vinogradov	505afd1e64	[mlir] Clean up boolean flags usage in LIT tests * Call `llvm_canonicalize_cmake_booleans` for all CMake options, which are propagated to `lit.local.cfg` files. * Use Python native boolean values instead of strings for such options. This fixes the cases, when CMake variables have values other than `ON` (like `TRUE`). This might happen due to IDE integration or due to CMake preset usage. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D110073	2021-10-12 11:44:48 +03:00
thomasraoux	750799b7bc	[mlir][NFC] Don't outline kernel in MMA integration tests This matches better how other gpu integration tests are done. Differential Revision: https://reviews.llvm.org/D103099	2021-05-27 09:43:54 -07:00
thomasraoux	b44007bec2	[mlir][gpu] Relax restriction on MMA store op to allow chain of mma ops. In order to allow large matmul operations using the MMA ops we need to chain operations this is not possible unless "DOp" and "COp" type have matching layout so remove the "DOp" layout and force accumulator and result type to match. Added a test for the case where the MMA value is accumulated. Differential Revision: https://reviews.llvm.org/D103023	2021-05-27 09:13:51 -07:00
thomasraoux	dae9038611	[mlir] Lower sm version for TensorCore intergration tests Those tests only require sm70, this allows to run those integration tests on more hardware. Differential Revision: https://reviews.llvm.org/D103049	2021-05-24 14:45:24 -07:00
Navdeep Kumar	e552fa28da	[MLIR][GPU] Add CUDA Tensor core WMMA test Add a test case to test the complete execution of WMMA ops on a Nvidia GPU with tensor cores. These tests are enabled under MLIR_RUN_CUDA_TENSOR_CORE_TESTS. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D95334	2021-05-22 16:19:36 +05:30
Eugene Zhulenev	8a316b00d6	[mlir] Convert async dialect passes from function passes to op agnostic passes Differential Revision: https://reviews.llvm.org/D100401	2021-04-13 11:46:00 -07:00
thomasraoux	3587728ed5	[mlir] Fix cuda integration test failure	2021-03-19 10:33:55 -07:00
Christian Sigg	a825fb2c07	[mlir] Remove mlir-rocm-runner This change combines for ROCm what was done for CUDA in D97463, D98203, D98360, and D98396. I did not try to compile SerializeToHsaco.cpp or test mlir/test/Integration/GPU/ROCM because I don't have an AMD card. I fixed the things that had obvious bit-rot though. Reviewed By: whchung Differential Revision: https://reviews.llvm.org/D98447	2021-03-19 00:24:10 -07:00
thomasraoux	1a572f4509	[mlir] Add vector op support to cuda-runner including vector.print Differential Revision: https://reviews.llvm.org/D97346	2021-03-18 13:03:08 -07:00
Alex Zinenko	7aa6f3aa0c	[mlir] fix integration tests post e2310704d890ad252aeb1ca28b4b84d29514b1d1 The commit in question moved some ops across dialects but did not update some of the target-specific integration tests that use these ops, presumably because the corresponding target hardware was not available. Fix these tests.	2021-03-15 14:41:27 +01:00
Christian Sigg	1ef544d4a9	[mlir] Remove mlir-cuda-runner Change CUDA integration tests to use mlir-opt + mlir-cpu-runner instead. Depends On D98203 Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D98396	2021-03-12 14:06:43 +01:00
Christian Sigg	2224221fb3	[mlir] Add NVVM to CUBIN conversion to mlir-opt If MLIR_CUDA_RUNNER_ENABLED, register a 'gpu-to-cubin' conversion pass to mlir-opt. The next step is to switch CUDA integration tests from mlir-cuda-runner to mlir-opt + mlir-cpu-runner and remove mlir-cuda-runner. Depends On D98279 Reviewed By: herhut, rriddle, mehdi_amini Differential Revision: https://reviews.llvm.org/D98203	2021-03-11 10:07:11 +01:00
Christian Sigg	9d7be77bf9	[mlir] Move cuda tests Move test inputs to test/Integration directory. Move runtime wrappers to ExecutionEngine. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D97463	2021-03-03 13:16:51 +01:00

31 Commits