llvm-project

Author	SHA1	Message	Date
Jakub Kuderski	c0345b4648	[mlir][gpu] Add subgroup_reduce to shuffle lowering (#76530 ) This supports both the scalar and the vector multi-reduction cases.	2024-01-02 16:14:22 -05:00
Jakub Kuderski	2af186f9bd	[mlir][gpu] Add patterns to break down subgroup reduce (#76271 ) The new patterns break down subgroup reduce ops with vector values into a sequence of subgroup reductions that fit the native shuffle size. The maximum/native shuffle size is parametrized. The overall goal is to be able to perform multi-element reductions with a sequence of `gpu.shuffle` ops.	2023-12-28 14:39:46 -05:00
Guray Ozen	5caae72d1a	[mlir][gpu] Productize `test-lower-to-nvvm` as `gpu-lower-to-nvvm` (#75775 ) The `test-lower-to-nvvm` pipeline serves as the common and proper pipeline for nvvm+host compilation, and it's used across our CUDA integration tests. This PR updates the `test-lower-to-nvvm` pipeline to `gpu-lower-to-nvvm` and moves it within `InitAllPasses.h`. The aim is to call it from Python, also having a standardize compilation process for nvvm.	2023-12-19 08:40:46 +01:00
Guray Ozen	f8058a37ae	[mlir] Fix nvvm integration tests build error (#70113 ) #69934 broke integration tests that rely on the kernel-bare-ptr-calling-convention and host-bare-ptr-calling-convention flags. This PR brings these flags. Also the kernel-index-bitwidth flag is removed, as kernel pointer size depends on the host. Separating host (64-bit) and kernel (32-bit) is not viable.	2023-10-24 22:32:46 +02:00
Guray Ozen	8875f78b5d	[mlir] Change default NVVM compilation to `fatbin` from `bin` (#70052 ) Change the NVVM assembly to `fatbin` as it is executable for multiple architectures. Using `bin` caused test errors at runtime in the test systems.	2023-10-24 16:56:15 +02:00
Guray Ozen	ba8ae9866b	[MLIR] Fixes NVGPU Integration Test Passes Ordering (#69934 ) The test-`lower-to-nvvm pipeline`, designed for NVGPU dialect within GPU kernels, plays important role for compiling integration tests. This PR restructured the passes, and cleaned up the code. It also fixes the order of pipelines. This fix is needed for #69913	2023-10-24 15:56:47 +02:00
Guray Ozen	afe400620f	[MLIR] Use `test-lower-to-nvvm` for sm_90 Integration Tests on GitHub (#68184 ) This PR enables `test-lower-to-nvvm` pass pipeline for the integration tests for NVIDIA sm_90 architecture. This PR adjusts `test-lower-to-nvvm` pass in two ways: 1) Calls `createConvertNVGPUToNVVMPass` before the outlining process. This particular pass is responsible for generating both device and host code. On the host, it calls the CUDA driver to build the TMA descriptor (`cuTensorMap`). 2) Integrates the `createConvertNVVMToLLVMPass` to generate PTXs for NVVM Ops.	2023-10-04 09:50:48 +02:00
Guray Ozen	9d54ae862a	[mlir] Add `opt-level` to `test-lower-to-nvvm` Pipeline (#68183 ) This PR adds the `opt-level` parameter to control code optimization for NVIDIA GPU targets in the `test-lower-to-nvvm` pipeline.	2023-10-04 09:25:53 +02:00
Fabian Mora	5093413a50	[mlir][gpu][NVPTX] Enable NVIDIA GPU JIT compilation path (#66220 ) This patch adds an NVPTX compilation path that enables JIT compilation on NVIDIA targets. The following modifications were performed: 1. Adding a format field to the GPU object attribute, allowing the translation attribute to use the correct runtime function to load the module. Likewise, a dictionary attribute was added to add any possible extra options. 2. Adding the `createObject` method to `GPUTargetAttrInterface`; this method returns a GPU object from a binary string. 3. Adding the function `mgpuModuleLoadJIT`, which is only available for NVIDIA GPUs, as there is no equivalent for AMD. 4. Adding the CMake flag `MLIR_GPU_COMPILATION_TEST_FORMAT` to specify the format to use during testing.	2023-09-14 18:00:27 -04:00
Krzysztof Drewniak	df852599f3	[mlir] Split up VectorToLLVM pass Currently, the VectorToLLVM patterns are built into a library along with the corresponding pass, which also pulls in all the platform-specific vector dialects (like AMXDialect) to apply all the vector to LLVM conversions. This causes dependency bloat when writing libraries - for example the GPU to LLVM passes, which use the vector to LLVM patterns, don't need the X86Vector dialect to be present at all. This commit partitions the library into VectorToLLVM and VectorToLLVMPass, where the latter pulls in all the other vector transformations. Reviewed By: nicolasvasilache, mehdi_amini Differential Revision: https://reviews.llvm.org/D158287	2023-09-13 16:09:56 +00:00
Fabian Mora	1828deb752	[mlir][gpu] Deprecate gpu::Serialization* passes. (#65857 ) Deprecate the `gpu-to-cubin` & `gpu-to-hsaco` passes in favor of the `TargetAttr` workflow. This patch removes remaining upstream uses of the aforementioned passes, including the option to use them in `mlir-opt`. A future patch will remove these passes entirely. The passes can be re-enabled in `mlir-opt` by adding the CMake flag: `-DMLIR_ENABLE_DEPRECATED_GPU_SERIALIZATION=1`.	2023-09-11 16:32:15 -04:00
Fabian Mora	119c489cc1	Reland [mlir][test][gpu] Migrate CUDA tests to the TargetAttr compilation workflow (llvm#65768) The revert happened due to a build bot failure that threw 'CUDA_ERROR_UNSUPPORTED_PTX_VERSION'. The failure's root cause was a pass using "+ptx76" for compilation and an old CUDA driver on the bot. This commit relands the patch with "+ptx60". Original Gh PR: #65768 Original commit message: Migrate tests referencing `gpu-to-cubin` to the new compilation workflow using `TargetAttrs`. The `test-lower-to-nvvm` pass pipeline was modified to use the new compilation workflow to simplify the introduction of future tests. The `createLowerGpuOpsToNVVMOpsPass` function was removed, as it didn't allow for passing all options available in the `ConvertGpuOpsToNVVMOp` pass.	2023-09-09 12:45:21 +00:00
Fabian Mora	2c596ea951	Revert "[mlir][test][gpu] Migrate CUDA tests to the TargetAttr compilation workflow (#65768 ) (#65848 ) This reverts commit d21b67293be15f8a89378e4785d70cc037866406.	2023-09-09 07:14:19 -04:00
Fabian Mora	d21b67293b	[mlir][test][gpu] Migrate CUDA tests to the TargetAttr compilation workflow (#65768 ) Migrate tests referencing `gpu-to-cubin` to the new compilation workflow using `TargetAttrs`. The `test-lower-to-nvvm` pass pipeline was modified to use the new compilation workflow to simplify the introduction of future tests. The `createLowerGpuOpsToNVVMOpsPass` function was removed, as it didn't allow for passing all options available in the `ConvertGpuOpsToNVVMOp` pass.	2023-09-09 07:03:38 -04:00
Mehdi Amini	c8bcc48af6	Cleanup CMake dependencies from unnecessary libraries in mlir/test/lib/Dialect/GPU/CMakeLists.txt (NFC)	2023-07-24 17:58:25 -07:00
Mehdi Amini	07102909c2	Revert "[mlir][gpu][transforms] Only depend on ExecutionEngine if MLIR_ENABLE_CUDA_RUNNER is true" This reverts commit 68b7d3fffd7e8ebc40fdcb0acdcf2e88a93ea5c3. The mlir-nvidia bot is broken.	2023-07-24 17:21:37 -07:00
Nicolas Vasilache	68b7d3fffd	[mlir][gpu][transforms] Only depend on ExecutionEngine if MLIR_ENABLE_CUDA_RUNNER is true This fixes a compilation bug where we would try to depend on ExecutionEngine but it wasn't actually built.	2023-07-25 01:26:03 +02:00
Mehdi Amini	5e8a1164f2	Revert "[mlir][gpu] Fallback to JIT compilation" "[mlir][gpu] Increase default SM version from 35 to 50" and "[mlir][gpu] Improving Cubin Serialization with ptxas Compiler" This reverts commit 2e0e00ed841951e358a85a871647be9b3a622f51 and reverts commit a6eb40692c795a9cc29266779ceca2e304141114 and reverts commit 585cbe3f639783bf0307b47504acbd205f135310. 15 tests are broken on the mlir-nvidia buildbot: 'cuModuleLoadData(&module, data)' failed with 'CUDA_ERROR_INVALID_SOURCE' 'cuModuleGetFunction(&function, module, name)' failed with 'CUDA_ERROR_INVALID_HANDLE' 'cuLaunchKernel(function, gridX, gridY, gridZ, blockX, blockY, blockZ, smem, stream, params, extra)' failed with 'CUDA_ERROR_INVALID_HANDLE' 'cuModuleUnload(module)' failed with 'CUDA_ERROR_INVALID_HANDLE'	2023-07-24 10:23:15 -07:00
Guray Ozen	a6eb40692c	[mlir][gpu] Increase default SM version from 35 to 50 Current SM version is 35 but it is deprecated long time ago. D155563 introduced ptxas compilations, using sm_35 causes failures in builtbot. This change increase default SM version to 50. Differential Revision: https://reviews.llvm.org/D156098	2023-07-24 15:11:30 +02:00
Guray Ozen	585cbe3f63	[mlir][gpu] Improving Cubin Serialization with ptxas Compiler This work improves how we compile the generated PTX code using the `ptxas` compiler. Currently, we rely on the driver's jit API to compile the PTX code. However, this approach has some limitations. It doesn't always produce the same binary output as the ptxas compiler, leading to potential inconsistencies in the generated Cubin files. This work introduces a significant improvement by directly utilizing the ptxas compiler for PTX compilation. By doing so, we can achieve more consistent and reliable results in generating cubin files. Key Benefits: - Using the Ptxas compiler directly ensures that the cubin files generated during the build process remain consistent with CUDA compilation using `nvcc` or `clang`. - Another advantage of this work is that it allows developers to experiment with different ptxas compilers without the need to change the compiler. Performance among ptxas compiler versions are vary, therefore, one can easily try different ptxas compilers. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155563	2023-07-24 12:29:53 +02:00
Nicolas Vasilache	582e1d58bd	[mlir][test] Fix linking error post test-lower-to-nvvm This fixes builds for 7e78ecfe10ea9071234de8d385b87d338d280266 (both cmake and bazel) as well as trim unnecessary dependencies. This is achieved by moving the functionality to test/lib/GPU which is a more natural landing pad.	2023-07-17 18:43:32 +02:00
Nicolas Vasilache	d661b4b575	[mlir][test] Fix linking error post test-lower-to-nvvm	2023-07-17 18:43:32 +02:00
Fabian Mora	dd16cd731d	[mlir][gpu] Add a pattern for transforming gpu.global_id to thread + blockId * blockDim This patch implements a rewrite pattern for transforming gpu.global_id x to gpu.thread_id + gpu.block_id * gpu.block_dim. Reviewed By: makslevental Differential Revision: https://reviews.llvm.org/D148978	2023-05-25 20:24:38 +00:00
Matthias Springer	61223c49dd	[mlir][GPU] Rename MLIRGPUOps CMake target to MLIRGPUDialect This is for consistency with other dialects. Differential Revision: https://reviews.llvm.org/D150659	2023-05-16 14:25:08 +02:00
Matthias Springer	4c48f016ef	[mlir][Affine][NFC] Wrap dialect in "affine" namespace This cleanup aligns the affine dialect with all the other dialects. Differential Revision: https://reviews.llvm.org/D148687	2023-04-20 11:19:21 +09:00
Jakub Kuderski	abc362a107	[mlir][arith] Change dialect name from Arithmetic to Arith Suggested by @lattner in https://discourse.llvm.org/t/rfc-define-precise-arith-semantics/65507/22. Tested with: `ninja check-mlir check-mlir-integration check-mlir-mlir-spirv-cpu-runner check-mlir-mlir-vulkan-runner check-mlir-examples` and `bazel build --config=generic_clang @llvm-project//mlir:all`. Reviewed By: lattner, Mogball, rriddle, jpienaar, mehdi_amini Differential Revision: https://reviews.llvm.org/D134762	2022-09-29 11:23:28 -04:00
Alex Zinenko	8b68da2c7d	[mlir] move SCF headers to SCF/{IR,Transforms} respectively This aligns the SCF dialect file layout with the majority of the dialects. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D128049	2022-06-20 10:18:01 +02:00
Mogball	e16d13322b	[mlir] (NFC) Clean up bazel and CMake target names All dialect targets in bazel have been named Dialect and all dialect targets in CMake have been named MLIRDialect.	2022-06-13 16:24:15 +00:00
Mogball	d7ef488bb6	[mlir][gpu] Move GPU headers into IR/ and Transforms/ Depends on D127350 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D127352	2022-06-09 22:49:03 +00:00
Christian Sigg	bcf3d52486	[MLIR][GPU] Expose GpuParallelLoopMapping as non-test pass. Reviewed By: bondhugula, herhut Differential Revision: https://reviews.llvm.org/D126199	2022-05-30 09:20:48 +02:00
River Riddle	5e50dd048e	[mlir] Rework the implementation of TypeID This commit restructures how TypeID is implemented to ideally avoid the current problems related to shared libraries. This is done by changing the "implicit" fallback path to use the name of the type, instead of using a static template variable (which breaks shared libraries). The major downside to this is that it adds some additional initialization costs for the implicit path. Given the use of type names for uniqueness in the fallback, we also no longer allow types defined in anonymous namespaces to have an implicit TypeID. To simplify defining an ID for these classes, a new `MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID` macro was added to allow for explicitly defining a TypeID directly on an internal class. To help identify when types are using the fallback, `-debug-only=typeid` can be used to log which types are using implicit ids. This change generally only requires changes to the test passes, which are all defined in anonymous namespaces, and thus can't use the fallback any longer. Differential Revision: https://reviews.llvm.org/D122775	2022-04-04 13:52:26 -07:00
River Riddle	87d6bf3728	[mlir][test] Generalize a bunch of FuncOp based passes to run on any operation/interfaces A lot of test passes are currently anchored on FuncOp, but this dependency is generally just historical. A majority of these test passes can run on any operation, or can operate on a specific interface (FunctionOpInterface/SymbolOpInterface). This allows for greatly reducing the API dependency on FuncOp, which is slated to be moved out of the Builtin dialect. Differential Revision: https://reviews.llvm.org/D121191	2022-03-08 12:25:32 -08:00
River Riddle	1f971e23f0	[mlir] Trim a huge number of unnecessary dependencies on the Func dialect The Func has a large number of legacy dependencies carried over from the old Standard dialect, which was pervasive and contained a large number of varied operations. With the split of the standard dialect and its demise, a lot of lingering dead dependencies have survived to the Func dialect. This commit removes a large majority of then, greatly reducing the dependence surface area of the Func dialect.	2022-03-01 12:10:04 -08:00
River Riddle	23aa5a7446	[mlir] Rename the Standard dialect to the Func dialect The last remaining operations in the standard dialect all revolve around FuncOp/function related constructs. This patch simply handles the initial renaming (which by itself is already huge), but there are a large number of cleanups unlocked/necessary afterwards: * Removing a bunch of unnecessary dependencies on Func * Cleaning up the From/ToStandard conversion passes * Preparing for the move of FuncOp to the Func dialect See the discussion at https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061 Differential Revision: https://reviews.llvm.org/D120624	2022-03-01 12:10:04 -08:00
Mehdi Amini	be0a7e9f27	Adjust "end namespace" comment in MLIR to match new agree'd coding style See D115115 and this mailing list discussion: https://lists.llvm.org/pipermail/llvm-dev/2021-December/154199.html Differential Revision: https://reviews.llvm.org/D115309	2021-12-08 06:05:26 +00:00
Mogball	a54f4eae0e	[MLIR] Replace std ops with arith dialect ops Precursor: https://reviews.llvm.org/D110200 Removed redundant ops from the standard dialect that were moved to the `arith` or `math` dialects. Renamed all instances of operations in the codebase and in tests. Reviewed By: rriddle, jpienaar Differential Revision: https://reviews.llvm.org/D110797	2021-10-13 03:07:03 +00:00
Uday Bondhugula	4acf3807e3	[MLIR] Split out GPU ops library from Transforms Split out GPU ops library from GPU transforms. This allows libraries to depend on GPU Ops without needing/building its transforms. Differential Revision: https://reviews.llvm.org/D105472	2021-07-07 11:26:49 +05:30
Mehdi Amini	b5e22e6d42	Migrate MLIR test passes to the new registration API Make sure they all define getArgument()/getDescription(). Depends On D104421 Differential Revision: https://reviews.llvm.org/D104426	2021-06-16 23:42:17 +00:00
River Riddle	3fef2d26a3	[mlir][NFC] Move passes in test/lib/Transforms/ to a directory that mirrors what they test test/lib/Transforms/ has bitrot and become somewhat of a dumping grounds for testing pretty much any part of the project. This revision cleans this up, and moves the files within to a directory that reflects what is actually being tested. Differential Revision: https://reviews.llvm.org/D102456	2021-05-14 10:28:11 -07:00

39 Commits