llvm-project

Author	SHA1	Message	Date
Mehdi Amini	e4853be2f1	Apply clang-tidy fixes for performance-for-range-copy to MLIR (NFC)	2022-01-02 22:19:56 +00:00
Mehdi Amini	1fc096af1e	Apply clang-tidy fixes for performance-unnecessary-value-param to MLIR (NFC) Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D116250	2022-01-02 01:45:18 +00:00
Mehdi Amini	02b6fb218e	Fix clang-tidy issues in mlir/ (NFC) Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D115956	2021-12-20 20:25:01 +00:00
Diego Caballero	32fe1a8a25	[mlir][GPU] Extend GPU kernel outlining to generate DL specification This patch extends the GPU kernel outlining pass so that it can take in an optional data layout specification that will be attached to the GPU module operation generated. If the data layout specification is not provided the default data layout is used instead. Reviewed By: herhut, mehdi_amini Differential Revision: https://reviews.llvm.org/D115722	2021-12-16 11:35:53 +00:00
Krzysztof Drewniak	e1da62910e	[MLIR][GPU] Define gpu.printf op and its lowerings - Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports constant format strings and scalar arguments - Define the lowering of gpu.pirntf to a call to printf() (which is what is required for AMD GPUs when using OpenCL) as well as to the hostcall interface present in the AMD Open Compute device library, which is the interface present when kernels are running under HIP. - Add a "runtime" enum that allows specifying which of the possible runtimes a ROCDL kernel will be executed under or that the runtime is unknown. This enum controls how gpu.printf is lowered This change does not enable lowering for Nvidia GPUs, but such a lowering should be possible in principle. And: [MLIR][AMDGPU] Always set amdgpu-implicitarg-num-bytes=56 on kernels This is something that Clang always sets on both OpenCL and HIP kernels, and failing to include it causes mysterious crashes with printf() support. In addition, revert the max-flat-work-group-size to (1, 256) to avoid triggering bugs in the AMDGPU backend. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D110448	2021-12-09 15:54:31 +00:00
Mehdi Amini	be0a7e9f27	Adjust "end namespace" comment in MLIR to match new agree'd coding style See D115115 and this mailing list discussion: https://lists.llvm.org/pipermail/llvm-dev/2021-December/154199.html Differential Revision: https://reviews.llvm.org/D115309	2021-12-08 06:05:26 +00:00
Krzysztof Drewniak	a6f53afbcb	[MLIR][GPU] Link in device libraries during HSA compilation if needed To perform some operations, such as sin() or printf(), code compiled for AMD GPUs must be linked to a series of device libraries. This commit adds support for linking in these libraries. However, since these device libraries are delivered as LLVM bitcode, raising the possibility of version incompatibilities, this commit only links in libraries when the functions from those libraries are called by the code being compiled. This code also sets the math flags to their most conservative values, as MLIR doesn't have a `-ffast-math` equivalent. Depends on D114114 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D114117	2021-11-19 22:29:37 +00:00
rdzhabarov	d729f4c38f	[mlir] Bug fix. Stream must outlive the pass manager. Bug fix. Stream must outlive the pass manager. Reviewed By: Chia-hungDuan Differential Revision: https://reviews.llvm.org/D114277	2021-11-19 21:45:43 +00:00
Krzysztof Drewniak	20f79f8caa	[MLIR][GPU] Make the path to ROCm a runtime option Our current build assumes that the path to ROCm we find at build time will be the path at which ROCm is located when the built code is executed. This commit adds a --rocm-path option to SerializeToHsaco, and removes the HIP dependency that the SerializeToHsaco previously had. Depends on D114113 (though the dependency is to ensure the diffs apply cleanly and to capture the dependency on D114107) Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D114114	2021-11-19 20:51:54 +00:00
Krzysztof Drewniak	bd22554af0	[MLIR][GPU] Run generic LLVM optimizations when serializing (on AMD) - Adds hooks that allow SerializeTo* passes to arbitrarily transform the produced LLVM Module before it is passed to the code generation passes. - Uses these hooks within the SerializeToHsaco pass in order to run LLVM optimizations and to set the optimization level on the TargetMachine. - Adds an optLevel parameter to SerializeToHsaco Future work may include moving much of what's been added to SerializeToHsaco to SerializeToBlob, but that would require confirmation from the NVVM backend maintainers that it would be appropriate to do so. Depends on D114107 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D114113	2021-11-19 19:21:24 +00:00
Krzysztof Drewniak	f849640a0c	[MLIR] Make the ROCM integration tests runnable - Move the #define s to the GPU Transform library from GPU Ops so that SerializeToHsaco is non-trivially compiled - Add required includes to SerializeToHsaco - Move MCSubtargetInfo creation to the correct point in the compilation process - Change mlir in ROCM tests to account for renamed/moved ops Differential Revision: https://reviews.llvm.org/D114184	2021-11-19 17:09:53 +00:00
Krzysztof Drewniak	fb1a06aa13	[MLIR][GPU] Add target arguments to SerializeToHsaco Compiling code for AMD GPUs requires knowledge of which chipset is being targeted, especially if the code uses chipset-specific intrinsics (which is the case in a downstream convolution generator). This commit adds `target`, `chipset` and `features` arguments to the SerializeToHsaco constructor to enable passing in this required information. It also amends the ROCm integration tests to pass in the target chipset, which is set to the chipset of the first GPU on the system executing the tests. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D114107	2021-11-18 16:28:44 +00:00
Vladislav Vinogradov	e41ebbecf9	[mlir][RFC] Refactor layout representation in MemRefType The change is based on the proposal from the following discussion: https://llvm.discourse.group/t/rfc-memreftype-affine-maps-list-vs-single-item/3968 * Introduce `MemRefLayoutAttr` interface to get `AffineMap` from an `Attribute` (`AffineMapAttr` implements this interface). * Store layout as a single generic `MemRefLayoutAttr`. This change removes the affine map composition feature and related API. Actually, while the `MemRefType` itself supported it, almost none of the upstream can work with more than 1 affine map in `MemRefType`. The introduced `MemRefLayoutAttr` allows to re-implement this feature in a more stable way - via separate attribute class. Also the interface allows to use different layout representations rather than affine maps. For example, the described "stride + offset" form, which is currently supported in ASM parser only, can now be expressed as separate attribute. Reviewed By: ftynse, bondhugula Differential Revision: https://reviews.llvm.org/D111553	2021-10-19 12:31:15 +03:00
Mogball	a54f4eae0e	[MLIR] Replace std ops with arith dialect ops Precursor: https://reviews.llvm.org/D110200 Removed redundant ops from the standard dialect that were moved to the `arith` or `math` dialects. Renamed all instances of operations in the codebase and in tests. Reviewed By: rriddle, jpienaar Differential Revision: https://reviews.llvm.org/D110797	2021-10-13 03:07:03 +00:00
Reid Kleckner	89b57061f7	Move TargetRegistry.(h\|cpp) from Support to MC This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually use the target, so we might as well move this out of Support. This allows us to ensure that Support doesn't have includes from MC/*. Differential Revision: https://reviews.llvm.org/D111454	2021-10-08 14:51:48 -07:00
Uday Bondhugula	08b63db8bb	[MLIR][GPU] Add GPU launch op support for dynamic shared memory Add support for dynamic shared memory for GPU launch ops: add an optional operand to gpu.launch and gpu.launch_func ops to specify the amount of "dynamic" shared memory to use. Update lowerings to connect this operand to the GPU runtime. Differential Revision: https://reviews.llvm.org/D110800	2021-10-01 16:46:07 +05:30
Mehdi Amini	b5e22e6d42	Migrate MLIR test passes to the new registration API Make sure they all define getArgument()/getDescription(). Depends On D104421 Differential Revision: https://reviews.llvm.org/D104426	2021-06-16 23:42:17 +00:00
Christian Sigg	0b21371e12	[mlir] Support pre-existing tokens in 'gpu-async-region' Allow gpu ops implementing the async interface to already be async when running the GpuAsyncRegionPass. That pass threads a 'current token' through a block with ops implementing the gpu async interface. After this change, existing async ops (returning a !gpu.async.token) set the current token. Existing synchronous `gpu.wait` ops reset the current token. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D103396	2021-06-10 08:43:45 +02:00
Chris Lattner	92a79dbe91	[Core] Add Twine support for StringAttr and Identifier. NFC. This is both more efficient and more ergonomic than going through an std::string, e.g. when using llvm::utostr and in string concat cases. Unfortunately we can't just overload ::get(). This causes an ambiguity because both twine and stringref implicitly convert from std::string. Differential Revision: https://reviews.llvm.org/D103754	2021-06-08 09:47:07 -07:00
Philipp Krones	c2f819af73	[MC] Refactor MCObjectFileInfo initialization and allow targets to create MCObjectFileInfo This makes it possible for targets to define their own MCObjectFileInfo. This MCObjectFileInfo is then used to determine things like section alignment. This is a follow up to D101462 and prepares for the RISCV backend defining the text section alignment depending on the enabled extensions. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101921	2021-05-23 14:15:23 -07:00
Nicolas Vasilache	8eb18a0f3e	[mlir][Standard] NFC - Drop remaining EDSC usage Drop the remaining EDSC subdirectories and update all uses. Differential Revision: https://reviews.llvm.org/D102911	2021-05-21 10:40:39 +00:00
Nicolas Vasilache	e3cf7c88c4	[mlir][MemRef] NFC - Drop MemRef EDSC usage Drop the MemRef dialect EDSC subdirectory and update all uses. Differential Revision: https://reviews.llvm.org/D102868	2021-05-20 20:13:58 +00:00
Nicolas Vasilache	84a880e1e2	[mlir][SCF] NFC - Drop SCF EDSC usage Drop the SCF dialect EDSC subdirectory and update all uses. Differential Revision: https://reviews.llvm.org/D102780	2021-05-19 15:52:14 +00:00
Christian Sigg	a0d019fc89	[mlir] Add support for ops with regions in 'gpu-async-region' rewriter. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D101757	2021-05-06 13:21:28 +02:00
Philipp Krones	632ebc4ab4	[MC] Untangle MCContext and MCObjectFileInfo This untangles the MCContext and the MCObjectFileInfo. There is a circular dependency between MCContext and MCObjectFileInfo. Currently this dependency also exists during construction: You can't contruct a MOFI without a MCContext without constructing the MCContext with a dummy version of that MOFI first. This removes this dependency during construction. In a perfect world, MCObjectFileInfo wouldn't depend on MCContext at all, but only be stored in the MCContext, like other MC information. This is future work. This also shifts/adds more information to the MCContext making it more available to the different targets. Namely: - TargetTriple - ObjectFileType - SubtargetInfo Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D101462	2021-05-05 10:03:02 -07:00
River Riddle	4efb7754e0	[mlir][NFC] Add a using directive for llvm::SetVector Differential Revision: https://reviews.llvm.org/D100436	2021-04-15 16:09:34 -07:00
Chris Lattner	dc4e913be9	[PatternMatch] Big mechanical rename OwningRewritePatternList -> RewritePatternSet and insert -> add. NFC This doesn't change APIs, this just cleans up the many in-tree uses of these names to use the new preferred names. We'll keep the old names around for a couple weeks to help transitions. Differential Revision: https://reviews.llvm.org/D99127	2021-03-22 17:20:50 -07:00
Chris Lattner	3a506b31a3	Change OwningRewritePatternList to carry an MLIRContext with it. This updates the codebase to pass the context when creating an instance of OwningRewritePatternList, and starts removing extraneous MLIRContext parameters. There are many many more to be removed. Differential Revision: https://reviews.llvm.org/D99028	2021-03-21 10:06:31 -07:00
Christian Sigg	a825fb2c07	[mlir] Remove mlir-rocm-runner This change combines for ROCm what was done for CUDA in D97463, D98203, D98360, and D98396. I did not try to compile SerializeToHsaco.cpp or test mlir/test/Integration/GPU/ROCM because I don't have an AMD card. I fixed the things that had obvious bit-rot though. Reviewed By: whchung Differential Revision: https://reviews.llvm.org/D98447	2021-03-19 00:24:10 -07:00
Vladislav Vinogradov	fee9054232	[mlir][ODS] Support specialized Attribute class for Enums Add a feature to `EnumAttr` definition to generate specialized Attribute class for the particular enumeration. This class will inherit `StringAttr` or `IntegerAttr` and will override `classof` and `getValue` methods. With this class the enumeration predicate can be checked with simple RTTI calls (`isa`, `dyn_cast`) and it will return the typed enumeration directly instead of raw string/integer. Based on the following discussion: https://llvm.discourse.group/t/rfc-add-enum-attribute-decorator-class/2252 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D97836	2021-03-17 16:44:24 +03:00
Julian Gross	e2310704d8	[MLIR] Create memref dialect and move dialect-specific ops from std. Create the memref dialect and move dialect-specific ops from std dialect to this dialect. Moved ops: AllocOp -> MemRef_AllocOp AllocaOp -> MemRef_AllocaOp AssumeAlignmentOp -> MemRef_AssumeAlignmentOp DeallocOp -> MemRef_DeallocOp DimOp -> MemRef_DimOp MemRefCastOp -> MemRef_CastOp MemRefReinterpretCastOp -> MemRef_ReinterpretCastOp GetGlobalMemRefOp -> MemRef_GetGlobalOp GlobalMemRefOp -> MemRef_GlobalOp LoadOp -> MemRef_LoadOp PrefetchOp -> MemRef_PrefetchOp ReshapeOp -> MemRef_ReshapeOp StoreOp -> MemRef_StoreOp SubViewOp -> MemRef_SubViewOp TransposeOp -> MemRef_TransposeOp TensorLoadOp -> MemRef_TensorLoadOp TensorStoreOp -> MemRef_TensorStoreOp TensorToMemRefOp -> MemRef_BufferCastOp ViewOp -> MemRef_ViewOp The roadmap to split the memref dialect from std is discussed here: https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667 Differential Revision: https://reviews.llvm.org/D98041	2021-03-15 11:14:09 +01:00
Mehdi Amini	e1364f1068	Replace use of OperationState with builder::create in GPU Kernel Outlining (NFC) OperationState is a low level API that is rarely indicated, the builder API convenient wrapper is preferred when possible.	2021-03-12 00:14:02 +00:00
Christian Sigg	2224221fb3	[mlir] Add NVVM to CUBIN conversion to mlir-opt If MLIR_CUDA_RUNNER_ENABLED, register a 'gpu-to-cubin' conversion pass to mlir-opt. The next step is to switch CUDA integration tests from mlir-cuda-runner to mlir-opt + mlir-cpu-runner and remove mlir-cuda-runner. Depends On D98279 Reviewed By: herhut, rriddle, mehdi_amini Differential Revision: https://reviews.llvm.org/D98203	2021-03-11 10:07:11 +01:00
Christian Sigg	6a291ed0f0	[mlir] Remove unnecessary copying of pass options I missed a comment in D98279 that you don't need to copy pass options. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D98366	2021-03-10 21:55:28 +01:00
Christian Sigg	4d295cf5b5	[mlir] Add base class for GpuKernelToBlobPass Instead of configuring kernel-to-cubin/rocdl lowering through callbacks, introduce a base class that target-specific passes can derive from. Put the base class in GPU/Transforms, according to the discussion in D98203. The mlir-cuda-runner will go away shortly, and the mlir-rocdl-runner as well at some point. I therefore kept the existing code path working and will remove it in a separate step. Depends On D98168 Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D98279	2021-03-10 12:14:43 +01:00
Christian Sigg	f03826f896	Pass GPU events instead of streams across async regions. Lower !gpu.async.tokens returned from async.execute regions to events instead of streams. Make !gpu.async.token returned from !async.execute single-use. This allows creating one event per use and destroying them without leaking or ref-counting. Technically we only need this for stream/event-based lowering. I kept the code separate from the rest of the gpu-async-region pass so that we can make this optional or move to a separate pass as needed. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D96965	2021-02-25 13:18:18 +01:00
Alexander Belyaev	a89035d750	Revert "[MLIR] Create memref dialect and move several dialect-specific ops from std." This commit introduced a cyclic dependency: Memref dialect depends on Standard because it used ConstantIndexOp. Std depends on the MemRef dialect in its EDSC/Intrinsics.h Working on a fix. This reverts commit 8aa6c3765b924d86f623d452777eb76b83bf2787.	2021-02-18 12:49:52 +01:00
Julian Gross	8aa6c3765b	[MLIR] Create memref dialect and move several dialect-specific ops from std. Create the memref dialect and move several dialect-specific ops without dependencies to other ops from std dialect to this dialect. Moved ops: AllocOp -> MemRef_AllocOp AllocaOp -> MemRef_AllocaOp DeallocOp -> MemRef_DeallocOp MemRefCastOp -> MemRef_CastOp GetGlobalMemRefOp -> MemRef_GetGlobalOp GlobalMemRefOp -> MemRef_GlobalOp PrefetchOp -> MemRef_PrefetchOp ReshapeOp -> MemRef_ReshapeOp StoreOp -> MemRef_StoreOp TransposeOp -> MemRef_TransposeOp ViewOp -> MemRef_ViewOp The roadmap to split the memref dialect from std is discussed here: https://llvm.discourse.group/t/rfc-split-the-memref-dialect-from-std/2667 Differential Revision: https://reviews.llvm.org/D96425	2021-02-18 11:29:39 +01:00
River Riddle	fe7c0d90b2	[mlir][IR] Remove the concept of `OperationProperties` These properties were useful for a few things before traits had a better integration story, but don't really carry their weight well these days. Most of these properties are already checked via traits in most of the code. It is better to align the system around traits, and improve the performance/cost of traits in general. Differential Revision: https://reviews.llvm.org/D96088	2021-02-09 12:00:15 -08:00
Tres Popp	c2c83e97c3	Revert "Revert "Reorder MLIRContext location in BuiltinAttributes.h"" This reverts commit 511dd4f4383b1c2873beac4dbea2df302f1f9d0c along with a couple fixes. Original message: Now the context is the first, rather than the last input. This better matches the rest of the infrastructure and makes it easier to move these types to being declaratively specified. Phabricator: https://reviews.llvm.org/D96111	2021-02-08 10:39:58 +01:00
Tres Popp	511dd4f438	Revert "Reorder MLIRContext location in BuiltinAttributes.h" This reverts commit 7827753f9810e846fb702f3e8dcff0bfb37344e1.	2021-02-08 09:32:42 +01:00
Tres Popp	7827753f98	Reorder MLIRContext location in BuiltinAttributes.h Now the context is the first, rather than the last input. This better matches the rest of the infrastructure and makes it easier to move these types to being declaratively specified. Differential Revision: https://reviews.llvm.org/D96111	2021-02-08 09:28:09 +01:00
River Riddle	e21adfa32d	[mlir] Mark LogicalResult as LLVM_NODISCARD This makes ignoring a result explicit by the user, and helps to prevent accidental errors with dropped results. Marking LogicalResult as no discard was always the intention from the beginning, but got lost along the way. Differential Revision: https://reviews.llvm.org/D95841	2021-02-04 15:10:10 -08:00
Christian Sigg	4c372a35cd	[mlir] Make GpuAsyncRegion pass depend on async dialect. Do not cache gpu.async.token type so that the pass can be created before the GPU dialect is registered. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D94397	2021-01-11 14:43:07 +01:00
River Riddle	fc5cf50e89	[mlir] Remove the MutableDictionaryAttr class This class used to serve a few useful purposes: * Allowed containing a null DictionaryAttr * Provided some simple mutable API around a DictionaryAttr The first of which is no longer an issue now that there is much better caching support for attributes in general, and a cache in the context for empty dictionaries. The second results in more trouble than it's worth because it mutates the internal dictionary on every action, leading to a potentially large number of dictionary copies. NamedAttrList is a much better alternative for the second use case, and should be modified as needed to better fit it's usage as a DictionaryAttrBuilder. Differential Revision: https://reviews.llvm.org/D93442	2020-12-17 17:18:42 -08:00
River Riddle	1b97cdf885	[mlir][IR][NFC] Move context/location parameters of builtin Type::get methods to the start of the parameter list This better matches the rest of the infrastructure, is much simpler, and makes it easier to move these types to being declaratively specified. Differential Revision: https://reviews.llvm.org/D93432	2020-12-17 13:01:36 -08:00
Christian Sigg	a79b26db0e	[mlir] Fix for gpu-async-region pass. - the !gpu.async.token is the second result of 'gpu.alloc async', not the first. - async.execute construction takes operand types not yet wrapped in !async.value. - fix typo Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D93156	2020-12-16 19:08:10 +01:00
Christian Sigg	1ffc1aaa09	[mlir] Use mlir::OpState::operator->() to get to methods of mlir::Operation. This is a preparation step to remove those methods from OpState. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D93098	2020-12-13 09:58:16 +01:00
Christian Sigg	0bf4a82a5a	[mlir] Use mlir::OpState::operator->() to get to methods of mlir::Operation. This is a preparation step to remove the corresponding methods from OpState. Reviewed By: silvas, rriddle Differential Revision: https://reviews.llvm.org/D92878	2020-12-09 12:11:32 +01:00
Christian Sigg	d9adde5ae2	[mlir][gpu] Move gpu.wait ops from async.execute regions to its dependencies. This can prevent unnecessary host synchronization. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D90346	2020-12-03 08:52:28 +01:00

1 2 3

123 Commits