llvm-project

Author	SHA1	Message	Date
Matthias Springer	b23c8225e8	[mlir][NFC] Clean up builder usage around constants/non-foldable ops * Use `create` instead of `createOrFold` for constant ops. Constants cannot be folded any further. * Use `create` instead of `createOrFold` for ops that do not have a folder. * Use C++ op builders that take an `int` instead of creating a `ConstantIndexOp`. * Create `tensor::DimOp` instead of `linalg::createOrFoldDimOp` when it is certain that the operand is a tensor. Differential Revision: https://reviews.llvm.org/D154196	2023-06-30 13:56:42 +02:00
Krzysztof Drewniak	73eecc9ca4	[mlir] Convert 8-bit float types to i8 Whereas LLVM currently doesn't have any types for 8-bit floats, and whereas existing 8-bit float APIs (for instance, the AMDGCN intrinsics) take such floats as (packed) bytes, translate the MLIR 8-bit float types to i8 during LLVM lowering. In order to not special-case arith.constant for bitcasting constants to their integer form, amend the MLIR to LLVM translator to turn 8-bit float constants into i8 constants with the same value (by use of APFloat's bitcast method). This change can be reverted once LLVM has 8-bit float types. Reviewed By: gysit Differential Revision: https://reviews.llvm.org/D153160	2023-06-26 17:42:00 +00:00
Giuseppe Rossini	20c66a0c66	[AMDGPU] Add basic support for gfx11xx This patch fixes a minor issue in AMDGPUToROCDL to add gfx11 support in MLIR Reviewed By: krzysz00 Differential Revision: https://reviews.llvm.org/D152450	2023-06-12 17:06:36 +00:00
Tres Popp	5550c82189	[mlir] Move casting calls from methods to function calls The MLIR classes Type/Attribute/Operation/Op/Value support cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast functionality in addition to defining methods with the same name. This change begins the migration of uses of the method to the corresponding function call as has been decided as more consistent. Note that there still exist classes that only define methods directly, such as AffineExpr, and this does not include work currently to support a functional cast/isa call. Caveats include: - This clang-tidy script probably has more problems. - This only touches C++ code, so nothing that is being generated. Context: - https://mlir.llvm.org/deprecation/ at "Use the free function variants for dyn_cast/cast/isa/…" - Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443 Implementation: This first patch was created with the following steps. The intention is to only do automated changes at first, so I waste less time if it's reverted, and so the first mass change is more clear as an example to other teams that will need to follow similar steps. Steps are described per line, as comments are removed by git: 0. Retrieve the change from the following to build clang-tidy with an additional check: https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check 1. Build clang-tidy 2. Run clang-tidy over your entire codebase while disabling all checks and enabling the one relevant one. Run on all header files also. 3. Delete .inc files that were also modified, so the next build rebuilds them to a pure state. 4. Some changes have been deleted for the following reasons: - Some files had a variable also named cast - Some files had not included a header file that defines the cast functions - Some files are definitions of the classes that have the casting methods, so the code still refers to the method instead of the function without adding a prefix or removing the method declaration at the same time. ``` ninja -C $BUILD_DIR clang-tidy run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-,misc-cast-functions'\ -header-filter=mlir/ mlir/ -fix rm -rf $BUILD_DIR/tools/mlir/*/.inc git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\ mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\ mlir/lib/**/IR/\ mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\ mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\ mlir/test/lib/Dialect/Test/TestTypes.cpp\ mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\ mlir/test/lib/Dialect/Test/TestAttributes.cpp\ mlir/unittests/TableGen/EnumsGenTest.cpp\ mlir/test/python/lib/PythonTestCAPI.cpp\ mlir/include/mlir/IR/ ``` Differential Revision: https://reviews.llvm.org/D150123	2023-05-12 11:21:25 +02:00
Krzysztof Drewniak	cc4703745f	[mlir][AMDGPU] Add emulation pass for atomics on AMDGPU targets Not all AMDGPU targets support all atomic operations. For example, there are not atomic floating-point adds on the gfx10 series. Add a pass to emulate these operations using a compare-and-swap loop, by analogy to the generic atomicrmw rewrite in MemrefToLLVM. This pass is named generally, as in the future we may have a memref-to-amdgpu that translates constructs like atomicrmw fmax (which doesn't generally exist in LLVM) to the relevant intrinsics, which may themselves require emulation. Since the AMDGPU dialect now has a pass that operates on it, the dialect's directory structure is reorganized to match other similarly complex dialects. The pass should be run before amdgpu-to-rocdl if desired. This commit also adds f64 support to atomic_fmax. Depends on D148722 Reviewed By: nirvedhmeshram Differential Revision: https://reviews.llvm.org/D148724	2023-05-03 21:18:48 +00:00
Krzysztof Drewniak	98c1104d41	[mlir][AMDGPU] Define atomic compare-and-swap for raw buffers This commit adds the buffer cmpswap intrinsic to the ROCDL dialect and its corresponding AMDGPU dialect wrappers. Reviewed By: nirvedhmeshram Differential Revision: https://reviews.llvm.org/D148722	2023-05-03 21:11:20 +00:00
giuseros	82ac02e4a8	Add scalar support for amdgpu.raw_buffer_{load,store} Introduce the possibility to load/store scalars via amdgpu.raw_buffer_{load,store} Reviewed By: krzysz00 Differential Revision: https://reviews.llvm.org/D146413	2023-03-20 20:19:20 +00:00
Jakub Kuderski	8c258fda1f	[ADT][mlir][NFCI] Do not use non-const lvalue-refs with enumerate Replace references to enumerate results with either result_pairs (reference wrapper type) or structured bindings. I did not use structured bindings everywhere as it wasn't clear to me it would improve readability. This is in preparation to the switch to zip semantics which won't support non-const lvalue reference to elements: https://reviews.llvm.org/D144503. I chose to use values instead of const lvalue-refs because MLIR is biased towards avoiding `const` local variables. This won't degrade performance because currently `result_pair` is cheap to copy (size_t + iterator), and in the future, the enumerator iterator dereference will return temporaries anyway. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D146006	2023-03-15 10:43:56 -04:00
Manupa Karunaratne	584f64365a	[MLIR][AMDGPU][ROCDL] Adding raw.buffer.atomic.fmax/smax/umin support This commit adds support for atomic fmax/smax/umin support for AMDGPU dialect and the dependent dialects to allow such a lowering. Reviewed By: krzysz00 Differential Revision: https://reviews.llvm.org/D144097	2023-02-28 16:58:35 +00:00
Krzysztof Drewniak	22f0c7a451	[mlir][AMDGPU] 8-bit float usage in the AMDGPU dialect Upcoming AMD hardware will include functions that accept 8-bit floats. Specifically, there are MFMA instructions that accept 8-bit floats, either using the same or mixed formats. This patch adds MLIR wrappers for these intrinsics and explicitly adds support for 8-bit floats in the gpu-to-rocdl conversion by way of amdgpu-to-rocdl. Since LLVM does not have f8 types, when targeting LLVM for compilation on an AMD GPU, both f8 types used on AMD hardware (f8E5M2FNUZ and f8E4M3FNUZ) are rewritten to i8. This patch also relaxes the restriction that the types of both source operands to a amdgpu.mfma instructions match exactly, as this is not necessarily required for the bf8 (f8E5M2FNUZ) and fp8 (f8E4M3FNUZ) instructions. In addition, since the buffer_{load,store} operations maintain a whitelist of permitted types, we add the relevant f8 types to that list. This patch does not add any implementations of arithmetic operations for f8 types. Reviewed By: jakeh-gc Differential Revision: https://reviews.llvm.org/D143956	2023-02-15 16:46:08 +00:00
Kazu Hirata	0a81ace004	[mlir] Use std::optional instead of llvm::Optional (NFC) This patch replaces (llvm::\|)Optional< with std::optional<. I'll post a separate patch to remove #include "llvm/ADT/Optional.h". This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2023-01-14 01:25:58 -08:00
Kazu Hirata	a1fe1f5f77	[mlir] Add #include <optional> (NFC) This patch adds #include <optional> to those files containing llvm::Optional<...> or Optional<...>. I'll post a separate patch to actually replace llvm::Optional with std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2023-01-13 21:05:06 -08:00
Kazu Hirata	1a36588ec6	[mlir] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-03 18:50:27 -08:00
Aliia Khasanova	399638f98c	Merge kDynamicSize and kDynamicSentinel into one constant. resolve conflicts Differential Revision: https://reviews.llvm.org/D138282	2022-11-21 13:01:26 +00:00
Kazu Hirata	430cbd5401	[mlir] Fix a warning This patch fixes: mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp:128:10: warning: variable ‘llvm2xI32’ set but not used [-Wunused-but-set-variable] The last use of llvm2xI32 was removed on July 6, 2022 in commit 63295622491a31eaccb6c534ba5caa836beb843f.	2022-10-23 10:11:20 -07:00
Krzysztof Drewniak	c55b41d519	[mlir][AMDGPU] Define amdgpu.mfma operator The amdgpu.mfma operator is a wrapper around the Matrix Fused Multiply Add (MFMA) instructions on some AMD GPUs (the CDNA-based MI-* cards). This interface allows for selecting the operation to be performed by specifying the dimensions of the multiplication to be performed and any additional attributes (such as whether to use reduced-precision floating-point math) that are needed to select the relevant mfma instruction and set its parameters. Reviewed By: ThomasRaoux, nirvedhmeshram Differential Revision: https://reviews.llvm.org/D132956	2022-08-31 21:06:12 +00:00
Michele Scuttari	67d0d7ac0a	[MLIR] Update pass declarations to new autogenerated files The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure. Reviewed By: mehdi_amini, rriddle Differential Review: https://reviews.llvm.org/D132838	2022-08-31 12:28:45 +02:00
Michele Scuttari	039b969b32	Revert "[MLIR] Update pass declarations to new autogenerated files" This reverts commit 2be8af8f0e0780901213b6fd3013a5268ddc3359.	2022-08-30 22:21:55 +02:00
Michele Scuttari	2be8af8f0e	[MLIR] Update pass declarations to new autogenerated files The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure. Reviewed By: mehdi_amini, rriddle Differential Review: https://reviews.llvm.org/D132838	2022-08-30 21:56:31 +02:00
Jeff Niu	0af643f3ce	[mlir][LLVMIR] (NFC) Add convenience builders for ConstantOp And clean up some of the user code	2022-08-09 15:34:36 -04:00
Krzysztof Drewniak	6329562249	[mlir][AMDGPU] Explicitly truncate memory addresses in buffer ops As a percaution, truncate memory addresses passed to kernels to 48 bits, since bits 48-63 of the buffer descriptor are used for the stride field and, on gfx10, to control swizzling. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D131016	2022-08-04 19:42:33 +00:00
Krzysztof Drewniak	bc61cc9a2d	[mlir][AMDGPU] Add lds_barrier op The lds_barrier op allows workgroups to wait at a barrier for operations to/from their local data store (LDS) to complete without incurring the performance penalties of a full memory fence. Reviewed By: nirvedhmeshram Differential Revision: https://reviews.llvm.org/D129522	2022-07-14 20:45:26 +00:00
Krzysztof Drewniak	db590549a9	[mlir][AMDGPU] Use the correct values for OOB_SELECT on gfx10 Differential Revision: https://reviews.llvm.org/D129320	2022-07-07 21:23:38 +00:00
Krzysztof Drewniak	cab44c515c	[mlir][AMDGPU] Add --chipset option to AMDGPUToROCDL Because the buffer descriptor structure (the V#) has no backwards-compatibility guarentees, and since said guarantees have been violated in practice (see https://github.com/llvm/llvm-project/issues/56323 ), and since the `targetIsRDNA` attribute isn't something that higher-level clients can set in general, make the lowering of the amdgpu dialect to rocdl take a --chipset option. Note that this option is a string because adding a parser for the Chipset struct to llvm::cl wasn't working out. Reviewed By: herhut Differential Revision: https://reviews.llvm.org/D129228	2022-07-07 14:58:13 +00:00
Kazu Hirata	037f09959a	[mlir] Don't use Optional::hasValue (NFC)	2022-06-20 11:22:37 -07:00
Jacques Pienaar	8df54a6a03	[mlir] Update accessors to prefixed form (NFC) Follow up from flipping dialects to both, flip accessor used to prefixed variant ahead to flipping from _Both to _Prefixed. This just flips to the accessors introduced in the preceding change which are just prefixed forms of the existing accessor changed from. Mechanical change using helper script https://github.com/jpienaar/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/AddGetterCheck.cpp and clang-format.	2022-06-18 17:53:22 -07:00
Krzysztof Drewniak	f1f05a91ca	[MLIR][AMDGPU] Add AMDGPU dialect, wrappers around raw buffer intrinsics By analogy with the NVGPU dialect, introduce an AMDGPU dialect for AMD-specific intrinsic wrappers. The dialect initially includes wrappers around the raw buffer intrinsics. On AMD GPUs, a memref can be converted to a "buffer descriptor" that allows more precise control of memory access, such as by allowing for out of bounds loads/stores to be replaced by 0/ignored without adding additional conditional logic, which is important for performance. The repository currently contains a limited conversion from transfer_read/transfer_write to Mubuf intrinsics, which are an older, deprecated intrinsic for the same functionality. The new amdgpu.raw_buffer_* ops allow these operations to be used explicitly and for including metadata such as whether the target chipset is an RDNA chip or not (which impacts the interpretation of some bits in the buffer descriptor), while still maintaining an MLIR-like interface. (This change also exposes the floating-point atomic add intrinsic.) Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D122765	2022-05-10 14:59:58 +00:00

27 Commits