27 Commits

Author SHA1 Message Date
Matthias Springer
b23c8225e8 [mlir][NFC] Clean up builder usage around constants/non-foldable ops
* Use `create` instead of `createOrFold` for constant ops. Constants cannot be folded any further.
* Use `create` instead of `createOrFold` for ops that do not have a folder.
* Use C++ op builders that take an `int` instead of creating a `ConstantIndexOp`.
* Create `tensor::DimOp` instead of `linalg::createOrFoldDimOp` when it is certain that the operand is a tensor.

Differential Revision: https://reviews.llvm.org/D154196
2023-06-30 13:56:42 +02:00
Krzysztof Drewniak
73eecc9ca4 [mlir] Convert 8-bit float types to i8
Whereas LLVM currently doesn't have any types for 8-bit floats, and
whereas existing 8-bit float APIs (for instance, the AMDGCN
intrinsics) take such floats as (packed) bytes, translate the MLIR
8-bit float types to i8 during LLVM lowering.

In order to not special-case arith.constant for bitcasting constants
to their integer form, amend the MLIR to LLVM translator to turn 8-bit
float constants into i8 constants with the same value (by use of
APFloat's bitcast method).

This change can be reverted once LLVM has 8-bit float types.

Reviewed By: gysit

Differential Revision: https://reviews.llvm.org/D153160
2023-06-26 17:42:00 +00:00
Giuseppe Rossini
20c66a0c66 [AMDGPU] Add basic support for gfx11xx
This patch fixes a minor issue in AMDGPUToROCDL to add gfx11 support in MLIR

Reviewed By: krzysz00

Differential Revision: https://reviews.llvm.org/D152450
2023-06-12 17:06:36 +00:00
Tres Popp
5550c82189 [mlir] Move casting calls from methods to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.

Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.

Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.

Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
  for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443

Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.

Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
   additional check:
   https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
   and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
   them to a pure state.
4. Some changes have been deleted for the following reasons:
   - Some files had a variable also named cast
   - Some files had not included a header file that defines the cast
     functions
   - Some files are definitions of the classes that have the casting
     methods, so the code still refers to the method instead of the
     function without adding a prefix or removing the method declaration
     at the same time.

```
ninja -C $BUILD_DIR clang-tidy

run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
               -header-filter=mlir/ mlir/* -fix

rm -rf $BUILD_DIR/tools/mlir/**/*.inc

git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
            mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
            mlir/lib/**/IR/\
            mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
            mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
            mlir/test/lib/Dialect/Test/TestTypes.cpp\
            mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
            mlir/test/lib/Dialect/Test/TestAttributes.cpp\
            mlir/unittests/TableGen/EnumsGenTest.cpp\
            mlir/test/python/lib/PythonTestCAPI.cpp\
            mlir/include/mlir/IR/
```

Differential Revision: https://reviews.llvm.org/D150123
2023-05-12 11:21:25 +02:00
Krzysztof Drewniak
cc4703745f [mlir][AMDGPU] Add emulation pass for atomics on AMDGPU targets
Not all AMDGPU targets support all atomic operations. For example,
there are not atomic floating-point adds on the gfx10 series. Add a
pass to emulate these operations using a compare-and-swap loop, by
analogy to the generic atomicrmw rewrite in MemrefToLLVM.

This pass is named generally, as in the future we may have a
memref-to-amdgpu that translates constructs like atomicrmw fmax (which
doesn't generally exist in LLVM) to the relevant intrinsics, which may
themselves require emulation.

Since the AMDGPU dialect now has a pass that operates on it, the
dialect's directory structure is reorganized to match other similarly
complex dialects.

The pass should be run before amdgpu-to-rocdl if desired.

This commit also adds f64 support to atomic_fmax.

Depends on D148722

Reviewed By: nirvedhmeshram

Differential Revision: https://reviews.llvm.org/D148724
2023-05-03 21:18:48 +00:00
Krzysztof Drewniak
98c1104d41 [mlir][AMDGPU] Define atomic compare-and-swap for raw buffers
This commit adds the buffer cmpswap intrinsic to the ROCDL dialect and
its corresponding AMDGPU dialect wrappers.

Reviewed By: nirvedhmeshram

Differential Revision: https://reviews.llvm.org/D148722
2023-05-03 21:11:20 +00:00
giuseros
82ac02e4a8 Add scalar support for amdgpu.raw_buffer_{load,store}
Introduce the possibility to load/store scalars via amdgpu.raw_buffer_{load,store}

Reviewed By: krzysz00

Differential Revision: https://reviews.llvm.org/D146413
2023-03-20 20:19:20 +00:00
Jakub Kuderski
8c258fda1f [ADT][mlir][NFCI] Do not use non-const lvalue-refs with enumerate
Replace references to enumerate results with either result_pairs
(reference wrapper type) or structured bindings. I did not use
structured bindings everywhere as it wasn't clear to me it would
improve readability.

This is in preparation to the switch to zip semantics which won't
support non-const lvalue reference to elements:
https://reviews.llvm.org/D144503.

I chose to use values instead of const lvalue-refs because MLIR is
biased towards avoiding `const` local variables. This won't degrade
performance because currently `result_pair` is cheap to copy (size_t
+ iterator), and in the future, the enumerator iterator dereference
will return temporaries anyway.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D146006
2023-03-15 10:43:56 -04:00
Manupa Karunaratne
584f64365a [MLIR][AMDGPU][ROCDL] Adding raw.buffer.atomic.fmax/smax/umin support
This commit adds support for atomic fmax/smax/umin support
for AMDGPU dialect and the dependent dialects to allow such
a lowering.

Reviewed By: krzysz00

Differential Revision: https://reviews.llvm.org/D144097
2023-02-28 16:58:35 +00:00
Krzysztof Drewniak
22f0c7a451 [mlir][AMDGPU] 8-bit float usage in the AMDGPU dialect
Upcoming AMD hardware will include functions that accept 8-bit floats.
Specifically, there are MFMA instructions that accept 8-bit floats,
either using the same or mixed formats. This patch adds MLIR wrappers
for these intrinsics and explicitly adds support for 8-bit floats in
the gpu-to-rocdl conversion by way of amdgpu-to-rocdl.

Since LLVM does not have f8 types, when targeting LLVM for compilation
on an AMD GPU, both f8 types used on AMD hardware (f8E5M2FNUZ and
f8E4M3FNUZ) are rewritten to i8.

This patch also relaxes the restriction that the types of both source
operands to a amdgpu.mfma instructions match exactly, as this is not
necessarily required for the bf8 (f8E5M2FNUZ) and fp8 (f8E4M3FNUZ)
instructions. In addition, since the buffer_{load,store} operations
maintain a whitelist of permitted types, we add the relevant f8 types
to that list.

This patch does not add any implementations of arithmetic operations
for f8 types.

Reviewed By: jakeh-gc

Differential Revision: https://reviews.llvm.org/D143956
2023-02-15 16:46:08 +00:00
Kazu Hirata
0a81ace004 [mlir] Use std::optional instead of llvm::Optional (NFC)
This patch replaces (llvm::|)Optional< with std::optional<.  I'll post
a separate patch to remove #include "llvm/ADT/Optional.h".

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-14 01:25:58 -08:00
Kazu Hirata
a1fe1f5f77 [mlir] Add #include <optional> (NFC)
This patch adds #include <optional> to those files containing
llvm::Optional<...> or Optional<...>.

I'll post a separate patch to actually replace llvm::Optional with
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2023-01-13 21:05:06 -08:00
Kazu Hirata
1a36588ec6 [mlir] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-03 18:50:27 -08:00
Aliia Khasanova
399638f98c Merge kDynamicSize and kDynamicSentinel into one constant.
resolve conflicts

Differential Revision: https://reviews.llvm.org/D138282
2022-11-21 13:01:26 +00:00
Kazu Hirata
430cbd5401 [mlir] Fix a warning
This patch fixes:

  mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp:128:10: warning:
  variable ‘llvm2xI32’ set but not used [-Wunused-but-set-variable]

The last use of llvm2xI32 was removed on July 6, 2022 in commit
63295622491a31eaccb6c534ba5caa836beb843f.
2022-10-23 10:11:20 -07:00
Krzysztof Drewniak
c55b41d519 [mlir][AMDGPU] Define amdgpu.mfma operator
The amdgpu.mfma operator is a wrapper around the Matrix Fused Multiply
Add (MFMA) instructions on some AMD GPUs (the CDNA-based MI-* cards).

This interface allows for selecting the operation to be performed by
specifying the dimensions of the multiplication to be performed and
any additional attributes (such as whether to use reduced-precision
floating-point math) that are needed to select the relevant mfma
instruction and set its parameters.

Reviewed By: ThomasRaoux, nirvedhmeshram

Differential Revision: https://reviews.llvm.org/D132956
2022-08-31 21:06:12 +00:00
Michele Scuttari
67d0d7ac0a
[MLIR] Update pass declarations to new autogenerated files
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838
2022-08-31 12:28:45 +02:00
Michele Scuttari
039b969b32
Revert "[MLIR] Update pass declarations to new autogenerated files"
This reverts commit 2be8af8f0e0780901213b6fd3013a5268ddc3359.
2022-08-30 22:21:55 +02:00
Michele Scuttari
2be8af8f0e
[MLIR] Update pass declarations to new autogenerated files
The patch introduces the required changes to update the pass declarations and definitions to use the new autogenerated files and allow dropping the old infrastructure.

Reviewed By: mehdi_amini, rriddle

Differential Review: https://reviews.llvm.org/D132838
2022-08-30 21:56:31 +02:00
Jeff Niu
0af643f3ce [mlir][LLVMIR] (NFC) Add convenience builders for ConstantOp
And clean up some of the user code
2022-08-09 15:34:36 -04:00
Krzysztof Drewniak
6329562249 [mlir][AMDGPU] Explicitly truncate memory addresses in buffer ops
As a percaution, truncate memory addresses passed to kernels to 48 bits,
since bits 48-63 of the buffer descriptor are used for the stride field
and, on gfx10, to control swizzling.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D131016
2022-08-04 19:42:33 +00:00
Krzysztof Drewniak
bc61cc9a2d [mlir][AMDGPU] Add lds_barrier op
The lds_barrier op allows workgroups to wait at a barrier for
operations to/from their local data store (LDS) to complete without
incurring the performance penalties of a full memory fence.

Reviewed By: nirvedhmeshram

Differential Revision: https://reviews.llvm.org/D129522
2022-07-14 20:45:26 +00:00
Krzysztof Drewniak
db590549a9 [mlir][AMDGPU] Use the correct values for OOB_SELECT on gfx10
Differential Revision: https://reviews.llvm.org/D129320
2022-07-07 21:23:38 +00:00
Krzysztof Drewniak
cab44c515c [mlir][AMDGPU] Add --chipset option to AMDGPUToROCDL
Because the buffer descriptor structure (the V#) has no backwards-compatibility
guarentees, and since said guarantees have been violated in practice
(see https://github.com/llvm/llvm-project/issues/56323 ), and since
the `targetIsRDNA` attribute isn't something that higher-level clients can set
in general, make the lowering of the amdgpu dialect to rocdl take a --chipset
option.

Note that this option is a string because adding a parser for the Chipset
struct to llvm::cl wasn't working out.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D129228
2022-07-07 14:58:13 +00:00
Kazu Hirata
037f09959a [mlir] Don't use Optional::hasValue (NFC) 2022-06-20 11:22:37 -07:00
Jacques Pienaar
8df54a6a03 [mlir] Update accessors to prefixed form (NFC)
Follow up from flipping dialects to both, flip accessor used to prefixed
variant ahead to flipping from _Both to _Prefixed. This just flips to
the accessors introduced in the preceding change which are just prefixed
forms of the existing accessor changed from.

Mechanical change using helper script
https://github.com/jpienaar/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/AddGetterCheck.cpp and clang-format.
2022-06-18 17:53:22 -07:00
Krzysztof Drewniak
f1f05a91ca [MLIR][AMDGPU] Add AMDGPU dialect, wrappers around raw buffer intrinsics
By analogy with the NVGPU dialect, introduce an AMDGPU dialect for
AMD-specific intrinsic wrappers.

The dialect initially includes wrappers around the raw buffer intrinsics.

On AMD GPUs, a memref can be converted to a "buffer descriptor" that
allows more precise control of memory access, such as by allowing for
out of bounds loads/stores to be replaced by 0/ignored without adding
additional conditional logic, which is important for performance.

The repository currently contains a limited conversion from
transfer_read/transfer_write to Mubuf intrinsics, which are an older,
deprecated intrinsic for the same functionality.

The new amdgpu.raw_buffer_* ops allow these operations to be used
explicitly and for including metadata such as whether the target
chipset is an RDNA chip or not (which impacts the interpretation of
some bits in the buffer descriptor), while still maintaining an
MLIR-like interface.

(This change also exposes the floating-point atomic add intrinsic.)

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D122765
2022-05-10 14:59:58 +00:00