13 Commits

Author SHA1 Message Date
Thomas Preud'homme
36cf982d6c
[MLIR] Add missing MLIRFuncDialect dep to MLIRAMDGPUTransforms (#84550)
This fixes the following failure when doing a clean build (in particular
no .ninja* lying around) of lib/libMLIRAMDGPUTransforms.a only:
```
In file included from mlir/lib/Dialect/AMDGPU/Transforms/OptimizeSharedMemory.cpp:21:
mlir/include/mlir/Dialect/Func/IR/FuncOps.h:29:10: fatal error: mlir/Dialect/Func/IR/FuncOps.h.inc: No such file or directory
```
2024-03-11 23:08:56 +00:00
erman-gurses
87c0260f45
[AMDGPU] Add parameterization for optimized shared memory variables (#82508)
- This PR adds parameterization for shared memory variables that are
used for optimization: `sharedMemoryLineSizeBytes` and
`defaultVectorSizeBits.`
- The default values are set to 128 for both variables since it gives
zero bank conflicts.
2024-02-27 23:28:12 -05:00
erman-gurses
04381c106f
[MLIR][AMDGPU]Add refactoring for shared-mem optimization (#81791)
Addressing the issues in this PR:
https://github.com/llvm/llvm-project/pull/81550
2024-02-15 13:53:15 -05:00
erman-gurses
29d1aca05c
[AMDGPU][MLIR]Add shmem-optimization as an op using transform dialect (#81550)
This PR adds functionality to use shared memory optimization as an op
using transform dialect.
2024-02-13 17:42:04 -08:00
erman-gurses
3f37df5b71
[reland][mlir][amdgpu] Shared memory access optimization pass (#79164)
- Reland: https://github.com/llvm/llvm-project/pull/75627

- Reproduced then fixed the build issue
2024-01-25 07:44:45 -08:00
Mehdi Amini
e611a4cf80
Revert "[mlir][amdgpu] Shared memory access optimization pass" (#78822)
Reverts llvm/llvm-project#75627 ; it broke the bot:
https://lab.llvm.org/buildbot/#/builders/61/builds/53218
2024-01-19 16:41:43 -08:00
erman-gurses
b7360fbe8c
[mlir][amdgpu] Shared memory access optimization pass (#75627)
It implements transformation to optimize accesses to shared memory.

Reference: https://reviews.llvm.org/D127457

_This change adds a transformation and pass to the NvGPU dialect that
attempts to optimize reads/writes from a memref representing GPU shared
memory in order to avoid bank conflicts. Given a value representing a
shared memory memref, it traverses all reads/writes within the parent op
and, subject to suitable conditions, rewrites all last dimension index
values such that element locations in the final (col) dimension are
given by newColIdx = col % vecSize + perm[row](col / vecSize, row)
where perm is a permutation function indexed by row and vecSize
is the vector access size in elements (currently assumes 128bit
vectorized accesses, but this can be made a parameter). This specific
transformation can help optimize typical distributed & vectorized
accesses
common to loading matrix multiplication operands to/from shared memory._
2024-01-19 15:44:45 -08:00
Krzysztof Drewniak
ba6d7a0f25 [mlir][AMDGPU] Add gfx941 to buffer atomics emulation
Reviewed By: fmorac

Differential Revision: https://reviews.llvm.org/D152299
2023-09-13 16:07:07 +00:00
Daniil Dudkin
8a6e54c9b3
[mlir][arith] Rename operations: maxfmaximumf, minfminimumf (#65800)
This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671.

This commit addresses Task 1.2 of the mentioned RFC. By renaming these operations, we align their names with LLVM intrinsics that have corresponding semantics.
2023-09-11 22:02:19 -07:00
Mehdi Amini
363b655920 Finish renaming getOperandSegmentSizeAttr() from operand_segment_sizes to operandSegmentSizes
This renaming started with the native ODS support for properties, this is completing it.

A mass automated textual rename seems safe for most codebases.
Drop also the ods prefix to keep the accessors the same as they were before
this change:
 properties.odsOperandSegmentSizes
reverts back to:
 properties.operandSegementSizes

The ODS prefix was creating divergence between all the places and make it harder to
be consistent.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D157173
2023-08-09 19:37:01 -07:00
Matthias Springer
71d50c890b [mlir][IR] Improve listener notifications for ops without results
`RewriterBase::Listener::notifyOperationReplaced` notifies observers that an op is about to be replaced with a range of values. This notification is not very useful for ops without results, because it does not specify the replacement op (and it cannot be deduced from the replacement values). It provides no additional information over the `notifyOperationRemoved` notification.

This revision adds an additional notification when a rewriter replaces an op with another op. By default, this notification triggers the original "op replaced with values" notification, so there is no functional change for existing code.

This new API is useful for the transform dialect, which needs to track op replacements. (Updated in a subsequent revision.)

Also includes minor documentation improvements.

Differential Revision: https://reviews.llvm.org/D152814
2023-06-14 08:51:14 +02:00
Tres Popp
5550c82189 [mlir] Move casting calls from methods to function calls
The MLIR classes Type/Attribute/Operation/Op/Value support
cast/dyn_cast/isa/dyn_cast_or_null functionality through llvm's doCast
functionality in addition to defining methods with the same name.
This change begins the migration of uses of the method to the
corresponding function call as has been decided as more consistent.

Note that there still exist classes that only define methods directly,
such as AffineExpr, and this does not include work currently to support
a functional cast/isa call.

Caveats include:
- This clang-tidy script probably has more problems.
- This only touches C++ code, so nothing that is being generated.

Context:
- https://mlir.llvm.org/deprecation/ at "Use the free function variants
  for dyn_cast/cast/isa/…"
- Original discussion at https://discourse.llvm.org/t/preferred-casting-style-going-forward/68443

Implementation:
This first patch was created with the following steps. The intention is
to only do automated changes at first, so I waste less time if it's
reverted, and so the first mass change is more clear as an example to
other teams that will need to follow similar steps.

Steps are described per line, as comments are removed by git:
0. Retrieve the change from the following to build clang-tidy with an
   additional check:
   https://github.com/llvm/llvm-project/compare/main...tpopp:llvm-project:tidy-cast-check
1. Build clang-tidy
2. Run clang-tidy over your entire codebase while disabling all checks
   and enabling the one relevant one. Run on all header files also.
3. Delete .inc files that were also modified, so the next build rebuilds
   them to a pure state.
4. Some changes have been deleted for the following reasons:
   - Some files had a variable also named cast
   - Some files had not included a header file that defines the cast
     functions
   - Some files are definitions of the classes that have the casting
     methods, so the code still refers to the method instead of the
     function without adding a prefix or removing the method declaration
     at the same time.

```
ninja -C $BUILD_DIR clang-tidy

run-clang-tidy -clang-tidy-binary=$BUILD_DIR/bin/clang-tidy -checks='-*,misc-cast-functions'\
               -header-filter=mlir/ mlir/* -fix

rm -rf $BUILD_DIR/tools/mlir/**/*.inc

git restore mlir/lib/IR mlir/lib/Dialect/DLTI/DLTI.cpp\
            mlir/lib/Dialect/Complex/IR/ComplexDialect.cpp\
            mlir/lib/**/IR/\
            mlir/lib/Dialect/SparseTensor/Transforms/SparseVectorization.cpp\
            mlir/lib/Dialect/Vector/Transforms/LowerVectorMultiReduction.cpp\
            mlir/test/lib/Dialect/Test/TestTypes.cpp\
            mlir/test/lib/Dialect/Transform/TestTransformDialectExtension.cpp\
            mlir/test/lib/Dialect/Test/TestAttributes.cpp\
            mlir/unittests/TableGen/EnumsGenTest.cpp\
            mlir/test/python/lib/PythonTestCAPI.cpp\
            mlir/include/mlir/IR/
```

Differential Revision: https://reviews.llvm.org/D150123
2023-05-12 11:21:25 +02:00
Krzysztof Drewniak
cc4703745f [mlir][AMDGPU] Add emulation pass for atomics on AMDGPU targets
Not all AMDGPU targets support all atomic operations. For example,
there are not atomic floating-point adds on the gfx10 series. Add a
pass to emulate these operations using a compare-and-swap loop, by
analogy to the generic atomicrmw rewrite in MemrefToLLVM.

This pass is named generally, as in the future we may have a
memref-to-amdgpu that translates constructs like atomicrmw fmax (which
doesn't generally exist in LLVM) to the relevant intrinsics, which may
themselves require emulation.

Since the AMDGPU dialect now has a pass that operates on it, the
dialect's directory structure is reorganized to match other similarly
complex dialects.

The pass should be run before amdgpu-to-rocdl if desired.

This commit also adds f64 support to atomic_fmax.

Depends on D148722

Reviewed By: nirvedhmeshram

Differential Revision: https://reviews.llvm.org/D148724
2023-05-03 21:18:48 +00:00