293 Commits

Author SHA1 Message Date
Mogball
d7ef488bb6 [mlir][gpu] Move GPU headers into IR/ and Transforms/
Depends on D127350

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127352
2022-06-09 22:49:03 +00:00
Mogball
7bdd3722f2 [mlir][gpu] Change ParalellLoopMappingAttr to AttrDef
It was a StructAttr. Also adds a FieldParser for AffineMap.

Depends on D127348

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127350
2022-06-09 22:23:21 +00:00
Christian Sigg
bcf3d52486 [MLIR][GPU] Expose GpuParallelLoopMapping as non-test pass.
Reviewed By: bondhugula, herhut

Differential Revision: https://reviews.llvm.org/D126199
2022-05-30 09:20:48 +02:00
Mehdi Amini
59c3be748f Apply clang-tidy fixes for performance-move-const-arg in SerializeToHsaco.cpp (NFC) 2022-05-16 13:58:49 +00:00
Arnab Dutta
16219f8c94 [MLIR][GPU] Add canonicalizer for gpu.memcpy
Erase gpu.memcpy op when only uses of dest are
the memcpy op in question, its allocation and deallocation
ops.

Reviewed By: bondhugula, csigg

Differential Revision: https://reviews.llvm.org/D124257
2022-05-14 19:01:04 +05:30
Christian Sigg
0e3d1ca54a [MLIR][GPU] NFC: simplify kernel operand accessor implementations.
Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D125112
2022-05-14 14:14:42 +02:00
Daniil Dudkin
70c463efc8 [mlir][NFC] Fix GpuKernelOutliningPass copy constructor warnings
1. Call copy constructor of the base class
2. Assign value of the option directly

Reviewed By: dcaballe, rriddle

Differential Revision: https://reviews.llvm.org/D125101
2022-05-12 11:41:18 +03:00
Thomas Raoux
15bcc36eed [mlir][gpu] Move async copy ops to NVGPU and add caching hints
Move async copy operations to NVGPU as they only exist on NV target and are
designed to match ptx semantic. This allows us to also add more fine grain
caching hint attribute to the op.
Add hint to bypass L1 and hook it up to NVVM op.

Differential Revision: https://reviews.llvm.org/D125244
2022-05-10 22:30:24 +00:00
Nikita Popov
03ab30686d [MLIR] Split off MLIRExecutionEngineUtils to fix libMLIR.so build (PR54242)
Building libMLIR.so currently fails with:

> /usr/bin/ld: /tmp/ccNzulEA.ltrans39.ltrans.o: in function `(anonymous namespace)::SerializeToHsacoPass::optimizeLlvm(llvm::Module&, llvm::TargetMachine&)':
> /builddir/build/BUILD/llvm-project-15.0.0.src/mlir/lib/Dialect/GPU/Transforms/SerializeToHsaco.cpp:328: undefined reference to `mlir::makeOptimizingTransformer(unsigned int, unsigned int, llvm::TargetMachine*)'

This is because MLIRGPUTransforms depends on MLIRExecutionEngine in
61bb2e4ea8/mlir/lib/Dialect/GPU/Transforms/SerializeToHsaco.cpp (L328),
but MLIRExecutionEngine is marked as excluded from libMLIR.so.

However, this code doesn't require the full execution engine: It
only performs middle-end optimization, and does not need any of
the JIT/codegen infrastructure. As such, split off a separate
library MLIRExecutionEngineUtils, which only contains that part
and is not excluded from libMLIR.so.

Fixes https://github.com/llvm/llvm-project/issues/54242.

Differential Revision: https://reviews.llvm.org/D125214
2022-05-10 10:17:52 +02:00
Chris Lattner
d85eb4e2d6 [AsmParser] Introduce a new "Argument" abstraction + supporting logic
MLIR has a common pattern for "arguments" that uses syntax
like `%x : i32 {attrs} loc("sourceloc")` which is implemented
in adhoc ways throughout the codebase.  The approach this uses
is verbose (because it is implemented with parallel arrays) and
inconsistent (e.g. lots of things drop source location info).

Solve this by introducing OpAsmParser::Argument and make addRegion
(which sets up BlockArguments for the region) take it.  Convert the
world to propagating this down.  This means that we correctly
capture and propagate source location information in a lot more
cases (e.g. see the affine.for testcase example), and it also
simplifies much code.

Differential Revision: https://reviews.llvm.org/D124649
2022-04-29 12:19:34 -07:00
Chris Lattner
5dedf911de [AsmParser] Rework logic around "region argument parsing"
The asm parser had a notional distinction between parsing an
operand (like "%foo" or "%4#3") and parsing a region argument
(which isn't supposed to allow a result number like #3).

Unfortunately the implementation has two problems:

1) It didn't actually check for the result number and reject
   it.  parseRegionArgument and parseOperand were identical.
2) It had a lot of machinery built up around it that paralleled
   operand parsing.  This also was functionally identical, but
   also had some subtle differences (e.g. the parseOptional
   stuff had a different result type).

I thought about just removing all of this, but decided that the
missing error checking was important, so I reimplemented it with
a `allowResultNumber` flag on parseOperand.  This keeps the
codepaths unified and adds the missing error checks.

Differential Revision: https://reviews.llvm.org/D124470
2022-04-28 11:12:44 -07:00
Vitaly Buka
6e1ac68a0c [mlir] Don't iterate mutable user list
executeOp.operandsMutable().append(asyncTokens) in
addAsyncDependencyAfter can resize and invalidate iterators.

Fixes reports like https://reviews.llvm.org/P8286

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D124577
2022-04-28 08:59:55 -07:00
Chris Lattner
31c8abc3f1 [AsmParser/Printer] Rework sourceloc support for function arguments.
When Location tracking support for block arguments was added, we
discussed various approaches to threading support for this through
function-like argument parsing.  At the time, we added a parallel array
of locations that could hold this.  It turns out that that approach was
verbose and error prone, roughly no one adopted it.

This patch takes a different approach, adding an optional source
locator to the UnresolvedOperand class.  This fits much more naturally
into the standard structure we use for representing locators, and gives
all the function like dialects locator support for free (e.g. see the
test adding an example for the LLVM dialect).

Differential Revision: https://reviews.llvm.org/D124188
2022-04-21 12:43:36 -07:00
Fangrui Song
ae46b3e01f Revert D121279 "[MLIR][GPU] Add canonicalizer for gpu.memcpy"
This reverts commit 12f55cac69d8978d1c433756a8b2114bf9ed1e1b.

Causes miscompile. Will follow up with a reproduce.
2022-04-21 08:55:13 -07:00
Uday Bondhugula
f47a38f517 Add async dependencies support for gpu.launch op
Add async dependencies support for gpu.launch op: this allows specifying
a list of async tokens ("streams") as dependencies for the launch.

Update the GPU kernel outlining pass lowering to propagate async
dependencies from gpu.launch to gpu.launch_func op. Previously, a new
stream was being created and destroyed for a kernel launch. The async
deps support allows the kernel launch to be serialized on an existing
stream.

Differential Revision: https://reviews.llvm.org/D123499
2022-04-21 16:25:59 +05:30
Uday Bondhugula
d7565de6cc [MLIR] NFC. Drop trailing white space in GPU async ops print
NFC. Drop trailing end of line white space in GPU async ops' printer
whenever the list of async deps is empty.

Reviewed By: mehdi_amini, rriddle

Differential Revision: https://reviews.llvm.org/D123754
2022-04-20 17:56:53 +05:30
Arnab Dutta
12f55cac69 [MLIR][GPU] Add canonicalizer for gpu.memcpy
Fold away gpu.memcpy op when only uses of dest are
the memcpy op in question, its allocation and deallocation
ops.

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D121279
2022-04-19 17:54:00 +05:30
River Riddle
58ceae9561 [mlir:NFC] Remove the forward declaration of FuncOp in the mlir namespace
FuncOp has been moved to the `func` namespace for a little over a month, the
using directive can be dropped now.
2022-04-18 12:01:55 -07:00
Mehdi Amini
21b251624b Apply clang-tidy fixes for readability-identifier-naming in GPUDialect.cpp (NFC) 2022-04-18 18:15:30 +00:00
Arnab Dutta
392d55c1e2 [MLIR][GPU] Add canonicalization patterns for folding simple gpu.wait ops.
* Fold away redundant %t = gpu.wait async + gpu.wait [%t] pairs.

* Fold away %t = gpu.wait async ... ops when %t has no uses.

* Fold away gpu.wait [] ops.

* In case of %t1 = gpu.wait async [%t0], replace all uses of %t1
  with %t0.

Differential Revision: https://reviews.llvm.org/D121878
2022-04-14 12:30:55 +05:30
River Riddle
1269f96d2e [mlir] Add MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID to SerializeToCubinPass
This pass is defined in an anonymous namespace and requires an explicit TypeID
2022-04-04 14:28:10 -07:00
River Riddle
5e50dd048e [mlir] Rework the implementation of TypeID
This commit restructures how TypeID is implemented to ideally avoid
the current problems related to shared libraries. This is done by changing
the "implicit" fallback path to use the name of the type, instead of using
a static template variable (which breaks shared libraries). The major downside to this
is that it adds some additional initialization costs for the implicit path. Given the
use of type names for uniqueness in the fallback, we also no longer allow types
defined in anonymous namespaces to have an implicit TypeID. To simplify defining
an ID for these classes, a new `MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID` macro
was added to allow for explicitly defining a TypeID directly on an internal class.

To help identify when types are using the fallback, `-debug-only=typeid` can be
used to log which types are using implicit ids.

This change generally only requires changes to the test passes, which are all defined
in anonymous namespaces, and thus can't use the fallback any longer.

Differential Revision: https://reviews.llvm.org/D122775
2022-04-04 13:52:26 -07:00
Thomas Raoux
d77f483640 [mlir][gpu] Relax restriction on mma load/store op
Those ops can support more complex layout as long as the most inner
dimension is contiguous.

Differential Revision: https://reviews.llvm.org/D122452
2022-03-25 04:03:40 +00:00
Markus Böck
e13d23bc6c [mlir] Rename OpAsmParser::OperandType to OpAsmParser::UnresolvedOperand
I am not sure about the meaning of Type in the name (was it meant be interpreted as Kind?), and given the importance and meaning of Type in the context of MLIR, its probably better to rename it. Given the comment in the source code, the suggestion in the GitHub issue and the final discussions in the review, this patch renames the OperandType to UnresolvedOperand.

Fixes https://github.com/llvm/llvm-project/issues/54446

Differential Revision: https://reviews.llvm.org/D122142
2022-03-21 21:42:13 +01:00
River Riddle
4a3460a791 [mlir:FunctionOpInterface] Rename the "type" attribute to "function_type"
This removes any potential confusion with the `getType` accessors
which correspond to SSA results of an operation, and makes it
clear what the intent is (i.e. to represent the type of the function).

Differential Revision: https://reviews.llvm.org/D121762
2022-03-16 17:07:04 -07:00
River Riddle
3655069234 [mlir] Move the Builtin FuncOp to the Func dialect
This commit moves FuncOp out of the builtin dialect, and into the Func
dialect. This move has been planned in some capacity from the moment
we made FuncOp an operation (years ago). This commit handles the
functional aspects of the move, but various aspects are left untouched
to ease migration: func::FuncOp is re-exported into mlir to reduce
the actual API churn, the assembly format still accepts the unqualified
`func`. These temporary measures will remain for a little while to
simplify migration before being removed.

Differential Revision: https://reviews.llvm.org/D121266
2022-03-16 17:07:03 -07:00
River Riddle
bbfec2a1b0 [mlir] Remove the deprecated ODS Op verifier/parser/printer code blocks
These have been deprecated for ~1 month now and can be removed.

Differential Revision: https://reviews.llvm.org/D121090
2022-03-15 01:17:30 -07:00
Chia-hung Duan
ed645f6336 [mlir] Support verification order (3/3)
In this CL, update the function name of verifier according to the
behavior. If a verifier needs to access the region then it'll be updated
to `verifyRegions`.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D120373
2022-03-11 01:16:28 +00:00
Mehdi Amini
b389d68e52 Revert "Fix link of libmlir.so by adding ExecutionEngine as dependency to GPUTransforms"
This reverts commit b743850b736e4a89378be8bed61c1b3489b56d19.

This didn't produce the expected result.
2022-03-08 20:40:36 +00:00
Mehdi Amini
b743850b73 Fix link of libmlir.so by adding ExecutionEngine as dependency to GPUTransforms
This feels like a layering violation, but it fixes the build.

Fixes #54242

tools/mlir/lib/Dialect/GPU/CMakeFiles/obj.MLIRGPUTransforms.dir/Transforms/SerializeToHsaco.cpp.o:SerializeToHsaco.cpp:function (anonymous namespace)::SerializeToHsacoPass::optimizeLlvm(llvm::Module&, llvm::TargetMachine&):
error: undefined reference to 'mlir::makeOptimizingTransformer(unsigned int, unsigned int, llvm::TargetMachine*)'
2022-03-08 20:33:03 +00:00
River Riddle
9eaff42360 [mlir][NFC] Move Parser.h to Parser/
There is no reason for this file to be at the top-level, and
its current placement predates the Parser/ folder's existence.

Differential Revision: https://reviews.llvm.org/D121024
2022-03-07 01:05:38 -08:00
Krzysztof Drewniak
4e817b3fa3 [MLIR][AMDGPU] Fix typo and add comment to SerializeToHsaco
Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D120943
2022-03-04 17:15:11 +00:00
Krzysztof Drewniak
d7f9220bb6 [MLIR] [AMDGPU] Use correct flags when building SerializeToHsaco
The SerializeToHsaco pass does not depend on ROCm being available on
the build system - it only requires ROCm to be present at runtime.
However, the CMake file that built it tested for
MLIR_ENABLE_ROCM_RUNNER , which implies that ROCm is currently
available and is used to control building ROCm integration tests.

Referencing MLIR_ENABLE_ROCM_RUNNER instead of
MLIR_ENABLE_ROCM_CONVERSIONS in the SerializeToHsaco build therefore
causes problems for clients who wish to build projects that depend on
this pass on a system without an AMD GPU present.

Reviewed By: whchung

Differential Revision: https://reviews.llvm.org/D120663
2022-03-03 21:44:26 +00:00
River Riddle
1f971e23f0 [mlir] Trim a huge number of unnecessary dependencies on the Func dialect
The Func has a large number of legacy dependencies carried over from the old
Standard dialect, which was pervasive and contained a large number of varied
operations. With the split of the standard dialect and its demise, a lot of lingering
dead dependencies have survived to the Func dialect. This commit removes a
large majority of then, greatly reducing the dependence surface area of the
Func dialect.
2022-03-01 12:10:04 -08:00
River Riddle
23aa5a7446 [mlir] Rename the Standard dialect to the Func dialect
The last remaining operations in the standard dialect all revolve around
FuncOp/function related constructs. This patch simply handles the initial
renaming (which by itself is already huge), but there are a large number
of cleanups unlocked/necessary afterwards:

* Removing a bunch of unnecessary dependencies on Func
* Cleaning up the From/ToStandard conversion passes
* Preparing for the move of FuncOp to the Func dialect

See the discussion at https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061

Differential Revision: https://reviews.llvm.org/D120624
2022-03-01 12:10:04 -08:00
Ivan Butygin
d271fc04d5 [mlir][gpu] Split ops sinking from gpu-kernel-outlining pass into separate pass
Previously `gpu-kernel-outlining` pass was also doing index computation sinking into gpu.launch before actual outlining.
Split ops sinking from `gpu-kernel-outlining` pass into separate pass, so users can use theirs own sinking pass before outlining.
To achieve old behavior users will need to call both passes: `-gpu-launch-sink-index-computations -gpu-kernel-outlining`.

Differential Revision: https://reviews.llvm.org/D119932
2022-02-17 10:34:20 +03:00
Shao-Ce SUN
2aed07e96c [NFC][MC] remove unused argument MCRegisterInfo in MCCodeEmitter
Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D119846
2022-02-16 13:10:09 +08:00
Shao-Ce SUN
9cc49c1951 Revert "[NFC][MC] remove unused argument MCRegisterInfo in MCCodeEmitter"
This reverts commit fe25c06cc5bdc2ef9427309f8ec1434aad69dc7a.
2022-02-16 11:57:49 +08:00
Shao-Ce SUN
fe25c06cc5 [NFC][MC] remove unused argument MCRegisterInfo in MCCodeEmitter
For ten years, it seems that `MCRegisterInfo` is not used by any target.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D119846
2022-02-16 11:47:17 +08:00
Krzysztof Drewniak
1aa71944cf [MLIR][GPU] Add missing include to SerilazeToHsaco
Differential Revision: https://reviews.llvm.org/D119852
2022-02-15 17:11:33 +00:00
Krzysztof Drewniak
cc15141794 [MLIR] Link SerializeToHsaco dependencies to correct MLIR library
Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D119774
2022-02-15 16:31:10 +00:00
Ivan Butygin
a2e2fbba17 [mlir][gpu] sinkOperationsIntoLaunchOp: Add user hook for isSinkingBeneficiary
Differential Revision: https://reviews.llvm.org/D119632
2022-02-15 16:50:49 +03:00
Akshay Baviskar
f1efac7f08 Add verifier for gpu.alloc op
Add verifier for gpu.alloc op to verify if the dimension operand counts
and symbol operand counts are same as their memref counterparts.

Differential Revision: https://reviews.llvm.org/D117427
2022-02-15 15:57:58 +05:30
Sameer Sahasrabuddhe
d8f99bb6e0 [AMDGPU] replace hostcall module flag with function attribute
The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.

If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.

The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.

Reviewed By: jdoerfert, arsenm, kpyzhov

Differential Revision: https://reviews.llvm.org/D119216
2022-02-11 22:51:56 +05:30
Thomas Raoux
5ab04bc068 [mlir][gpu] Add device side async copy operations
Add new operations to the gpu dialect to represent device side
asynchronous copies. This also add the lowering of those operations to
nvvm dialect.
Those ops are meant to be low level and map directly to llvm dialects
like nvvm or rocdl.

We can further add higher level of abstraction by building on top of
those operations.
This has been discuss here:
https://discourse.llvm.org/t/modeling-gpu-async-copy-ampere-feature/4924

Differential Revision: https://reviews.llvm.org/D119191
2022-02-10 17:25:59 -08:00
Krzysztof Drewniak
1ce314ce6b [MLIR][GPU][lld] Use LLD bundled in ROCm, removing workaround
Having clarified that executing the SerializeToHsaco pass can
depend on a ROCm installation, switch from calling lld as a library to
using the copy of lld guaranteed to be included in a ROCm install.

This removes the workaround introduced in D119277

Reviewed By: whchung

Differential Revision: https://reviews.llvm.org/D119463
2022-02-10 19:37:30 +00:00
Krzysztof Drewniak
c37b3e4108 [MLIR][GPU] Add now-required include to SerializeToHsaco
Reviewed By: whchung

Differential Revision: https://reviews.llvm.org/D119455
2022-02-10 18:36:38 +00:00
Matthias Springer
69f7647158 [mlir][GPU] Add ShuffleOp builder for constant offset/width
Differential Revision: https://reviews.llvm.org/D119345
2022-02-10 02:55:44 +09:00
Alexandre Ganea
1e661e583d [MLIR] Temporary workaround for calling the LLD ELF driver as-a-lib
This fixes the situation described in https://github.com/llvm/llvm-project/issues/53475 with a repro exposed by https://github.com/ROCmSoftwarePlatform/D108850-lld-bug-reproduction

This is purposely just a workaround to unblock users. This could be transplanted to the release/14.x branch if need be. A proper fix will later be provided in https://reviews.llvm.org/D119049.

Differential Revision: https://reviews.llvm.org/D119277
2022-02-08 19:12:15 -05:00
River Riddle
2418cd92c0 [mlir] Update uses of parser/printer ODS op field to hasCustomAssemblyFormat
The parser/printer fields are deprecated and in the process of being removed.
2022-02-07 19:03:58 -08:00