Retry landing https://github.com/llvm/llvm-project/pull/153373
## Major changes from previous attempt
- remove the test in CAPI because no existing tests in CAPI deal with
sanitizer exemptions
- update `mlir/docs/Dialects/GPU.md` to reflect the new behavior: load
GPU binary in global ctors, instead of loading them at call site.
- skip the test on Aarch64 since we have an issue with initialization there
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
The LLVM dialect no longer has its own vector types. It uses
`mlir::VectorType` everywhere. Remove
`LLVM::getFixedVectorType/getScalableVectorType` and use
`VectorType::get` instead. This commit addresses a
[comment](https://github.com/llvm/llvm-project/pull/133286#discussion_r2022192500)
on the PR that deleted the LLVM vector types.
The LLVM dialect no longer has its own vector types. It uses
`mlir::VectorType` everywhere. Remove `LLVM::getVectorElementType` and
use `cast<VectorType>(ty).getElementType()` instead. This commit
addresses a
[comment](https://github.com/llvm/llvm-project/pull/133286#discussion_r2022192500)
on the PR that deleted the LLVM vector types.
Also improve vector type constraints by specifying the
`mlir::VectorType` C++ class, so that explicit casts to `VectorType` can
be avoided in some places.
Since #125690, the MLIR vector type supports `!llvm.ptr` as an element
type. The only remaining element type for `LLVMFixedVectorType` is now
`LLVMPPCFP128Type`.
This commit turns `LLVMPPCFP128Type` into a proper FP type (by
implementing `FloatTypeInterface`), so that the MLIR vector type accepts
it as an element type. This makes `LLVMFixedVectorType` obsolete.
`LLVMScalableVectorType` is also obsolete. This commit deletes
`LLVMFixedVectorType` and `LLVMScalableVectorType`.
Note for LLVM integration: Use `VectorType` instead of
`LLVMFixedVectorType` and `LLVMScalableVectorType`.
There are cases in SPIR-V shaders where values need to be yielded from
the selection region to make valid MLIR. For example (part of the SPIR-V
shader decompiled to GLSL):
```
bool _115
if (_107)
{
// ...
float _200 = fma(...);
// ...
_115 = _200 < _174;
}
else
{
_115 = _107;
}
bool _123;
if (_115)
{
// ...
float _213 = fma(...);
// ...
_123 = _213 < _174;
}
else
{
_123 = _115;
}
````
This patch extends `mlir.selection` so it can return values.
`mlir.merge` is used as a "yield" operation. This allows to maintain a
compatibility with code that does not yield any values, as well as, to
maintain an assumption that `mlir.merge` is the only operation in the
merge block of the selection region.
This patch introduces a use for the new `getBlockArgsPairs` to avoid
having to manually list each applicable clause.
Also, the `numClauseBlockArgs()` function is introduced, which
simplifies the implementation of the interface's verifier and enables
better memory handling within `getBlockArgsPairs`.
This patch makes additions to the `BlockArgOpenMPOpInterface` to
simplify its use by letting it handle the matching between operands and
their associated entry block arguments. Most significantly, the
following is now possible:
```c++
SmallVector<std::pair<Value, BlockArgument>> pairs;
cast<BlockArgOpenMPOpInterface>(op).getBlockArgsPairs(pairs);
for (auto [var, arg] : pairs) {
// var points to the operand (outside value) and arg points to the entry
// block argument associated to that value.
}
```
This is achieved by making the interface define and use `getXyzVars()`
methods, which by default return empty `OperandRange`s and are overriden
by getters automatically produced for the `Variadic<...> $xyz_vars`
tablegen argument of the corresponding clause. These definitions can
then be simplified, since they no longer need to manually define
`numXyzBlockArgs` functions as a result.
A side-effect of this is that all ops implementing this interface will
now publicly define `getXyzVars()` functions for all entry block
argument-generating clauses, even if they don't actually accept all
clauses. However, these would just return empty ranges, so it shouldn't
cause issues.
This change uncovered some incorrect definitions of class declarations
related to the `ReductionClauseInterface`, and the `OpenMP_DetachClause`
incorrectly implementing the `BlockArgOpenMPOpInterface`, so these
issues are also addressed.
This PR https://github.com/llvm/llvm-project/pull/123902 broke python
bindings for `tensor.pack`/`unpack`. This PR fixes that. It also
1. adds convenience wrappers for pack/unpack
2. cleans up matmul-like ops in the linalg bindings
3. fixes linalg docs missing pack/unpack
With the removal of mlir-vulkan-runner (as part of #73457) in
e7e3c45bc70904e24e2b3221ac8521e67eb84668, mlir-cpu-runner is now the
only runner for all CPU and GPU targets, and the "cpu" name has been
misleading for some time already. This commit renames it to mlir-runner.
This patch adds the `host_eval` clause to the `omp.target` operation.
Additionally, it updates its op verifier to make sure all uses of block
arguments defined by this clause fall within one of the few cases where
they are allowed.
MLIR to LLVM IR translation fails on translation of this clause with a
not-yet-implemented error.
OpenACC data clause operations previously required that the variable
operand implemented PointerLikeType interface. This was a reasonable
constraint because the dialects currently mixed with `acc` do use
pointers to represent variables. However, this forces the "pointer"
abstraction to be exposed too early and some cases are not cleanly
representable through this approach (more specifically FIR's `fix.box`
abstraction).
Thus, relax this by allowing a variable to be a type which implements
either `PointerLikeType` interface or `MappableType` interface.
This patch simplifies the representation of OpenMP loop wrapper
operations by introducing the `NoTerminator` trait and updating
accordingly the verifier for the `LoopWrapperInterface`.
Since loop wrappers are already limited to having exactly one region
containing exactly one block, and this block can only hold a single
`omp.loop_nest` or loop wrapper and an `omp.terminator` that does not
return any values, it makes sense to simplify the representation of loop
wrappers by removing the terminator.
There is an extensive list of Lit tests that needed updating to remove
the `omp.terminator`s adding some noise to this patch, but actual
changes are limited to the definition of the `omp.wsloop`, `omp.simd`,
`omp.distribute` and `omp.taskloop` loop wrapper ops, Flang lowering for
those, `LoopWrapperInterface::verifyImpl()`, SCF to OpenMP conversion
and OpenMP dialect documentation.
This patch adds general information on the proposed approach to unify
the handling and representation of clauses that define entry block
arguments attached to operations that accept them.
This patch updates the OpenMP dialect top-level documentation to
describe the operand structures, when they can be used and how they are
automatically generated.
This patch describes the loop wrapper approach to represent
loop-associated constructs in the OpenMP MLIR dialect and documents
current limitations and ongoing efforts.
This patch creates a handwritten main documentation page for the OpenMP
dialect linking to the ODS-generated one as a sub-section.
This new page can be extended to better describe overall design
decisions of the dialect rather than relying exclusively on
documentation generated automatically from ODS descriptions. After some
investigation, there seem to be a few main ways we could structure
dialect documentation to allow the introduction of possibly extensive
handwritten text.
- Create a top-level OpenMPDialect.td file that includes the
auto-generated one. This is what the `acc` dialect currently does, but
it results in the addition of two equal TOCs. It would be possible to
move the `include` before all handwritten sections so that the page
would have a single TOC, but I believe moving general descriptions to
the end of the document would hurt readability. Also keeping the section
order without introducing a second TOC would mean the TOC would be
inserted somewhere halfway through the page, which isn't useful.
- Create an OpenMPDialect directory with an _index.md including the
auto-generated documentation. This is a different way of reproducing the
same issues described above, which is what is currently done for the
`linalg` dialect. The multiple TOC issue there is avoided by only
including automatically-generated documentation for operations (i.e.
`mlir-tblgen -gen-op-doc`) rather than for dialects (i.e. `mlir-tblgen
-gen-dialect-doc`). That approach would make it impossible to generate
all of the documentation without adding new tablegen backends for
`DialectAttr`, `DialectType` and `EnumAttrInfo` definitions or making
the TOC optional through a command line option.
- Create an OpenMPDialect directory with an _index.md that does not
include the auto-generated documentation. Instead, link to another
document in that directory that includes it. This is the approach taken
here, and it circumvents all these issues without having to make any
changes to tablegen backends.
Adds a few notes on scalable vectors in the docs for the Vector dialect.
This is mostly "repeating" things from LLVM's LangRef.
Additionally:
* Adds a few basic tests with scalable vectors (those should've been
added long time ago),
* Updates a comment in "TypeConverter.cpp" (the current comment is
out-of-date),
* Includes small formatting edits in Vector.md.
**NOTE** Depends on #101813 - only review the top commit
The link has been "broken" since #73792 that updated
"## DeeperDive" to "## LLVM Lowering Tradeoffs".
This patch fixes the MD link for the affected sub-section:
* Before: [deeper dive section](#DeeperDive)
* After: [LLVM Lowering Tradeoffs](#llvm-lowering-tradeoffs)
I've also rephrased the surrounding comment a
bit - to better match the updated section name.
It is now translated to `<1 x i64>`, which allows the removal of a bunch
of special casing.
This _incompatibly_ changes the ABI of any LLVM IR function with
`x86_mmx` arguments or returns: instead of passing in mmx registers,
they will now be passed via integer registers. However, the real-world
incompatibility caused by this is expected to be minimal, because Clang
never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>`
or `double`, depending on ABI.
This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type.
That type simply no longer corresponds to an IR type, and is used only
by MMX intrinsics and inline-asm operands.
Because SelectionDAGBuilder only knows how to generate the
operands/results of intrinsics based on the IR type, it thus now
generates the intrinsics with the type MVT::v1i64, instead of
MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus
have the X86 backend fix them up in DAGCombine. (This may be a
short-lived hack, if all the MMX intrinsics can be removed in upcoming
changes.)
Works towards issue #98272.
This patch removes the last vestiges of the old gpu serialization
pipeline. To compile GPU code use target attributes instead.
See [Compilation overview | 'gpu' Dialect - MLIR
docs](https://mlir.llvm.org/docs/Dialects/GPU/#compilation-overview) for
additional information on the target attributes compilation pipeline
that replaced the old serialization pipeline.
This commit adds `emitc.size_t`, `emitc.ssize_t` and `emitc.ptrdiff_t`
types to the EmitC dialect. These are used to map `index` types to C/C++
types with an explicit signedness, and are emitted in C/C++ as `size_t`,
`ssize_t` and `ptrdiff_t`.
The `noinline`, `alwaysinline`, and `optnone` function attributes are
already being used in MLIR code for the LLVM inlining interface and in
some SPIR-V lowering, despite residing in the passthrough dictionary,
which is intended as exactly that -- a pass through MLIR -- and not to
model any actual semantics being handled in MLIR itself.
Promote the `noinline`, `alwaysinline`, and `optnone` attributes out of
the passthrough dictionary on `llvm.func` into first class unit
attributes, updating the import and export accordingly.
Add a verifier to `llvm.func` that checks that these attributes are not
set in an incompatible way according to the LLVM specification.
Update the LLVM dialect inlining interface to use the first class
attributes to check whether inlining is possible.
Some docs were emitted into the wrong location (Polynomial/ instead of
Dialect/). Furthermore, `-gen-dialect-docs` subsumes
`-gen-attr/typedef-docs` so the latter are not required.
Add a top-level entry that includes both other files in a proper order.