Move the documentation of the ownership-based buffer deallocation pass
to a separate file. Also improve the documentation a bit and insert a
figure that explains the `bufferization.dealloc` op (copied from the
tutorial at the LLVM Dev Summit 2023).
`attr-dict` currently prints any attribute (inherent or discardable)
that does not occur elsewhere in the assembly format. `prop-dict` on the
other hand, always prints and parses all properties (including inherent
attributes stored as properties) as part of the property dictionary,
regardless of whether the properties or attributes are used elsewhere.
(with the exception of default-valued attributes implemented recently in
https://github.com/llvm/llvm-project/pull/87970).
This PR changes the behavior of `prop-dict` to only print and parse
attributes and properties that do not occur elsewhere in the assembly
format. This is achieved by 1) adding used attributes and properties to
the elision list when printing and 2) using a custom version of
`setPropertiesFromAttr` called `setPropertiesFromParsedAttr` that is
sensitive to the assembly format and auto-generated by ODS.
The current and new behavior of `prop-dict` and `attr-dict` were also
documented.
Happens to also fix https://github.com/llvm/llvm-project/issues/88506
Adds two new CMake functions to query the host system:
* `check_hwcap`,
* `check_emulator`.
Together, these functions are used to check whether a given set of MLIR
integration tests require an emulator. If yes, then the corresponding
CMake var that defies the required emulator executable is also checked.
`check_hwcap` relies on ELF_HWCAP for discovering CPU features from
userspace on Linux systems. This is the recommended approach for Arm
CPUs running on Linux as outlined in this blog post:
* https://community.arm.com/arm-community-blogs/b/operating-systems-blog/posts/runtime-detection-of-cpu-features-on-an-armv8-a-cpu
Other operating systems (e.g. Android) and CPU architectures will
most likely require some other approach. Right now these new hooks are
only used for SVE and SME integration tests.
This relands #86489 with the following changes:
* Replaced:
`set(hwcap_test_file ${CMAKE_BINARY_DIR}/${CMAKE_FILES_DIRECTORY}/hwcap_check.c)`
with:
`set(hwcap_test_file ${CMAKE_BINARY_DIR}/temp/hwcap_check.c)`
The former would trigger an infinite loop when running `ninja`
(after the initial CMake configuration).
* Fixed commit msg. Previous one was taken from the initial GH PR
commit rather than the final re-worked solution (missed this when
merging via GH UI).
* A couple more NFCs/tweaks.
References to headings need to be preceded with a slash. Also,
references to headings on the same page do not need to contain the name
of the document (omitting the document name means if the name changes
the links will still be valid).
I double checked the links by building [the
website](https://github.com/llvm/mlir-www):
```shell
./mlir-www-helper.sh --install-docs ../llvm-project website
cd website && hugo serve
```
Integration tests for ArmSME require an emulator (there's no hardware
available). Make sure that CMake complains if `MLIR_RUN_ARM_SME_TESTS`
is set while `ARM_EMULATOR_EXECUTABLE` is empty.
I'm also adding a note in the docs for future reference.
Revise the printing functionality of TimingManager to accommodate
various output formats. At present, TimingManager is limited to
outputting data solely in plain text format. To overcome this
limitation, I have introduced an abstract class that serves as the
foundation for printing. This approach allows users to implement
additional output formats by extending this abstract class. As part of
this update, I have integrated support for JSON as a new output format,
enhancing the ease of parsing for subsequent processing scripts.
When importing from LLVM IR the data layout of all pointer types
contains an index bitwidth that should be used for index computations.
This revision adds a getter to the DataLayout that provides access to
the already stored bitwidth. The function returns an optional since only
pointer-like types have an index bitwidth. Querying the bitwidth of a
non-pointer type returns std::nullopt.
The new function works for the built-in Index type and, using a type
interface, for the LLVMPointerType.
Transform interfaces are implemented, direction or via extensions, in
libraries belonging to multiple other dialects. Those dialects don't
need to depend on the non-interface part of the transform dialect, which
includes the growing number of ops and transitive dependency footprint.
Split out the interfaces into a separate library. This in turn requires
flipping the dependency from the interface on the dialect that has crept
in because both co-existed in one library. The interface shouldn't
depend on the transform dialect either.
As a consequence of splitting, the capability of the interpreter to
automatically walk the payload IR to identify payload ops of a certain
kind based on the type used for the entry point symbol argument is
disabled. This is a good move by itself as it simplifies the interpreter
logic. This functionality can be trivially replaced by a
`transform.structured.match` operation.
I double checked the links by building [the
website](https://github.com/llvm/mlir-www):
```
$ mlir-www-helper.sh --install-docs ../llvm-project website
$ cd website && hugo serve
```
- Fixed OpenACC's spec link format
- Add missed `OpenACCPasses.md` into Passes.md
- Add missed `MyExtensionCh4.md` into Ch4.md of tutorial of transform
As part of the renaming the Standard dialect to Func dialect, *support*
for the `func.constant` operation was added to the emitter. However, the
emitter cannot emit function types. Hence the emission for a snippet
like
```
%0 = func.constant @myfn : (f32) -> f32
func.func private @myfn(%arg0: f32) -> f32 {
return %arg0 : f32
}
```
failes with `func.mlir:1:6: error: cannot emit type '(f32) -> f32'`.
This removes `func.constant` from the emitter.
It seems the `Traits.md` file was turned into `Traits/_index.md` in
https://reviews.llvm.org/D153291, causing links to `Traits.md` to no
longer work (instead, `Traits` needs to be used).
Use the "main" transform-interpreter pass instead of the test pass.
This, along with the previously introduced debug extension, now allow
tutorials to no longer depend on test passes and extensions.
Even when `private-function-dynamic-ownership` is set, ownership should
never be passed to the callee. This can lead to double deallocs (#77096)
or use-after-free in the caller because ownership is currently passed
regardless of whether there are any further uses of the buffer in the
caller or not.
Note: This is consistent with the fact that ownership is never passed to
nested regions.
This commit fixes#77096.
* Split out `MeshDialect.h` form `MeshOps.h` that defines the dialect
class. Reduces include clutter if you care only about the dialect and
not the ops.
* Expose functions `getMesh` and `collectiveProcessGroupSize`. There
functions are useful for outside users of the dialect.
* Remove unused code.
* Remove examples and tests of mesh.shard attribute in tensor encoding.
Per the decision that Spmdization would be performed on sharding
annotations and there will be no tensors with sharding specified in the
type.
For more info see this RFC comment:
https://discourse.llvm.org/t/rfc-sharding-framework-design-for-device-mesh/73533/81
This commit renames 4 pattern rewriter API functions:
* `updateRootInPlace` -> `modifyOpInPlace`
* `startRootUpdate` -> `startOpModification`
* `finalizeRootUpdate` -> `finalizeOpModification`
* `cancelRootUpdate` -> `cancelOpModification`
The term "root" is a misnomer. The root is the op that a rewrite pattern
matches against
(https://mlir.llvm.org/docs/PatternRewriter/#root-operation-name-optional).
A rewriter must be notified of all in-place op modifications, not just
in-place modifications of the root
(https://mlir.llvm.org/docs/PatternRewriter/#pattern-rewriter). The old
function names were confusing and have contributed to various broken
rewrite patterns.
Note: The new function names use the term "modify" instead of "update"
for consistency with the `RewriterBase::Listener` terminology
(`notifyOperationModified`).
Introduce a new extension for simple print-debugging of the transform
dialect scripts. The initial version of this extension consists of two
ops that are printing the payload objects associated with transform
dialect values. Similar ops were already available in the test extenion
and several downstream projects, and were extensively used for testing.
Clarify what kind of IR modifications are allowed. Also improve the
documentation of the greedy rewrite driver entry points.
Addressing comments in #76219.
After PR#75548, the OpenACC documentation on the MLIR website has a few
issues. This change corrects them:
- Renames OpenACC.md to OpenACCDialect.md so that links remain
unchanged. In its current state, the links to
https://mlir.llvm.org/docs/Dialects/OpenACCDialect/ no longer work.
- Since the old OpenACCDialect.md (the one with operation definitions)
is being included in the new file, rename the old file to prevent name
ambiguity.
- A header is needed in the .md file, otherwise the index on website is
not properly created.
- Add a new section before including the operations .md file because
otherwise the separation is not clear.
This document captures the design philosophy of the acc dialect. It also
shares the rationale behind the design and implementation of various
operations - and ties that back to the dialect design goals.
Co-authored-by: Valentin Clement <clementval@gmail.com>
Co-authored-by: Slava Zakharin <szakharin@nvidia.com>
This PR improves the documentation for the `gpu-lower-to-nvvm-pipeline`
(as it was remaning item for #75775)
- Changes pipeline `gpu-lower-to-nvvm` -> `gpu-lower-to-nvvm-pipeline`
- Adds a section in GPU Dialect in website. It clarifies the pipeline's
functionality in lowering primary dialects to NVVM targets.
Replace (in tests and docs):
%forall, %tiled = transform.structured.tile_using_forall
with (updated order of return handles):
%tiled, %forall = transform.structured.tile_using_forall
Similar change is applied to (in the TD tutorial):
transform.structured.fuse_into_containing_op
This update makes sure that the tests/documentation are consistent with
the Op specifications. Follow-up for #67320 which updated the order of
the return handles for `tile_using_forall`.
Existing implementation of the LLVM type converter for LLVM structs
containing incompatible types was attempting to change identifiers of
the struct in case of name clash post-conversion (all identified structs
have different names post-conversion since one cannot change the body of
the struct once initialized). Beyond a trivial error of not updating the
counter in renaming, this approach was broken for recursive structs that
can't be made aware of the renaming and would use the pre-existing
struct with clashing name instead.
For example, given
`!llvm.struct<"_Converted.foo", (struct<"_Converted.foo">, f32)>`
the following type
`!llvm.struct<"foo", (struct<"foo", index>)>`
would incorrectly convert to
```
!llvm.struct<"_Converted_1.foo",
(struct<"_Converted.foo",
(struct<"_Converted.foo">, f32)>)>
```
Remove this incorrect renaming and simply refuse to convert types if it
would lead to identifier clashes for structs with different bodies.
Document the expectation that such generated names are reserved and must
not be present in the input IR of the converter. If we ever actually
need to use handle such cases, this can be achieved by temporarily
renaming structs with reserved identifiers to an unreserved name and
back in a pre/post-processing pass that does _not_ use the type
conversion infra.
Data layout queries may be issued for types whose size exceeds the range
of 32-bit integer as well as for types that don't have a size known at
compile time, such as scalable vectors. Use best practices from LLVM IR
and adopt `llvm::TypeSize` for size-related queries and `uint64_t` for
alignment-related queries.
See #72678.
MLIR can't really be const-correct (it would need a `ConstValue` class
alongside the `Value` class really, like `ArrayRef` and
`MutableArrayRef`). This is however making is more consistent: method
that are directly modifying the Value shouldn't be marked const.
This renames the `emitc.call` op to `emitc.call_opaque` as the existing
call op does not refer to the callee by symbol. The rename allows to
introduce a new call op alongside with a future `emitc.func` op to model
and facilitate functions and function calls.