Blocks without a terminator are not handled correctly by
`Block::without_terminator`: the last operation is excluded, even when
it is not a terminator. With this commit, only terminators are excluded.
If the last operation is unregistered, it is included for safety.
`Operation::clone` does not clone the properties of unregistered ops.
This patch modifies the property initialization for unregistered ops to
initialize properties as attributes.
fixes#151640
---------
Signed-off-by: Boyana Norris <brnorris03@gmail.com>
Add convention for lexer if the last file is contained in the first,
then the first is used for error reporting. This requires that these two
overlap to make it easy to find the corresponding spots. Enables going
from
```
within split at mlir/test/IR/invalid.mlir::10 offset :6:9: error: reference to an undefined block
```
to
```
mlir/test/IR/invalid.mlir:15:9: error: reference to an undefined block
```
This does change the split to not produce always null terminated buffers
and tools that need it, need to do so themselves (which is mostly by copying -
this may have little actual impact as previously this was a copy too).
Erasing the operation to which the current insertion point is set,
leaves the insertion point in an invalid state. This commit resets the
insertion point to the following operation.
Also adjust the insertion point when inlining a block.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
This PR adds a feature that attaches a listener to all RewritePatterns that
logs information about the modified operations.
When the MLIR test suite is run, these debug outputs can
be filtered and combined into an index linking operations to the
patterns that insert, modify, or replace them. This index is intended to
be used to create a website that allows one to look up patterns from an
operation name.
The debug logs emitted can be viewed with --debug-only=generate-pattern-catalog,
and the lit config is modified to do this when the env var MLIR_GENERATE_PATTERN_CATALOG is set.
Example usage:
```
mkdir build && cd build
cmake -G Ninja ../llvm \
-DLLVM_ENABLE_PROJECTS="mlir" \
-DLLVM_TARGETS_TO_BUILD="host" \
-DCMAKE_BUILD_TYPE=DEBUG
ninja -j 24 check-mlir
MLIR_GENERATE_PATTERN_CATALOG=1 bin/llvm-lit -j 24 -v -a tools/mlir/test | grep 'pattern-logging-listener' | sed 's/^# | [pattern-logging-listener] //g' | sort | uniq > pattern_catalog.txt
```
Sample pattern catalog output (that fits in a gist):
https://gist.github.com/j2kun/02d1ab8d31c10d71027724984c89905a
---------
Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
This PR fixes a use-after-free error that happens when `DistinctAttr`
instances are created within a `PassManager` running with crash recovery
enabled. The root cause is that `DistinctAttr` storage is allocated in a
thread_local allocator, which is destroyed when the crash recovery
thread joins, invalidating the storage.
Moreover, even without crash reproduction disabling multithreading on
the context will destroy the context's thread pool, and in turn delete
the threadlocal storage. This means a call to
`ctx->disableMulthithreading()` breaks the IR.
This PR replaces the thread local allocator with a synchronised
allocator that's shared between threads. This persists the lifetime of
allocated DistinctAttr storage instances to the lifetime of the context.
### Problem Details:
The `DistinctAttributeAllocator` uses a
`ThreadLocalCache<BumpPtrAllocator>` for lock-free allocation of
`DistinctAttr` storage in a multithreaded context. The issue occurs when
a `PassManager` is run with crash recovery (`runWithCrashRecovery`), the
pass pipeline is executed on a temporary thread spawned by
`llvm::CrashRecoveryContext`. Any `DistinctAttr`s created during this
execution have their storage allocated in the thread_local cache of this
temporary thread. When the thread joins, the thread_local storage is
destroyed, freeing the `DistinctAttr`s' memory. If this attribute is
accessed later, e.g. when printing, it results in a use-after-free.
As mentioned previously, this is also seen after creating some
`DistinctAttr`s and then calling `ctx->disableMulthithreading()`.
### Solution
`DistinctAttrStorageAllocator` uses a synchronised, shared allocator
instead of one wrapped in a `ThreadLocalCache`. The former is what
stores the allocator in transient thread_local storage.
### Testing:
A C++ unit test has been added to validate this fix. (I was previously
reproducing this failure with `mlir-opt` but I can no longer do so and I
am unsure why.)
-----
Note: This is a 2nd attempt at my previous PR
https://github.com/llvm/llvm-project/pull/128566 that was reverted in
https://github.com/llvm/llvm-project/pull/133000. I believe I've
addressed the TSAN and race condition concerns.
The motivation is to avoid having to negate `isDynamic*` checks, avoid
double negations, and allow for `ShapedType::isStaticDim` to be used in
ADT functions without having to wrap it in a lambda performing the
negation.
Also add the new functions to C and Python bindings.
Currently AffineExpr Add has ability to optimize `"s1 + (s1 // c * -c)"
to "s1 % c"`,
but can not optimize `"(s0 + s1) + (s1 // c * -c)"`.
This patch provide an opportunity to do this simplification, let it can
be simplified to `"s0 + s1 % c"`.
ArrayRef(std::nullopt) just got deprecated. This patch does the same
to MutableArrayRef(std::nullopt). Since there are only a couple of
uses, this patch does migration and deprecation at the same time.
[MLIR] Fix circular dependency introduced in In
https://github.com/llvm/llvm-project/pull/144897. This PR is to break
the dependency. by moving StateStack to IR folder
This commit resolves a circular dependency issue between mlir/Support
and mlir/IR:
- Move StateStack.h and StateStack.cpp from Support to IR folder
- Update CMakeLists.txt files to reflect the new locations
- Update Bazel BUILD file to maintain correct dependencies
- Update includes in affected files (flang, Target/LLVMIR)
The circular dependency was caused by StateStack.h depending on
IR/Visitors.h
while other IR files depended on Support. Moving StateStack to IR
eliminates
this cycle while maintaining proper separation of concerns.
This patch is adding the ability to print a uint8_t/int8_t as an int
instead of a char and demonstrate support for int8_t/uin8_t properties.
Fix#144993
This patch enhances `MemRefType::areTrailingDimsContiguous` to also
handle memrefs with dynamic dimensions.
The implementation itself is based on a new member function
`MemRefType::getMaxCollapsableTrailingDims` that return the maximum
number of trailing dimensions that can be collapsed - trivially all
dimensions for memrefs with identity layout, or by examining the memref
strides stopping at discontiguous or statically unknown strides.
Add a new API to access all blobs that are stored in the blob manager.
The main purpose (as of now) is to allow users of dialect resources to
iterate over all blobs, especially when the blobs are no longer used in
IR (e.g. the operation that uses the blob is deleted) and thus cannot be
easily accessed without manual tracking of keys.
This patch adds the `PtrLikeTypeInterface` type interface to identify
pointer-like types. This interface is defined as:
```
A ptr-like type represents an object storing a memory address. This object
is constituted by:
- A memory address called the base pointer. This pointer is treated as a
bag of bits without any assumed structure. The bit-width of the base
pointer must be a compile-time constant. However, the bit-width may remain
opaque or unavailable during transformations that do not depend on the
base pointer. Finally, it is considered indivisible in the sense that as
a `PtrLikeTypeInterface` value, it has no metadata.
- Optional metadata about the pointer. For example, the size of the memory
region associated with the pointer.
Furthermore, all ptr-like types have two properties:
- The memory space associated with the address held by the pointer.
- An optional element type. If the element type is not specified, the
pointer is considered opaque.
```
This patch adds this interface to `!ptr.ptr` and the `memref` type.
Furthermore, this patch adds necessary ops and type to handle casting
between `!ptr.ptr` and ptr-like types.
First, it defines the `!ptr.ptr_metadata` type. An opaque type to
represent the metadata of a ptr-like type. The rationale behind adding
this type, is that at high-level the metadata of a type like `memref`
cannot be specified, as its structure is tied to its lowering.
The `ptr.get_metadata` operation was added to extract the opaque pointer
metadata. The concrete structure of the metadata is only known when the
op is lowered.
Finally, this patch adds the `ptr.from_ptr` and `ptr.to_ptr` operations.
Allowing to cast back and forth between `!ptr.ptr` and ptr-like types.
```mlir
func.func @func(%mr: memref<f32, #ptr.generic_space>) -> memref<f32, #ptr.generic_space> {
%ptr = ptr.to_ptr %mr : memref<f32, #ptr.generic_space> -> !ptr.ptr<#ptr.generic_space>
%mda = ptr.get_metadata %mr : memref<f32, #ptr.generic_space>
%res = ptr.from_ptr %ptr metadata %mda : !ptr.ptr<#ptr.generic_space> -> memref<f32, #ptr.generic_space>
return %res : memref<f32, #ptr.generic_space>
}
```
It's future work to replace and remove the `bare-ptr-convention` through
the use of these ops.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
This patch simplifies code by removing the values from
insert/try_emplace. Note that default values inserted by try_emplace
are immediately overrideen in all these cases.
This PR adds `arith.scaling_truncf` and `arith.scaling_extf` operations
which supports the block quantization following OCP MXFP specs listed
here
https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf
OCP MXFP Spec comes with reference implementation here
https://github.com/microsoft/microxcaling/tree/main
Interesting piece of reference code is this method `_quantize_mx`
7bc41952de/mx/mx_ops.py (L173).
Both `arith.scaling_truncf` and `arith.scaling_extf` are designed to be
an elementwise operation. Please see description about them in
`ArithOps.td` file for more details.
Internally,
`arith.scaling_truncf` does the
`arith.truncf(arith.divf(input/(2^scale)))`. `scale` should have
necessary broadcast, clamping, normalization and NaN propagation done
before callling into `arith.scaling_truncf`.
`arith.scaling_extf` does the `arith.mulf(2^scale, input)` after taking
care of necessary data type conversions.
CC: @krzysz00 @dhernandez0 @bjacob @pashu123 @MaheshRavishankar
@tgymnich
---------
Co-authored-by: Prashant Kumar <pk5561@gmail.com>
Co-authored-by: Krzysztof Drewniak <Krzysztof.Drewniak@amd.com>
- try_emplace(Key) is shorter than insert({Key, nullptr}).
- try_emplace performs value initialization without value parameters.
- We overwrite values on successful insertion anyway.
We already have hasOneUse. Like llvm::Value we provide helper methods to
query the number of uses of a Value. Add unittests for Value, because
that was missing.
---------
Co-authored-by: Michael Maitland <michaelmaitland@meta.com>
The original implementation will create two intermediate AffineMap in
the context, calling this compose function with different values
multiple times will occupy a lot of memory.
To improve the performance, we can call the AffineExpr::replace
directly, so we don't need to store all combinations of values in the
context.
The forms of the MemRef builder that took an integer argument instead of
an attribute have been deprecated for years now, and have almost no
upstream uses (the remaining ones are handled in this PR). Therefore,
remove them.
This commit refactors the getStridesAndOffet() method on MemRefType to
just call `MemRefLayoutAttrInterface::getStridesAndOffset(shape,
strides& offset&)`, allowing downstream users and future layouts (ex, a
potential contiguous layout) to implement it without needing to patch
BuiltinTypes or without needing them to conform their affine maps to the
canonical strided form.