92 Commits

Author SHA1 Message Date
Mehdi Amini
62b29d9f76
[MLIR] Adopt LDBG() debug macro in BytecodeWriter.cpp (NFC) (#154642) 2025-08-20 22:45:39 +00:00
Hank
c075fb8c37
[MLIR] Fix duplicated attribute nodes in MLIR bytecode deserialization (#151267)
Fixes #150163 

MLIR bytecode does not preserve alias definitions, so each attribute
encountered during deserialization is treated as a new one. This can
generate duplicate `DISubprogram` nodes during deserialization.

The patch adds a `StringMap` cache that records attributes and fetches
them when encountered again.
2025-08-20 13:03:26 +00:00
Kazu Hirata
b3b8a097fe
[mlir] Use *Map::try_emplace (NFC) (#143341)
- try_emplace(Key) is shorter than insert({Key, nullptr}).
- try_emplace performs value initialization without value parameters.
- We overwrite values on successful insertion anyway.
2025-06-09 07:18:26 -07:00
Michael Maitland
7454098a9e
[mlir][Value] Add getNumUses, hasNUses, and hasNUsesOrMore to Value (#142084)
We already have hasOneUse. Like llvm::Value we provide helper methods to
query the number of uses of a Value. Add unittests for Value, because
that was missing.

---------

Co-authored-by: Michael Maitland <michaelmaitland@meta.com>
2025-05-30 00:39:45 -04:00
Kazu Hirata
52e3f3d68c
[mlir] Use llvm::make_first_range (NFC) (#135900) 2025-04-15 23:17:33 -07:00
Kazu Hirata
eb7f51485e
[mlir] Use llvm::append_range (NFC) (#135722) 2025-04-14 22:22:04 -07:00
Han-Chung Wang
66b0b0466b
[MLIR][NFC] Fix incomplete boundary comments. (#133516)
I observed that we have the boundary comments in the codebase like:

```
//===----------------------------------------------------------------------===//
// ...
//===----------------------------------------------------------------------===//
```

I also observed that there are incomplete boundary comments. The
revision is generated by a script that completes the boundary comments.

```
//===----------------------------------------------------------------------===//
// ...

...
```

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-03-31 09:29:54 -07:00
Nikhil Kalra
f3a29906aa
[mlir] BytecodeWriter: invoke reserveExtraSpace (#126953)
Update `BytecodeWriter` to invoke `reserveExtraSpace` on the stream
before writing to it. This will give clients implementing custom output
streams the opportunity to allocate an appropriately sized buffer for
the write.
2025-02-12 14:17:30 -08:00
Karim Nosseir
7fa57cd430
[MLIR] Add move constructor to BytecodeWriterConfig (#126130)
The config is currently not movable and because there are constructors
the default move won't be generated, which prevents it from being moved.
Also, it is not copyable because of the unique_ptr. This PR adds move
constructor to allow moving it.
2025-02-06 21:30:55 -08:00
Kazu Hirata
4f4e2abb1a
[mlir] Migrate away from PointerUnion::{is,get} (NFC) (#122591)
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
2025-01-11 13:16:43 -08:00
Wang Qiang
b77e40265c
[llvm][NFC] Fix typos: replace “avaliable” with “available” across various files (#114524)
This pull request corrects multiple occurrences of the typo "avaliable"
to "available" across the LLVM and Clang codebase. These changes improve
the clarity and accuracy of comments and documentation. Specific
modifications are in the following files:

1. clang-tools-extra/clang-tidy/readability/FunctionCognitiveComplexityCheck.cpp:
Updated comments in readability checks for cognitive complexity.
2. llvm/include/llvm/ExecutionEngine/Orc/ExecutionUtils.h: Corrected
documentation for JITDylib responsibilities.
3. llvm/include/llvm/Target/TargetMacroFusion.td: Fixed descriptions for
FusionPredicate variables.
4. llvm/lib/CodeGen/SafeStack.cpp: Improved comments on DominatorTree
availability.
5. llvm/lib/Target/RISCV/RISCVSchedSiFive7.td: Enhanced resource usage
descriptions for vector units.
6. llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp: Updated invariant
description in shift-detect idiom logic.
7. llvm/test/MC/ARM/mve-fp-registers.s: Amended ARM MVE register
availability notes.
8. mlir/lib/Bytecode/Reader/BytecodeReader.cpp: Adjusted forward
reference descriptions for bytecode reader operations.

These changes have no impact on code functionality, focusing solely on
documentation clarity.

Co-authored-by: wangqiang <wangqiang1@kylinos.cn>
2024-11-01 13:25:04 +00:00
Kazu Hirata
85c97c1cec
[Bytecode] Avoid repeated hash lookups (NFC) (#108320) 2024-09-12 00:51:38 -07:00
Kevin Gleason
d1578848e9
Add logging for emit functions in BytecodeWriter.cpp (#99558)
Recently there was a change to materializing unrealized conversion
casts, which inserted conversion that previously did not exist during
legalization (https://github.com/llvm/llvm-project/pull/97903), after
these cases are inserted and then washed away after transformation
completes, it caused the use-list ordering of an op to change in some
cases: `my.add %arg0(use1), %arg0(use2) --> my.add %arg0(use2),
%arg0(use1)`, which subtly changes the bytecode emitted since this is
considered a custom use-list.

When investigating why the bytecode had changed I added the following
logging which helped track down the difference, in my case it showed
extra bytes with "use-list section". With
`-debug-only=mlir-bytecode-writer` emits logs like the following,
detailing the source of written bytes:

```
emitBytes(4b)	bytecode header
emitVarInt(6)	bytecode version
emitByte(13)	bytecode version
emitBytes(17b)	bytecode producer
emitByte(0)	null terminator
emitVarInt(2)	dialects count
...
emitByte(5)	dialect version
emitVarInt(4)	op names count
emitByte(9)	op names count
emitVarInt(0)	dialect number
...
emitVarInt(2)	dialect writer
emitByte(5)	dialect writer
emitVarInt(9259963783827161088)	dialect APInt
...
emitVarInt(3)	attr/type offset
emitByte(7)	attr/type offset
emitByte(3)	section code
emitVarInt(18)	section size
...
```

Note: this uses string constants and `StringLiteral`, I'm not sure if
these are washed away during compilation / OK to have these around for
debuggin, or if there's a better way to do this? Alternative was adding
many braces and `LLVM_DEBUG` calls at each callsite, but this felt more
error prone / likely to miss some callsites.
2024-07-20 12:57:50 -05:00
Hideto Ueno
c0084c36ed
[mlir][BytecodeReader] Const qualify *SectionReader, NFC (#99376)
`StringSectionReader`, `ResourceSectionReader` and
`PropertiesSectionReader` are immutable after `initialize` so this PR
adds const to their parsing functions and references in `AttrTypeReader`
and `DialectReader`.
2024-07-18 19:45:36 +09:00
Jacques Pienaar
f1ac7725e4
[mlir] Remove bytecode reader & writer header from interface. (#98920)
Flagged some additional headers missing in process.

Inspired by #98676
2024-07-15 16:09:22 -07:00
Jeff Niu
af7ee51a90
[mlir][bytecode] Fix external resource bytecode parsing (#97650)
The key was being dropped for external resources because they aren't
present in the dialect resource name mapper.
2024-07-03 15:32:45 -07:00
Ramkumar Ramachandra
db791b278a
mlir/LogicalResult: move into llvm (#97309)
This patch is part of a project to move the Presburger library into
LLVM.
2024-07-02 10:42:33 +01:00
Jeff Niu
fb771fe315
[mlir] Slightly optimize bytecode op numbering (#88310)
If the bytecode encoding supports properties, then the dictionary
attribute is always the raw dictionary attribute of the operation,
regardless of what it contains. Otherwise, get the dictionary attribute
from the op: if the op does not have properties, then it returns the raw
dictionary, otherwise it returns the combined inherent and discardable
attributes.
2024-04-10 23:34:48 +02:00
Matteo Franciolini
da092e8808
Fix bytecode roundtrip of unregistered ops (#82932)
When roundtripping to bytecode an unregistered operation name that does
not contain any '.' separator, the bytecode writer will emit an op
encoding without a proper opName. In this case, the string just becomes
a possibly unknown dialect name. At parsing, this dialect name is
used as a proper operation name.

However, when the unregistered operation name coincidentally matches
that of a dialect, the parser would fail. That means we can't roundtrip
an unregistered op with a name that matches one of the registered
dialect names. For example,

```
"index"() : () -> ()
```

can be emitted but cannot be parsed, because its name is coincidentally
the same as that of the Index dialect. The patch removes such
inconsistency.

This patch specifically fixes the bytecode roundtrip of
`mlir/test/IR/parser.mlir`.
2024-02-25 16:18:42 -08:00
Matteo Franciolini
5375cbfb62
Fix pipeline-invalid.mlir bytecode roundtrip test (#82366)
If an op was not contained in a region when was written to bytecode,
we don't have an initialized valueScope with forward references to
define.
2024-02-20 21:40:36 -08:00
Alex Zinenko
5ed11e767c [mlir] don't use magic numbers in IRNumbering.cpp
Bytecode versions have named constants that should be used instead of
magic numbers.
2024-01-04 09:49:34 +00:00
Alex Zinenko
985bb3a20a [mlir] fix bytecode writer after c1eab57673ef3eb28
The change in c1eab57 fixed the
behavior of `getDiscardableAttrDictionary` for ops that are not using
properties to only return discardable attributes. Bytecode writer was
relying on the wrong behavior and would assume all attributes are
discardable, without appropriate testing. Fix that and add a test.
2024-01-04 09:49:34 +00:00
Kazu Hirata
88d319a29f [mlir] Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 22:58:30 -08:00
Matteo Franciolini
4488f4933e
[mlir][bytecode] Add bytecode writer config API to skip serialization of resources (#71991)
When serializing to bytecode, users can select the option to elide
resources from the bytecode file. This will instruct the bytecode writer
to serialize only the key and resource kind, while skipping
serialization of the data buffer. At parsing, the IR is built in memory
with valid (but empty) resource handlers.
2023-11-13 12:59:30 -06:00
Matteo Franciolini
7ad9e9dcf5
[mlir][bytecode] Implements back deployment capability for MLIR dialects (#70724)
When emitting bytecode, clients can specify a target dialect version to
emit in `BytecodeWriterConfig`. This exposes a target dialect version to
the DialectBytecodeWriter, which can be queried by name and used to
back-deploy attributes, types, and properties.
2023-10-31 15:41:29 -07:00
Mehdi Amini
5e458f5aef Apply clang-tidy fixes for llvm-qualified-auto in IRNumbering.cpp (NFC) 2023-10-21 17:31:37 -07:00
Mogball
2b5134f1b7 [mlir] Fix bytecode reading of resource sections
This partially reverts #66380. The assertion that the underlying buffer
of an EncodingReader is aligned to any required alignments for resource
sections. Resources know their own alignment and pad their buffers
accordingly, but the bytecode reader doesn't know that ahead of time.
Consequently, it cannot give the resource EncodingReader a base buffer
aligned to the maximum required alignment.

A simple example from the test fails without this:

```mlir
module @TestDialectResources attributes {
  bytecode.test = dense_resource<resource> : tensor<4xi32>
} {}
{-#
  dialect_resources: {
    builtin: {
      resource: "0x2000000001000000020000000300000004000000",
      resource_2: "0x2000000001000000020000000300000004000000"
    }
  }
```
2023-09-29 18:39:56 -07:00
Christian Sigg
1c8c365de2
[mlir][bytecode] Check that bytecode source buffer is sufficiently aligned. (#66380)
Before this change, the `ByteCode` test failed on CentOS 7 with
devtoolset-9, because strings happen to be only 8 byte aligned. In
general though, strings have no alignment guarantee.

Increase resource alignment in test to 32 bytes. 
Adjust test to sufficiently align buffer.
Add test to check error when buffer is insufficiently aligned.
2023-09-17 13:46:01 +02:00
Adrian Kuegel
7d6fb14057 [mlir] Apply ClangTidy fix (NFC)
redundant get() call on smart pointer.
2023-08-07 15:45:36 +02:00
Mehdi Amini
2ef44aa443 [MLIR][Bytecode] Add missing field initializer in constructor initializer list
Leaving this field unitialized could led to crashes when it'll diverge from the
IRNumbering phase.

Differential Revision: https://reviews.llvm.org/D156965
2023-08-02 23:31:01 -07:00
Matteo Franciolini
bff6a4292f Expose callbacks for encoding of types/attributes
[mlir] Expose a mechanism to provide a callback for encoding types and attributes in MLIR bytecode.

Two callbacks are exposed, respectively, to the BytecodeWriterConfig and to the ParserConfig. At bytecode parsing/printing, clients have the ability to specify a callback to be used to optionally read/write the encoding. On failure, fallback path will execute the default parsers and printers for the dialect.

Testing shows how to leverage this functionality to support back-deployment and backward-compatibility usecases when roundtripping to bytecode a client dialect with type/attributes dependencies on upstream.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D153383
2023-07-28 16:45:42 -07:00
Mehdi Amini
b86a13211f Revert "Expose callbacks for encoding of types/attributes"
This reverts commit b299ec16661f653df66cdaf161cdc5441bc9803c.

The authorship informations were incorrect.
2023-07-28 16:45:42 -07:00
Mehdi Amini
b299ec1666 Expose callbacks for encoding of types/attributes
[mlir] Expose a mechanism to provide a callback for encoding types and attributes in MLIR bytecode.

Two callbacks are exposed, respectively, to the BytecodeWriterConfig and to the ParserConfig. At bytecode parsing/printing, clients have the ability to specify a callback to be used to optionally read/write the encoding. On failure, fallback path will execute the default parsers and printers for the dialect.

Testing shows how to leverage this functionality to support back-deployment and backward-compatibility usecases when roundtripping to bytecode a client dialect with type/attributes dependencies on upstream.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D153383
2023-07-28 10:44:02 -07:00
River Riddle
0e6000f647 [mlir:bytecode] Only visit the all regions path if the op has regions
Zero region operations return true for both isBeforeAllRegions and
isAfterAllRegions when using WalkStage. The bytecode walk only
expects region holding operations in the after regions path, so
guard against that.
2023-07-25 16:06:33 -07:00
River Riddle
4af01bf956 [mlir:bytecode] Support lazy loading dynamically isolated regions
We currently only support lazy loading for regions that
statically implement the IsolatedFromAbove trait, but that
limits the amount of operations that can be lazily loaded. This review
lifts that restriction by computing which operations have isolated
regions when numbering, allowing any operation to be lazily loaded
as long as it doesn't use values defined above.

Differential Revision: https://reviews.llvm.org/D156199
2023-07-25 15:55:34 -07:00
River Riddle
5ab6589551 [mlir:bytecode] Fix bytecode lazy loading for ops with multiple regions
We currently encode each region as a separate section, but
the reader expects all of the regions to be in the same section.
This updates the writer to match the behavior that the reader
expects.

Differential Revision: https://reviews.llvm.org/D156198
2023-07-25 15:55:34 -07:00
Mehdi Amini
9ea6b30ac2 Update ODS variadic segments "magic" attributes to use native Properties
The operand_segment_sizes and result_segment_sizes Attributes are now inlined
in the operation as native propertie. We continue to support building an
Attribute on the fly for `getAttr("operand_segment_sizes")` and setting the
property from an attribute with `setAttr("operand_segment_sizes", attr)`.

A new bytecode version is introduced to support backward compatibility and
backdeployments.

Differential Revision: https://reviews.llvm.org/D155919
2023-07-24 18:16:58 -07:00
Mehdi Amini
a7cd64c9f1 Revert "Update ODS variadic segments "magic" attributes to use native Properties"
This reverts commit 20b93abca6516bbb23689c3777536fea04e46e14.

One python test is broken, WIP.
2023-07-24 12:27:42 -07:00
Mehdi Amini
20b93abca6 Update ODS variadic segments "magic" attributes to use native Properties
The operand_segment_sizes and result_segment_sizes Attributes are now inlined
in the operation as native propertie. We continue to support building an
Attribute on the fly for `getAttr("operand_segment_sizes")` and setting the
property from an attribute with `setAttr("operand_segment_sizes", attr)`.

A new bytecode version is introduced to support backward compatibility and
backdeployments.

Differential Revision: https://reviews.llvm.org/D155919
2023-07-24 11:37:57 -07:00
Andrzej Warzynski
79c83e12c8 [mlir][VectorType] Allow arbitrary dimensions to be scalable
At the moment, only the trailing dimensions in the vector type can be
scalable, i.e. this is supported:

    vector<2x[4]xf32>

and this is not allowed:

    vector<[2]x4xf32>

This patch extends the vector type so that arbitrary dimensions can be
scalable. To this end, an array of bool values is added to every vector
type to denote whether the corresponding dimensions are scalable or not.
For example, for this vector:

  vector<[2]x[3]x4xf32>

the following array would be created:

  {true, true, false}.

Additionally, the current syntax:

  vector<[2x3]x4xf32>

is replaced with:

  vector<[2]x[3]x4xf32>

This is primarily to simplify parsing (this way, the parser can easily
process one dimension at a time rather than e.g. tracking whether
"scalable block" has been entered/left).

NOTE: The `isScalableDim` parameter of `VectorType` (introduced in this
patch) makes `numScalableDims` redundant. For the time being,
`numScalableDims` is preserved to facilitate the transition between the
two parameters. `numScalableDims` will be removed in one of the
subsequent patches.

This change is a part of a larger effort to enable scalable
vectorisation in Linalg. See this RFC for more context:
  * https://discourse.llvm.org/t/rfc-scalable-vectorisation-in-linalg/

Differential Revision: https://reviews.llvm.org/D153372
2023-06-27 19:21:59 +01:00
River Riddle
6ee1aba8ac [mlir][bytecode] Fix lazy loading of non-isolated regions
The bytecode reader currently assumes all regions can be lazy
loaded, which breaks reading any non-isolated region. This patch
fixes that by properly handling nested non-lazy regions, and only
considers isolated regions as lazy.

Differential Revision: https://reviews.llvm.org/D153795
2023-06-26 16:33:20 -07:00
Ulrich Weigand
bb0bbed610 Fix bytecode reader/writer on big-endian platforms
This makes the bytecode reader/writer work on big-endian platforms.
The only problem was related to encoding of multi-byte integers,
where both reader and writer code make implicit assumptions about
endianness of the host platform.

This fixes the current test failures on s390x, and in addition allows
to remove the UNSUPPORTED markers from all other bytecode-related
test cases - they now also all pass on s390x.

Also adding a GFAIL_SKIP to the MultiModuleWithResource unit test,
as this still fails due to an unrelated endian bug regarding
decoding of external resources.

Differential Revision: https://reviews.llvm.org/D153567

Reviewed By: mehdi_amini, jpienaar, rriddle
2023-06-23 09:22:55 +02:00
Mehdi Amini
9c1e55873e Use symbolic name for previous MLIR Bytecode versions
Reviewed By: jpienaar, burmako

Differential Revision: https://reviews.llvm.org/D151621
2023-06-06 01:19:56 -07:00
Kevin Gleason
0ee4875ddf [mlir][bytecode] Error if requested bytecode version is unsupported
Currently desired bytecode version is clamped to the maximum. This allows requesting bytecode versions that do not exist. We have added callsite validation for this in StableHLO to ensure we don't pass an invalid version number, probably better if this is managed upstream. If a user wants to use the current version, then omitting `setDesiredBytecodeVersion` is the best way to do that (as opposed to providing a large number).

Adding this check will also properly error on older version numbers as we increment the minimum supported version. Silently claming on minimum version would likely lead to unintentional forward incompatibilities.

Separately, due to bytecode version being `int64_t` and using methods to read/write uints, we can generate payloads with invalid version numbers:

```
mlir-opt file.mlir --emit-bytecode --emit-bytecode-version=-1 | mlir-opt
<stdin>:0:0: error: bytecode version 18446744073709551615 is newer than the current version 5
```

This is fixed with version bounds checking as well.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D151838
2023-05-31 19:20:42 -07:00
Haojian Wu
5217498dc8 [mlir][bazel] Port for 660f714e26999d266232a1fbb02712bb879bd34e 2023-05-27 08:05:19 +02:00
Jie Fu
5e8ed850d3 [mlir] Fix non-const lvalue reference to type 'uint64_t' cannot bind to type 'size_t' error (NFC)
/Users/jiefu/llvm-project/mlir/lib/Bytecode/Reader/BytecodeReader.cpp:1007:39: error: non-const lvalue reference to type 'uint64_t' (aka 'unsigned long long') cannot bind to a value of unrelated type 'size_t' (aka 'unsigned long')
    if (failed(propReader.parseVarInt(count)))
                                      ^~~~~
/Users/jiefu/llvm-project/mlir/lib/Bytecode/Reader/BytecodeReader.cpp:191:39: note: passing argument to parameter 'result' here
  LogicalResult parseVarInt(uint64_t &result) {
                                      ^
/Users/jiefu/llvm-project/mlir/lib/Bytecode/Reader/BytecodeReader.cpp:1020:44: error: non-const lvalue reference to type 'uint64_t' (aka 'unsigned long long') cannot bind to a value of unrelated type 'size_t' (aka 'unsigned long')
      if (failed(offsetsReader.parseVarInt(dataSize)) ||
                                           ^~~~~~~~
/Users/jiefu/llvm-project/mlir/lib/Bytecode/Reader/BytecodeReader.cpp:191:39: note: passing argument to parameter 'result' here
  LogicalResult parseVarInt(uint64_t &result) {
                                      ^
2 errors generated.
2023-05-27 09:53:10 +08:00
Mehdi Amini
660f714e26 [MLIR] Add native Bytecode support for properties
This is adding a new interface (`BytecodeOpInterface`) to allow operations to
opt-in skipping conversion to attribute and serializing properties to native
bytecode.

The scheme relies on a new section where properties are stored in sequence

  { size, serialize_properties }, ...

The operations are storing the index of a properties, a table of offset is
built when loading the properties section the first time.

This is a re-commit of 837d1ce0dc which conflicted with another patch upgrading
the bytecode and the collision wasn't properly resolved before.

Differential Revision: https://reviews.llvm.org/D151065
2023-05-26 17:45:01 -07:00
Mehdi Amini
bb9a0c736b Revert "[MLIR] Add native Bytecode support for properties"
This reverts commit ca5a12fd69d4acf70c08f797cbffd714dd548348
and follow-up fixes:

df34c288c428eb4b867c8075def48b3d1727d60b
07dc906883af660780cf6d0cc1044f7e74dab83e
ab80ad0095083fda062c23ac90df84c40b4332c8
837d1ce0dc8eec5b17255291b3462e6296cb369b

The first commit was incomplete and broken, I'll prepare a new version
later, in the meantime pull this work out of tree.
2023-05-25 21:02:31 -07:00
Mehdi Amini
df34c288c4 Fix MLIR Bytecode backward deployment
The condition for guarding the properties section was reversed.
2023-05-25 20:40:03 -07:00
Eugene Burmako
07dc906883 Fix MLIR back-deployment to version < 5 ; properties section should not be emitted.
This was an oversight in the development of bytecode version 5, which was
caught by downstream StableHLO compatibility tests.

Differential revision: https://reviews.llvm.org/D151531
2023-05-25 20:31:47 -07:00