llvm-project

Author	SHA1	Message	Date
Yang Bai	4eb1a07d7d	[mlir][vector] Support multi-dimensional vectors in VectorFromElementsLowering (#151175 ) This patch introduces a new unrolling-based approach for lowering multi-dimensional `vector.from_elements` operations. Implementation Details: 1. New Transform Pattern: Added `UnrollFromElements` that unrolls a N-D(N>=2) from_elements op to a (N-1)-D from_elements op align the outermost dimension. 2. Utility Functions: Added `unrollVectorOp` to reuse the unroll algo of vector.gather for vector.from_elements. 3. Integration: Added the unrolling pattern to the convert-vector-to-llvm pass as a temporal transformation. 4. Use direct LLVM dialect operations instead of intermediate vector.insert operations for efficiency in `VectorFromElementsLowering`. Example: ```mlir // unroll %v = vector.from_elements %e0, %e1, %e2, %e3 : vector<2x2xf32> => %poison_2d = ub.poison : vector<2x2xf32> %vec_1d_0 = vector.from_elements %e0, %e1 : vector<2xf32> %vec_2d_0 = vector.insert %vec_1d_0, %poison_2d [0] : vector<2xf32> into vector<2x2xf32> %vec_1d_1 = vector.from_elements %e2, %e3 : vector<2xf32> %result = vector.insert %vec_1d_1, %vec_2d_0 [1] : vector<2xf32> into vector<2x2xf32> // convert-vector-to-llvm %v = vector.from_elements %e0, %e1, %e2, %e3 : vector<2x2xf32> => %poison_2d = ub.poison : vector<2x2xf32> %poison_2d_cast = builtin.unrealized_conversion_cast %poison_2d : vector<2x2xf32> to !llvm.array<2 x vector<2xf32>> %poison_1d_0 = llvm.mlir.poison : vector<2xf32> %c0_0 = llvm.mlir.constant(0 : i64) : i64 %vec_1d_0_0 = llvm.insertelement %e0, %poison_1d_0[%c0_0 : i64] : vector<2xf32> %c1_0 = llvm.mlir.constant(1 : i64) : i64 %vec_1d_0_1 = llvm.insertelement %e1, %vec_1d_0_0[%c1_0 : i64] : vector<2xf32> %vec_2d_0 = llvm.insertvalue %vec_1d_0_1, %poison_2d_cast[0] : !llvm.array<2 x vector<2xf32>> %poison_1d_1 = llvm.mlir.poison : vector<2xf32> %c0_1 = llvm.mlir.constant(0 : i64) : i64 %vec_1d_1_0 = llvm.insertelement %e2, %poison_1d_1[%c0_1 : i64] : vector<2xf32> %c1_1 = llvm.mlir.constant(1 : i64) : i64 %vec_1d_1_1 = llvm.insertelement %e3, %vec_1d_1_0[%c1_1 : i64] : vector<2xf32> %vec_2d_1 = llvm.insertvalue %vec_1d_1_1, %vec_2d_0[1] : !llvm.array<2 x vector<2xf32>> %result = builtin.unrealized_conversion_cast %vec_2d_1 : !llvm.array<2 x vector<2xf32>> to vector<2x2xf32> ``` --------- Co-authored-by: Nicolas Vasilache <Nico.Vasilache@amd.com> Co-authored-by: Yang Bai <yangb@nvidia.com> Co-authored-by: James Newling <james.newling@gmail.com> Co-authored-by: Diego Caballero <dieg0ca6aller0@gmail.com>	2025-08-18 10:09:12 -07:00
Nishant Patel	4a9d038acd	[MLIR][XeGPU] Distribute load_nd/store_nd/prefetch_nd with offsets from Wg to Sg (#153432 ) This PR adds pattern to distribute the load/store/prefetch nd ops with offsets from workgroup to subgroup IR. This PR is part of the transition to move offsets from create_nd to load/store/prefetch nd ops. Create_nd PR : #152351	2025-08-18 09:45:29 -07:00
Jeremy Kun	c67d27dad0	[mlir][Presburger] NFC: return var index from IntegerRelation::addLocalFloorDiv (#153463 ) addLocalFloorDiv currently returns void and requires the caller to know that the newly added local variable is in a particular index. This commit returns the index of the newly added variable so that callers need not tie themselves to this implementation detail. I found one relevant callsite demonstrating this and updated it. I am using this API out of tree and wanted to make our out-of-tree code a bit more resilient to upstream changes.	2025-08-18 08:47:47 -07:00
Jacques Pienaar	4bf33958da	[mlir] Update builders to use new form. (#154132 ) Mechanically applied using clang-tidy.	2025-08-18 15:19:34 +00:00
Matthias Springer	f84aaa6eaa	[mlir][Transforms] Dialect conversion: Add flag to dump materialization kind (#119532 ) Add a debugging flag to the dialect conversion to dump the materialization kind. This flag is useful to find out whether a missing materialization rule is for source or target materializations. Also add missing test coverage for the `buildMaterializations` flag.	2025-08-18 13:25:18 +00:00
Chaitanya	4a3bf27c69	[OpenMP] Introduce omp.target_allocmem and omp.target_freemem omp dialect ops. (#145464 ) This PR introduces two new ops in omp dialect, omp.target_allocmem and omp.target_freemem. omp.target_allocmem: Allocates heap memory on device. Will be lowered to omp_target_alloc call in llvm. omp.target_freemem: Deallocates heap memory on device. Will be lowered to omp+target_free call in llvm. Example: %1 = omp.target_allocmem %device : i32, i64 omp.target_freemem %device, %1 : i32, i64 The work in this PR is C-P/inspired from @ivanradanov commit from coexecute implementation: [Add fir omp target alloc and free ops](`be860ac8ba`) [Lower omp_target_{alloc,free} to llvm](`6e2d584dc9`)	2025-08-18 18:15:11 +05:30
Mehdi Amini	cfe5975eaf	[MLIR] Fix SCF verifier crash (#153974 ) An operand of the nested yield op can be null and hasn't been verified yet when processing the enclosing operation. Using `getResultTypes()` will dereference this null Value and crash in the verifier.	2025-08-18 12:48:55 +02:00
Andrzej Warzyński	51b5a3e1a6	[MLIR] Add Egress dialects maintainers (#151721 ) As per https://discourse.llvm.org/t/mlir-project-maintainers/87189, this PR adds maintainers for the "egress" dialects. Compared to the original proposal, two changes are included: * The "mesh" dialect has been renamed to "shard" (https://discourse.llvm.org/t/mlir-mesh-cleanup-mesh/). * The "XeVM" dialect has been added (https://discourse.llvm.org/t/rfc-proposal-for-new-xevm-dialect/).	2025-08-18 10:34:44 +01:00
Mehdi Amini	16aa283344	[MLIR] Refactor the walkAndApplyPatterns driver to remove the recursion (#154037 ) This is in preparation of a follow-up change to stop traversing unreachable blocks. This is not NFC because of a subtlety of the early_inc. On a test case like: ``` scf.if %cond { "test.move_after_parent_op"() ({ "test.any_attr_of_i32_str"() {attr = 0 : i32} : () -> () }) : () -> () } ``` We recursively traverse the nested regions, and process an op when the region is done (post-order). We need to pre-increment the iterator before processing an operation in case it gets deleted. However we can do this before or after processing the nested region. This implementation does the latter.	2025-08-18 09:07:19 +00:00
Mehdi Amini	87e6fd161a	[MLIR] Erase unreachable blocks before applying patterns in the greedy rewriter (#153957 ) Operations like: %add = arith.addi %add, %add : i64 are legal in unreachable code. Unfortunately many patterns would be unsafe to apply on such IR and can lead to crashes or infinite loops. To avoid this we can remove unreachable blocks before attempting to apply patterns. We may have to do this also whenever the CFG is changed by a pattern, it is left up for future work right now. Fixes #153732	2025-08-18 10:59:43 +02:00
Matthias Springer	ff68f7115c	[mlir][builtin] Make `unrealized_conversion_cast` inlineable (#139722 ) Until now, `builtin.unrealized_conversion_cast` ops could not be inlined by the Inliner pass.	2025-08-18 10:23:26 +02:00
Matthias Springer	f7b09ad700	[mlir][LLVM] `ArithToLLVM`: Add 1:N support for `arith.select` lowering (#153944 ) Add 1:N support for the `arith.select` lowering. Only cases where the entire true/false value is selected are supported.	2025-08-18 09:42:37 +02:00
Guray Ozen	5d300afa80	[MLIR][NVVM] Add support for multiple return values in `inline_ptx` (#153774 ) This PR adds the ability for `nvvm.inline_ptx` to return multiple values, matching the expected semantics in PTX while respecting LLVM’s constraints. LLVM’s `inline_asm` op does not natively support multiple returns — instead, it requires packing results into an LLVM `struct` and then extracting them. This PR implements automatic packing/unpacking so that multiple return values can be expressed naturally in MLIR without extra user boilerplate. Example MLIR: ``` %r1, %r2 = nvvm.inline_ptx "{ .reg .pred p; setp.ge.s32 p, $2, $3; selp.s32 $0, $2, $3, p; selp.s32 $1, $2, $3, !p; }" (%a, %b) : i32, i32 -> i32, i32 %r3 = llvm.add %r1, %r2 : i32 ``` Lowered LLVM IR: ``` %1 = llvm.inline_asm has_side_effects asm_dialect = att "{\0A\09 .reg .pred p;\0A\09 setp.ge.s32 p, $2, $3;\0A\09 selp.s32 $0, $2, $3, p;\0A\09 selp.s32 $1, $2, $3, !p;\0A\09}\0A", "=r,=r,r,r" %a, %b : (i32, i32) -> !llvm.struct<(i32, i32)> %2 = llvm.extractvalue %1[0] : !llvm.struct<(i32, i32)> %3 = llvm.extractvalue %1[1] : !llvm.struct<(i32, i32)> %4 = llvm.add %2, %3 : i32 ```	2025-08-18 08:37:55 +02:00
Shenghang Tsai	7610b13729	[MLIR] Split ExecutionEngine Initialization out of ctor into an explicit method call (#153524 ) Retry landing https://github.com/llvm/llvm-project/pull/153373 ## Major changes from previous attempt - remove the test in CAPI because no existing tests in CAPI deal with sanitizer exemptions - update `mlir/docs/Dialects/GPU.md` to reflect the new behavior: load GPU binary in global ctors, instead of loading them at call site. - skip the test on Aarch64 since we have an issue with initialization there --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>	2025-08-17 23:07:24 +02:00
Veera	e1aa415220	[mlir][InferIntRangeCommon] Fix Division by Zero Crash (#151637 ) Fixes #131273 Adds a check to avoid division when max value of denominator is zero.	2025-08-17 10:56:34 -07:00
Erik Davis	a66d8f62e6	[mlir][doc] fixup code block (#153977 ) This fixes a small typo in the toy tutorial. A code block was not correctly terminated, causing it to run into the subsequent block.	2025-08-17 13:01:05 +02:00
Matthias Springer	0d8aa9d9ec	[mlir][SparseTensor] Simplify pipeline (#152908 ) This refactoring improves compilation time.	2025-08-16 18:45:26 +02:00
Maksim Levental	6fc1deb8b7	[mlir][python] handle more undefined symbols not covered by nanobind (#153861 ) Introduced (but omitted from this CMake) in https://github.com/llvm/llvm-project/pull/151246.	2025-08-16 09:25:15 -04:00
Matthias Springer	2692ff8213	[mlir][LLVM] Fix build (#153947 ) Fix build after #153937.	2025-08-16 13:06:58 +02:00
Matthias Springer	f8f23e838a	[mlir][LLVM] `ControlFlowToLLVM`: Add 1:N type conversion support (#153937 ) Add support for 1:N type conversions to the `ControlFlowToLLVM` lowering patterns. Not applicable to `cf.switch` and `cf.assert`. --------- Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>	2025-08-16 12:51:40 +02:00
Matthias Springer	f0967fca04	[mlir][LLVM] `FuncToLLVM`: Add 1:N type conversion support (#153823 ) Add support for 1:N type conversions to the `FuncToLLVM` lowering patterns. This commit does not change the lowering of any types (such as `MemRefType`). It just sets up the infrastructure, such that 1:N type conversions can be used during `FuncToLLVM`. Note: When the converted result types of a `func.func` have more than 1 type, then the results are wrapped in an `llvm.struct`. That's because `llvm.func` does not support multiple result values. This "wrapping" was already implemented for cases where the original `func.func` has multiple results. With 1:N conversions, even a single result can now expand to multiple converted results, triggering the same wrapping mechanism. The test cases are exercised with both the old and the new no-rollback conversion driver.	2025-08-16 09:45:08 +02:00
Chao Chen	9c4e571ae8	[mlir][xegpu] Add definitions of MemDescType and related ops. (#153273 )	2025-08-15 18:02:13 -05:00
Aiden Grossman	ca8ee49c1f	[MLIR] Set LLVM_LIT_ARGS in Standalone Example CMake (#152423 ) Setting LLVM_LIT_ARGS to include --quiet and then running check-mlir in a standard checkout will otherwise cause test failures here because LLVM_LIT_ARGS gets propagated into this project.	2025-08-15 12:40:32 -07:00
asraa	b045729eb4	[mlir][presburger] add functionality to compute local mod in IntegerRelation (#153614 ) Similar to `IntegerRelation::addLocalFloorDiv`, this adds a utility `IntegerRelation::addLocalModulo` that adds and returns a local variable that is the modulus of an affine function of the variables modulo some constant modulus. The function returns the absolute index of the new var in the relation. This is computed by first finding the floordiv of `exprs // modulus = q` and then computing the remainder `result = exprs - q * modulus`. Signed-off-by: Asra Ali <asraa@google.com>	2025-08-15 09:55:13 -07:00
Andrey Timonin	dfa1335db1	[mlir][emitc] Add verification for the emitc.get_field op (#152577 ) This MR adds a `verifier` for the `emitc.get_field` op. - The `verifier` checks that the `emitc.get_field` operation is nested inside an `emitc.class` op. - Additionally, appropriate tests for erroneous cases were added for class-related operations in `invalid_ops.mlir`.	2025-08-15 18:32:12 +02:00
Tim Gymnich	ffaba758fb	[MLIR][ROCDL] Add permlane16.swap and permanlane32.swap (#153804 ) add rocdl.permlane16.swap and rocdl.permanlane32.swap	2025-08-15 17:35:31 +02:00
Kazu Hirata	f4bc3151bb	[mlir] Fix warnings This patch fixes: mlir/lib/Target/Wasm/TranslateFromWasm.cpp:82:1: error: unused variable 'wasmSectionName<(anonymous namespace)::WasmSectionType::DATACOUNT>' [-Werror,-Wunused-const-variable] mlir/lib/Target/Wasm/TranslateFromWasm.cpp💯5: error: unused variable 'valueTypesEncodings' [-Werror,-Wunused-const-variable] mlir/lib/Target/Wasm/TranslateFromWasm.cpp:735:13: error: unused function 'buildLiteralType<unsigned int>' [-Werror,-Wunused-function] mlir/lib/Target/Wasm/TranslateFromWasm.cpp:740:13: error: unused function 'buildLiteralType<unsigned long>' [-Werror,-Wunused-function] mlir/lib/Target/Wasm/TranslateFromWasm.cpp:292:33: error: private field 'symbols' is not used [-Werror,-Wunused-private-field]	2025-08-15 07:24:31 -07:00
Guray Ozen	4c389178ee	[MLIR][NVVM] Print readable modifer (NFC) (#153779 ) Currently, modifier is printed as address, so it is not readable and not useful. This PR adds readable printing for it. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-08-15 15:47:39 +02:00
Guray Ozen	af92cabdef	[MLIR][NVVM] Combine griddepcontrol Ops (#152525 ) We've 2 ops: 1. nvvm.griddepcontrol.wait 2. nvvm.griddepcontrol.launch_dependents They are related to Grid Dependent Launch (or programmatic dependent launch in CUDA) and same concept. This PR unifies both ops into a single one.	2025-08-15 15:47:12 +02:00
Erick Ochoa Lopez	61caab7789	[mlir][llvm] Add `align` attribute to `llvm.intr.masked.{expandload,compressstore}` (#153063 ) * Add `requiresArgsAndResultsAttr` to `LLVM_OneResultIntrOp` * Add `args_attrs` to `llvm.intr.masked.{expandload,compressstore}` The LLVM intrinsics [`llvm.intr.masked.expandload`](https://llvm.org/docs/LangRef.html#llvm-masked-expandload-intrinsics) and [`llvm.intr.masked.compressstore`](https://llvm.org/docs/LangRef.html#llvm-masked-compressstore-intrinsics) both allow an optional align parameter attribute to be set which defaults to one. Inlining the documentation below for [`llvm.intr.masked.expandload` 's ](https://llvm.org/docs/LangRef.html#id1522) and [`llvm.intr.masked.compressstore`'s](https://llvm.org/docs/LangRef.html#id1522) arguments respectively > The `align` parameter attribute can be provided for the first argument. The pointer alignment defaults to 1. > The `align` parameter attribute can be provided for the second argument. The pointer alignment defaults to 1.	2025-08-15 08:34:14 -04:00
Mehdi Amini	69453d7021	[MLIR] Fix memory leak in importWebAssemblyToModule when it fails to import (#153794 )	2025-08-15 12:33:25 +00:00
Mehdi Amini	7640645f79	[MLIR][Wasm] Remove statistics as they depend on global ctors (#153795 ) Use a debug log instead for now.	2025-08-15 12:29:20 +00:00
Markus Böck	8582025f1f	[mlir][Transforms] Turn 1:N -> 1:1 dispatch fatal error into match failure (#153605 ) Prior to this PR, the default behaviour of a conversion pattern which receives operands of a 1:N is to abort the compilation. This has historically been useful when the 1:N type conversion got merged into the dialect conversion as it allowed us to easily find patterns that should be capable of handling 1:N type conversions but didn't. However, this behaviour has the disadvantage of being non-composable: While the pattern in question cannot handle the 1:N type conversion, another pattern part of the set might, but doesn't get the chance as compilation is aborted. This PR fixes this behaviour by failing to match and instead of aborting, giving other patterns the chance to legalize an op. The implementation uses a reusable function called `dispatchTo1To1` to allow derived conversion patterns to also implement the behaviour.	2025-08-15 11:45:25 +02:00
Matthias Springer	21b607adbe	[mlir][SCF] `scf.for`: Add support for unsigned integer comparison (#153379 ) Add a new unit attribute to allow for unsigned integer comparison. Example: ```mlir scf.for unsigned %iv_32 = %lb_32 to %ub_32 step %step_32 : i32 { // body } ``` Discussion: https://discourse.llvm.org/t/scf-should-scf-for-support-unsigned-comparison/84655	2025-08-15 10:59:14 +02:00
Ferdinand Lemaire	6bb8f6f2d0	[MLIR][WASM] Introduce an importer for Wasm binaries (#152131 ) First step in introducing the wasm-import target to mlir-translate. This is the first PR to introduce the pass, with this PR, there is very little support for the actual WebAssembly language, it's mostly there to introduce the skeleton of the importer. A follow-up will come with support for a wider range of operators. It was split to make it easier to review, since it's a good chunk of work. --------- Co-authored-by: Luc Forget <dev@alias.lforget.fr> Co-authored-by: Ferdinand Lemaire <ferdinand.lemaire@woven-planet.global> Co-authored-by: Jessica Paquette <jessica.paquette@woven-planet.global> Co-authored-by: Luc Forget <luc.forget@woven.toyota>	2025-08-15 10:54:40 +02:00
Chenguang Wang	3f797a8342	[mlir][spirv] Add missing #include in SPIRVImageInterfaces.h (#153727 ) SPIRVImageInterfaces.h.inc uses some types, e.g. mlir::TypedValue, without #include the necessary headers. This is fine most of the time, but we did run into a weird case where bazel fails to compile //mlir:SPIRVImageInterfaces on clang19 for ChromiumOS when parse_headers (see [1]) is specified. [1]: https://bazel.build/docs/bazel-and-cpp#toolchain-features	2025-08-14 19:07:54 -07:00
Erich Keane	e5e3e4bdb5	[OpenACC] Add firstprivate recipe helper methods to ACC dialect (#153604 ) Like we did for the 'private' clause, this adds an easier to use helper function to add the 'firstprivate' clause + recipe to the Parallel and Serial ops.	2025-08-14 13:07:59 -07:00
Jianhui Li	98728d9dc8	[MLIR][XeGPU] Add lowering from transfer_read/transfer_write to load_gather/store_scatter (#152429 ) Lowering transfer_read/transfer_write to load_gather/store_scatter in case the target uArch doesn't support load_nd/store_nd. The high level steps: 1. compute Strides; 2. compute Offsets; 3. collapseMemrefTo1D; 4. create Load gather or store_scatter op	2025-08-14 11:27:07 -07:00
Boyana Norris	ada191136b	[mlir][cmake] Fix mlir target export (#153341 ) In https://github.com/llvm/llvm-project/pull/152195, target export was accidentally moved inside a conditional, but it should have been left outside. This patch undoes that change.	2025-08-14 11:24:44 -06:00
Matthias Springer	e2ae634cc1	[mlir][LLVM][NFC] Simplify `copyUnrankedDescriptors` (#153597 ) Split the function into two: one that copies a single unranked descriptor and one that copies multiple unranked descriptors. This is in preparation of adding 1:N support to the Func->LLVM lowering patterns.	2025-08-14 18:25:19 +02:00
Boyana Norris	1945753700	[mlir][linalg] Fix incorrect linalg short form printing (#153219 ) Both `linalg.map` and `linalg.reduce` are sometimes printed in short form incorrectly, resulting in a round-trip output with different semantics. This patch adds additional `yield` operand checks to ensure that all criteria for short-form printing are satisfied. Updated/added comments and renamed the `findPayloadOp` function to `canUseShortForm`, which more accurately reflects its purpose. A couple of new lit tests check for the proper use of long form when short-form conditions are not met. Fixes #117528	2025-08-14 17:19:16 +01:00
Renato Golin	8cc22ee674	[MLIR][Maintainers] Add maintainer list for core sub-categories (#152136 ) Ref: https://discourse.llvm.org/t/mlir-project-maintainers/87189 See also: * #151721 * #150945 Compared to the original proposal, one change is included: * The `ub` dialect has @Hardcode84 as maintainer. Please accept to validate your nomination, let's keep new nominations for follow up PRs.	2025-08-14 16:08:15 +01:00
Matthias Springer	0ff92fe2f0	[mlir][LLVM][NFC] Simplify `computeSizes` function (#153588 ) Rename `computeSizes` to `computeSize` and make it compute just a single size. This is in preparation of adding 1:N support to the Func->LLVM lowering patterns.	2025-08-14 17:00:03 +02:00
Jaden Angella	bfda0e777d	[mlir][EmitC] Expand the MemRefToEmitC pass - Lowering `CopyOp` (#151206 ) This patch lowers `memref.copy` to `emitc.call_opaque "memcpy"`. From: ``` func.func @copying(%arg0 : memref<9x4x5x7xf32>, %arg1 : memref<9x4x5x7xf32>) { memref.copy %arg0, %arg1 : memref<9x4x5x7xf32> to memref<9x4x5x7xf32> return } ``` To: ```cpp #include <cstring> void copying(float v1[9][4][5][7], float v2[9][4][5][7]) { size_t v3 = 0; float* v4 = &v2[v3][v3][v3][v3]; float* v5 = &v1[v3][v3][v3][v3]; size_t v6 = sizeof(float); size_t v7 = 1260; size_t v8 = v6 * v7; memcpy(v5, v4, v8); return; } ```	2025-08-14 05:25:55 -07:00
lonely eagle	6d08a39eeb	[mlir][nvgpu] Add tma last dim bytes check (#153451 ) Add the check the number of bytes in the last dimension of Tma must be a multiple of 16.	2025-08-14 20:14:20 +08:00
Igor Wodiany	87de48d11f	[mlir][spirv] Add spirv validation for module.mlir target test (#153227 ) Creating this patch as an example on using the new `mlir-translate` flag. Eventually all tests will be updated to validate SPIR-V modules.	2025-08-14 12:45:55 +01:00
Andrzej Warzyński	8d4f3171fa	[mlir][linalg] Fix UnPackOp::getTiledOuterDims (#152960 ) Fixes `getTiledOuterDims` by making sure that the `outer_dims_perm` attribute from `linalg.unpack` is taken into account. Fixes #152037	2025-08-14 11:39:50 +01:00
Ege Beysel	8de85e753f	[mlir][linalg] Add support for scalable vectorization of `linalg.batch_mmt4d` (#152984 ) This PR builds upon the previous #146531 and enables scalable vectorization for `batch_mmt4d` as well. --------- Signed-off-by: Ege Beysel <beyselege@gmail.com>	2025-08-14 11:47:51 +02:00
Jordan Rupprecht	1d55b70ec3	[MLIR][GPU][XeVM] Add missing #include for standalone header build (#153532 ) This header uses GPUModuleOp but does not directly include the header: `error: no type named 'GPUModuleOp' in namespace 'mlir::gpu'; did you mean 'ModuleOp'?` Needed for #148286	2025-08-14 04:13:41 +00:00
Sayan Saha	8432f24831	[mlir][tosa] Don't fold mul with zero lhs/rhs if resulting type is dynamic (#153420 ) Canonicalizing the following IR: ``` func.func @mul_zero_dynamic_nofold(%arg0: tensor<?x17xf32>) -> tensor<?x17xf32> { %0 = "tosa.const"() <{values = dense<0.000000e+00> : tensor<1x1xf32>}> : () -> tensor<1x1xf32> %1 = "tosa.const"() <{values = dense<0> : tensor<1xi8>}> : () -> tensor<1xi8> %2 = tosa.mul %arg0, %0, %1 : (tensor<?x17xf32>, tensor<1x1xf32>, tensor<1xi8>) -> tensor<?x17xf32> return %2 : tensor<?x17xf32> } ``` resulted in a crash ``` #0 0x000056513187e8db backtrace (./build-release/bin/mlir-opt+0x9d698db) #1 0x0000565131b17737 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:838:8 #2 0x0000565131b187f3 PrintStackTraceSignalHandler(void) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:918:1 #3 0x0000565131b18c30 llvm::sys::RunSignalHandlers() /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Signals.cpp:105:18 #4 0x0000565131b18c30 SignalHandler(int, siginfo_t, void*) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:409:3 #5 0x00007f2e4165b050 (/lib/x86_64-linux-gnu/libc.so.6+0x3c050) #6 0x00007f2e416a9eec __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 #7 0x00007f2e4165afb2 raise ./signal/../sysdeps/posix/raise.c:27:6 #8 0x00007f2e41645472 abort ./stdlib/abort.c:81:7 #9 0x00007f2e41645395 _nl_load_domain ./intl/loadmsgcat.c:1177:9 #10 0x00007f2e41653ec2 (/lib/x86_64-linux-gnu/libc.so.6+0x34ec2) #11 0x00005651443ec4ba mlir::DenseIntOrFPElementsAttr::getRaw(mlir::ShapedType, llvm::ArrayRef<char>) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/IR/BuiltinAttributes.cpp:1361:3 #12 0x00005651443f1209 mlir::DenseElementsAttr::resizeSplat(mlir::ShapedType) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/IR/BuiltinAttributes.cpp:0:10 #13 0x000056513f76f2b6 mlir::tosa::MulOp::fold(mlir::tosa::MulOpGenericAdaptor<llvm::ArrayRef<mlir::Attribute>>) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/Dialect/Tosa/IR/TosaCanonicalizations.cpp:0:0 ``` from the folder for `tosa::mul` since the zero value was being reshaped to `?x17` size which isn't supported. AFAIK, `tosa.const` requires all dimensions to be static. So in this case, the fix is to not to fold the op.	2025-08-13 19:45:06 -04:00

1 2 3 4 5 ...

23873 Commits