llvm-project

Author	SHA1	Message	Date
Tim Gymnich	ffaba758fb	[MLIR][ROCDL] Add permlane16.swap and permanlane32.swap (#153804 ) add rocdl.permlane16.swap and rocdl.permanlane32.swap	2025-08-15 17:35:31 +02:00
Kazu Hirata	f4bc3151bb	[mlir] Fix warnings This patch fixes: mlir/lib/Target/Wasm/TranslateFromWasm.cpp:82:1: error: unused variable 'wasmSectionName<(anonymous namespace)::WasmSectionType::DATACOUNT>' [-Werror,-Wunused-const-variable] mlir/lib/Target/Wasm/TranslateFromWasm.cpp💯5: error: unused variable 'valueTypesEncodings' [-Werror,-Wunused-const-variable] mlir/lib/Target/Wasm/TranslateFromWasm.cpp:735:13: error: unused function 'buildLiteralType<unsigned int>' [-Werror,-Wunused-function] mlir/lib/Target/Wasm/TranslateFromWasm.cpp:740:13: error: unused function 'buildLiteralType<unsigned long>' [-Werror,-Wunused-function] mlir/lib/Target/Wasm/TranslateFromWasm.cpp:292:33: error: private field 'symbols' is not used [-Werror,-Wunused-private-field]	2025-08-15 07:24:31 -07:00
Guray Ozen	4c389178ee	[MLIR][NVVM] Print readable modifer (NFC) (#153779 ) Currently, modifier is printed as address, so it is not readable and not useful. This PR adds readable printing for it. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-08-15 15:47:39 +02:00
Guray Ozen	af92cabdef	[MLIR][NVVM] Combine griddepcontrol Ops (#152525 ) We've 2 ops: 1. nvvm.griddepcontrol.wait 2. nvvm.griddepcontrol.launch_dependents They are related to Grid Dependent Launch (or programmatic dependent launch in CUDA) and same concept. This PR unifies both ops into a single one.	2025-08-15 15:47:12 +02:00
Erick Ochoa Lopez	61caab7789	[mlir][llvm] Add `align` attribute to `llvm.intr.masked.{expandload,compressstore}` (#153063 ) * Add `requiresArgsAndResultsAttr` to `LLVM_OneResultIntrOp` * Add `args_attrs` to `llvm.intr.masked.{expandload,compressstore}` The LLVM intrinsics [`llvm.intr.masked.expandload`](https://llvm.org/docs/LangRef.html#llvm-masked-expandload-intrinsics) and [`llvm.intr.masked.compressstore`](https://llvm.org/docs/LangRef.html#llvm-masked-compressstore-intrinsics) both allow an optional align parameter attribute to be set which defaults to one. Inlining the documentation below for [`llvm.intr.masked.expandload` 's ](https://llvm.org/docs/LangRef.html#id1522) and [`llvm.intr.masked.compressstore`'s](https://llvm.org/docs/LangRef.html#id1522) arguments respectively > The `align` parameter attribute can be provided for the first argument. The pointer alignment defaults to 1. > The `align` parameter attribute can be provided for the second argument. The pointer alignment defaults to 1.	2025-08-15 08:34:14 -04:00
Mehdi Amini	69453d7021	[MLIR] Fix memory leak in importWebAssemblyToModule when it fails to import (#153794 )	2025-08-15 12:33:25 +00:00
Mehdi Amini	7640645f79	[MLIR][Wasm] Remove statistics as they depend on global ctors (#153795 ) Use a debug log instead for now.	2025-08-15 12:29:20 +00:00
Markus Böck	8582025f1f	[mlir][Transforms] Turn 1:N -> 1:1 dispatch fatal error into match failure (#153605 ) Prior to this PR, the default behaviour of a conversion pattern which receives operands of a 1:N is to abort the compilation. This has historically been useful when the 1:N type conversion got merged into the dialect conversion as it allowed us to easily find patterns that should be capable of handling 1:N type conversions but didn't. However, this behaviour has the disadvantage of being non-composable: While the pattern in question cannot handle the 1:N type conversion, another pattern part of the set might, but doesn't get the chance as compilation is aborted. This PR fixes this behaviour by failing to match and instead of aborting, giving other patterns the chance to legalize an op. The implementation uses a reusable function called `dispatchTo1To1` to allow derived conversion patterns to also implement the behaviour.	2025-08-15 11:45:25 +02:00
Matthias Springer	21b607adbe	[mlir][SCF] `scf.for`: Add support for unsigned integer comparison (#153379 ) Add a new unit attribute to allow for unsigned integer comparison. Example: ```mlir scf.for unsigned %iv_32 = %lb_32 to %ub_32 step %step_32 : i32 { // body } ``` Discussion: https://discourse.llvm.org/t/scf-should-scf-for-support-unsigned-comparison/84655	2025-08-15 10:59:14 +02:00
Ferdinand Lemaire	6bb8f6f2d0	[MLIR][WASM] Introduce an importer for Wasm binaries (#152131 ) First step in introducing the wasm-import target to mlir-translate. This is the first PR to introduce the pass, with this PR, there is very little support for the actual WebAssembly language, it's mostly there to introduce the skeleton of the importer. A follow-up will come with support for a wider range of operators. It was split to make it easier to review, since it's a good chunk of work. --------- Co-authored-by: Luc Forget <dev@alias.lforget.fr> Co-authored-by: Ferdinand Lemaire <ferdinand.lemaire@woven-planet.global> Co-authored-by: Jessica Paquette <jessica.paquette@woven-planet.global> Co-authored-by: Luc Forget <luc.forget@woven.toyota>	2025-08-15 10:54:40 +02:00
Chenguang Wang	3f797a8342	[mlir][spirv] Add missing #include in SPIRVImageInterfaces.h (#153727 ) SPIRVImageInterfaces.h.inc uses some types, e.g. mlir::TypedValue, without #include the necessary headers. This is fine most of the time, but we did run into a weird case where bazel fails to compile //mlir:SPIRVImageInterfaces on clang19 for ChromiumOS when parse_headers (see [1]) is specified. [1]: https://bazel.build/docs/bazel-and-cpp#toolchain-features	2025-08-14 19:07:54 -07:00
Erich Keane	e5e3e4bdb5	[OpenACC] Add firstprivate recipe helper methods to ACC dialect (#153604 ) Like we did for the 'private' clause, this adds an easier to use helper function to add the 'firstprivate' clause + recipe to the Parallel and Serial ops.	2025-08-14 13:07:59 -07:00
Jianhui Li	98728d9dc8	[MLIR][XeGPU] Add lowering from transfer_read/transfer_write to load_gather/store_scatter (#152429 ) Lowering transfer_read/transfer_write to load_gather/store_scatter in case the target uArch doesn't support load_nd/store_nd. The high level steps: 1. compute Strides; 2. compute Offsets; 3. collapseMemrefTo1D; 4. create Load gather or store_scatter op	2025-08-14 11:27:07 -07:00
Boyana Norris	ada191136b	[mlir][cmake] Fix mlir target export (#153341 ) In https://github.com/llvm/llvm-project/pull/152195, target export was accidentally moved inside a conditional, but it should have been left outside. This patch undoes that change.	2025-08-14 11:24:44 -06:00
Matthias Springer	e2ae634cc1	[mlir][LLVM][NFC] Simplify `copyUnrankedDescriptors` (#153597 ) Split the function into two: one that copies a single unranked descriptor and one that copies multiple unranked descriptors. This is in preparation of adding 1:N support to the Func->LLVM lowering patterns.	2025-08-14 18:25:19 +02:00
Boyana Norris	1945753700	[mlir][linalg] Fix incorrect linalg short form printing (#153219 ) Both `linalg.map` and `linalg.reduce` are sometimes printed in short form incorrectly, resulting in a round-trip output with different semantics. This patch adds additional `yield` operand checks to ensure that all criteria for short-form printing are satisfied. Updated/added comments and renamed the `findPayloadOp` function to `canUseShortForm`, which more accurately reflects its purpose. A couple of new lit tests check for the proper use of long form when short-form conditions are not met. Fixes #117528	2025-08-14 17:19:16 +01:00
Renato Golin	8cc22ee674	[MLIR][Maintainers] Add maintainer list for core sub-categories (#152136 ) Ref: https://discourse.llvm.org/t/mlir-project-maintainers/87189 See also: * #151721 * #150945 Compared to the original proposal, one change is included: * The `ub` dialect has @Hardcode84 as maintainer. Please accept to validate your nomination, let's keep new nominations for follow up PRs.	2025-08-14 16:08:15 +01:00
Matthias Springer	0ff92fe2f0	[mlir][LLVM][NFC] Simplify `computeSizes` function (#153588 ) Rename `computeSizes` to `computeSize` and make it compute just a single size. This is in preparation of adding 1:N support to the Func->LLVM lowering patterns.	2025-08-14 17:00:03 +02:00
Jaden Angella	bfda0e777d	[mlir][EmitC] Expand the MemRefToEmitC pass - Lowering `CopyOp` (#151206 ) This patch lowers `memref.copy` to `emitc.call_opaque "memcpy"`. From: ``` func.func @copying(%arg0 : memref<9x4x5x7xf32>, %arg1 : memref<9x4x5x7xf32>) { memref.copy %arg0, %arg1 : memref<9x4x5x7xf32> to memref<9x4x5x7xf32> return } ``` To: ```cpp #include <cstring> void copying(float v1[9][4][5][7], float v2[9][4][5][7]) { size_t v3 = 0; float* v4 = &v2[v3][v3][v3][v3]; float* v5 = &v1[v3][v3][v3][v3]; size_t v6 = sizeof(float); size_t v7 = 1260; size_t v8 = v6 * v7; memcpy(v5, v4, v8); return; } ```	2025-08-14 05:25:55 -07:00
lonely eagle	6d08a39eeb	[mlir][nvgpu] Add tma last dim bytes check (#153451 ) Add the check the number of bytes in the last dimension of Tma must be a multiple of 16.	2025-08-14 20:14:20 +08:00
Igor Wodiany	87de48d11f	[mlir][spirv] Add spirv validation for module.mlir target test (#153227 ) Creating this patch as an example on using the new `mlir-translate` flag. Eventually all tests will be updated to validate SPIR-V modules.	2025-08-14 12:45:55 +01:00
Andrzej Warzyński	8d4f3171fa	[mlir][linalg] Fix UnPackOp::getTiledOuterDims (#152960 ) Fixes `getTiledOuterDims` by making sure that the `outer_dims_perm` attribute from `linalg.unpack` is taken into account. Fixes #152037	2025-08-14 11:39:50 +01:00
Ege Beysel	8de85e753f	[mlir][linalg] Add support for scalable vectorization of `linalg.batch_mmt4d` (#152984 ) This PR builds upon the previous #146531 and enables scalable vectorization for `batch_mmt4d` as well. --------- Signed-off-by: Ege Beysel <beyselege@gmail.com>	2025-08-14 11:47:51 +02:00
Jordan Rupprecht	1d55b70ec3	[MLIR][GPU][XeVM] Add missing #include for standalone header build (#153532 ) This header uses GPUModuleOp but does not directly include the header: `error: no type named 'GPUModuleOp' in namespace 'mlir::gpu'; did you mean 'ModuleOp'?` Needed for #148286	2025-08-14 04:13:41 +00:00
Sayan Saha	8432f24831	[mlir][tosa] Don't fold mul with zero lhs/rhs if resulting type is dynamic (#153420 ) Canonicalizing the following IR: ``` func.func @mul_zero_dynamic_nofold(%arg0: tensor<?x17xf32>) -> tensor<?x17xf32> { %0 = "tosa.const"() <{values = dense<0.000000e+00> : tensor<1x1xf32>}> : () -> tensor<1x1xf32> %1 = "tosa.const"() <{values = dense<0> : tensor<1xi8>}> : () -> tensor<1xi8> %2 = tosa.mul %arg0, %0, %1 : (tensor<?x17xf32>, tensor<1x1xf32>, tensor<1xi8>) -> tensor<?x17xf32> return %2 : tensor<?x17xf32> } ``` resulted in a crash ``` #0 0x000056513187e8db backtrace (./build-release/bin/mlir-opt+0x9d698db) #1 0x0000565131b17737 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:838:8 #2 0x0000565131b187f3 PrintStackTraceSignalHandler(void) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:918:1 #3 0x0000565131b18c30 llvm::sys::RunSignalHandlers() /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Signals.cpp:105:18 #4 0x0000565131b18c30 SignalHandler(int, siginfo_t, void*) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/llvm/lib/Support/Unix/Signals.inc:409:3 #5 0x00007f2e4165b050 (/lib/x86_64-linux-gnu/libc.so.6+0x3c050) #6 0x00007f2e416a9eec __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 #7 0x00007f2e4165afb2 raise ./signal/../sysdeps/posix/raise.c:27:6 #8 0x00007f2e41645472 abort ./stdlib/abort.c:81:7 #9 0x00007f2e41645395 _nl_load_domain ./intl/loadmsgcat.c:1177:9 #10 0x00007f2e41653ec2 (/lib/x86_64-linux-gnu/libc.so.6+0x34ec2) #11 0x00005651443ec4ba mlir::DenseIntOrFPElementsAttr::getRaw(mlir::ShapedType, llvm::ArrayRef<char>) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/IR/BuiltinAttributes.cpp:1361:3 #12 0x00005651443f1209 mlir::DenseElementsAttr::resizeSplat(mlir::ShapedType) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/IR/BuiltinAttributes.cpp:0:10 #13 0x000056513f76f2b6 mlir::tosa::MulOp::fold(mlir::tosa::MulOpGenericAdaptor<llvm::ArrayRef<mlir::Attribute>>) /local-ssd/sayans/Softwares/llvm-repo/llvm-project-latest/mlir/lib/Dialect/Tosa/IR/TosaCanonicalizations.cpp:0:0 ``` from the folder for `tosa::mul` since the zero value was being reshaped to `?x17` size which isn't supported. AFAIK, `tosa.const` requires all dimensions to be static. So in this case, the fix is to not to fold the op.	2025-08-13 19:45:06 -04:00
Sang Ik Lee	9f953fa62f	[MLIR] XeVM Target: Add missing SPIR-V backend dependency libraries. (#153505 ) Adding missing dependency SPIRVDesc, SPIRVInfo Fixes post commit build issue with #148286	2025-08-13 16:03:16 -07:00
Krzysztof Drewniak	bbe3d64b39	[mlir][ROCDL] Annotate lane ID functions with noundef, ranges (#151396 ) Now that we have general support for setting argument and result attributes on LLVM intrinsics, extend the definitions of mbcnt.lo and mbcnt.hi to carry such attributes. With that, update the construction of the mbcnt.lo/mbcnt.hi calls used to get the lane ID to be `noundef` (since the lane ID is always defined) and to be annotated with the correct ranges (so that generic LLVM passes can correctly optimized based on the fact that there are never more than 32/64 lanes). (Also, handle a pattern that wasn't using getLaneId() and get rid of a dead argument)	2025-08-13 17:44:03 -05:00
Nishant Patel	af87214b84	[MLIR][XeGPU] Add pattern for arith.constant for wg to sg distribution (#151977 )	2025-08-13 13:52:07 -07:00
James Newling	2796336152	[mlir][vector] Improve vector.gather description (#153278 ) Improve/elaborate example describing semantics	2025-08-13 13:50:06 -07:00
Sang Ik Lee	baae949f19	[MLIR][GPU][XeVM] Add XeVM target and XeVM dialect integration tests. (#148286 ) As part of XeVM dialect upsteaming, covers remaining parts required for XeVM dialect integration and testing. It has two high level components - XeVM target and serialization support - XeVM dialect integration tests using level zero runtime Co-Authored-by: Artem Kroviakov <artem.kroviakov@intel.com>	2025-08-13 13:17:10 -07:00
Mehdi Amini	bfd490e0cd	Revert "[MLIR] Split ExecutionEngine Initialization out of ctor into an explicit method call" (#153477 ) Reverts llvm/llvm-project#153373 Sanitizer bot is broken	2025-08-13 19:43:04 +00:00
Matthias Springer	c888addc9f	[mlir][Transforms] Fix build (#153447 ) Fix build after #151865.	2025-08-13 18:21:45 +02:00
Igor Wodiany	d4045a448d	[mlir][spirv] Add .spv extension to validation files (#153440 )	2025-08-13 16:13:06 +00:00
Matthias Springer	7e7c9d975e	[mlir][Transforms] Dialect Conversion Driver without Rollback (#151865 ) This commit improves the `allowPatternRollback` flag handling in the dialect conversion driver. Previously, this flag was used to merely detect cases that are incompatible with the new One-Shot Dialect Conversion driver. This commit implements the driver itself: when the flag is set to "false", all IR changes are materialized immediately, bypassing the `IRRewrite` and `ConversionValueMapping` infrastructure. A few selected test cases now run with both the old and the new driver. RFC: https://discourse.llvm.org/t/rfc-a-new-one-shot-dialect-conversion-driver/79083	2025-08-13 17:40:55 +02:00
Mohammadreza Ameri Mahabadian	187f2967df	[mlir][spirv] Conditionally add SPV_KHR_non_semantic_info extension u… (#152686 ) …pon serialization If serialization option `emitDebugInfo` is enabled, then it is required to serialize `SPV_KHR_non_semantic_info` extension provided that it is available in the target environment. --------- Signed-off-by: Mohammadreza Ameri Mahabadian <mohammadreza.amerimahabadian@arm.com>	2025-08-13 11:33:10 -04:00
Shenghang Tsai	2f93693f76	[MLIR] Split ExecutionEngine Initialization out of ctor into an explicit method call (#153373 ) This PR introduces a mechanism to defer JIT engine initialization, enabling registration of required symbols before global constructor execution. ## Problem Modules containing `gpu.module` generate global constructors (e.g., kernel load/unload) that execute during engine creation. This can force premature symbol resolution, causing failures when: - Symbols are registered via `mlirExecutionEngineRegisterSymbol` after creation - Global constructors exist (even if not directly using unresolved symbols, e.g., an external function declaration) - GPU modules introduce mandatory binary loading logic ## Usage ```c // Create engine without initialization MlirExecutionEngine jit = mlirExecutionEngineCreate(...); // Register required symbols mlirExecutionEngineRegisterSymbol(jit, ...); // Explicitly initialize (runs global constructors) mlirExecutionEngineInitialize(jit); ``` --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>	2025-08-13 15:22:01 +02:00
Matthias Springer	2fcdabaf39	[mlir][DialectUtils] Fix div by zero crash (#153380 )	2025-08-13 13:38:57 +02:00
Baz	e141da8a62	[mlir][ExecutionEngine] fix default free function in `OwningMemRef`. (#153133 ) `basePtr` should be freed instead of `data` because it is the one which is storing the output of `malloc`. In `allocAligned()`, the `data` is malloced and then assigned to `basePtr`.	2025-08-13 10:23:04 +00:00
Adam Siemieniuk	7d1b9cad87	[mlir][amx] Vector to AMX conversion pass (#151121 ) Adds a pass for Vector to AMX operation conversion. Initially, a direct rewrite for vector contraction in packed VNNI layout is supported. Operations are expected to already be in shapes which are AMX-compatible for the rewriting to occur.	2025-08-13 11:08:52 +02:00
Longsheng Mou	2edee0bc79	[mlir][gpu] Support outlining nested `gpu.launch` (#152696 ) This PR fixes a crash in `GpuKernelOutliningPass` that occurred when encountering a symbol that was not a `FlatSymbolRefAttr`, enabling outlining of nested `gpu.launch` operations. Fixes #149318.	2025-08-13 11:42:52 +08:00
Maksim Levental	2b842e5600	[mlir][python] fix PyThreadState_GetFrame again (#153333 ) add more APIs missing from 3.8 (fix rocm builder)	2025-08-12 21:29:23 -05:00
Maksim Levental	9df846bf71	[mlir][python] fix PyThreadState_GetFrame (#153325 ) `PyThreadState_GetFrame` wasn't added until 3.9 (fixes currently failing rocm builder)	2025-08-13 01:16:04 +00:00
Maksim Levental	a40f47c972	[mlir][python] automatic location inference (#151246 ) This PR implements "automatic" location inference in the bindings. The way it works is it walks the frame stack collecting source locations (Python captures these in the frame itself). It is inspired by JAX's [implementation](`523ddcfbca/jax/_src/interpreters/mlir.py (L462)`) but moves the frame stack traversal into the bindings for better performance. The system supports registering "included" and "excluded" filenames; frames originating from functions in included filenames will not be filtered and frames originating from functions in excluded filenames will be filtered (in that order). This allows excluding all the generated `*_ops_gen.py` files. The system is also "toggleable" and off by default to save people who have their own systems (such as JAX) from the added cost. Note, the system stores the entire stacktrace (subject to `locTracebackFramesLimit`) in the `Location` using specifically a `CallSiteLoc`. This can be useful for profiling tools (flamegraphs etc.). Shoutout to the folks at JAX for coming up with a good system. --------- Co-authored-by: Jacques Pienaar <jpienaar@google.com>	2025-08-12 16:59:59 -05:00
Nick Smith	21473462f7	[MLIR][Python] MLIR Enum Python bindings infinite recursion (#151584 ) (#151588 ) Fixes an infinite recursion bug when using I32BitEnumAttrCaseGroup with python bindings. For more info, see issue: - https://github.com/llvm/llvm-project/issues/151584	2025-08-12 14:27:05 -04:00
modiking	38d854c6e8	[MLIR][NVVM] Update MLIR mapa to reflect new address space (#146031 ) The mapa.shared.cluster variant that takes in address-space 3 now should output address-space 7. This patch updates the NVVMOps.td file to reflect this.	2025-08-12 21:43:51 +05:30
Gao Yanfeng	24f5385a85	[MLIR][NVVM] Support generating all the ldmatrix intrinsics from NVVM ops (#148783 ) Previously, the NVVM dialect's ldmatrix operation could only generate a limited subset of the available NVVM ldmatrix intrinsics. The intrinsics generating new ops introduced in BlackWell are not accessible through the NVVM ops. This commit extends the ldmatrix operation to support all available ldmatrix intrinsics.	2025-08-12 15:13:15 +01:00
Akash Banerjee	e1a694cd16	[NFC] Remove invalid conversions in ComplexToROCDLLibraryCalls	2025-08-12 15:06:03 +01:00
Akash Banerjee	c1f410779a	Revert "[NFC] Remove invalid conversions in ComplexToROCDLLibraryCalls" This reverts commit b8104fa320f006bacd3e16afb431b5980dd5000a.	2025-08-12 14:18:57 +01:00
Matthias Springer	ef2b8805bf	[mlir][vector] Implement `InferTypeOpInterface` on `vector.to_elements` (#153172 ) Just for convenience. This auto-generates an additional builder that infers the result type.	2025-08-12 15:15:30 +02:00
Akash Banerjee	b8104fa320	[NFC] Remove invalid conversions in ComplexToROCDLLibraryCalls	2025-08-12 14:05:00 +01:00

1 2 3 4 5 ...

23848 Commits