llvm-project

Author	SHA1	Message	Date
Srinivasa Ravi	ab9e447fb1	[MLIR][NVVM] Add support for mapa MLIR Ops (#124514 ) Adds `mapa` and `mapa.shared.cluster` MLIR Ops to generate mapa instructions. `mapa` - Map the address of the shared variable in the target CTA. - `mapa` - source is a register containing generic address pointing to shared memory. - `mapa.shared.cluster` - source is a shared memory variable or a register containing a valid shared memory address. PTX Spec Reference: https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-mapa	2025-01-30 11:05:12 +05:30
Krzysztof Drewniak	cdc09a118a	[mlir][IntRangeInference] Infer values for {memref,tensor}.dim (#122945 ) Implement the integer range inference niterface for memref.dim and tetnor.dim using shared code. The inference will infer the `dim` of dynamic dimensions to [0, index_max] and take the union of all the dimensions that the `dim` argument could be validly referring to.	2025-01-29 18:43:53 -06:00
Jerry-Ge	956c0707d9	[mlir][tosa] Change the start and size of slice to tosa shape type (#124209 ) Update to use getConstShapeValue to collect shape information along the graph. Change-Id: Ic6fc2341e3bcfbec06a1d08986e26dd08573bd9c Co-authored-by: TatWai Chong <tatwai.chong@arm.com>	2025-01-29 13:43:35 -08:00
Jay Foad	aa2952165c	Fix typo "tranpose" (#124929 )	2025-01-29 17:49:54 +00:00
Rolf Morel	0d4efa2725	[MLIR][Linalg] Introduce linalg.contract (#123618 ) A new op that allows for representing arbitrary contractions on operands of arbitrary rank, with arbitrary transposes and arbitrary broadcasts specified through its indexing_maps attribute. Supports the expected lowerings to linalg.generic and to vector.contract. Corresponding RFC is here: https://discourse.llvm.org/t/mlir-rfc-introduce-linalg-contract/83589	2025-01-29 17:28:52 +00:00
Matthias Springer	6900768719	[mlir][Conversion] Fix typos in MemRef descriptor comments (#124923 )	2025-01-29 17:13:14 +01:00
Igor Wodiany	a01097faca	[mlir][spirv] Add definition for VectorTimesMatrixOp (#124571 ) Adding op as defined in section 3.52.13. (Arithmetic Instructions) of the SPIR-V specification.	2025-01-29 10:16:38 -05:00
Andrea Faulds	25ae1a266d	[mlir][spirv] Make ConvertToSPIRVPass into a test pass (non-public) With the removal of mlir-vulkan-runner (as part of #73457) in e7e3c45bc70904e24e2b3221ac8521e67eb84668, this pass no longer has to be public (previously it had to be so the runner could use it). This commit makes it instead only available for use by mlir-opt. This is a recommit of 058d183980a2f334d085a46c32abded0557aa789 (#124301) which had been reverted in 4573c857da88b3210d497d9a88a89351a74b5964 due to a missing linker dependency on MLIRSPIRVTransforms in mlir/test/lib/Pass/CMakeLists.txt (fixed in this commit).	2025-01-29 15:51:42 +01:00
Andrea Faulds	4573c857da	Revert "[mlir][spirv] Make ConvertToSPIRVPass into a test pass (non-public) (#124301 )" This reverts commit 058d183980a2f334d085a46c32abded0557aa789 due to build failures (missing symbols when linking).	2025-01-29 15:06:32 +01:00
Andrea Faulds	058d183980	[mlir][spirv] Make ConvertToSPIRVPass into a test pass (non-public) (#124301 ) With the removal of mlir-vulkan-runner (as part of #73457) in e7e3c45bc70904e24e2b3221ac8521e67eb84668, this pass no longer has to be public (previously it had to be so the runner could use it). This commit makes it instead only available for use by mlir-opt.	2025-01-29 14:36:55 +01:00
Ingo Müller	8d6b24167b	[mlir] Make `TypedStrAttr` actually enforce the string type. (#124770 ) The tablgen definition `TypedStrAttr` is an attribute constraints that is meant to restrict the type of a `StringAttr` to the type given as parameter. However, the definition did not previously restrict the type; any `StringAttr` was accepted. This PR makes the definition actually enforce the type. To test the constraints, the PR also changes the test op that was previously used to test this constraint such that the enforced type is `AnyInteger` instead of `AnyType`. The latter allowed any type, so not enforcing that constraint had no observable effect. The PR then adds a test case with a wrong type and ensures that diagnostics are produced. Signed-off-by: Ingo Müller <ingomueller@google.com>	2025-01-29 13:32:36 +01:00
Adam Siemieniuk	87782b216f	[mlir][x86vector] AVX512-BF16 Dot op (#124800 ) Adds AVX512 bf16 dot-product operation and defines lowering to LLVM intrinsics. AVX512 intrinsic operation definition is extended with an optional extension field that allows specifying necessary LLVM mnemonic suffix e.g., `"bf16"` for `x86_avx512bf16_` intrinsics.	2025-01-29 13:07:41 +01:00
Dmitriy Smirnov	f20b8e35b3	[MLIR][Linalg] Fixes for Winograd decomposition and for tiling (#123675 ) The PR addresses issues with the filters of 1 x r and of r x 1 and with the tiling. --------- Signed-off-by: Dmitriy Smirnov <dmitriy.smirnov@arm.com>	2025-01-29 10:38:29 +00:00
Henrich Lauko	2a1f79582f	[MLIR] Fix import of invokes with mismatched variadic types (#124828 ) This resolves the same issue addressed in https://github.com/llvm/llvm-project/pull/124286, but for invoke operations. The issue arose from duplicated logic for both imports. This PR also refactors the common import code for call and invoke instructions to mitigate issues in the future.	2025-01-29 10:43:00 +01:00
Matthias Gehre	5d3ae51612	Reapply "[mlir][python] allow DenseIntElementsAttr for index type (#118947 )" (#124804 ) This reapplies #118947 and adapts to nanobind.	2025-01-29 09:14:37 +01:00
Matthias Gehre	2ec27848c0	[MLIR] normalize-memrefs: Normalize memref.alloca (#123293 ) The pass was only handling `memref.alloc` and this extends it to also handle `memref.alloca`.	2025-01-29 08:34:33 +01:00
Srinivasa Ravi	d4159e2a1d	[MLIR][NVVM] Add support for griddepcontrol Ops (#124603 ) Adds `griddepcontrol.wait` and `griddepcontrol.launch.dependents` MLIR Ops to generate griddepcontrol instructions. `griddepcontrol` - Allows dependent and prerequisite grids as defined by the runtime to control execution in the following ways: - `griddepcontrol.wait` - causes the executing thread to wait until all prerequisite grids in flight have completed and all the memory operations from the prerequisite grids are performed and made visible to the current grid. - `griddepcontrol.launch.dependents` - signals that specific dependents the runtime system designated to react to this instruction can be scheduled as soon as all other CTAs in the grid issue the same instruction or have completed. PTX Spec Reference: https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-griddepcontrol	2025-01-29 10:57:51 +05:30
Diego Caballero	35df525fd0	[mlir][Vector] Add support for poison indices to `Extract/IndexOp` (#123488 ) Following up on #122188, this PR adds support for poison indices to `ExtractOp` and `InsertOp`. It also includes canonicalization patterns to turn extract/insert ops with poison indices into `ub.poison`.	2025-01-28 13:51:50 -08:00
Matthias Gehre	1b729c3d70	Revert "[mlir][python] allow DenseIntElementsAttr for index type (#118947 )" This reverts commit 9dd762e8b10586e749b0ddf3542e5dccf8392395.	2025-01-28 18:35:50 +01:00
Matthias Gehre	9dd762e8b1	[mlir][python] allow DenseIntElementsAttr for index type (#118947 ) Model the `IndexType` as `uint64_t` when converting to a python integer. With the python bindings, ```python DenseIntElementsAttr(op.attributes["attr"]) ``` used to `assert` when `attr` had `index` type like `dense<[1, 2, 3, 4]> : vector<4xindex>`. --------- Co-authored-by: Christopher McGirr <christopher.mcgirr@amd.com> Co-authored-by: Tiago Trevisan Jost <tiago.trevisanjost@amd.com>	2025-01-28 18:31:58 +01:00
Jack Frankland	a58e774fba	[mlir][tosa] Make TOSA MUL's Shift an Input (#121953 ) The TOSA-v1.0 specification makes the shift attribute of the MUL (Hammard product) operator an input. Move the `shift` parameter of the MUL operator in the MILR TOSA dialect from an attribute to an input and update any lit tests appropriately. Expand the verifier of the `tosa::MulOp` operation to check the various constraints defined in the TOSA-v1.0 specification. Specifically, ensure that all input operands (excluding the optional shift) are of the same rank. This means that broadcasting tests which previously checked rank-0 tensors would be broadcast are no longer valid and are removed. Signed-off-by: Jack Frankland <jack.frankland@arm.com> Co-authored-by: TatWai Chong <tatwai.chong@arm.com>	2025-01-28 16:25:22 +00:00
Luohao Wang	e84f6b6a88	[mlir] Fix conflict of user defined reserved functions with internal prototypes (#123378 ) On lowering from `memref` to LLVM, `malloc` and other intrinsic functions from `libc` will be declared in the current module. User's redefinition of these reserved functions will poison the internal analysis with wrong prototype. This patch adds assertion on the found function's type and reports if it mismatch with the intended type. Related to #120950 --------- Co-authored-by: Luohao Wang <Luohaothu@users.noreply.github.com>	2025-01-28 14:40:47 +01:00
Hongren Zheng	3a439e2caf	[mlir][dataflow] disallow outside use of propagateIfChanged for DataFlowSolver (#120885 ) Detailed writeup is in https://github.com/google/heir/issues/1153. See also https://github.com/llvm/llvm-project/pull/120881. In short, `propagateIfChanged` is used outside of the `DataFlowAnalysis` scope, because it is public, but it does not propagate as expected as the `DataFlowSolver` has stopped running. To solve such misuse, `propagateIfChanged` should be made protected/private. For downstream users affected by this, to correctly propagate the change, the Analysis should be re-run (check #120881) instead of just a `propagateIfChanged` The change to `IntegerRangeAnalysis` is just a expansion of the `solver->propagateIfChanged`. The `Lattice` has already been updated by the `join`. Propagation is done by `onUpdate`. Cc @Mogball for review	2025-01-28 13:32:28 +08:00
Hongren Zheng	3c64f86314	[mlir] Add OpAsmTypeInterface for pretty-print (#121187 ) See https://discourse.llvm.org/t/rfc-introduce-opasm-type-attr-interface-for-pretty-print-in-asmprinter/83792 for detailed introduction. This PR acts as the first part of it * Add `OpAsmTypeInterface` and `getAsmName` API for deducing ASM name from type * Add default impl in `OpAsmOpInterface` to respect this API when available. The `OpAsmAttrInterface` / hooking into Alias system part should be another PR, using a `getAlias` API. ### Discussion * Instead of using `StringRef getAsmName()` as the API, I use `void getAsmName(OpAsmSetNameFn)`, as returning StringRef might be unsafe (std::string constructed inside then returned a _ref_; and this aligns with the design of `getAsmResultNames`. * On the result packing of an op, the current approach is that when not all of the result types are `OpAsmTypeInterface`, then do nothing (old default impl) ### Review Cc @j2kun and @Alexanderviand-intel for downstream; Cc @River707 and @joker-eph for relevent commit history; Cc @ftynse for discourse.	2025-01-28 13:31:41 +08:00
Scott Manley	e492083f55	[OpenACC] Add AutomaticAllocationScope to recipe ops (#124337 ) The recipe operations should have AutomaticAllocationScope so recipes can be converted using operators that require parent ops to have AutomaticAllocationScope	2025-01-27 07:47:45 -08:00
MaheshRavishankar	092372da15	[mlir][Tensor] Rework `ReifyRankedShapedTypeInterface` implementation for `tensor.expand_shape` op. (#113501 ) The op carries the output-shape directly. This can be used directly. Also adds a method to get the shape as a `SmallVector<OpFoldResult>`. Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2025-01-27 07:05:34 -08:00
Samuel Ginzburg	43a50deb63	[MLIR][ROCDL] Add GFX940 SMFMAC (2:4 sparsity) instructions to the ROCDL dialect (#124435 ) # Overview This PR adds 2:4 structured sparsity (sparse A, dense B) matrix multiply instructions to ROCDL. # Testing I've added tests to Dialect/mlir and Target/mlir	2025-01-27 11:58:26 +01:00
Longsheng Mou	8f17f51deb	[mlir][tosa] Fix comments format(NFC) (#124520 ) This PR corrects the formatting of comments in Markdown. The previous format was as follows: https://mlir.llvm.org/docs/Dialects/TOSA/#tosaerf-mlirtosaerfop ![image](https://github.com/user-attachments/assets/1d1d10d5-c960-4724-9fb4-29c17ea39b11) https://mlir.llvm.org/docs/Dialects/TOSA/#tosarescale-mlirtosarescaleop ![image](https://github.com/user-attachments/assets/fb23cbf6-be10-4a60-8b43-b28dc2db6918)	2025-01-27 10:50:53 +00:00
Jacques Pienaar	3b35b4c7f9	[mlir] Allow fallback from file line col range to loc (#124321 ) This was discussed during the original review but I made it stricter than discussed. Making it a pure view but adding a helper for bytecode serialization (I could avoid the helper, but it ends up with more logic and stronger coupling).	2025-01-24 18:08:44 -08:00
Jianjian Guan	990837f91d	[mlir][arith][tensor] Disable index type for bitcast (#121455 ) Fixes #121397.	2025-01-24 16:53:04 +08:00
donald chen	45d83ae7df	[mlir] [math] Fix the precision issue of expand math (#120865 ) The convertFloorOp pattern incurs precision loss when floating-point numbers exceed the representable range of int64. This pattern should be removed. Fixes https://github.com/llvm/llvm-project/issues/119836	2025-01-24 14:46:41 +08:00
Yi Qian	c118864223	[MLIR][ROCDL]Add MFMA_*_F8F6F4 instructions to the ROCDL dialect (#123830 ) This PR adds mfma.scale.f32.32x32x64.f8f6f4 and mfma.scale.f32.16x16x128.f8f6f4 to the ROCDL dialect. They are converted to the corresponding intrinsics in the mlir-to-llvmir pass.	2025-01-23 19:27:56 +00:00
Durgadoss R	2e6cc79f81	[MLIR][NVVM] Migrate CpAsyncOp to intrinsics (#123789 ) Intrinsics are available for the 'cpSize' variants also. So, this patch migrates the Op to lower to the intrinsics for all cases. * Update the existing tests to check the lowering to intrinsics. * Add newer cp_async_zfill tests to verify the lowering for the 'cpSize' variants. * Tidy-up CHECK lines in cp_async() function in nvvmir.mlir (NFC) PTX spec link: https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cp-async Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-01-23 16:15:52 +05:30
Jack Frankland	8388040fc9	[mlir][tosa] Add NaN Propagation Mode Support (#121951 ) The TOSA-V1.0 specification adds "nan propagation" modes as attributes for several operators. Adjust the ODS definitions of the relevant operations to include this attribute. The defined modes are "PROPAGATE" and "IGNORE" and the PROPAGATE mode is set by default. MAXIMUM, MINIMUM, REDUCE_MAX, REDUCE_MIN, MAX_POOL, CLAMP, and ARGMAX support this attribute. Signed-off-by: Jack Frankland <jack.frankland@arm.com> Co-authored-by: TatWai Chong <tatwai.chong@arm.com>	2025-01-23 10:14:00 +00:00
Jakub Kuderski	98de5dfe6a	[mlir] Add NamedAttribute ctor taking StringRef. NFC. (#123974 ) This is a small QoL improvement so that we don't have to go through helpers when building `NamedAttribute`s.	2025-01-22 19:02:17 -05:00
Jacques Pienaar	a77250fd78	[mlir] Add C and Python interface for file range (#123276 ) Plumbs through creating file ranges to C and Python.	2025-01-22 14:33:19 -08:00
Jerry-Ge	7e622b6132	[TOSA] Change PadOp padding to tosa.shape (#123133 ) This patch changes PadOp's padding input to type !tosa.shape<2 * rank>, (where rank is the rank of the PadOp's input), instead of a <rank x 2> tensor. This patch is also a part of TOSA v1.0 effort: https://discourse.llvm.org/t/rfc-tosa-dialect-increment-to-v1-0/83708 This patch updates the PadOp to match all against the TOSA v1.0 form. Original Authors include: @Tai78641 @wonjeon Co-authored-by: Tai Ly <tai.ly@arm.com>	2025-01-22 12:36:48 -08:00
Anchu Rajendran S	afcbcae668	[mlir][OpenMP] inscan reduction modifier and scan op mlir support (#114737 ) Scan directive allows to specify scan reductions within an worksharing loop, worksharing loop simd or simd directive which should have an `InScan` modifier associated with it. This change adds the mlir support for the same. Related PR: [Parsing and Semantic Support for scan](https://github.com/llvm/llvm-project/pull/102792)	2025-01-22 09:53:54 -08:00
Igor Wodiany	f78359cf43	[mlir][spirv] Add definition for OpEmitVertex and OpEndPrimitive (#123759 ) This is hopefully the first patch in the series of patches adding some missing SPIR-V ops to MLIR over the next weeks/months, starting with something simple: `OpEmitVertex` and `OpEndPrimitive`. Since the ops have no input and outputs, and the only condition is "This instruction must only be used when only one stream is present.", which I don't think can be validate at the instruction level in isolation, I set `hasVerifier` to 0. I hope I didn't miss anything, but I'm more than happy to address any comments.	2025-01-22 12:45:23 -05:00
Petr Kurapov	fa6f88af10	[MLIR][XeGPU] Allow some nd ops to have argument shapes mismatch for … (#120566 ) …the distributed IR case. This patch allows `nd_load` and `nd_store` to preserve the tensor descriptor shape during distribution to SIMT. The validation now expects the distributed instruction to retain the `sg_map` attribute and uses it to verify the consistency.	2025-01-22 18:03:36 +01:00
Tai Ly	729f958c4f	[TOSA] Add SameOperandsAndResultRank to TOSA Ops (#104501 ) [note: this is blocked by: https://github.com/tensorflow/tensorflow/pull/73891 otherwise tensorflow may have lit test failures] This patch adds SameOperandsAndResultRank trait to TOSA operators with ResultsBroadcastableShape trait. SameOperandsAndResultRank trait requiring that all operands and results have matching ranks unless the operand/result is unranked. This also renders the TosaMakeBroadcastable pass unnecessary - but this pass is left in for now just in case it is still used in some flows. The lit test, broadcast.mlir, is removed. This also adds verify of the SameOperandsAndResultRank trait in the TosaInferShapes pass to validate inferred shapes. Signed-off-by: Tai Ly <tai.ly@arm.com>	2025-01-22 13:21:04 +00:00
Diego Caballero	d25a1f8887	[mlir][Vector][NFC] Add `vector-transform-options` flag to ConvertVectorToLLVMPass (#123491 ) This flag enables the configuration of some transformation such as the lowering of contractions and transposes. The default configuration preserves the existing behavior.	2025-01-21 16:41:57 -08:00
Hyunsung Lee	83cdcd5da4	[MLIR/linalg] Update arg name of `generalizeNamedOp` in `Transforms.h` (#123679 ) `Generalization.cpp:53` ```cpp FailureOr<GenericOp> mlir::linalg::generalizeNamedOp(RewriterBase &rewriter, LinalgOp linalgOp) { if (failed(generalizeNamedOpPrecondition(linalgOp))) return rewriter.notifyMatchFailure(linalgOp, "preconditions not met"); SmallVector<Value> inputs = linalgOp.getDpsInputs(); ValueRange outputs = linalgOp.getDpsInits(); SmallVector<AffineMap> indexingMaps = linalgOp.getIndexingMapsArray(); SmallVector<utils::IteratorType> iterators = linalgOp.getIteratorTypesArray(); SmallVector<Type> resultTypes = linalgOp.hasPureTensorSemantics() ? TypeRange(ValueRange(outputs)) : TypeRange{}; ... ``` `generalizeNamedOp` in `Generalization.cpp` has a different arg name than `generalizeNamedOp` in `Transforms.h` Sync to use `linalgOp`	2025-01-21 23:54:06 +00:00
Yi Qian	4564ac91e1	Add gfx950 mfma instructions to ROCDL dialect (#123361 ) Add ROCDL support to the following instructions: V_MFMA_F32_16X16X32_BF16 V_MFMA_I32_16X16X64_I8 V_MFMA_F32_16X16X32_F16 V_MFMA_F32_32X32X16_BF16 V_MFMA_I32_32X32X32_I8 V_MFMA_F32_32X32X16_F16 --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com> Co-authored-by: Jungwook Park <jungwook.park@amd.com>	2025-01-21 18:21:30 +00:00
plognjen	27f15add7c	[MLIR][ROCDL] Add ops for LDS read transpose and global to LDS intrinsics (#123530 ) This PR adds missing ds\.read.tr4\.b64, ds\.read\.tr8\.b64, ds\.read\.tr6\.b96, ds\.read\.tr16\.b64 and global\.load\.lds ops to the ROCDL dialect. The ops are converted to the corresponding intrinsic calls during the translation from MLIR to LLVM IRs. --------- Co-authored-by: Ognjen Plavsic <plognjen@amd.com>	2025-01-21 16:40:46 +00:00
Andrea Faulds	e7e3c45bc7	[mlir] Remove mlir-vulkan-runner and GPUToVulkan conversion passes (#123750 ) This follows up on 733be4ed7dcf976719f424c0cb81b77a14f91f5a, which made mlir-vulkan-runner and its associated passes redundant, and completes the main goal of #73457. The mlir-vulkan-runner tests become part of the integration test suite, and the Vulkan runner runtime components become part of ExecutionEngine, just as was done when removing other target-specific runners.	2025-01-21 16:51:27 +01:00
Karlo Basioli	a53abb2386	[mlir][IR] CommonTypeConstraints: fix syntax error (#123765 )	2025-01-21 16:31:50 +01:00
Emilio Cota	67a412f072	[mlir][IR] CommonTypeConstraints: fully qualify low-precision FP type… (#123738 ) …s isa<> calls in isa<> calls To ease integration with downstream projects. Follow-up to PR #123326.	2025-01-21 13:46:09 +00:00
Andrea Faulds	733be4ed7d	[mlir][spirv] Add GpuToLLVM cconv suited to Vulkan, migrate last tests (#123384 ) This commit is a follow-up to 99a562b3cb17e89273ba0fe77129f2fb17a19381, which migrated some of the mlir-vulkan-runner tests to mlir-cpu-runner using a new pipeline and set of wrappers. That commit could not migrate all the tests, because the existing calling conventions/ABIs for kernel arguments generated by GPUToLLVMConversionPass were not a good fit for the Vulkan runtime. This commit fixes this and migrates the remaining tests. With this commit, mlir-vulkan-runner and many related components are now unused, and they will be removed in a later commit (see #73457). The old calling conventions require both the caller (host LLVM code) and callee (device code) to have compile-time knowledge of the precise argument types. This works for CUDA, ROCm and SYCL, where there is a C-like calling convention agreed between the host and device code, and the runtime passes through arguments as raw data without comprehension. For Vulkan, however, the interface declared by the shader/kernel is in a more abstract form, so the device code has indirect access to the argument data, and the runtime must process the arguments to set up and bind appropriately-sized buffer descriptors. This commit introduces a new calling convention option to meet the Vulkan runtime's needs. It lowers memref arguments to {void*, size_t} pairs, which can be trivially interpreted by the runtime without it needing to know the original argument types. Unlike the stopgap measure in the previous commit, this system can support memrefs of various ranks and element types, which unblocked migrating the remaining tests.	2025-01-21 13:27:45 +01:00
Durgadoss R	0f9e913466	[MLIR][NVVM] Add TMA Bulk Copy Ops (#123186 ) PR #122344 adds intrinsics for Bulk Async Copy (non-tensor variants) using TMA. This patch adds the corresponding NVVM Dialect Ops. lit tests are added to verify the lowering to all variants of the intrinsics. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-01-21 13:56:59 +05:30

1 2 3 4 5 ...

11109 Commits