llvm-project

Author	SHA1	Message	Date
Matthias Springer	2fcdabaf39	[mlir][DialectUtils] Fix div by zero crash (#153380 )	2025-08-13 13:38:57 +02:00
Maya Amrami	e138c95155	[mlir] ViewLikeInterface - verify ranks in verifyOffsetSizeAndStrideOp (#147926 ) getMixedOffsets() calls getMixedValues() with `static_offsets` and `offsets`. It is assumed that the number of dynamic offsets in `static_offsets` equals the rank of `offsets`. Otherwise, we fail on assert when trying to access an array out of its bounds. The same applies to getMixedStrides() and getMixedOffsets(). A verification of this assumption is added to verifyOffsetSizeAndStrideOp() and a clear assert is added in getMixedValues().	2025-07-20 14:20:16 +03:00
Kazu Hirata	c06d3a7b72	[mlir] Remove unused includes (NFC) (#148769 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-07-14 22:19:23 -07:00
Jakub Kuderski	6512ca7ddb	[mlir] Add `isStatic`* size check for `ShapedType`s. NFCI. (#147085 ) The motivation is to avoid having to negate `isDynamic*` checks, avoid double negations, and allow for `ShapedType::isStaticDim` to be used in ADT functions without having to wrap it in a lambda performing the negation. Also add the new functions to C and Python bindings.	2025-07-07 14:57:27 -04:00
Thomas Preud'homme	7763002357	[MLIR/Utils] Add missing dep on Arith dialect (#146834 ) Fix the following compile error when building libMLIRDialectUtils.a only: In file included from mlir/include/mlir/Dialect/Utils/ReshapeOpsUtils.h:17, from mlir/lib/Dialect/Utils/ReshapeOpsUtils.cpp:9: mlir/include/mlir/Dialect/Arith/IR/Arith.h:28:10: fatal error: mlir/Dialect/Arith/IR/ArithOpsDialect.h.inc: No such file or directory ArithDialect dependency is now needed since 0515449f6dcb452ea0b089fb3057d469c3cffa3f to create arith.muli op.	2025-07-03 11:33:11 +01:00
MaheshRavishankar	7bc956d3d6	[mlir][PartialReductionTilingInterface] Add support for `ReductionTilingStrategy::PartialReductionOuterParallel` in `tileUsingSCF`. (#143988 ) Following up from https://github.com/llvm/llvm-project/pull/143467, this PR adds support for `ReductionTilingStrategy::PartialReductionOuterParallel` to `tileUsingSCF`. The implementation of `PartialReductionTilingInterface` for `Linalg` ops has been updated to support this strategy as well. This makes the `tileUsingSCF` come on par with `linalg::tileReductionUsingForall` which will be deprecated subsequently. Changes summary - `PartialReductionTilingInterface` changes : - `tileToPartialReduction` method needed to get the induction variables of the generated tile loops. This was needed to keep the generated code similar to `linalg::tileReductionUsingForall`, specifically to create a simplified access for slicing the intermediate partial results tensor when tiled in `num_threads` mode. - `getPartialResultTilePosition` methods needs the induction varialbes for the generated tile loops for the same reason above, and also needs the `tilingStrategy` to be passed in to generate correct code. The tests in `transform-tile-reduction.mlir` testing the `linalg::tileReductionUsingForall` have been moved over to test `scf::tileUsingSCF` with `ReductionTilingStrategy::PartialReductionOuterParallel` strategy. Some of the test that were doing further cyclic distribution of the transformed code from tiling are removed. Those seem like two separate transformation that were merged into one. Ideally that would need to happen when resolving the `scf.forall` rather than during tiling. Please review only the top commit. Depends on https://github.com/llvm/llvm-project/pull/143467 Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2025-06-23 12:27:26 -07:00
Momchil Velikov	4af96a9d83	[MLIR] Determine contiguousness of memrefs with dynamic dimensions (#142421 ) This patch enhances `MemRefType::areTrailingDimsContiguous` to also handle memrefs with dynamic dimensions. The implementation itself is based on a new member function `MemRefType::getMaxCollapsableTrailingDims` that return the maximum number of trailing dimensions that can be collapsed - trivially all dimensions for memrefs with identity layout, or by examining the memref strides stopping at discontiguous or statically unknown strides.	2025-06-23 09:28:33 +01:00
Kazu Hirata	c4ba734993	[mlir] Compare std::optional<T> to values directly (NFC) (#144241 ) This patch transforms: X && *X == Y to: X == Y where X is of std::optional<T>, and Y is of T or similar.	2025-06-14 23:23:42 -07:00
Artem Gindinson	f82cf74420	[mlir][tensor] Fix `getReassociationForCollapse` for tensor/scalar re… (#144118 ) …shapes Commit 6e5a142 changed the behavior of the function when computing reassociations between tensors (consisting of unit/dynamic dimensions) and scalars/0d vectors. The IR representation for such reshapes actually expects an empty reassociation, like so: ``` func.func @example(%arg0 : tensor<?x?x?xf32>) -> tensor<f32> { %0 = tensor.collapse_shape %arg0 [] : tensor<?x?x?xf32> into tensor<f32> } ``` Restore the original behavior - the routine should resort to reporting failures when compile time-known non-unit dimensions are part of the attempted reassociation. Signed-off-by: Artem Gindinson <gindinson@roofline.ai>	2025-06-13 20:03:24 +02:00
Ian Wood	6e5a1423b7	[mlir] Reapply "Loosen restrictions on folding dynamic reshapes" (#142827 ) The original PR https://github.com/llvm/llvm-project/pull/137963 had a nvidia bot failure. This appears to be a flaky test because rerunning the build was successful. This change needs commit 6f2ba47 to fix incorrect usage of `getReassociationIndicesForCollapse`. Reverts llvm/llvm-project#142639 Co-authored-by: Artem Gindinson <gindinson@roofline.ai>	2025-06-12 10:28:27 +02:00
Ian Wood	f5a2f00da9	Revert "[mlir][tensor] Loosen restrictions on folding dynamic reshapes" (#142639 ) Reverts llvm/llvm-project#137963 --------- Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>	2025-06-03 14:10:41 -07:00
Artem Gindinson	cb4a407e5c	[mlir][tensor] Loosen restrictions on folding dynamic reshapes (#137963 ) The main idea behind the change is to allow expand-of-collapse folds for reshapes like `?x?xk` -> `?` (k>1). The rationale here is that the expand op must have a coherent index/affine expression specified in its `output_shape` argument (see example below), and if it doesn't, the IR has already been invalidated at an earlier stage: ``` %c32 = arith.constant 32 : index %div = arith.divsi %<some_index>, %c32 : index %collapsed = tensor.collapse_shape %41#1 [[0], [1, 2], [3, 4]] : tensor<9x?x32x?x32xf32> into tensor<9x?x?xf32> %affine = affine.apply affine_map<()[s0] -> (s0 * 32)> ()[%div] %expanded = tensor.expand_shape %collapsed [[0], [1, 2], [3]] output_shape [9, %div, 32, %affine] : tensor<9x?x?xf32> into tensor<9x?x32x?xf32> ``` On the above assumption, adjust the routine in `getReassociationIndicesForCollapse()` to allow dynamic reshapes beyond just `?x..?x1x1x..x1` -> `?`. Dynamic subshapes introduce two kinds of issues: 1. n>2 consecutive dynamic dimensions in the source shape cannot be collapsed together into 1<k<n neighboring dynamic dimensions in the target shape, since there'd be more than one suitable reassociation (example: `?x?x10x? into ?x?`) 2. When figuring out static subshape reassociations based on products, there are cases where a static dimension is collapsed with a dynamic one, and should therefore be skipped when comparing products of source & target dimensions (e.g. `?x2x3x4 into ?x12`) To address 1, we should detect such sequences in the target shape before assigning multiple dynamic dimensions into the same index set. For 2, we take note that a static target dimension was preceded by a dynamic one and allow an "offset" subshape of source static dimensions, as long as there's an exact sequence for the target size later in the source shape. This PR aims to address all reshapes that can be determined based purely on shapes (and original reassociation maps, as done in `ComposeExpandOfCollapseOp::findCollapsingReassociation)`. It doesn't seem possible to fold all qualifying dynamic shape patterns in a deterministic way without looking into affine expressions simultaneously. That would be difficult to maintain in a single general utility, so a path forward would be to provide dialect-specific implementations for Linalg/Tensor. Signed-off-by: Artem Gindinson <gindinson@roofline.ai> --------- Signed-off-by: Artem Gindinson <gindinson@roofline.ai> Co-authored-by: Ian Wood <ianwood2024@u.northwestern.edu>	2025-06-03 09:09:01 -07:00
Han-Chung Wang	c39915fa2e	[mlir][NFC] Simplify constant checks with isOneInteger and renamed isZeroInteger. (#139340 ) The revision adds isOneInteger helper, and simplifies the existing code with the two methods. It removes some lambda, which makes code cleaner. For downstream users, you can update the code with the below script. ```bash sed -i "s/isZeroIndex/isZeroInteger/g" */.h sed -i "s/isZeroIndex/isZeroInteger/g" */.cpp ``` --------- Signed-off-by: hanhanW <hanhan0912@gmail.com>	2025-05-20 14:53:02 -07:00
Chao Chen	99720bbb87	[MLIR][Utils] Fix the overflow issue in computeSuffixProductImpl for 32-bit system. (#140567 ) In `int64_t r = strides.size() - 2`, it may cause overflow on 32-bit system when strides.size() is 1, because `strides.size()` is defined as `unsigned int`	2025-05-19 13:27:37 -05:00
Iris Shi	78af0f3ab8	[mlir][NFC] Use `llvm::sort` (#140261 )	2025-05-16 23:35:13 +08:00
Han-Chung Wang	7de2e4971f	[mlir][NFC] Use Builder for getReassociationIndicesAttribute method. (#137251 ) The method does not need to create any operation, so we can use Builder. It can be reused by any attribute getter implementation, so it does not need to declare OpBuilder in the implementation. Signed-off-by: hanhanW <hanhan0912@gmail.com>	2025-04-24 15:16:18 -07:00
Kazu Hirata	3041fa6c7a	[mlir] Use *Set::insert_range (NFC) (#132326 ) DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently gained C++23-style insert_range. This patch replaces: Dest.insert(Src.begin(), Src.end()); with: Dest.insert_range(Src); This patch does not touch custom begin like succ_begin for now.	2025-03-20 22:24:17 -07:00
Ian Wood	fbbb33f400	[mlir] Fix crash when verifying linalg.transpose (#131733 ) Adds checks in `isPermutationVector` for indices that are outside of the bounds and removes the assert. Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>	2025-03-18 12:33:27 -07:00
Andrzej Warzyński	517800e37e	[mlir][tensor][linalg] Move Pack/UnPack Ops to Linalg (#123902 ) Moves `PackOp` and `UnPackOp` from the Tensor dialect to Linalg. This change was discussed in the following RFC: * https://discourse.llvm.org/t/rfc-move-tensor-pack-and-tensor-unpack-into-linalg This change involves significant churn but only relocates existing code - no new functionality is added. Note for Downstream Users Downstream users must update references to `PackOp` and `UnPackOp` as follows: * Code: `s/tensor::(Up)PackOp/linalg::(Un)PackOp/g` * Tests: `s/tensor.(un)pack/linalg.(un)pack/g` No other modifications should be required.	2025-02-17 10:44:27 +00:00
MaheshRavishankar	092372da15	[mlir][Tensor] Rework `ReifyRankedShapedTypeInterface` implementation for `tensor.expand_shape` op. (#113501 ) The op carries the output-shape directly. This can be used directly. Also adds a method to get the shape as a `SmallVector<OpFoldResult>`. Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2025-01-27 07:05:34 -08:00
Kazu Hirata	129f1001c3	[Dialect] Migrate away from PointerUnion::{is,get} (NFC) (#120818 ) Note that PointerUnion::{is,get} have been soft deprecated in PointerUnion.h: // FIXME: Replace the uses of is(), get() and dyn_cast() with // isa<T>, cast<T> and the llvm::dyn_cast<T> I'm not touching PointerUnion::dyn_cast for now because it's a bit complicated; we could blindly migrate it to dyn_cast_if_present, but we should probably use dyn_cast when the operand is known to be non-null.	2024-12-21 08:17:51 -08:00
Andrzej Warzyński	e9bafa35d2	[mlir][tensor] Generalize/restrict `GeneralizeOuterUnitDimsPackOpPattern` (#114315 ) This PR restricts `GeneralizeOuterUnitDimsPackOpPattern` to follow its intended purpose (as per the documentation), which is to: > require all outer dimensions of tensor.pack to be 1. There was one in-tree test that violated this assumption (and happened to work) – see `@simple_KCRS_to_KRSCsr` in "generalize-tensor-pack.mlir". That test has been updated to satisfy the new requirements of the pattern. By enforcing the pattern to follow its intended design (i.e., making it stricter), the calculation of shapes and sizes for various Ops that the pattern generates (PadOp, ExtractSliceOp, EmptyOp, TensorOp, and InsertSliceOp) becomes much simpler and easier to document. This also helped generalize the pattern to support cases like the one below: ```mlir func.func @simple_pad_and_pack_dynamic_tile_cst( %src: tensor<5x1xf32>, %dest: tensor<1x1x?x2xf32>, %pad: f32) -> tensor<1x1x?x2xf32> { %tile_dim_0 = arith.constant 8 : index %0 = tensor.pack %src padding_value(%pad : f32) inner_dims_pos = [0, 1] inner_tiles = [%tile_dim_0, 2] into %dest : tensor<5x1xf32> -> tensor<1x1x?x2xf32> return %0 : tensor<1x1x?x2xf32> } ``` Note that the inner tile slice is dynamic but compile-time constant. `getPackOpSourceOrPaddedSource`, which is used to generate PadOp, detects this and generates a PadOp with static shapes. This is a good optimization, but it means that all shapes/sizes for Ops generated by `GeneralizeOuterUnitDimsPackOpPattern` also need to be updated to be constant/static. By restricting the pattern and simplifying the size/shape calculation, supporting the case above becomes much easier. Notable implementation changes: * PadOp processes the original source (no change in dimensions/rank). ExtractSliceOp extracts the tile to pack and may reduce the rank. All following ops work on the tile extracted by ExtractSliceOp (possibly rank-reduced). * All shape/size calculations assume that trailing dimensions match inner_tiles from tensor.pack. All leading dimensions (i.e., outer dimensions) are assumed to be 1. * Dynamic sizes for ops like ExtractSliceOp are taken from inner_tiles rather than computed as, for example, tensor.dim %dest, 2. It’s the responsibility of the "producers" of tensor.pack to ensure that dimensions in %dest match the specified tile sizes.	2024-11-06 20:42:47 +00:00
Max191	98e838a890	[mlir] Do not bufferize parallel_insert_slice dest to read for full slices (#112761 ) In the insert_slice bufferization interface implementation, the destination tensor is not considered read if the full tensor is overwritten by the slice. This PR adds the same check for tensor.parallel_insert_slice. Adds two new StaticValueUtils: - `isAllConstantIntValue` checks if an array of `OpFoldResult` are all equal to a passed `int64_t` value. - `areConstantIntValues` checks if an array of `OpFoldResult` are all equal to a passed array of `int64_t` values. fixes https://github.com/llvm/llvm-project/issues/112435 --------- Signed-off-by: Max Dawkins <max.dawkins@gmail.com>	2024-10-18 16:02:03 -04:00
Vinayak Dev	2f15d7e43e	[mlir][tensor] Fix off-by-one error in ReshapeOpsUtils (#112774 ) This patch fixes an off-by-one error in `mlir::getReassociationIndicesForCollapse()` that occurs when the last two dims of the source tensor satisfy the while loop. This would cause an assertion failure due to out-of-bounds-access, which is now fixed.	2024-10-18 14:02:30 +05:30
Kazu Hirata	b52885bc23	[mlir] Use std::optional::value_or (NFC) (#109893 )	2024-09-26 09:53:43 -07:00
donald chen	9cc11b98a7	[mlir] [linalg] Add pattern to swap transpose with broadcast (#97063 ) Add a pattern that implement: transpose(broadcast(input)) -> broadcast(transpose(input))	2024-07-23 12:52:25 +08:00
c8ef	3f222f3bc6	[NFC] Fix some typos (#98791 )	2024-07-14 13:28:11 +02:00
Ramkumar Ramachandra	0fb216fb2f	mlir/MathExtras: consolidate with llvm/MathExtras (#95087 ) This patch is part of a project to move the Presburger library into LLVM.	2024-06-11 23:00:02 +01:00
Spenser Bauman	a9205c5c9d	[mlir][tensor] Implement constant folder for tensor.pad (#92691 ) Extend the folding ability of the RewriteAsConstant patterns to include tensor.pad operations on constants. The new pattern with constant fold tensor.pad operations which operate on tensor constants and have statically resolvable padding sizes/values. %init = arith.constant dense<[[6, 7], [8, 9]]> : tensor<2x2xi32> %pad_value = arith.constant 0 : i32 %0 = tensor.pad %init low[1, 1] high[1, 1] { ^bb0(%arg1: index, %arg2: index): tensor.yield %pad_value : i32 } : tensor<2x2xi32> to tensor<4x4xi32> becomes %cst = arith.constant dense<[[0, 0, 0, 0], [0, 6, 7, 0], [0, 8, 9, 0], [0, 0, 0, 0]]> : tensor<4x4xi32> Co-authored-by: Spenser Bauman <sabauma@fastmail>	2024-06-06 10:22:16 -04:00
Gaurav Shukla	97069a8619	[MLIR] Generalize expand_shape to take shape as explicit input (#90040 ) This patch generalizes tensor.expand_shape and memref.expand_shape to consume the output shape as a list of SSA values. This enables us to implement generic reshape operations with dynamic shapes using collapse_shape/expand_shape pairs. The output_shape input to expand_shape follows the static/dynamic representation that's also used in `tensor.extract_slice`. Differential Revision: https://reviews.llvm.org/D140821 --------- Signed-off-by: Gaurav Shukla<gaurav.shukla@amd.com> Signed-off-by: Gaurav Shukla <gaurav.shukla@amd.com> Co-authored-by: Ramiro Leal-Cavazos <ramiroleal050@gmail.com>	2024-04-30 09:28:35 -07:00
Mehdi Amini	8c0341df02	Revert "[MLIR] Generalize expand_shape to take shape as explicit input" (#89540 ) Reverts llvm/llvm-project#69267 this broke some bots.	2024-04-21 14:33:48 +02:00
Gaurav Shukla	e095d978ba	[MLIR] Generalize expand_shape to take shape as explicit input (#69267 ) This patch generalizes tensor.expand_shape and memref.expand_shape to consume the output shape as a list of SSA values. This enables us to implement generic reshape operations with dynamic shapes using collapse_shape/expand_shape pairs. The output_shape input to expand_shape follows the static/dynamic representation that's also used in `tensor.extract_slice`. Differential Revision: https://reviews.llvm.org/D140821 Co-authored-by: Ramiro Leal-Cavazos <ramiroleal050@gmail.com>	2024-04-21 07:37:02 -04:00
Aviad Cohen	ccc02563f4	[mlir][linalg]: Fixed possible memory leak in cloneToCollapsedOp (#87595 ) * Direct call to `clone` function leads to memory leak. Instead, we should use `RewriterBase` clone function instead.	2024-04-07 08:23:16 +03:00
Thomas Preud'homme	da2c98b558	[MLIR] Remove UtilsDialect dep on ArithUtils (#85919 ) This will reduce the amount of libraries pulled through the de facto dependency of TilingInterface on UtilsDialect for its IteratorType.	2024-03-20 12:18:42 +00:00
Diego Caballero	847048f497	[mlir][Vector] Fix bug in vector xfer op flattening transformation (#81964 ) It looks like the affine map generated to compute the indices of the collapsed dimensions used the wrong dim size. For indices `[idx0][idx1]` we computed the collapsed index as `idx0size0 + idx1` instead of `idx0size1 + idx1`. This led to correctness issues in convolution tests when enabling this transformation internally.	2024-02-22 12:37:32 -08:00
Mehdi Amini	8383bf2307	Apply clang-tidy fixes for llvm-else-after-return in IndexingUtils.cpp (NFC)	2024-02-14 10:11:37 -08:00
Han-Chung Wang	2472c45ba3	[mlir][tensor] Enhance pack/unpack simplification for identity outer_dims_perm cases. (#77409 ) They can be simplified to reshape ops if outer_dims_perm is an identity permutation. The revision adds a `isIdentityPermutation` method to IndexingUtils.	2024-01-10 08:30:34 -08:00
Guray Ozen	c65d8c7187	[mlir][memref] extract_strided_metadata for zero-sized memref (#74835 )	2023-12-08 15:55:14 +01:00
Rik Huijzer	68f0bc6f2e	[mlir] Fix a zero stride canonicalizer crash (#74200 ) This PR fixes https://github.com/llvm/llvm-project/issues/73383 and is another shot at the refactoring proposed in https://github.com/llvm/llvm-project/pull/72885. --------- Co-authored-by: Kai Sasaki <lewuathe@gmail.com>	2023-12-06 07:35:18 +01:00
Matthias Springer	68386a74ba	[mlir][tensor] Fix crash when canonicalizing invalid IR (#72888 ) This commit fixes a crash of the canonicalizer when there are slice ops with offset/size SSA values that have a negative constant value. Such ops are invalid if they are reachable and their offsets/sizes should not be folded to static integer values. (But such ops may appear in non-reachable block.) This commit fixes #71150.	2023-11-21 09:20:18 +01:00
long.chen	1609f1c2a5	[mlir][affine][nfc] cleanup deprecated T.cast style functions (#71269 ) detail see the docment: https://mlir.llvm.org/deprecation/ Not all changes are made manually, most of them are made through a clang tool I wrote https://github.com/lipracer/cpp-refactor.	2023-11-14 13:01:19 +08:00
bjacob	8a80e33150	Add `isBatchVecmat` utilities for `linalg.batch_vecmat` (#70284 ) `linalg.batch_vecmat` was just added in https://github.com/llvm/llvm-project/pull/70218, but I forgot then to add the standard `isBatchVecmat` utilities	2023-10-26 07:47:00 -04:00
NatashaKnk	9f4950983e	[mlir] Add ContractionOpInterface utility functions for vector matrix multiplication (#68945 )	2023-10-18 08:55:51 -07:00
Christopher Bate	831041be79	[mlir][vector] Cleanup VectorUnroll and create a generic tile iteration utility This change refactors some of the utilities used to unroll larger vector computations into smaller vector computations. In fact, the indexing computations used here are rather generic and are useful in other dialects or downstream projects. Therefore, a utility for iterating over all possible tile offsets for a particular pair of static (shape, tiled shape) is introduced in IndexingUtils and replaces the existing computations in the vector unrolling transformations. This builds off of the refactoring of IndexingUtils introduced in 203fad476b7e. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D150000	2023-09-14 20:34:44 -06:00
Ivan Butygin	793ee2bf08	[mlir][gpu] Add DecomposeMemrefsPass Some GPU backends (SPIR-V) lower memrefs to bare pointers, so for dynamically sized/strided memrefs it will fail. This pass extracts sizes and strides via `memref.extract_strrided_metadata` outside `gpu.launch` body and do index/offset calculation explicitly and then reconstructs memrefs via `memref.reinterpret_cast`. `memref.reinterpret_cast` then lowered via https://reviews.llvm.org/D155011 Differential Revision: https://reviews.llvm.org/D155247	2023-08-10 22:28:05 +02:00
Ivan Butygin	b13248f997	Revert "[mlir][gpu] Add DecomposeMemrefsPass" Broke some bots This reverts commit 2b5b2bfef102b1021d91f2b9485e2443bdea9df5.	2023-08-10 03:07:28 +02:00
Ivan Butygin	2b5b2bfef1	[mlir][gpu] Add DecomposeMemrefsPass Some GPU backends (SPIR-V) lower memrefs to bare pointers, so for dynamically sized/strided memrefs it will fail. This pass extracts sizes and strides via `memref.extract_strrided_metadata` outside `gpu.launch` body and do index/offset calculation explicitly and then reconstructs memrefs via `memref.reinterpret_cast`. `memref.reinterpret_cast` then lowered via https://reviews.llvm.org/D155011 Differential Revision: https://reviews.llvm.org/D155247	2023-08-10 02:28:03 +02:00
Nicolas Vasilache	a3cd2eeb2d	[mlir][nvgpu] Add a nvgpu.rewrite_copy_as_tma transform operation. This revision adds support for direct lowering of a linalg.copy on buffers between global and shared memory to a tma async load + synchronization operations. This uses the recently introduced Hopper NVVM and NVGPU abstraction to connect things end to end. Differential Revision: https://reviews.llvm.org/D157087	2023-08-08 12:07:59 +00:00
Matthias Springer	b2826c0209	[mlir][NFC] Move offsets/sizes/strides helper to dialect utils and interface header * Move `foldDynamicIndexList` to `DialectUtils` and simplify function. * Move `OpWithOffsetSizesAndStridesConstantArgumentFolder` to `ViewLikeInterface` and add documentation. Differential Revision: https://reviews.llvm.org/D156581	2023-07-31 14:53:14 +02:00
Nicolas Vasilache	90ecfa2a40	[mlir][linalg] NFC - Move some utils in preparation for revamping mapping of scf.forall	2023-07-25 01:19:57 +02:00

1 2 3

125 Commits