This patch enhances `MemRefType::areTrailingDimsContiguous` to also
handle memrefs with dynamic dimensions.
The implementation itself is based on a new member function
`MemRefType::getMaxCollapsableTrailingDims` that return the maximum
number of trailing dimensions that can be collapsed - trivially all
dimensions for memrefs with identity layout, or by examining the memref
strides stopping at discontiguous or statically unknown strides.
In `int64_t r = strides.size() - 2`, it may cause overflow on 32-bit
system when strides.size() is 1, because `strides.size()` is defined
as `unsigned int`
Adds checks in `isPermutationVector` for indices that are outside of the
bounds and removes the assert.
Signed-off-by: Ian Wood <ianwood2024@u.northwestern.edu>
It looks like the affine map generated to compute the indices of the
collapsed dimensions used the wrong dim size. For indices `[idx0][idx1]`
we computed the collapsed index as `idx0*size0 + idx1` instead of
`idx0*size1 + idx1`. This led to correctness issues in convolution tests
when enabling this transformation internally.
They can be simplified to reshape ops if outer_dims_perm is an identity
permutation. The revision adds a `isIdentityPermutation` method to
IndexingUtils.
This change refactors some of the utilities used to unroll larger vector
computations into smaller vector computations. In fact, the indexing
computations used here are rather generic and are useful in other dialects or
downstream projects. Therefore, a utility for iterating over all possible tile
offsets for a particular pair of static (shape, tiled shape) is introduced in
IndexingUtils and replaces the existing computations in the vector unrolling
transformations. This builds off of the refactoring of IndexingUtils introduced
in 203fad476b7e.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D150000
Some GPU backends (SPIR-V) lower memrefs to bare pointers, so for dynamically sized/strided memrefs it will fail.
This pass extracts sizes and strides via `memref.extract_strrided_metadata` outside `gpu.launch` body and do index/offset calculation explicitly and then reconstructs memrefs via `memref.reinterpret_cast`.
`memref.reinterpret_cast` then lowered via https://reviews.llvm.org/D155011
Differential Revision: https://reviews.llvm.org/D155247
Some GPU backends (SPIR-V) lower memrefs to bare pointers, so for dynamically sized/strided memrefs it will fail.
This pass extracts sizes and strides via `memref.extract_strrided_metadata` outside `gpu.launch` body and do index/offset calculation explicitly and then reconstructs memrefs via `memref.reinterpret_cast`.
`memref.reinterpret_cast` then lowered via https://reviews.llvm.org/D155011
Differential Revision: https://reviews.llvm.org/D155247
This revision adds support for direct lowering of a linalg.copy on buffers between global and shared memory to a tma async load + synchronization operations.
This uses the recently introduced Hopper NVVM and NVGPU abstraction to connect things end to end.
Differential Revision: https://reviews.llvm.org/D157087
This patch recognizes when tensor.pack/unpack operations are simple
tensor.pad/unpad (a.k.a. tensor.extract_slice) and lowers them in a simpler
sequence of instruction.
For pack, instead of doing:
```
pad
expand_shape
transpose
```
we do
```
pad
insert_slice
```
For unpack, instead of doing:
```
transpose
collapse_shape
extract_slice
```
we do
```
extract_slice
```
Note: returning nullptr for the transform dialect is fine. The related
handles are just ignored by the following transformation.
Differential Revision: https://reviews.llvm.org/D148159
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This revision refactors and cleans up a bunch of infra related to vector, shapes and indexing into more reusable APIs.
Differential Revision: https://reviews.llvm.org/D138501
This reduces the dependencies of the MLIRVector target and makes the dialect consistent with other dialects.
Differential Revision: https://reviews.llvm.org/D118533