8 Commits

Author SHA1 Message Date
Aart Bik
ab6334dd11
[mlir][sparse] add expanded size to API (#68614)
Used for asserting we do not run out of bounds on the expanded access
pattern.
2023-10-09 14:42:11 -07:00
Aart Bik
427f120f60
[mlir][sparse] minor edits in runtime lib Cpp files (#68165) 2023-10-03 16:28:54 -07:00
wren romano
84cd51bb97 [mlir][sparse] Renaming "pointer/index" to "position/coordinate"
The old "pointer/index" names often cause confusion since these names clash with names of unrelated things in MLIR; so this change rectifies this by changing everything to use "position/coordinate" terminology instead.

In addition to the basic terminology, there have also been various conventions for making certain distinctions like: (1) the overall storage for coordinates in the sparse-tensor, vs the particular collection of coordinates of a given element; and (2) particular coordinates given as a `Value` or `TypedValue<MemRefType>`, vs particular coordinates given as `ValueRange` or similar.  I have striven to maintain these distinctions
as follows:

  * "p/c" are used for individual position/coordinate values, when there is no risk of confusion.  (Just like we use "d/l" to abbreviate "dim/lvl".)

  * "pos/crd" are used for individual position/coordinate values, when a longer name is helpful to avoid ambiguity or to form compound names (e.g., "parentPos").  (Just like we use "dim/lvl" when we need a longer form of "d/l".)

    I have also used these forms for a handful of compound names where the old name had been using a three-letter form previously, even though a longer form would be more appropriate.  I've avoided renaming these to use a longer form purely for expediency sake, since changing them would require a cascade of other renamings.  They should be updated to follow the new naming scheme, but that can be done in future patches.

  * "coords" is used for the complete collection of crd values associated with a single element.  In the runtime library this includes both `std::vector` and raw pointer representations.  In the compiler, this is used specifically for buffer variables with C++ type `Value`, `TypedValue<MemRefType>`, etc.

    The bare form "coords" is discouraged, since it fails to make the dim/lvl distinction; so the compound names "dimCoords/lvlCoords" should be used instead.  (Though there may exist a rare few cases where is is appropriate to be intentionally ambiguous about what coordinate-space the coords live in; in which case the bare "coords" is appropriate.)

    There is seldom the need for the pos variant of this notion.  In most circumstances we use the term "cursor", since the same buffer is reused for a 'moving' pos-collection.

  * "dcvs/lcvs" is used in the compiler as the `ValueRange` analogue of "dimCoords/lvlCoords".  (The "vs" stands for "`Value`s".)  I haven't found the need for it, but "pvs" would be the obvious name for a pos-`ValueRange`.

    The old "ind"-vs-"ivs" naming scheme does not seem to have been sustained in more recent code, which instead prefers other mnemonics (e.g., adding "Buf" to the end of the names for `TypeValue<MemRefType>`).  I have cleaned up a lot of these to follow the "coords"-vs-"cvs" naming scheme, though haven't done an exhaustive cleanup.

  * "positions/coordinates" are used for larger collections of pos/crd values; in particular, these are used when referring to the complete sparse-tensor storage components.

    I also prefer to use these unabbreviated names in the documentation, unless there is some specific reason why using the abbreviated forms helps resolve ambiguity.

In addition to making this terminology change, this change also does some cleanup along the way:
  * correcting the dim/lvl terminology in certain places.
  * adding `const` when it requires no other code changes.
  * miscellaneous cleanup that was entailed in order to make the proper distinctions.  Most of these are in CodegenUtils.{h,cpp}

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D144773
2023-03-06 12:23:33 -08:00
wren romano
c518745bba [mlir][sparse] Making way for SparseTensorRuntime to support non-permutations
Systematically updates the SparseTensorRuntime to properly distinguish tensor-dimensions from storage-levels (and their associated ranks, shapes, sizes, indices, etc).  With a few exceptions which are noted in the code, this ensures the runtime has all the **semantic** changes necessary to support non-permutations.

(Whereas **operationally**, since we're still using `std::vector<uing64_t>` to represent the mappings, there's no way to pass in any interesting non-permutations.  Changing the representation to `std::function` will be done in a separate differential.)

Depends On D137680

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D137681
2022-11-14 13:48:41 -08:00
bixia1
461c461a7c [mlir][sparse] Rename SparseTensorFile to SparseTensorReader.
This is to prepare for adding SparseTensorWriter.

Reviewed By: wrengr

Differential Revision: https://reviews.llvm.org/D135477
2022-10-10 08:24:37 -07:00
wren romano
0011c0a159 [mlir][sparse] Renaming x-macros for better hygiene
Now that mlir_sparsetensor_utils is a public library, this differential renames the x-macros to help avoid namespace pollution issues.

Reviewed By: aartbik, Peiming

Differential Revision: https://reviews.llvm.org/D134988
2022-09-30 14:04:58 -07:00
wren romano
329f2f103a [mlir][sparse] refactoring SparseTensorUtils: (3 of 4) code-cleanup
Previously, the SparseTensorUtils.cpp library contained a C++ core implementation, but hid it in an anonymous namespace and only exposed a C-API for accessing it. Now we are factoring out that C++ core into a standalone C++ library so that it can be used directly by downstream clients (per request of one such client). This refactoring has been decomposed into a stack of differentials in order to simplify the code review process, however the full stack of changes should be considered together.

* D133462: Part 1: split one file into several
* D133830: Part 2: Reorder chunks within files
* (this): Part 3: General code cleanup
* D133833: Part 4: Update documentation

This part performs some general code cleanup including:
* making more things `const`, especially for the targets of pointers
* using preincrement wherever possible ([[ https://llvm.org/docs/CodingStandards.html#prefer-preincrement | per LLVM style guide ]])
* adding messages to most `assert` statments.
* moving argument casting from the core function/method definitions to the CPP wrappers

Depends On D133830

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D133831
2022-09-29 14:44:07 -07:00
wren romano
0fca5c5f45 [mlir][sparse] refactoring SparseTensorUtils: (1 of 4) file-splitting
Previously, the SparseTensorUtils.cpp library contained a C++ core implementation, but hid it in an anonymous namespace and only exposed a C-API for accessing it. Now we are factoring out that C++ core into a standalone C++ library so that it can be used directly by downstream clients (per request of one such client). This refactoring has been decomposed into a stack of differentials in order to simplify the code review process, however the full stack of changes should be considered together.

* (this): Part 1: split one file into several
* D133830: Part 2: Reorder chunks within files
* D133831: Part 3: General code cleanup
* D133833: Part 4: Update documentation

This part aims to make no changes other than the 1:N file splitting, and things which are forced to accompany that change.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D133462
2022-09-29 14:35:27 -07:00