11 Commits

Author SHA1 Message Date
Boian Petkantchin
abfac563f5
[mlir][mesh] Make sharding propagation and spmdization work on FuncOpInterface (#84415)
Make them more general instead of only supporting `func::FuncOp`.
2024-03-08 08:14:36 -08:00
Boian Petkantchin
4f7ab789bf
[mlir][mesh] add support in spmdization for incomplete sharding annotations (#82442)
Don't require that `mesh.shard` operations come in pairs. If there is
only a single `mesh.shard` operation we assume that the producer result
and consumer operand have the same sharding.
2024-02-22 11:06:14 -08:00
Boian Petkantchin
dc3258c617
[mlir][mesh] Add all-slice operation (#81218)
This op is the inverse of all-gather. It is useful to have an explicit
concise representation instead of having a blob of slicing logic.

Add lowering for the op that slices from the tensor based on the
in-group process index.

Make resharding generate an all-slice instead of inserting the slicing
logic directly.
2024-02-15 13:03:58 -08:00
Boian Petkantchin
adbf21f12b
[mlir][mesh] Add spmdization pass (#80518)
Add a pass that converts a function that has sharding annotations into
SPMD form.
2024-02-06 20:55:14 -08:00
Boian Petkantchin
31fc0a12e1
[mlir][mesh] Refactoring code organization, tests and docs (#79606)
* Split out `MeshDialect.h` form `MeshOps.h` that defines the dialect
class. Reduces include clutter if you care only about the dialect and
not the ops.

* Expose functions `getMesh` and `collectiveProcessGroupSize`. There
functions are useful for outside users of the dialect.

* Remove unused code.

* Remove examples and tests of mesh.shard attribute in tensor encoding.
Per the decision that Spmdization would be performed on sharding
annotations and there will be no tensors with sharding specified in the
type.
For more info see this RFC comment:
https://discourse.llvm.org/t/rfc-sharding-framework-design-for-device-mesh/73533/81
2024-01-31 07:20:14 -08:00
Boian Petkantchin
9a8437f504
[mlir][mesh] Rename cluster to mesh (#79484)
Rename
* Op mesh.cluster -> mesh.mesh
* Op mesh.cluster_shape -> mesh.mesh_shape
* variables and attributes.

The name `mesh` is more specific to what it really represents. It is a
mesh of devices.
The name `cluster` implies a broader posibility of device
configurations. When just the word `mesh` is used the meaning can often
be inferred from the context whether it refers to the mesh dialect or a
device mesh. The full name can be used when needed.
2024-01-26 07:03:29 -08:00
Boian Petkantchin
5df2c00af3
[mlir][mesh] Remove rank attribute and rename dim_sizes to shape in ClusterOp (#77838)
Remove the somewhat redundant rank attribute.
Before this change
```
mesh.cluster @mesh(rank = 3, dim_sizes = 2x3)
```
After
```
mesh.cluster @mesh(shape = 2x3x?)
```

The rank is instead determined by the provided shape. With this change
no longer `getDimSizes()` can be wrongly assumed to have size equal to
the cluster rank.
Now `getShape().size()` will always equal `getRank()`.
2024-01-15 07:39:09 -08:00
Boian Petkantchin
79aa776267
[mlir][mesh] Add lowering of process multi-index op (#77490)
* Rename mesh.process_index -> mesh.process_multi_index.
* Add mesh.process_linear_index op.
* Add lowering of mesh.process_multi_index into an expression using
mesh.process_linear_index, mesh.cluster_shape and
affine.delinearize_index.

This is useful to lower mesh ops and prepare them for further lowering
where the runtime may have only the linear index of a device/process.
For example in MPI we have a rank (linear index) in a communicator.
2024-01-10 07:01:16 -08:00
Boian Petkantchin
7a4c49756d
[mlir][mesh] Use one type for mesh axis (#76830)
Make all ops and attributes use the types MeshAxis and MeshAxesAttr
instead of int16_t, int32_t, DenseI16ArrayAttr and DenseI32ArrayAttr.
2024-01-03 15:47:11 -08:00
Jie Fu
ab43cf26ca [mlir][mesh] Fix -Wunused-variable in Spmdization.cpp (NFC)
llvm-project/mlir/lib/Dialect/Mesh/Transforms/Spmdization.cpp:573:14:
 error: unused variable 'targetShardType' [-Werror,-Wunused-variable]
  ShapedType targetShardType =
             ^
1 error generated.
2024-01-03 09:29:14 +08:00
Boian Petkantchin
1a8fb88719
[mlir][mesh] Add resharding spmdization on a 1D device mesh (#76179)
The current implementation supports only sharding of tensor axes that
have size divisible by the mesh axis size.
2024-01-02 15:50:07 -08:00