Charitha Saumya 6b5c440a67
[mlir][xegpu] Add support for vector.reduction and vector.multi_reduction subgroup to work-item distribution. (#180308)
This PR adds support for lowering of `vector.reduction` and
`vector.multi_reduction` ops in subgroup to work-item distribution.

Following cases are considered currently (more support will be added
later):

* `vector.reduction` : This assumes the source vector is distributed to
all lanes and lanes must shuffle data to do a collaborative reduction.
result is shared among all lanes. This is done by emitting
`gpu::ShuffleOp` s and doing a butterfly reduction. Refer
`VectorDistribution` for more details.
* `vector.multi_reduction`: 2 cases are considered,

1. **Reduction is lane-local**: simply lower to a lane local multi
reduction op. each lane does its own reduction. result is distributed.
2. **Reduction is not lane-local:** This one is handled indirectly. In
this case, we rewrite the reduction in terms of `vector.reduction` ops
(plus exrtact. insert) before the WI distribution even begin. Then whole
things is distributed using `gpu::ShuffleOp` s later (not fullly
supported yet).
2026-02-13 11:49:55 -08:00
..