Charitha Saumya 244ebef1dd
Reapply [mlir][vector] Refactor WarpOpScfForOp to support unused or swapped forOp results. (#148313)
Reapply attempt for : https://github.com/llvm/llvm-project/pull/148291
Fix for the build failure reported in :
https://lab.llvm.org/buildbot/#/builders/116/builds/15477

-----

This crash is caused by mismatch of distributed type returned by
`getDistributedType` and intended distributed type for forOp results.

Solution diff:
20c2cf6766

Example:
```
func.func @warp_scf_for_broadcasted_result(%arg0: index) -> vector<1xf32> {
  %c128 = arith.constant 128 : index
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index
  %2 = gpu.warp_execute_on_lane_0(%arg0)[32] -> (vector<1xf32>) {
    %ini = "some_def"() : () -> (vector<1xf32>)
    %0 = scf.for %arg3 = %c0 to %c128 step %c1 iter_args(%arg4 = %ini) -> (vector<1xf32>) {
      %1 = "some_op"(%arg4) : (vector<1xf32>) -> (vector<1xf32>)
      scf.yield %1 : vector<1xf32>
    }
    gpu.yield %0 : vector<1xf32>
  }
  return %2 : vector<1xf32>
}
``` 
In this case the distributed type for forOp result is `vector<1xf32>`
(result is not distributed and broadcasted to all lanes instead).
However, in this case `getDistributedType` will return NULL type.

Therefore, if the distributed type can be recovered from warpOp, we
should always do that first before using `getDistributedType`
2025-07-14 15:41:56 -07:00
..

Multi-Level Intermediate Representation

See https://mlir.llvm.org/ for more information.