Conversion from gpu.subgroup_mma_constant_matrix to spirv.MatrixTimesScalar didn't check that the op type was a multiplication and thus would incorrectly convert other elementwise scalar operations.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D140081
Along the way, make the default pattern fail instead of crashing
when an elementwise op is not supported yet.
Reviewed By: kuhar
Differential Revision: https://reviews.llvm.org/D139280
Add support for loading, computing, and storing `gpu.subgroup` WMMA ops
in transpose mode as well. Update the GPU to NVVM lowerings to support
`transpose` mode and update integration tests as well.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D139021
Enables transposed gpu.subgroup_mma_load_matrix and updates the lowerings in Vector to GPU and GPU to SPIRV. Needed to enable B transpose matmuls lowering to wmma ops.
Taken over from author: stanley-nod <stanley@nod-labs.com>
Reviewed By: ThomasRaoux, antiagainst
Differential Revision: https://reviews.llvm.org/D138770