If we can not prove that f16 operands of a buildvector are canonicalized, then we can not lower into a V_PACK. In this scenario, we would previously lower into some combination of and(sdwa), shr, or. This patch allows for matching into V_PERM instead.
Change-Id: Ifa4a74fdb81ef44f22ba490c7fdf81ec8aebc945
BUILD_VECTOR of i16 and undef gets expanded to the COPY_TO_REGCLASS.
The latter is further lowererd to the copy instructions.
We need to provide the correct register class for the uniform and divergent BUILD_VECTOR nodes
to avoid VGPR to SGPR copies.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D122068
This change enables divergence-driven instruction selection for the build_vector DAG nodes.
It also enables packed i16 instructions for GFX9.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D116187