This allows the instruction selector to realize that it can directly
broadcast the low byte of the memset value, rather than replicating
it to a 64-bit GPR before broadcasting.
This fixes PR50985.
Differential Revision: https://reviews.llvm.org/D108354