If we're logical shifting an all-signbits value, then we can just mask out the shifted bits.
This helps removes some unnecessary bitcasted vXi16 shifts used for vXi8 shifts (which SimplifyDemandedBits will struggle to remove through the bitcast), and allows some AVX1 shifts of 256-bit values to stay as a YMM instruction.
Noticed in codegen from #82290