This enables support for atomic RMW ops (add, sub, min and max to be precise) with `bfloat16` operands, via the [SPV_INTEL_16bit_atomics extension](https://github.com/intel/llvm/pull/20009). It's logically a successor to #166031 (I should've used a stack), but I'm putting it up for early review. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>