These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
This patch adds support for i64, f64 values in `gpu.shuffle`, rewriting 64bit shuffles into two 32bit shuffles.
The reason behind this change is that both CUDA & HIP support this kind of shuffling.
The implementation provided by this patch is based on the LLVM IR emitted by clang for 64bit shuffles when using `-O3`.
Reviewed By: makslevental
Differential Revision: https://reviews.llvm.org/D148974