
With this PR, if we have customized implementation for scalar or vector length = 2, we don't need to write new macros, e.g. https://github.com/intel/llvm/blob/fb18321705f6/libclc/clc/include/clc/clcmacro.h#L15 Undef __HALF_ONLY, __FLOAT_ONLY and __DOUBLE_ONLY at the end of clc/include/clc/math/gentype.inc llvm-diff shows no change to nvptx64--nvidiacl.bc and amdgcn--amdhsa.bc