This patch relands #67410 and fixes the cmpfail below:
#include <immintrin.h>
__attribute__((target("avx"))) void test(__m128 a, __m128 b) {
_mm_cmp_ps(a, b, 14);
}
According to Intel SDM, SSE/SSE2 instructions cmp[p|s][s|d] are
supported when imm8 is in range of [0, 7]