Philip Ginsbach-Chen 1d821b0c6b
[AArch64] use isTRNMask to calculate shuffle costs (#171524)
This builds on #169858 to fix the divergence in codegen
(https://godbolt.org/z/a9az3h6oq) between two very similar
functions initially observed in #137447 (represented in the diff by test
cases `@transpose_splat_constants` and `@transpose_constants_splat`:
```
int8x16_t f(int8_t x)
{
  return (int8x16_t) { x, 0, x, 1, x, 2, x, 3,
                       x, 4, x, 5, x, 6, x, 7 };
}

int8x16_t g(int8_t x)
{
  return (int8x16_t) { 0, x, 1, x, 2, x, 3, x,
                       4, x, 5, x, 6, x, 7, x };
}
```

The PR uses an additional `isTRNMask` call in
`AArch64TTIImpl::getShuffleCost` to ensure that we treat shuffle masks
as transpose masks even if `isTransposeMask` fails to recognise them
(meaning that `Kind == TTI::SK_Transpose` cannot be relied upon).

Follow-up work could consider modifying `isTransposeMask`, but that
would also impact other backends than AArch64.
2025-12-15 05:34:30 +00:00
..
2025-08-11 05:53:55 -07:00