OverMighty ea045b99da [AArch64] Add patterns for scalar FMUL, FMULX
Scalar FMUL, FMULX instructions perform better or the same compared to indexed
FMUL, FMULX.

For example, the Arm Cortex-A55 Software Optimization Guide lists the following
instructions with a throughput of 2 IPC:
 - "FP multiply" FMUL
 - "ASIMD FP multiply" FMULX

whereas it lists the following with a throughput of 1 IPC:
 - "ASIMD FP multiply, by element" FMUL, FMULX

The Arm Cortex-A510 Software Optimization Guide, however, does not separately
list "by element" variants of the "ASIMD FP multiply" instructions, which are
listed with the same throughput as the non-ASIMD ones.

Fixes #60817.

Differential Revision: https://reviews.llvm.org/D153207
2023-06-30 08:34:20 +01:00
..
2023-06-28 11:57:13 -07:00