9 Commits

Author SHA1 Message Date
David Green
02a1d311bd
[AArch64] Extend and rewrite load zero and load undef patterns (#108185)
The ldr instructions implicitly zero any upper lanes, so we can use them
for insert(zerovec, load, 0) patterns. Likewise insert(undef, load, 0)
or scalar_to_reg can reuse the scalar loads as the top bits are undef.

This patch makes sure there are patterns for each type and for each of
the normal, unaligned, roW and roX addressing modes.
2024-09-19 14:52:52 +01:00
David Green
300161761d [AArch64] Add tests for scalar_to_vector(load) and extend load into zero tests. NFC 2024-09-11 09:34:14 +01:00
Usman Nadeem
cc82f1290a
[AArch64] Update latencies for Cortex-A510 scheduling model (#87293)
Updated according to the Software Optimization Guide for Arm®
Cortex®‑A510 Core Revision: r1p3 Issue 6.0.
2024-04-17 11:42:52 -07:00
Harvin Iriawan
db158c7c83 [AArch64] Update generic sched model to A510
Refresh of the generic scheduling model to use A510 instead of A55.
  Main benefits are to the little core, and introducing SVE scheduling information.
  Changes tested on various OoO cores, no performance degradation is seen.

  Differential Revision: https://reviews.llvm.org/D156799
2023-08-21 12:25:15 +01:00
Fangrui Song
d39b4ce3ce [test] Replace aarch64-*-eabi with aarch64
Using "eabi" for aarch64 targets is a common mistake and warned by Clang Driver.
We want to avoid it elsewhere as well. Just use the common "aarch64" without
other triple components.
2023-06-27 20:02:52 -07:00
David Green
1c6ea96193 [AArch64] Fix load-insert-zero patterns with i8 and negative offsets.
These should have been using the LDURBi instructions where the offset is
negative, as reported from the reproducer in D144086.
2023-03-08 12:48:21 +00:00
David Green
a10ac6554d [AArch64] Extend load insert into zero patterns to SVE.
This extends the patterns for loading into the zeroth lane of a zero vector
from D144086 to SVE, which work in the same way as the existing patterns. Only
full length vectors are added here, not the narrower floating point vector
types.
2023-03-06 23:26:08 +00:00
David Green
83bbd3fdbd [AArch64] Load into zero vector patterns
A LDR will implicitly zero the rest of the vector, so vector_insert(zeros,
load, 0) can use a single load. This adds tablegen patterns for both scaled and
unscaled loads, detecting where we are inserting a load into the lower element
of a zero vector.

Differential Revision: https://reviews.llvm.org/D144086
2023-03-01 13:54:03 +00:00
David Green
afa557fad6 [AArch64] Add a test for loading into a zerovector. NFC 2023-02-21 14:42:53 +00:00