9 Commits

Author SHA1 Message Date
David Green
96819daa3d
[AArch64] Handle v2i16 and v2i8 in concat load combine. (#86264)
This extends the concat load patch from
https://reviews.llvm.org/D121400, which was later moved to a combine, to
handle v2i8 and v2i16 concat loads too.
2024-03-25 17:10:23 +00:00
David Green
99d8c25b31 [AArch64] Extra tests for v2i8 concat loads. NFC 2024-03-22 09:55:18 +00:00
Harvin Iriawan
db158c7c83 [AArch64] Update generic sched model to A510
Refresh of the generic scheduling model to use A510 instead of A55.
  Main benefits are to the little core, and introducing SVE scheduling information.
  Changes tested on various OoO cores, no performance degradation is seen.

  Differential Revision: https://reviews.llvm.org/D156799
2023-08-21 12:25:15 +01:00
Nikita Popov
5ddce70ef0 [AArch64] Convert some tests to opaque pointers (NFC) 2022-12-19 12:36:19 +01:00
David Green
1ba8f4f67d [AArch64] Move v4i8 concat load lowering to a combine.
The existing code was not updating the uses of loads that it recreated,
leading to incorrect chains which could break the ordering between
nodes. This moves the code to a combine instead, and makes sure we
update the chain references. This does mean it happens earlier -
potentially before the concats are simplified. This can lead to
inefficiencies in the codegen, which will be fixed in followups.
2022-04-14 15:19:33 +01:00
David Green
fe6057a293 [AArch64] Custom lower concat(v4i8 load, ...)
We already have custom lowering for v4i8 load, which loads as a f32,
converts to a vector and bitcasts and extends the result to a v4i16.
This adds some custom lowering of concat(v4i8 load, ...) to keep the
result as an f32 and create a buildvector of the resulting f32 loads.
This helps not create all the extends and bitcasts, which are often
difficult to fully clean up.

Differential Revision: https://reviews.llvm.org/D121400
2022-03-18 11:58:02 +00:00
David Green
0fa4aeb453 [AArch64] Add extra insert-subvector tests. NFC 2022-03-17 15:29:07 +00:00
David Green
e348b09bb5 [AArch64] Turn UZP1 with undef operand into truncate
This turns upz1(x, undef) to concat(truncate(x), undef), as the truncate
is simpler and can often be optimized away, and it helps some of the
insert-subvector tests optimize more cleanly.

Differential Revision: https://reviews.llvm.org/D120879
2022-03-04 11:12:26 +00:00
David Green
04661a4d8e [AArch64] Additional insert-subvector codegen tests. NFC 2022-03-04 09:04:09 +00:00