Momchil Velikov a0cc1ab978
[AArch64] Add intrinsics for multi-vector to ZA array vector accumulators (#91606)
[Recommit of e88ba6d975d887ca001cae30bfa0c53d91165148]

According to the specification in
https://github.com/ARM-software/acle/pull/309 this adds the intrinsics

void_svadd_za16_vg1x2_f16(uint32_t slice, svfloat16x2_t zn)
__arm_streaming __arm_inout("za");
void_svadd_za16_vg1x4_f16(uint32_t slice, svfloat16x4_t zn)
__arm_streaming __arm_inout("za");
void_svsub_za16_vg1x2_f16(uint32_t slice, svfloat16x2_t zn)
__arm_streaming __arm_inout("za");
void_svsub_za16_vg1x4_f16(uint32_t slice, svfloat16x4_t zn)
__arm_streaming __arm_inout("za");

as well as the corresponding `bf16` variants.
2024-05-17 15:02:53 +01:00
..
2023-08-08 21:59:53 +01:00

++ SVE CodeGen Warnings ++

When the WARN check lines fail in the SVE codegen tests it most likely means you
have introduced a warning due to:
1. Adding an invalid call to VectorType::getNumElements() or EVT::getVectorNumElements()
   when the type is a scalable vector.
2. Relying upon an implicit cast conversion from TypeSize to uint64_t.

For generic code, please modify your code to work with ElementCount and TypeSize directly.
For target-specific code that only deals with fixed-width vectors, use the fixed-size interfaces.
Please refer to the code where those functions live for more details.