This adds a `only-if-required-by-ops` flag to the `enable-arm-streaming`
pass. This flag defaults to `false` (which preserves the original
behaviour), however, if set to `true` the pass will only add the
selected ZA/streaming mode to functions that contain ops that implement
`ArmSMETileOpInterface`.
This simplifies enabling these modes, as we can now first try lowering
ops to ArmSME, then only if we succeed, add the relevant function
attributes.
Previously, we were inserting za.enable/disable intrinsics for functions
with the "arm_za" attribute (at the MLIR level), rather than using the
backend attributes. This was done to avoid a dependency on the SME ABI
functions from compiler-rt (which have only recently been implemented).
Doing things this way did have correctness issues, for example, calling
a streaming-mode function from another streaming-mode function (both
with ZA enabled) would lead to ZA being disabled after returning to the
caller (where it should still be enabled). Fixing issues like this would
require re-doing the ABI work already done in the backend within MLIR.
Instead, this patch switches to use the "arm_new_za" (backend) attribute
for enabling ZA for an MLIR function. For the integration tests, this
requires some way of linking the SME ABI functions. This is done via the
`%arm_sme_abi_shlib` lit substitution. By default, this expands to a
stub implementation of the SME ABI functions, but this can be overridden
by providing the `ARM_SME_ABI_ROUTINES_SHLIB` CMake cache variable
(pointing it at an alternative implementation). For now, the ArmSME
integration tests pass with just stubs, as we don't make use of nested
ZA-enabled calls.
A future patch may add an option to compiler-rt to build the SME
builtins into a standalone shared library to allow easily
building/testing with the actual implementation.
This attribute makes the `enable_arm_streaming` pass ignore a function
(i.e. not add the enable streaming/za attributes). The main use case for
this is to prevent helper functions within tests being made streaming
functions.
This patch extends the 'enable-arm-streaming' pass with a new option to
enable the ZA storage array by adding the 'arm_za' attribute to
'func.func' ops.
A later patch will insert `llvm.aarch64.sme.za.enable` at the beginning
of 'func.func' ops and `llvm.aarch64.sme.za.disable` before
`func.return` statements when lowering to LLVM dialect.
Currently the pass only supports enabling ZA with streaming-mode on but
the SME LDR, STR and ZERO instructions can access ZA when not in
streaming-mode (section B1.1.1, IDGNQM [1]), so it may be worth making
these options independent in the future.
N.B. This patch is generally useful in the context of SME enablement in
MLIR, but it will help enable writing an integration test for rewrite
pattern that lowers `vector.transfer_write` -> `zero {za}` (D152508).
[1] https://developer.arm.com/documentation/ddi0616/aa
Reviewed By: awarzynski, dcaballe
Differential Revision: https://reviews.llvm.org/D152695