To run integration tests using qemu-aarch64 on x64 host, below flags are
added to the cmake command when building mlir/llvm:
-DMLIR_INCLUDE_INTEGRATION_TESTS=ON \
-DMLIR_RUN_ARM_SVE_TESTS=ON \
-DMLIR_RUN_ARM_SME_TESTS=ON \
-DARM_EMULATOR_EXECUTABLE="<...>/qemu-aarch64" \
-DARM_EMULATOR_OPTIONS="-L /usr/aarch64-linux-gnu" \
-DARM_EMULATOR_MLIR_CPU_RUNNER_EXECUTABLE="<llvm_arm64_build_top>/bin/mlir-cpu-runner-arm64"
\
-DARM_EMULATOR_LLI_EXECUTABLE="<llvm_arm64_build_top>/bin/lli" \
-DARM_EMULATOR_UTILS_LIB_DIR="<llvm_arm64_build_top>/lib"
The last three above are prebuilt on, or cross-built for, an aarch64
host.
This patch introduced substittutions of "%native_mlir_runner_utils" etc. and use
them in SVE/SME integration tests. When configured to run using qemu-aarch64,
mlir runtime util libs will be loaded from ARM_EMULATOR_UTILS_LIB_DIR, if set.
Some tests marked with 'UNSUPPORTED: target=aarch64{{.*}}' are still run
when configured with ARM_EMULATOR_EXECUTABLE and the default target is
not aarch64.
A lit config feature 'mlir_arm_emulator' is added in
mlir/test/lit.site.cfg.py.in and to UNSUPPORTED list of such tests.
Codegen "vectors" for pos/crd/val use the capacity as memref size, not
the actual used size. Although the sparsifier itself always uses just
the defined pos/crd/val parts, printing these and passing them back to a
runtime environment could benefit from wrapping the basic pos/crd/val
getters into a proper memref view that sets the right size.
This change lifts the restriction that purely allocated empty sparse
tensors cannot escape the method. Instead it makes a best effort to add
a finalizing operation before the escape.
This assumes that
(1) we never build sparse tensors across method boundaries
(e.g. allocate in one, insert in other method)
(2) if we have other uses of the empty allocation in the
same method, we assume that either that op will fail
or will do the finalization for us.
This is best-effort, but fixes some very obvious missing cases.