Summary: This PR adds a pointer argument to the initialization routine to be used for global options. Right now this is used to allow the user to constrain which backends they wish to use. If a null argument is passed, the same behavior as before is observed. This is epxected to be extensible by forcing the user to encode the size of the struct. So, old executables will encode which fields they have access to. We use a macro helper to get this struct rather than a runtime call so that the current state of the size is baked into the executable rather than something looked up by the runtime. Otherwise it would just return the size that the (potentially newer) runtime would see
GPU Math Conformance Tests
Overview
This test suite provides a framework to systematically measure the accuracy of math functions on GPUs and verify their conformance with standards like OpenCL.
While the primary focus is validating the implementations in the C standard math library (LLVM-libm), these tests can also be executed against other math library providers, such as CUDA Math and HIP Math, for comparison.
The goals of this project are to empower LLVM-libm contributors with a robust tool for validating their implementations and to build trust with end-users by providing transparent accuracy data.
Table of Contents
Getting Started
This guide covers how to build the necessary dependencies, which include the new Offload API and the C standard library for both host and GPU targets.
System Requirements
Before you begin, ensure your system meets the following requirements:
- A system with an AMD or NVIDIA GPU.
- The latest proprietary GPU drivers installed.
- The corresponding development SDK for your hardware:
- AMD: ROCm SDK
- NVIDIA: CUDA Toolkit
Building the Dependencies
The official documentation for building LLVM-libc for GPUs provides a detailed guide and should be considered the primary reference. Please follow the instructions in the "Standard runtimes build" section of that guide:
Important
For the conformance tests, the standard
cmakecommand from the official documentation must be adapted slightly. You must also addlibcto the main-DLLVM_ENABLE_RUNTIMESlist. This is a crucial step because the tests need a host-side build oflibcto use as the reference oracle for validating GPU results.
Running the Tests
Default Test
To build and run the conformance test for a given function (e.g., logf) against the default C standard math library llvm-libm provider, use the following command. This will execute the test on all available and supported platforms.
ninja -C build/runtimes/runtimes-bins offload.conformance.logf
Testing Other Providers
Once the test binary has been built, you can run it against other math library providers using the --test-configs flag.
-
For
cuda-mathon an NVIDIA GPU:./build/runtimes/runtimes-bins/offload/logf.conformance --test-configs=cuda-math:cuda -
For
hip-mathon an AMD GPU:./build/runtimes/runtimes-bins/offload/logf.conformance --test-configs=hip-math:amdgpu
You can also run all available configurations for a test with:
./build/runtimes/runtimes-bins/offload/logf.conformance --test-configs=all
Adding New Tests
To add a conformance test for a new math function, follow these steps:
-
Implement the Device Kernels: Create a kernel wrapper for the new function in each provider's source file. For CUDA Math and HIP Math, you must also add a forward declaration for the vendor function in
/device_code/DeviceAPIs.hpp. -
Implement the Host Test: Create a new
.cppfile in/tests. This file defines theFunctionConfig(function and kernel names, as well as ULP tolerance) and the input generation strategy.- Use exhaustive testing (
ExhaustiveGenerator) for functions with small input spaces (e.g., half-precision functions and single-precision univariate functions). This strategy iterates over every representable point in the input space, ensuring complete coverage. - Use randomized testing (
RandomGenerator) for functions with large input spaces (e.g., single-precision bivariate and double-precision functions), where exhaustive testing is computationally infeasible. Although not exhaustive, this strategy is deterministic, using a fixed seed to sample a large, reproducible subset of points from the input space.
- Use exhaustive testing (
-
Add the Build Target: Add a new
add_conformance_test(...)entry to/tests/CMakeLists.txtto make the test buildable.