Ross Brunton 2c11a83691
[Offload] Add olCalculateOptimalOccupancy (#142950)
This is equivalent to `cuOccupancyMaxPotentialBlockSize`. It is
currently
only implemented on Cuda; AMDGPU and Host return unsupported.

---------

Co-authored-by: Callum Fare <callum@codeplay.com>
2025-08-19 15:16:47 +01:00
..

Liboffload

This directory contains the implementation of the work-in-progress new API for Offload. It builds on top of the existing plugin implementations but provides a single level of abstraction suitable for implementation of many offloading language runtimes, rather than just OpenMP.

Testing liboffload

The main test suite for liboffload can be run with the check-offload-unit target, which runs the offload.unittests executable. The test suite will automatically run on every available device, but can be restricted to a single platform (CUDA, AMDGPU) with a command line argument:

$ ./offload.unittests --platform=CUDA

Tracing of Offload API calls can be enabled by setting the OFFLOAD_TRACE environment variable. This works with any program that uses liboffload.

$ OFFLOAD_TRACE=1 ./offload.unittests
---> olInit()-> OL_SUCCESS
# etc

The host plugin is not currently supported.

Modifying liboffload

The main header (OffloadAPI.h) and some implementation details are autogenerated with tablegen. See the API definition README for implementation details.