This sprinkles a few mutexes around the plugin interface so that the olLaunchKernel CTS test now passes when ran on multiple threads. Part of this also involved changing the interface for device synchronise so that it can optionally not free the underlying queue (which introduced a race condition in liboffload).
Liboffload
This directory contains the implementation of the work-in-progress new API for Offload. It builds on top of the existing plugin implementations but provides a single level of abstraction suitable for implementation of many offloading language runtimes, rather than just OpenMP.
Testing liboffload
The main test suite for liboffload can be run with the check-offload-unit
target, which runs the offload.unittests executable. The test suite will
automatically run on every available device, but can be restricted to a single
platform (CUDA, AMDGPU) with a command line argument:
$ ./offload.unittests --platform=CUDA
Tracing of Offload API calls can be enabled by setting the OFFLOAD_TRACE
environment variable. This works with any program that uses liboffload.
$ OFFLOAD_TRACE=1 ./offload.unittests
---> olInit()-> OL_SUCCESS
# etc
The host plugin is not currently supported.
Modifying liboffload
The main header (OffloadAPI.h) and some implementation details are
autogenerated with tablegen. See the API definition README
for implementation details.