Previously, `olDestroyQueue` would not actually destroy the queue, instead leaving it for the device to clean up when it was destroyed. Now, the queue is either released immediately if it is complete or put into a list of "pending" queues if it is not. Whenever we create a new queue, we check this list to see if any are now completed. If there are any we release their resources and use them instead of pulling from the pool. This prevents long running programs that create and drop many queues without syncing them from leaking memory all over the place.
Liboffload
This directory contains the implementation of the work-in-progress new API for Offload. It builds on top of the existing plugin implementations but provides a single level of abstraction suitable for implementation of many offloading language runtimes, rather than just OpenMP.
Testing liboffload
The main test suite for liboffload can be run with the check-offload-unit
target, which runs the offload.unittests executable. The test suite will
automatically run on every available device, but can be restricted to a single
platform (CUDA, AMDGPU) with a command line argument:
$ ./offload.unittests --platform=CUDA
Tracing of Offload API calls can be enabled by setting the OFFLOAD_TRACE
environment variable. This works with any program that uses liboffload.
$ OFFLOAD_TRACE=1 ./offload.unittests
---> olInit()-> OL_SUCCESS
# etc
The host plugin is not currently supported.
Modifying liboffload
The main header (OffloadAPI.h) and some implementation details are
autogenerated with tablegen. See the API definition README
for implementation details.