16 Commits

Author SHA1 Message Date
Ross Brunton
2c11a83691
[Offload] Add olCalculateOptimalOccupancy (#142950)
This is equivalent to `cuOccupancyMaxPotentialBlockSize`. It is
currently
only implemented on Cuda; AMDGPU and Host return unsupported.

---------

Co-authored-by: Callum Fare <callum@codeplay.com>
2025-08-19 15:16:47 +01:00
Ross Brunton
910d7e90bf
[Offload] Make olLaunchKernel test thread safe (#149497)
This sprinkles a few mutexes around the plugin interface so that the
olLaunchKernel CTS test now passes when ran on multiple threads.

Part of this also involved changing the interface for device synchronise
so that it can optionally not free the underlying queue (which
introduced a race condition in liboffload).
2025-08-08 10:57:04 +01:00
Ross Brunton
690c3ee5be
[Offload] Replace "EventOut" parameters with olCreateEvent (#150217)
Rather than having every "enqueue"-type function have an output pointer
specifically for an output event, just provide an `olCreateEvent`
entrypoint which pushes an event to the queue.

For example, replace:
```cpp
olMemcpy(Queue, ..., EventOut);
```
with
```cpp
olMemcpy(Queue, ...);
olCreateEvent(Queue, EventOut);
```
2025-07-24 14:31:06 +01:00
Ross Brunton
2726b7fb1c
[Offload] Rename olWaitEvent/Queue to olSyncEvent/Queue (#150023)
This more closely matches the nomenclature used by CUDA, AMDGPU and
the plugin interface.
2025-07-23 10:52:13 +01:00
Ross Brunton
eee723f928
[Offload] Replace GetKernel with GetSymbol with global support (#148221)
`olGetKernel` has been replaced by `olGetSymbol` which accepts a
`Kind` parameter. As well as loading information about kernels, it
can now also load information about global variables.
2025-07-11 14:48:10 +01:00
Ross Brunton
466357ab51
[Offload] Change ol_kernel_handle_t -> ol_symbol_handle_t (#147943)
In the future, we want `ol_symbol_handle_t` to represent both kernels
and global variables The first step in this process is a rename and
promotion to a "typed handle".
2025-07-10 14:54:10 +01:00
Ross Brunton
bed9fe77dc
[Offload] Tests for global memory and constructors (#147537)
Adds two "launch kernel" tests for lib offload, one testing that
global memory works and persists between different kernels, and one
verifying that `[[gnu::constructor]]` works correctly.

Since we now have tests that contain multiple kernels in the same
binary, the test framework has been updated a bit.
2025-07-09 14:26:50 +01:00
Ross Brunton
8ae8d31832
[Offload] Add liboffload unit tests for shared/local memory (#147040) 2025-07-07 16:20:02 +01:00
Ross Brunton
613c38a992
[Offload] Fix type mismatch warning in test (#143700) 2025-06-23 10:14:12 +01:00
Ross Brunton
4f60321ca1
[Offload] Add ol_dimensions_t and convert ranges from size_t -> uint32_t (#143901)
This is a three element x, y, z size_t vector that can be used any place
where a 3D vector is required. This ensures that all vectors across
liboffload are the same and don't require any resizing/reordering
dances.
2025-06-12 09:59:59 -05:00
Ross Brunton
269c29ae67
[Offload] Allow setting null arguments in olLaunchKernel (#141958) 2025-06-06 07:05:11 -05:00
Ross Brunton
41e22aa31b
[Offload] Set size correctly in olLaunchKernel cts test (#142398)
It was previously not scaled by `sizeof(uint32_t)`.
2025-06-02 09:27:09 -05:00
Ross Brunton
050892d2f8
[Offload] Use new error code handling mechanism and lower-case messages (#139275)
[Offload] Use new error code handling mechanism

This removes the old ErrorCode-less error method and requires
every user to provide a concrete error code. All calls have been
updated.

In addition, for consistency with error messages elsewhere in LLVM, all
messages have been made to start lower case.
2025-05-20 08:50:20 -05:00
Ross Brunton
f6ac5276ee
[Offload] Ensure all llvm::Errors are handled (#137339)
`llvm::Error`s containing errors must be explicitly handled or an assert
will be raised. With this change, `ol_impl_result_t` can accept and
consume an `llvm::Error` for errors raised by PluginInterface that have
multiple causes and other places now call `llvm::consumeError`.

Note that there is currently no facility for PluginInterface to
communicate exact error codes, but the constructor is designed in
such a way that it can be easily added later. This MR is to
convert a crash into an error code.

A new test was added, however due to the aforementioned issue with
error codes, it does not pass and instead is marked as a skip.
2025-05-02 07:37:19 -05:00
Callum Fare
6022a5214b
[Offload] Add check-offload-unit for liboffload unittests (#137312)
Adds a `check-offload-unit` target for running the liboffload unit test
suite. This unit test binary runs the tests for every available device.
This can optionally filtered to devices from a single platform, but the
check target runs on everything.

The target is not part of `check-offload` and does not get propagated to
the top level build. I'm not sure if either of these things are
desirable, but I'm happy to look into it if we want.

Also remove the `offload/unittests/Plugins` test as it's dead code and
doesn't build.
2025-04-29 11:21:59 -05:00
Callum Fare
800d949bb3
[Offload] Implement the remaining initial Offload API (#122106)
Implement the complete initial version of the Offload API, to the extent
that is usable for simple offloading programs. Tested with a basic SYCL
program.

As far as possible, these are simple wrappers over existing
functionality in the plugins.

* Allocating and freeing memory (host, device, shared).
* Creating a program 
* Creating a queue (wrapper over asynchronous stream resource)
* Enqueuing memcpy operations
* Enqueuing kernel executions
* Waiting on (optional) output events from the enqueue operations
* Waiting on a queue to finish

Objects created with the API have reference counting semantics to handle
their lifetime. They are created with an initial reference count of 1,
which can be incremented and decremented with retain and release
functions. They are freed when their reference count reaches 0. Platform
and device objects are not reference counted, as they are expected to
persist as long as the library is in use, and it's not meaningful for
users to create or destroy them.

Tests have been added to `offload.unittests`, including device code for
testing program and kernel related functionality.

The API should still be considered unstable and it's very likely we will
need to change the existing entry points.
2025-04-22 13:27:50 -05:00