Summary:
This check is unnecessarily restrictive and currently incorrectly fires
for any size less than eight bytes. Just remove it, we do sanity checks
elsewhere and at some point need to trust the ABI.
This commit refactors the `add_offload_test_device_code` CMake function
to compile device code using the C++ compiler (`CMAKE_CXX_COMPILER`)
instead of the C compiler.
This change enables the use of C++ features, such as templates, within
device-side test kernels. This will allow for more advanced and reusable
kernel wrappers, reducing boilerplate code in the conformance test
suite.
As part of this change:
- All `.c` files for device code in `unittests/` have been renamed to
`.cpp`.
- Kernel definitions are now wrapped in `extern "C"` to ensure C linkage
and prevent name mangling.
This change affects the `OffloadAPI` and `Conformance` test suites.
cc @callumfare @RossBrunton @jhuber6
`olGetKernel` has been replaced by `olGetSymbol` which accepts a
`Kind` parameter. As well as loading information about kernels, it
can now also load information about global variables.
Adds two "launch kernel" tests for lib offload, one testing that
global memory works and persists between different kernels, and one
verifying that `[[gnu::constructor]]` works correctly.
Since we now have tests that contain multiple kernels in the same
binary, the test framework has been updated a bit.
Summary:
I'll probably want to use this as a more generic utility in the future.
This patch reworks it to make it a top level function. I also tried to
decouple this from the OpenMP utilities to make that easier in the
future. Instead, I just use `-march=native` functionality which is the
same thing. Needed a small hack to skip the linker stage for checking if
that works.
This should still create the same output as far as I'm aware.
Adds a `check-offload-unit` target for running the liboffload unit test
suite. This unit test binary runs the tests for every available device.
This can optionally filtered to devices from a single platform, but the
check target runs on everything.
The target is not part of `check-offload` and does not get propagated to
the top level build. I'm not sure if either of these things are
desirable, but I'm happy to look into it if we want.
Also remove the `offload/unittests/Plugins` test as it's dead code and
doesn't build.
Implement the complete initial version of the Offload API, to the extent
that is usable for simple offloading programs. Tested with a basic SYCL
program.
As far as possible, these are simple wrappers over existing
functionality in the plugins.
* Allocating and freeing memory (host, device, shared).
* Creating a program
* Creating a queue (wrapper over asynchronous stream resource)
* Enqueuing memcpy operations
* Enqueuing kernel executions
* Waiting on (optional) output events from the enqueue operations
* Waiting on a queue to finish
Objects created with the API have reference counting semantics to handle
their lifetime. They are created with an initial reference count of 1,
which can be incremented and decremented with retain and release
functions. They are freed when their reference count reaches 0. Platform
and device objects are not reference counted, as they are expected to
persist as long as the library is in use, and it's not meaningful for
users to create or destroy them.
Tests have been added to `offload.unittests`, including device code for
testing program and kernel related functionality.
The API should still be considered unstable and it's very likely we will
need to change the existing entry points.