llvm-project

Author	SHA1	Message	Date
Ross Brunton	c8986d1ecb	[Offload] Guard olMemAlloc/Free with a mutex (#153786 ) Both these functions update an `AllocInfoMap` structure in the context, however they did not use any locks, causing random failures in threaded code. Now they use a mutex.	2025-08-20 13:23:57 +01:00
Ross Brunton	2c11a83691	[Offload] Add olCalculateOptimalOccupancy (#142950 ) This is equivalent to `cuOccupancyMaxPotentialBlockSize`. It is currently only implemented on Cuda; AMDGPU and Host return unsupported. --------- Co-authored-by: Callum Fare <callum@codeplay.com>	2025-08-19 15:16:47 +01:00
Rafal Bielski	9c9d9e4cb6	[Offload] Define additional device info properties (#152533 ) Add the following properties in Offload device info: * VENDOR_ID * NUM_COMPUTE_UNITS * [SINGLE\|DOUBLE\|HALF]_FP_CONFIG * NATIVE_VECTOR_WIDTH_[CHAR\|SHORT\|INT\|LONG\|FLOAT\|DOUBLE\|HALF] * MAX_CLOCK_FREQUENCY * MEMORY_CLOCK_RATE * ADDRESS_BITS * MAX_MEM_ALLOC_SIZE * GLOBAL_MEM_SIZE Add a bitfield option to enumerators, allowing the values to be bit-shifted instead of incremented. Generate the per-type enums using `foreach` to reduce code duplication. Use macros in unit test definitions to reduce code duplication.	2025-08-19 13:02:01 +01:00
Ross Brunton	30c7951136	[Offload] `olLaunchHostFunction` (#152482 ) Add an `olLaunchHostFunction` method that allows enqueueing host work to the stream.	2025-08-15 09:39:48 +01:00
Ross Brunton	3e9f29cfee	[Offload] Store globals in the program's global list rather than the kernel list (#153441 )	2025-08-13 17:18:25 +01:00
Ross Brunton	910d7e90bf	[Offload] Make olLaunchKernel test thread safe (#149497 ) This sprinkles a few mutexes around the plugin interface so that the olLaunchKernel CTS test now passes when ran on multiple threads. Part of this also involved changing the interface for device synchronise so that it can optionally not free the underlying queue (which introduced a race condition in liboffload).	2025-08-08 10:57:04 +01:00
Ross Brunton	197d1c1570	[Offload] OL_QUEUE_INFO_EMPTY (#152473 ) Add a queue query that (if possible) reports whether the queue is empty	2025-08-08 10:20:45 +01:00
Ross Brunton	a44532544b	[Offload] Don't create events for empty queues (#152304 ) Add a device function to check if a device queue is empty. If liboffload tries to create an event for an empty queue, we create an "empty" event that is already complete. This allows `olCreateEvent`, `olSyncEvent` and `olWaitEvent` to run quickly for empty queues.	2025-08-07 10:16:33 +01:00
Ross Brunton	d03692a00e	[Offload] Rework `MAX_WORK_GROUP_SIZE` (#151926 ) `MAX_WORK_GROUP_SIZE` now represents the maximum total number of work groups the device can allocate, rather than the maximum per dimension. `MAX_WORK_GROUP_SIZE_PER_DIMENSION` has been added, which has the old behaviour.	2025-08-04 15:21:24 +01:00
Ross Brunton	adb2421202	[Offload] Refactor device information queries to use new tagging (#147318 ) Instead using strings to look up device information (which is brittle and slow), use the new tags that the plugins specify when building the nodes.	2025-07-25 14:51:51 +01:00
Ross Brunton	690c3ee5be	[Offload] Replace "EventOut" parameters with `olCreateEvent` (#150217 ) Rather than having every "enqueue"-type function have an output pointer specifically for an output event, just provide an `olCreateEvent` entrypoint which pushes an event to the queue. For example, replace: ```cpp olMemcpy(Queue, ..., EventOut); ``` with ```cpp olMemcpy(Queue, ...); olCreateEvent(Queue, EventOut); ```	2025-07-24 14:31:06 +01:00
Ross Brunton	081b74caf5	[Offload] Add olWaitEvents (#150036 ) This function causes a queue to wait until all the provided events have completed before running any future scheduled work.	2025-07-23 14:12:16 +01:00
Ross Brunton	2726b7fb1c	[Offload] Rename olWaitEvent/Queue to olSyncEvent/Queue (#150023 ) This more closely matches the nomenclature used by CUDA, AMDGPU and the plugin interface.	2025-07-23 10:52:13 +01:00
Ross Brunton	55b417a75f	[Offload] Cache symbols in program (#148209 ) When creating a new symbol, check that it already exists. If it does, return that pointer rather than building a new symbol structure.	2025-07-16 18:32:47 +01:00
Callum Fare	47c9609a86	[Offload] Check plugins aren't already deinitialized when tearing down (#148642 ) This is a hotfix for #148615 - it fixes the issue for me locally. I think a broader issue is that in the test environment we're calling olShutDown from a global destructor in the test binaries. We should do something more controlled, either calling olInit/olShutDown in every test, or move those to a GTest global environment. I didn't do that originally because it looked like it needed changes to LLVM's GTest wrapper.	2025-07-14 16:17:10 +01:00
Ross Brunton	2fdeeefacf	[Offload] Add global variable address/size queries (#147972 ) Add two new symbol info types for getting the bounds of a global variable. As well as a number of tests for reading/writing to it.	2025-07-11 16:12:48 +01:00
Ross Brunton	84e15d08c2	[Offload] Add `olGetSymbolInfo[Size]` (#147962 ) This mirrors the similar functions for other handles. The only implemented info at the moment is the symbol's kind.	2025-07-11 15:29:53 +01:00
Ross Brunton	eee723f928	[Offload] Replace `GetKernel` with `GetSymbol` with global support (#148221 ) `olGetKernel` has been replaced by `olGetSymbol` which accepts a `Kind` parameter. As well as loading information about kernels, it can now also load information about global variables.	2025-07-11 14:48:10 +01:00
Ross Brunton	466357ab51	[Offload] Change `ol_kernel_handle_t` -> `ol_symbol_handle_t` (#147943 ) In the future, we want `ol_symbol_handle_t` to represent both kernels and global variables The first step in this process is a rename and promotion to a "typed handle".	2025-07-10 14:54:10 +01:00
Callum Fare	7c6edf4a05	[Offload] Implement olGetQueueInfo, olGetEventInfo (#142947 ) Add info queries for queues and events. `olGetQueueInfo` only supports getting the associated device. We were already tracking this so we can implement this for free. We will likely add other queries to it in the future (whether the queue is empty, what flags it was created with, etc) `olGetEventInfo` only supports getting the associated queue. This is another thing we were already storing in the handle. We'll be able to add other queries in future (the event type, status, etc)	2025-07-09 17:09:31 +01:00
Ross Brunton	7d52b0983e	[Offload] Add `MAX_WORK_GROUP_SIZE` device info query (#143718 ) This adds a new device info query for the maximum workgroup/block size for each dimension.	2025-07-02 16:33:54 +01:00
Ross Brunton	67e73ba605	[Offload] Refactor device/platform info queries (#146345 ) This makes several small changes to how the platform and device info queries are handled: * ReturnHelper has been replaced with InfoWriter which is more explicit in how it is invoked. * InfoWriter consumes `llvm::Expected` rather than values directly, and will early exit if it returns an error. * As a result of the above, `GetInfoString` now correctly returns errors rather than empty strings. * The host device now has its own dedicated "getInfo" function rather than being checked in multiple places.	2025-06-30 15:00:43 +01:00
Ross Brunton	003145d0c8	[Offload] Implement `olShutDown` (#144055 ) `olShutDown` was not properly calling deinit on the platforms, resulting in random segfaults on AMD devices. As part of this, `olInit` and `olShutDown` now alloc and free the offload context rather than it being static. This allows `olShutDown` to be called within a destructor of a static object (like the tests do) without having to worry about destructor ordering.	2025-06-30 12:14:00 +01:00
Ross Brunton	39f19f2f1f	[Offload] Store device info tree in device handle (#145913 ) Rather than creating a new device info tree for each call to `olGetDeviceInfo`, we instead do it on device initialisation. As well as improving performance, this fixes a few lifetime issues with returned strings. This does unfortunately mean that device information is immutable, but hopefully that shouldn't be a problem for any queries we want to implement. This also meant allowing offload initialization to fail, which it can now do.	2025-06-27 15:10:43 +01:00
Ross Brunton	0870c8838b	[Offload] Add an `unloadBinary` interface to PluginInterface (#143873 ) This allows removal of a specific Image from a Device, rather than requiring all image data to outlive the device they were created for. This is required for `ol_program_handle_t`s, which now specify the lifetime of the buffer used to create the program.	2025-06-25 14:53:18 +01:00
Ross Brunton	4359e55838	[Offload] Properly report errors when jit compiling (#145498 ) Previously, if a binary failed to load due to failures when jit compiling, the function would return success with nullptr. Now it returns a new plugin error, `COMPILE_FAILURE`.	2025-06-24 16:27:12 +01:00
Ross Brunton	f242360e15	[Offload] Add type information to device info nodes (#144535 ) Rather than being "stringly typed", store values as a std::variant that can hold various types. This means that liboffload doesn't have to do any string parsing for integer/bool device info keys.	2025-06-20 09:05:05 -05:00
Ross Brunton	e0633d59b9	[Offload] Check for initialization (#144370 ) All entry points (except olInit) now check that offload has been initialized. If not, a new `OL_ERRC_UNINITIALIZED` error is returned.	2025-06-20 09:04:50 -05:00
Ross Brunton	53336ad488	[Offload] Move (most) global state to an `OffloadContext` struct (#144494 ) Rather than having a number of static local variables, we now use a single `OffloadContext` struct to store global state. This is initialised by `olInit`, but is never deleted (de-initialization of Offload isn't yet implemented). The error reporting mechanism has not been moved to the struct, since that's going to cause issues with teardown (error messages must outlive liboffload).	2025-06-19 16:02:03 -05:00
Ross Brunton	e6a3579653	[Offload] Replace device info queue with a tree (#144050 ) Previously, device info was returned as a queue with each element having a "Level" field indicating its nesting level. This replaces this queue with a more traditional tree-like structure. This should not result in a change to the output of `llvm-offload-device-info`.	2025-06-13 09:22:47 -05:00
Ross Brunton	4f60321ca1	[Offload] Add `ol_dimensions_t` and convert ranges from size_t -> uint32_t (#143901 ) This is a three element x, y, z size_t vector that can be used any place where a 3D vector is required. This ensures that all vectors across liboffload are the same and don't require any resizing/reordering dances.	2025-06-12 09:59:59 -05:00
Callum Fare	835497a4dc	[Offload] Make olMemcpy src parameter const (#143161 )	2025-06-06 10:25:00 -05:00
Ross Brunton	7efb79b705	[Offload] Fix Error checking (#141939 ) All errors must be checked - this includes the local variable we were using to increase the lifetime of `Res`. As we were not explicitly checking it, it resulted in an `abort` in debug builds.	2025-05-29 08:17:08 -05:00
Joseph Huber	0ebe5557d9	[Offload] Add specifier for the host type (#141635 ) Summary: We use this sepcial type to indicate a host value, this will be refined later but for now it's used as a stand-in device for transfers and queues. It needs a special kind because it is not a device target as the other ones so we need to differentiate it between a CPU and GPU type. Fixes: https://github.com/llvm/llvm-project/issues/141436	2025-05-28 08:51:14 -05:00
Joseph Huber	a9b64bb318	[Offload] Fix segfault when looking for host device name (#141632 ) Summary: This is done using the generic device into pointe, but no such thing exists for the host device, leading to a segfault. This patch fixes that for now, but in the future we should probably be more careful in general handling the possibility that the handle is null everywhere. Fixes: https://github.com/llvm/llvm-project/issues/141434	2025-05-27 13:43:29 -05:00
Ross Brunton	7e9d708be0	[Offload] Use llvm::Error throughout liboffload internals (#140879 ) This removes the `ol_impl_result_t` helper class, replacing it with `llvm::Error`. In addition, some internal functions that returned `ol_errc_t` now return `llvm::Error` (with a fancy message).	2025-05-27 13:42:56 -05:00
Ross Brunton	050892d2f8	[Offload] Use new error code handling mechanism and lower-case messages (#139275 ) [Offload] Use new error code handling mechanism This removes the old ErrorCode-less error method and requires every user to provide a concrete error code. All calls have been updated. In addition, for consistency with error messages elsewhere in LLVM, all messages have been made to start lower case.	2025-05-20 08:50:20 -05:00
Ross Brunton	f6ac5276ee	[Offload] Ensure all `llvm::Error`s are handled (#137339 ) `llvm::Error`s containing errors must be explicitly handled or an assert will be raised. With this change, `ol_impl_result_t` can accept and consume an `llvm::Error` for errors raised by PluginInterface that have multiple causes and other places now call `llvm::consumeError`. Note that there is currently no facility for PluginInterface to communicate exact error codes, but the constructor is designed in such a way that it can be easily added later. This MR is to convert a crash into an error code. A new test was added, however due to the aforementioned issue with error codes, it does not pass and instead is marked as a skip.	2025-05-02 07:37:19 -05:00
Callum Fare	7bc16a0f63	[Offload] Adding missing Offload unit tests for event entry points (#137315 ) A couple of liboffload entry points were missed out from the tests, and unsurprisingly a crash in one of them made it in. Add the tests and fix the unchecked error in `olDestroyEvent`.	2025-04-30 09:06:00 -05:00
Callum Fare	6022a5214b	[Offload] Add check-offload-unit for liboffload unittests (#137312 ) Adds a `check-offload-unit` target for running the liboffload unit test suite. This unit test binary runs the tests for every available device. This can optionally filtered to devices from a single platform, but the check target runs on everything. The target is not part of `check-offload` and does not get propagated to the top level build. I'm not sure if either of these things are desirable, but I'm happy to look into it if we want. Also remove the `offload/unittests/Plugins` test as it's dead code and doesn't build.	2025-04-29 11:21:59 -05:00
Callum Fare	800d949bb3	[Offload] Implement the remaining initial Offload API (#122106 ) Implement the complete initial version of the Offload API, to the extent that is usable for simple offloading programs. Tested with a basic SYCL program. As far as possible, these are simple wrappers over existing functionality in the plugins. * Allocating and freeing memory (host, device, shared). * Creating a program * Creating a queue (wrapper over asynchronous stream resource) * Enqueuing memcpy operations * Enqueuing kernel executions * Waiting on (optional) output events from the enqueue operations * Waiting on a queue to finish Objects created with the API have reference counting semantics to handle their lifetime. They are created with an initial reference count of 1, which can be incremented and decremented with retain and release functions. They are freed when their reference count reaches 0. Platform and device objects are not reference counted, as they are expected to persist as long as the library is in use, and it's not meaningful for users to create or destroy them. Tests have been added to `offload.unittests`, including device code for testing program and kernel related functionality. The API should still be considered unstable and it's very likely we will need to change the existing entry points.	2025-04-22 13:27:50 -05:00
Callum Fare	fd3907ccb5	Reland #118503 : [Offload] Introduce offload-tblgen and initial new API implementation (#118614 ) Reland #118503. Added a fix for builds with `-DBUILD_SHARED_LIBS=ON` (see last commit). Otherwise the changes are identical. --- ### New API Previous discussions at the LLVM/Offload meeting have brought up the need for a new API for exposing the functionality of the plugins. This change introduces a very small subset of a new API, which is primarily for testing the offload tooling and demonstrating how a new API can fit into the existing code base without being too disruptive. Exact designs for these entry points and future additions can be worked out over time. The new API does however introduce the bare minimum functionality to implement device discovery for Unified Runtime and SYCL. This means that the `urinfo` and `sycl-ls` tools can be used on top of Offload. A (rough) implementation of a Unified Runtime adapter (aka plugin) for Offload is available [here](https://github.com/callumfare/unified-runtime/tree/offload_adapter). Our intention is to maintain this and use it to implement and test Offload API changes with SYCL. ### Demoing the new API ```sh # From the runtime build directory $ ninja LibomptUnitTests $ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests ``` ### Open questions and future work * Only some of the available device info is exposed, and not all the possible device queries needed for SYCL are implemented by the plugins. A sensible next step would be to refactor and extend the existing device info queries in the plugins. The existing info queries are all strings, but the new API introduces the ability to return any arbitrary type. * It may be sensible at some point for the plugins to implement the new API directly, and the higher level code on top of it could be made generic, but this is more of a long-term possibility.	2024-12-05 09:34:04 +01:00
Jan Patrick Lehr	51553227f0	Revert "Reland of #108413 : [Offload] Introduce offload-tblgen and initial new API implementation" (#118541 ) Reverts llvm/llvm-project#118503 Broke bot https://lab.llvm.org/staging/#/builders/131/builds/9701/steps/5/logs/stdio	2024-12-03 21:42:38 +01:00
Callum Fare	8da490320f	Reland of #108413 : [Offload] Introduce offload-tblgen and initial new API implementation (#118503 ) This is another attempt to reland the changes from #108413 The previous two attempts introduced regressions and were reverted. This PR has been more thoroughly tested with various configurations so shouldn't cause any problems this time. If anyone is aware of any likely remaining problems then please let me know. The changes are identical other than the fixes contained in the last 5 commits. --- ### New API Previous discussions at the LLVM/Offload meeting have brought up the need for a new API for exposing the functionality of the plugins. This change introduces a very small subset of a new API, which is primarily for testing the offload tooling and demonstrating how a new API can fit into the existing code base without being too disruptive. Exact designs for these entry points and future additions can be worked out over time. The new API does however introduce the bare minimum functionality to implement device discovery for Unified Runtime and SYCL. This means that the `urinfo` and `sycl-ls` tools can be used on top of Offload. A (rough) implementation of a Unified Runtime adapter (aka plugin) for Offload is available [here](https://github.com/callumfare/unified-runtime/tree/offload_adapter). Our intention is to maintain this and use it to implement and test Offload API changes with SYCL. ### Demoing the new API ```sh # From the runtime build directory $ ninja LibomptUnitTests $ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests ``` ### Open questions and future work * Only some of the available device info is exposed, and not all the possible device queries needed for SYCL are implemented by the plugins. A sensible next step would be to refactor and extend the existing device info queries in the plugins. The existing info queries are all strings, but the new API introduces the ability to return any arbitrary type. * It may be sensible at some point for the plugins to implement the new API directly, and the higher level code on top of it could be made generic, but this is more of a long-term possibility.	2024-12-03 16:28:35 +00:00
Jan Patrick Lehr	5208bc3694	Revert "Reland #2 - [Offload] Introduce offload-tblgen and initial new API implementation (#108413 . #117704 )" (#117995 ) Reverts llvm/llvm-project#117894 Buildbot failures in OpenMP/Offload bots. https://lab.llvm.org/buildbot/#/builders/30/builds/11193	2024-11-28 12:58:17 +01:00
Callum Fare	992b00020f	Reland #2 - [Offload] Introduce offload-tblgen and initial new API implementation (#108413 . #117704 ) (#117894 ) Relands #117704, which relanded changes from #108413 - this was reverted due to build issues. The new offload library did not build with `LIBOMPTARGET_OMPT_SUPPORT` enabled, which was not picked up by pre-merge testing. The last commit contains the fix; everything else is otherwise identical to the approved PR. ___ ### New API Previous discussions at the LLVM/Offload meeting have brought up the need for a new API for exposing the functionality of the plugins. This change introduces a very small subset of a new API, which is primarily for testing the offload tooling and demonstrating how a new API can fit into the existing code base without being too disruptive. Exact designs for these entry points and future additions can be worked out over time. The new API does however introduce the bare minimum functionality to implement device discovery for Unified Runtime and SYCL. This means that the `urinfo` and `sycl-ls` tools can be used on top of Offload. A (rough) implementation of a Unified Runtime adapter (aka plugin) for Offload is available [here](https://github.com/callumfare/unified-runtime/tree/offload_adapter). Our intention is to maintain this and use it to implement and test Offload API changes with SYCL. ### Demoing the new API ```sh # From the runtime build directory $ ninja LibomptUnitTests $ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests ``` ### Open questions and future work * Only some of the available device info is exposed, and not all the possible device queries needed for SYCL are implemented by the plugins. A sensible next step would be to refactor and extend the existing device info queries in the plugins. The existing info queries are all strings, but the new API introduces the ability to return any arbitrary type. * It may be sensible at some point for the plugins to implement the new API directly, and the higher level code on top of it could be made generic, but this is more of a long-term possibility.	2024-11-28 10:19:37 +00:00
Fraser Cormack	0cb5846a68	Revert "Reland - [Offload] Introduce offload-tblgen and initial new API implementation (#108413 ) (#117704 )" This reverts commit c979ec05642f292737d250c6682d85ed49bc7b6e. This showed failures in the post-merge CI.	2024-11-27 10:49:01 +00:00
Callum Fare	c979ec0564	Reland - [Offload] Introduce offload-tblgen and initial new API implementation (#108413 ) (#117704 ) Relands changes from #108413 - this was reverted due to build issues. The problem was just that the `offload-tblgen` tool was behind recent changes to tablegen that ensure `const` records. This has been fixed and the PR is otherwise identical. ___ ### New API Previous discussions at the LLVM/Offload meeting have brought up the need for a new API for exposing the functionality of the plugins. This change introduces a very small subset of a new API, which is primarily for testing the offload tooling and demonstrating how a new API can fit into the existing code base without being too disruptive. Exact designs for these entry points and future additions can be worked out over time. The new API does however introduce the bare minimum functionality to implement device discovery for Unified Runtime and SYCL. This means that the `urinfo` and `sycl-ls` tools can be used on top of Offload. A (rough) implementation of a Unified Runtime adapter (aka plugin) for Offload is available [here](https://github.com/callumfare/unified-runtime/tree/offload_adapter). Our intention is to maintain this and use it to implement and test Offload API changes with SYCL. ### Demoing the new API ```sh # From the runtime build directory $ ninja LibomptUnitTests $ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests ``` ### Open questions and future work * Only some of the available device info is exposed, and not all the possible device queries needed for SYCL are implemented by the plugins. A sensible next step would be to refactor and extend the existing device info queries in the plugins. The existing info queries are all strings, but the new API introduces the ability to return any arbitrary type. * It may be sensible at some point for the plugins to implement the new API directly, and the higher level code on top of it could be made generic, but this is more of a long-term possibility.	2024-11-27 10:39:07 +00:00
Joseph Huber	d047bee496	Revert "[Offload] Introduce offload-tblgen and initial new API implementation (#108413 )" This reverts commit 8a2311c4bf9993230e37dc20b57973dc917f2338.	2024-11-25 12:16:46 -06:00
Callum Fare	8a2311c4bf	[Offload] Introduce offload-tblgen and initial new API implementation (#108413 ) Introduce `offload-tblgen` and an initial implementation of a subset of the new API. The tablegen files are intended to be the single source of truth for the new API, with the header files, documentation, and others bits of source all automatically generated. TODO (based on review feedback so far): - [x] Check in the generated headers - [x] Add an `offload-generate` target to trigger the generation rather than building them every time - [x] Decide how error handling should work - [x] Finish up new error handling implementation - [x] Decide naming convention - [x] Add testing for the new API - [x] Add tablegen specific testing - [x] clang-tidy and use llvm:: types when possible - [x] Add optional code location arguments - [x] Avoid multiple returns from one function ### offload-tblgen See the included [README](`d80db06491/offload/new-api/API/README.md`) for more information on how the API definition and generation works. I'm happy to answer any questions about it and plan to walk through it in a future LLVM Offload call. It should be noted that struct definitions have not been fully implemented/tested as they aren't used by the initial API definitions, but finishing that off in the future shouldn't be too much work. The tablegen tooling has been designed to be easily extended with new backends, using the classes in `RecordTypes.hpp` to abstract over the tablegen records. ### New API Previous discussions at the LLVM/Offload meeting have brought up the need for a new API for exposing the functionality of the plugins. This change introduces a very small subset of a new API, which is primarily for testing the offload tooling and demonstrating how a new API can fit into the existing code base without being too disruptive. Exact designs for these entry points and future additions can be worked out over time. The new API does however introduce the bare minimum functionality to implement device discovery for Unified Runtime and SYCL. This means that the `urinfo` and `sycl-ls` tools can be used on top of Offload. A (rough) implementation of a Unified Runtime adapter (aka plugin) for Offload is available [here](https://github.com/callumfare/unified-runtime/tree/offload_adapter). Our intention is to maintain this and use it to implement and test Offload API changes with SYCL. ### Demoing the new API ```sh $ git clone -b offload_adapter https://github.com/callumfare/unified-runtime.git $ cd unified-runtime $ mkdir build $ cd build $ cmake .. -GNinja -DUR_BUILD_ADAPTER_OFFLOAD=ON \ -DUR_OFFLOAD_INSTALL_DIR=<offload build dir containing liboffload_new.so> \ -DUR_OFFLOAD_INCLUDE_DIR=<offload build dir containing 'offload' headers directory> $ ninja urinfo export LD_LIBRARY_PATH=<offload build dir containing offload plugin libraries> $ UR_ADAPTERS_FORCE_LOAD=$PWD/lib/libur_adapter_offload.so ./bin/urinfo [cuda:gpu][cuda:0] CUDA, NVIDIA GeForce GT 1030 [12030] # Demo with tracing $ OFFLOAD_TRACE=1 UR_ADAPTERS_FORCE_LOAD=$PWD/lib/libur_adapter_offload.so ./bin/urinfo ---> offloadPlatformGet(.NumEntries = 0, .phPlatforms = {}, .pNumPlatforms = 0x7ffd05e4d6e0 (2))-> OFFLOAD_RESULT_SUCCESS ---> offloadPlatformGet(.NumEntries = 2, .phPlatforms = {0x564bf4040220, 0x564bf4040240}, .pNumPlatforms = nullptr)-> OFFLOAD_RESULT_SUCCESS ... ``` ### Open questions and future work * The new API is implemented in a separate library (`liboffload_new.so`). It could just as easily be part of the existing `libomptarget` library - I have no strong feelings on which is better. * Only some of the available device info is exposed, and not all the possible device queries needed for SYCL are implemented by the plugins. A sensible next step would be to refactor and extend the existing device info queries in the plugins. The existing info queries are all strings, but the new API introduces the ability to return any arbitrary type. * It may be sensible at some point for the plugins to implement the new API directly, and the higher level code on top of it could be made generic, but this is more of a long-term possibility.	2024-11-25 11:34:14 -06:00

50 Commits