llvm-project

Author	SHA1	Message	Date
Kevin Sala Penades	ac71b185c2	[offload] Remove LIBOMPTARGET_SHARED_MEMORY_SIZE envar (#186231 ) This commit removes the `LIBOMPTARGET_SHARED_MEMORY_SIZE` envar and outputs a runtime warning if it is defined. Access to dynamic shared memory should be obtained through the `dyn_groupprivate` clause (OpenMP 6.1) or the launch arguments in liboffload kernel launch.	2026-03-12 21:21:29 -07:00
Joseph Huber	4376fbd793	[OpenMP] Move OpenMP implicit argument to the end and reformat (#185989 ) Summary: We use this `dyn_ptr` argument in Clang/OpenMP to handle the `KernelLaunchEnvironment`. This is a per-kernel argument used to share some information. Currenetly, it's prepended to the argument list and we generate storage for it in the runtime. This is bad for a few reasons: 1. It changes the ABI by shifting user arguments 2. It cannot be trivially be left uninitialized if unused 3. The runtime must allocate its own memory for it This PR changes it to be appended instead. Additionally, space for this is always emitted. This means the OMPIRBuilder itself will provide the storage, we simply need to populate it in the runtime if it is used. This means that if it's unused we don't always pay the cost and it's easier for non-OpenMP users to ignore it. Backward compatibility is maintained by auto-upgrading the kernel arguments. In `libomptarget` we completely allocate a new buffer to store this in the new format. The plugins still need to respect the old ABI of the called device object, so we simply rotate it if it's the old version.	2026-03-12 18:08:22 -05:00
Kevin Sala Penades	1f583c6dee	[OpenMP][Offload] Add offload runtime support for dyn_groupprivate clause (#152831 ) Part 3 adding offload runtime support. See https://github.com/llvm/llvm-project/pull/152651. --------- Co-authored-by: Krzysztof Parzyszek <Krzysztof.Parzyszek@amd.com>	2026-03-12 01:13:06 -07:00
Alex Duran	be021b8433	[OFFLOAD] Add interface to extend image validation (#185663 ) As discussed in #185404 we might want to provide a way for plugins to validate images not recognized by the common layer. This PR adds such extension and uses it to validate pure SPIRV images by the Level Zero plugin.	2026-03-10 18:41:23 +01:00
Jan Patrick Lehr	92447ed273	[Offload] Fix copy-elision warning (#182848 ) This fixes a warning about a prohibited copy-elision due to the move of a temporary object.	2026-02-23 13:58:07 +00:00
fineg74	1c6d774baa	[OFFLOAD] Extend olMemRegister API to handle cases when a memory block may have been mapped outside of liboffload. (#172226 ) This PR adds extends liboffload olMemRegister API to handle a case when a memory block may have been mapped before calling olMemRegister to support some use cases in libomptarget	2026-02-17 20:53:00 +00:00
Hansang Bae	85d64d1201	[Offload] Cast to `void ` in the debug message (#177019 ) There are a few places where data types based on character array or string are printed in the debug message while they do not represent strings. Such expressions should be casted to `void ` unless they represent actual strings. Change also includes casting from integral type to pointer type when appropriate.	2026-01-20 15:44:08 -06:00
fineg74	848d736e64	[OFFLOAD] Add asynchronous queue query API for libomptarget migration (#172231 ) Add liboffload asynchronous queue query API for libomptarget migration This PR adds liboffload asynchronous queue query API that needed to make libomptarget to use liboffload	2026-01-20 10:53:32 -08:00
Hansang Bae	edd857aad8	[Offload] Remove unnecessary `maybe_unused` attribute (#175855 ) The attribute is not necessary in the new debug messaging.	2026-01-15 14:31:58 -06:00
Hansang Bae	dae3b49cba	[Offload] Update debug message printig in the plugins (#175205 ) * Prepare a set of debug types in llvm::offload::debug to be used in plugin code * Update debug messages in the plugins	2026-01-12 14:26:43 -06:00
Hansang Bae	ecb94bcfe2	[Offload] Debug message update part 3 (#171684 ) Update debug messages based on the new method from #170425. Updated the following files. - plugins-nextgen/common/include/MemoryManager.h - plugins-nextgen/common/include/PluginInterface.h - plugins-nextgen/common/src/GlobalHandler.cpp - plugins-nextgen/common/src/PluginInterface.cpp - plugins-nextgen/host/dynamic_ffi/ffi.cpp	2025-12-17 09:05:16 -06:00
Joseph Huber	aaddd8d38a	[OpenMP] Fix tests relying on the heap size variable Summary: I made that an unimplemented error, but forgot that it was used for this environment variable.	2025-11-06 13:00:26 -06:00
Joseph Huber	670c453aeb	[Offload] Remove handling for device memory pool (#163629 ) Summary: This was a lot of code that was only used for upstream LLVM builds of AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so just use that. Simplifies code, can be added back if we start providing alternate forms but I don't think there's a single use-case that would justify it yet.	2025-11-06 10:15:18 -06:00
Robert Imschweiler	dc94f2cbad	[Offload] Add device UID (#164391 ) Introduced in OpenMP 6.0, the device UID shall be a unique identifier of a device on a given system. (Not necessarily a UUID.) Since it is not guaranteed that the (U)UIDs defined by the device vendor libraries, such as HSA, do not overlap with those of other vendors, the device UIDs in offload are always combined with the offload plugin name. In case the vendor library does not specify any device UID for a given device, we fall back to the offload-internal device ID. The device UID can be retrieved using the `llvm-offload-device-info` tool.	2025-11-04 20:15:47 +01:00
Nicole Aschenbrenner	16641ad8a2	[OpenMP] Adds omp_target_is_accessible routine (#138294 ) Adds omp_target_is_accessible routine. Refactors common code from omp_target_is_present to work for both routines. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>	2025-10-22 17:35:16 +02:00
Alex Duran	45757b9284	[OFFLOAD] Remove unused init_device_info plugin interface (#162650 ) This was used for the old interop code. It's dead code after #143491	2025-10-09 08:38:24 -05:00
Kevin Sala Penades	01d761a776	[Offload] Use Error for allocating/deallocating in plugins (#160811 ) Co-authored-by: Joseph Huber <huberjn@outlook.com>	2025-09-26 13:50:00 -05:00
Ross Brunton	e60a5733f0	[Offload] Print Image location rather than casting it (#160309 ) This squishes a warning where the runtime tries to bind a StringRef to a `%p`.	2025-09-24 10:57:55 +01:00
Joseph Huber	23efc67e19	[Offload] Remove non-blocking allocation type (#159851 ) Summary: This was originally added in as a hack to work around CUDA's limitation on allocation. The `libc` implementation now isn't even used for CUDA so this code is never hit. Even if this case, this code never truly worked. A true solution would be to use CUDA's virtual memory API instead to allocate 2MiB slabs independenctly from the normal memory management done in the stream.	2025-09-20 09:07:14 -05:00
Joseph Huber	51e3c3d51b	[Offload] Implement 'olIsValidBinary' in offload and clean up (#159658 ) Summary: This exposes the 'isDeviceCompatible' routine for checking if a binary can be loaded. This is useful if people don't want to consume errors everywhere when figuring out which image to put to what device. I don't know if this is a good name, I was thining like `olIsCompatible` or whatever. Let me know what you think. Long term I'd like to be able to do something similar to what OpenMP does where we can conditionally only initialize devices if we need them. That's going to be support needed if we want this to be more generic.	2025-09-19 12:15:57 -05:00
Nick Sarnie	f74583fbe8	[offload] Fix build with debug libomptarget (#159144 ) Currently get this error ``` offload/plugins-nextgen/common/src/PluginInterface.cpp:859:63: error: member reference type 'StringRef' is not a pointer; did you mean to use '.'? ``` We pass the full image binary now so we can't really print anything useful here. Seems introduced in https://github.com/llvm/llvm-project/pull/158748. --------- Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com>	2025-09-16 18:40:02 +00:00
Joseph Huber	e7101dac9c	[Offload] Copy loaded images into managed storage (#158748 ) Summary: Currently we have this `__tgt_device_image` indirection which just takes a reference to some pointers. This was all find and good when the only usage of this was from a section of GPU code that came from an ELF constant section. However, we have expanded beyond that and now need to worry about managing lifetimes. We have code that references the image even after it was loaded internally. This patch changes the implementation to instaed copy the memory buffer and manage it locally. This PR reworks the JIT and other image handling to directly manage its own memory. We now don't need to duplicate this behavior externally at the Offload API level. Also we actually free these if the user unloads them. Upside, less likely to crash and burn. Downside, more latency when loading an image.	2025-09-16 08:57:28 -05:00
Joseph Huber	5d550bf41c	[OpenMP] Move `__omp_rtl_data_environment' handling to OpenMP (#157182 ) Summary: This operation is done every time we load a binary, this behavior should be moved into OpenMP since it concerns an OpenMP specific data struct. This is a little messy, because ideally we should only be using public APIs, but more can be extracted later.	2025-09-08 09:58:38 -05:00
Ross Brunton	9e5d8bd3d1	[Offload] Improve `olDestroyQueue` logic (#153041 ) Previously, `olDestroyQueue` would not actually destroy the queue, instead leaving it for the device to clean up when it was destroyed. Now, the queue is either released immediately if it is complete or put into a list of "pending" queues if it is not. Whenever we create a new queue, we check this list to see if any are now completed. If there are any we release their resources and use them instead of pulling from the pool. This prevents long running programs that create and drop many queues without syncing them from leaking memory all over the place.	2025-08-29 09:39:00 +01:00
Dominik Adamski	87db8e9130	[OpenMP][Offload] Add SPMD-No-Loop mode to OpenMP offload runtime (#154105 ) Kernels which are marked as SPMD-No-Loop should be launched with sufficient number of teams and threads to cover loop iteration space. No-Loop mode is described in RFC: https://discourse.llvm.org/t/rfc-no-loop-mode-for-openmp-gpu-kernels/87517/	2025-08-28 09:19:14 +02:00
Callum Fare	0b18d2da70	[Offload] Implement olMemFill (#154102 ) Implement olMemFill to support filling device memory with arbitrary length patterns. AMDGPU support will be added in a follow-up PR.	2025-08-22 14:31:16 +01:00
Ross Brunton	4c0c295775	[Offload] `OL_EVENT_INFO_IS_COMPLETE` (#153194 ) A simple info query for events that returns whether the event is complete or not.	2025-08-22 13:40:31 +01:00
Abhinav Gaba	79cf877627	[Offload] Introduce dataFence plugin interface. (#153793 ) The purpose of this fence is to ensure that any `dataSubmit`s inserted into a queue before a `dataFence` finish before finish before any `dataSubmit`s inserted after it begin. This is a no-op for most queues, since they are in-order, and by design any operations inserted into them occur in order. But the interface is supposed to be functional for out-of-order queues. The addition of the interface means that any operations that rely on such ordering (like ATTACH map-type support in #149036) can invoke it, without worrying about whether the underlying queue is in-order or out-of-order. Once a plugin supports out-of-order queues, the plugin can implement this function, without requiring any change at the libomptarget level. --------- Co-authored-by: Alex Duran <alejandro.duran@intel.com>	2025-08-15 11:49:35 -07:00
Ross Brunton	30c7951136	[Offload] `olLaunchHostFunction` (#152482 ) Add an `olLaunchHostFunction` method that allows enqueueing host work to the stream.	2025-08-15 09:39:48 +01:00
Ross Brunton	910d7e90bf	[Offload] Make olLaunchKernel test thread safe (#149497 ) This sprinkles a few mutexes around the plugin interface so that the olLaunchKernel CTS test now passes when ran on multiple threads. Part of this also involved changing the interface for device synchronise so that it can optionally not free the underlying queue (which introduced a race condition in liboffload).	2025-08-08 10:57:04 +01:00
Ross Brunton	a44532544b	[Offload] Don't create events for empty queues (#152304 ) Add a device function to check if a device queue is empty. If liboffload tries to create an event for an empty queue, we create an "empty" event that is already complete. This allows `olCreateEvent`, `olSyncEvent` and `olWaitEvent` to run quickly for empty queues.	2025-08-07 10:16:33 +01:00
hidekisaito	83e5a99ff6	[AMDGPU][Offload] Enable memory manager use for up to ~3GB allocation size in omp_target_alloc (#151882 ) Enables AMD data center class GPUs to use memory manager memory pooling up to 3GB allocation by default, up from the "1 << 13" threshold that all plugin-nextgen devices use.	2025-08-06 14:41:20 -07:00
Alex Duran	f092b820d1	[OFFLOAD] Fix typo in assert (#152316 ) Fixes an issue introduced by PR https://github.com/llvm/llvm-project/pull/143491.	2025-08-06 17:01:47 +02:00
Alex Duran	66d1c37eb6	[OFFLOAD][OPENMP] 6.0 compatible interop interface (#143491 ) The following patch introduces a new interop interface implementation with the following characteristics: * It supports the new 6.0 prefer_type specification * It supports both explicit objects (from interop constructs) and implicit objects (from variant calls). * Implements a per-thread reuse mechanism for implicit objects to reduce overheads. * It provides a plugin interface that allows selecting the supported interop types, and managing all the backend related interop operations (init, sync, ...). * It enables cooperation with the OpenMP runtime to allow progress on OpenMP synchronizations. * It cleanups some vendor/fr_id mismatchs from the current query routines. * It supports extension to define interop callbacks for library cleanup.	2025-08-06 16:34:39 +02:00
Ross Brunton	ae44418f28	[Offload] Erase entries from JIT cache when program is destroyed (#148847 ) When `unloadBinary` is called, any entries in the JITEngine's cache for that binary will be cleared. This fixes a nasty issue with liboffload program handles. If two handles happen to have had the same address (after one was free'd, for example), the cache would be hit and return the wrong program.	2025-07-25 16:11:30 +01:00
Ross Brunton	4f02965ae2	[Offload] Store kernel name in GenericKernelTy (#142799 ) GenericKernelTy has a pointer to the name that was used to create it. However, the name passed in as an argument may not outlive the kernel. Instead, GenericKernelTy now contains a std::string, and copies the name into there.	2025-07-02 14:11:05 +01:00
Ross Brunton	0870c8838b	[Offload] Add an `unloadBinary` interface to PluginInterface (#143873 ) This allows removal of a specific Image from a Device, rather than requiring all image data to outlive the device they were created for. This is required for `ol_program_handle_t`s, which now specify the lifetime of the buffer used to create the program.	2025-06-25 14:53:18 +01:00
Ross Brunton	4359e55838	[Offload] Properly report errors when jit compiling (#145498 ) Previously, if a binary failed to load due to failures when jit compiling, the function would return success with nullptr. Now it returns a new plugin error, `COMPILE_FAILURE`.	2025-06-24 16:27:12 +01:00
Ross Brunton	e6a3579653	[Offload] Replace device info queue with a tree (#144050 ) Previously, device info was returned as a queue with each element having a "Level" field indicating its nesting level. This replaces this queue with a more traditional tree-like structure. This should not result in a change to the output of `llvm-offload-device-info`.	2025-06-13 09:22:47 -05:00
Ethan Luis McDonough	67ff66e677	[PGO][Offload] Fix offload coverage mapping (#143490 ) This pull request fixes coverage mapping on GPU targets. - It adds an address space cast to the coverage mapping generation pass. - It reads the profiled function names from the ELF directly. Reading it from public globals was causing issues in cases where multiple device-code object files are linked together.	2025-06-10 20:19:38 -05:00
Ross Brunton	050892d2f8	[Offload] Use new error code handling mechanism and lower-case messages (#139275 ) [Offload] Use new error code handling mechanism This removes the old ErrorCode-less error method and requires every user to provide a concrete error code. All calls have been updated. In addition, for consistency with error messages elsewhere in LLVM, all messages have been made to start lower case.	2025-05-20 08:50:20 -05:00
Joseph Huber	92bba68634	[Offload] Fix handling of 'bare' mode when environment missing (#136794 ) Summary: We treated the missing kernel environment as a unique mode, but it was kind of this random bool that was doing the same thing and it explicitly expects the kernel environment to be zero. It broke after the previous change since it used to default to SPMD and didn't handle zero in any of the other cases despite being used. This fixes that and queries for it without needing to consume an error.	2025-04-23 08:16:39 -05:00
Nikita Popov	979c275097	[IR] Store Triple in Module (NFC) (#129868 ) The module currently stores the target triple as a string. This means that any code that wants to actually use the triple first has to instantiate a Triple, which is somewhat expensive. The change in #121652 caused a moderate compile-time regression due to this. While it would be easy enough to work around, I think that architecturally, it makes more sense to store the parsed Triple in the module, so that it can always be directly queried. For this change, I've opted not to add any magic conversions between std::string and Triple for backwards-compatibilty purses, and instead write out needed Triple()s or str()s explicitly. This is because I think a decent number of them should be changed to work on Triple as well, to avoid unnecessary conversions back and forth. The only interesting part in this patch is that the default triple is Triple("") instead of Triple() to preserve existing behavior. The former defaults to using the ELF object format instead of unknown object format. We should fix that as well.	2025-03-06 10:27:47 +01:00
Ethan Luis McDonough	9e5c136d5a	[PGO][Offload] Profile profraw generation for GPU instrumentation #76587 (#93365 ) This pull request is the second part of an ongoing effort to extends PGO instrumentation to GPU device code and depends on #76587. This PR makes the following changes: - Introduces `__llvm_write_custom_profile` to PGO compiler-rt library. This is an external function that can be used to write profiles with custom data to target-specific files. - Adds `__llvm_write_custom_profile` as weak symbol to libomptarget so that it can write the collected data to a profraw file. - Adds `PGODump` debug flag and only displays dump when the aforementioned flag is set	2025-02-11 23:30:54 -06:00
Joseph Huber	5812d0bf8e	[Offload] Make only a single thread handle the RPC server thread (#126067 ) Summary: This patch just changes the interface to make starting the thread multiple times permissable since it will only be done the first time. Note that this does not refcount it or anything, so it's onto the user to make sure that they don't shut down the thread before everyone is done using it. That is the case today because the shutDown portion is run by a single thread in the destructor phase. Another question is if we should make this thread truly global state, because currently it will be private to each plugin instance, so if you have an AMD and NVIDIA image there will be two, similarly if you have those inside of a shared library.	2025-02-06 11:38:14 -06:00
Joseph Huber	7a8779422d	[Offload] Stop the RPC server faiilng with more than one GPU (#125982 ) Summary: Pretty dumb mistake of me, forgot that this is run per-device and per-plugin, which fell through the cracks with my testing because I have two GPUs that use different plugins.	2025-02-05 20:51:28 -06:00
Joseph Huber	a284a6ed17	[OpenMP] Guard OpenMP specific entry handling	2025-02-03 16:16:18 -06:00
Christian Clauss	1f56bb3137	[Offload][NFC] Fix typos discovered by codespell (#125119 ) https://github.com/codespell-project/codespell % `codespell --ignore-words-list=archtype,hsa,identty,inout,iself,nd,te,ths,vertexes --write-changes`	2025-01-31 09:35:29 -06:00
Joseph Huber	38b3f45a81	[Offload] Fix offload-info interface Summary: The offload info tool doesn't initialize things properly, just check this first instead.	2025-01-27 10:36:09 -06:00
Joseph Huber	f07505849c	[Offload] Fix server thread from being shut down if unused	2025-01-27 08:29:41 -06:00

1 2

74 Commits