llvm-project

Author	SHA1	Message	Date
Joseph Huber	07896d44a3	[OpenMP] Emit aggregate kernel prototypes and remove libffi dependency (#186261 ) Summary: This PR changes the handling of the emitted kernels when targeting a CPU to be a pointer struct. The old handling emitted a standard function prototype, this necessitated a target specific ABI to call it because the signature differed with the number of arguments. Instead, this PR emits a void pointer to a naturally aligned struct, this is what APIs like `pthreads` assert. This allows us to remove all the complexity around launching host kernels and just pass the argument list.	2026-03-20 13:08:23 -05:00
Bruce Changlong Xu	cbab7e65a7	[AMDGPU] Minor cleanups in offload plugin and AMDGPUEmitPrintf. NFC. (#187587 ) Use empty() in assert, brace-init instead of std::make_pair in the AMDGPU offload plugin, and fix a comment typo in AMDGPUEmitPrintf.	2026-03-19 18:16:47 -04:00
fineg74	2890f9883c	[OFFLOAD] Improve handling of synchronization errors in L0 plugin and reenable tests (#186927 ) This change improves handling of errors during synchronization in Level Zero plugin by ensuring cleanup of queues and events in case of an synchronization error. As a result multiple tests stopped hanging. --------- Co-authored-by: Duran, Alex <alejandro.duran@intel.com>	2026-03-18 05:50:06 +01:00
Joseph Huber	154a128c65	Reapply "[OpenMP] Move OpenMP implicit argument to the end and reformat" (#186309 ) Should be working downstream now This reverts commit 9b61ff210fdff752d5db55b128474e9990258488.	2026-03-13 15:48:37 -05:00
Piotr Balcer	1b9a4a0f72	[Offload][L0] clear completed events from a wait list (#186379 ) Queue's WaitEvent collection wasn't being cleared after synchronization and resetting of the events. This led to hangs on subsequent host synchronizations if not preceeded by any other operation.	2026-03-13 13:56:27 +00:00
theRonShark	9b61ff210f	Revert "[OpenMP] Move OpenMP implicit argument to the end and reformat" (#186309 ) Reverts llvm/llvm-project#185989	2026-03-13 05:20:40 +00:00
Kevin Sala Penades	ac71b185c2	[offload] Remove LIBOMPTARGET_SHARED_MEMORY_SIZE envar (#186231 ) This commit removes the `LIBOMPTARGET_SHARED_MEMORY_SIZE` envar and outputs a runtime warning if it is defined. Access to dynamic shared memory should be obtained through the `dyn_groupprivate` clause (OpenMP 6.1) or the launch arguments in liboffload kernel launch.	2026-03-12 21:21:29 -07:00
Joseph Huber	4376fbd793	[OpenMP] Move OpenMP implicit argument to the end and reformat (#185989 ) Summary: We use this `dyn_ptr` argument in Clang/OpenMP to handle the `KernelLaunchEnvironment`. This is a per-kernel argument used to share some information. Currenetly, it's prepended to the argument list and we generate storage for it in the runtime. This is bad for a few reasons: 1. It changes the ABI by shifting user arguments 2. It cannot be trivially be left uninitialized if unused 3. The runtime must allocate its own memory for it This PR changes it to be appended instead. Additionally, space for this is always emitted. This means the OMPIRBuilder itself will provide the storage, we simply need to populate it in the runtime if it is used. This means that if it's unused we don't always pay the cost and it's easier for non-OpenMP users to ignore it. Backward compatibility is maintained by auto-upgrading the kernel arguments. In `libomptarget` we completely allocate a new buffer to store this in the new format. The plugins still need to respect the old ABI of the called device object, so we simply rotate it if it's the old version.	2026-03-12 18:08:22 -05:00
Kevin Sala Penades	1f583c6dee	[OpenMP][Offload] Add offload runtime support for dyn_groupprivate clause (#152831 ) Part 3 adding offload runtime support. See https://github.com/llvm/llvm-project/pull/152651. --------- Co-authored-by: Krzysztof Parzyszek <Krzysztof.Parzyszek@amd.com>	2026-03-12 01:13:06 -07:00
Alex Duran	789fea83bb	[offload][l0][nfc] remove duplicated entry (#185855 ) Remove left over function by mistake from #185404	2026-03-11 11:55:30 +01:00
Alex Duran	3ff332ad0f	[Offload][L0] Add support for OffloadBinary format in L0 plugin (#185404 ) - Accept OffloadBinaries as valid images by plugins that support them in the PluginInterface. - Add support in L0 plugin to extract SPIRV images and their associated metadata from an OffloadBinary image. Depends on: - #185663 Follow-up PRs: - #185413 (Changes SPIRV wrapper generation to use OffloadBinary) - #185425 (Adjusts llvm-objdump) - #184774 (Adjusts llvm-offload-binary)	2026-03-11 11:42:36 +01:00
Alex Duran	be021b8433	[OFFLOAD] Add interface to extend image validation (#185663 ) As discussed in #185404 we might want to provide a way for plugins to validate images not recognized by the common layer. This PR adds such extension and uses it to validate pure SPIRV images by the Level Zero plugin.	2026-03-10 18:41:23 +01:00
Joseph Huber	a9e457a82f	[Offload][AMDGPU] Fix RPC server on mixed w32 w64 workloads (#185496 ) Summary: This was a regression from the original LLVM-gpu-loader. We used to handle `-mwavefrontsize64` correctly in the loader by over-allocating memory and just leaving the upper 32-bits masked off. In order to handle this in offload we need to scan loaded kernels to see how much memory we need to allocate. This should be safe, the protocol is designed to handle an arbitrary size and worst-case this just wastes space.	2026-03-09 17:13:59 -05:00
Łukasz Plewa	57614e8810	[OFFLOAD] Replace C-style casts with C++ style casts in obtainInfoImpl (#185023 ) Replace C-style bool casts (bool)TmpInt with C++ functional casts bool(TmpInt)	2026-03-06 10:28:38 -06:00
Hansang Bae	8f268e63e4	[Offload] Remove unused data type (#183840 )	2026-02-27 15:46:59 -06:00
Hansang Bae	a347e1298c	[Offload] Enable memory usage printing with `alloc` debug type (#182938 )	2026-02-23 17:19:41 -06:00
Jan Patrick Lehr	92447ed273	[Offload] Fix copy-elision warning (#182848 ) This fixes a warning about a prohibited copy-elision due to the move of a temporary object.	2026-02-23 13:58:07 +00:00
Alex Duran	7ed0aa2652	[OFFLOAD][L0] Remove leftover global constructor (#182611 ) (#182665 ) fixes #182611	2026-02-21 18:09:46 +01:00
Joseph Huber	21b3461440	[flang-rt] Implement basic support for I/O from OpenMP GPU Offloading (#181039 ) Summary: This PR provides the minimal support for Fortran I/O coming from a GPU in OpenMP offloading. We use the same support the `libc` uses for its printing through the RPC server. The helper functions `rpc::dispatch` and `rpc::invoke` help make this mostly automatic. Becaus Fortran I/O is not reentrant, the vast majority of complexity comes from needing to stitch together calls from the GPU until they can be executed all at once. This is needed not only because of the limitations of recursive I/O, but without this the output would all be interleaved because of the GPU's lock-step execution. As such, the return values from the intermediate functions are meaningless, all returning true. The final value is correct however. For cookies we create a context pointer on the server to chain these together. Works on both my AMD and NVIDIA GPUs. ```fortran program hello_gpu implicit none !$omp target teams num_teams(1) !$omp parallel num_threads(2) ! Print strings print *, "Hello from GPU" !$omp end parallel !$omp end target teams end program hello_gpu ``` ```console > flang hello.f90 -O2 -fopenmp --offload-arch=gfx1030 > ./a.out Hello from GPU Hello from GPU > flang hello.f90 -O2 -fopenmp --offload-arch=sm_89 > ./a.out Hello from GPU Hello from GPU ```	2026-02-20 07:56:59 -06:00
Jan Patrick Lehr	e1e0e86e60	[Offload] Always check/consume Error (#182008 ) This fixes an issue introduced in https://github.com/llvm/llvm-project/pull/172226 where an llvm::Error is not checked in the "good" code path.	2026-02-18 13:46:21 +01:00
fineg74	1c6d774baa	[OFFLOAD] Extend olMemRegister API to handle cases when a memory block may have been mapped outside of liboffload. (#172226 ) This PR adds extends liboffload olMemRegister API to handle a case when a memory block may have been mapped before calling olMemRegister to support some use cases in libomptarget	2026-02-17 20:53:00 +00:00
Joseph Huber	d85576d368	[libc] Replace RPC 'close()' mechanism with RAII handler (#181690 ) Summary: Closing ports was previously done manually, This makes the protocol more error prone as unclosed ports will leak and eventually the locks will run out. I believe the original fear was that the RAII portion would negatively impact code generation but I have not noticed anything significant.	2026-02-16 15:14:30 -06:00
fineg74	b58a31d3ce	[OFFLOAD] Add support for host offloading device (#177307 ) The purpose of this PR is to add support of host as an offloading device to liboffload. Both OpenMP and sycl support offloading to a host as their normal workflow and therefore would require such capability from liboffload library.	2026-02-13 10:27:52 +01:00
Hansang Bae	0deb1b6e05	[Offload] Try to load Level Zero loader with version suffix (#180042 ) The default Level Zero loader `libze_loader.so` may not be available on systems that don't have Level Zero development package. Level Zero loaders with major version suffix are searched in that case.	2026-02-11 15:13:26 -06:00
Alex Duran	8b9fd4803c	[OFFLOAD] Support host plugin on Windows (#180401 ) Changes to make host plugin compile on Windows: * Change IO code to be portable * Adjust Makefiles Allow plugin to work partially when libffi support is not found dynamically (compilation works fine even on Windows because of the wrapper support).	2026-02-11 08:54:47 +01:00
Joseph Huber	2f00977fea	[Offload] Make the RPC callbacks private to each running server (#178901 ) Summary: The static object mixes callbacks from different plugins because ever since we moved to the object library target these are actually shared. Just make it a member of the base class and make it a pointer set just to do some basic deduplication.	2026-02-06 08:28:57 -06:00
Alex Duran	4096cb6017	[OFFLOAD] Fix TARGET_NAME in plugins common code (#180151 ) Unlike other names is set between quotes which prevents our debug macros to properly match it.	2026-02-06 14:12:04 +01:00
Joseph Huber	1a86c146ae	[Offload] Add a function to register an RPC Server callback (#178774 ) Summary: We provide an RPC server to manage calls initiated by the device to run on the host. This is very useful for the built-in handling we have, however there are cases where we would want to extend this functionality. Cases like Fortran or MPI would be useful, but we cannot put references to these in the core offloading runtime. This way, we can provide this as a library interface that registers custom handlers for whatever code people want.	2026-01-30 08:03:13 -06:00
Hansang Bae	85d64d1201	[Offload] Cast to `void ` in the debug message (#177019 ) There are a few places where data types based on character array or string are printed in the debug message while they do not represent strings. Such expressions should be casted to `void ` unless they represent actual strings. Change also includes casting from integral type to pointer type when appropriate.	2026-01-20 15:44:08 -06:00
fineg74	848d736e64	[OFFLOAD] Add asynchronous queue query API for libomptarget migration (#172231 ) Add liboffload asynchronous queue query API for libomptarget migration This PR adds liboffload asynchronous queue query API that needed to make libomptarget to use liboffload	2026-01-20 10:53:32 -08:00
Hansang Bae	edd857aad8	[Offload] Remove unnecessary `maybe_unused` attribute (#175855 ) The attribute is not necessary in the new debug messaging.	2026-01-15 14:31:58 -06:00
Hansang Bae	90b6d33755	[Offload] Small debug message fix in Level Zero plugin (#175958 ) Do not include trailing zeros in the device name.	2026-01-14 09:42:19 -06:00
Alex Duran	efad3563ea	[OFFLOAD] Update CUDA and AMD plugins to new debug format (#175787 )	2026-01-13 17:53:59 +01:00
Alex Duran	86e114a9b2	Revert "[OFFLOAD] Update CUDA and AMD plugins to new debug format" (#175786 ) Reverts llvm/llvm-project#175757	2026-01-13 17:13:46 +01:00
Alex Duran	7c2f49373b	[OFFLOAD] Update CUDA and AMD plugins to new debug format (#175757 ) This should be the last step before completely removing the DP macro.	2026-01-13 17:06:35 +01:00
Hansang Bae	13cd7003ad	[NFC][Offload] Rename a function (#175673 ) Renamed a function as suggested in #175664.	2026-01-12 19:40:17 -06:00
Hansang Bae	496729fe7e	[Offload] Fix level_zero plugin build (#175664 ) Build has been broken when OMPTARGET_DEBUG is undefined.	2026-01-12 16:53:23 -06:00
Hansang Bae	dae3b49cba	[Offload] Update debug message printig in the plugins (#175205 ) * Prepare a set of debug types in llvm::offload::debug to be used in plugin code * Update debug messages in the plugins	2026-01-12 14:26:43 -06:00
fineg74	1232599032	[OFFLOAD] Add memory data locking API for libomptarget migration (#173138 ) Add liboffload memory data locking API for libomptarget migration This PR adds liboffload memory data locking API that needed to make libomptarget to use liboffload	2026-01-12 13:07:57 -06:00
Alex Duran	dbd52bd558	[OFFLOAD][OpenMP] Remove old style REPORT support (#175607 ) Fix the few remaining usages and remove the support for the old REPORT macro.	2026-01-12 19:48:40 +01:00
Joseph Huber	c722ef4874	[OpenMP] Remove testing LTO variant on CPU targets (#175187 ) Summary: This is only really meaningful for the NVPTX target. Not all build environments support host LTO and these are redundant tests, just clean this up and make it run faster.	2026-01-09 10:13:44 -06:00
fineg74	583ce49a40	[OFFLOAD] Make L0 provide more information about device to be consistent with other plugins (#172946 ) Update information about devices provided by level zero plugin in order to be more consistent with other plugins.	2026-01-08 22:10:44 +00:00
Alex Duran	280e609d4e	[OFFLOAD][L0] Expose native ELF to upper layers (#172819 ) This PR refactors how the device image is built so we can expose the native ELF of the device to DeviceImageTy which solves several issues regarding symbol look up (as DeviceImageTy expects an ELF). It also simplifies the module linking code taking into account the latest changes in the driver (which adds "-library-compilation when necessary). --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com>	2025-12-18 18:03:12 +00:00
Alex Duran	5559918321	[OFFLOAD][L0] Improve symbol device lookup (#172820 ) When looking for the device address of a symbol, we need to also look if it's a function symbol if not found as global symbol in the device. --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-12-18 15:31:20 +00:00
Alex Duran	3ac0ff2f36	[OFFLOAD][L0] Fix usages of getDebugLevel in L0 plugin (#172815 ) Support for getDebugLevel was removed as part of the new debug macros (#165416). This PR updates such usages to use the new ODBG_* macros. --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com>	2025-12-18 15:30:59 +00:00
Alex Duran	f125c8db5c	[OFFLOAD] Add plugin with support for Intel oneAPI Level Zero (#158900 ) Add a new nextgen plugin that supports GPU devices through the Intel oneAPI Level Zero library. The plugin is not enabled by default and needs to be added to LIBOMPTARGET_PLUGINS_TO_BUILD explicitely. --------- Co-authored-by: Alexey Sachkov <alexey.sachkov@intel.com> Co-authored-by: Nick Sarnie <nick.sarnie@intel.com> Co-authored-by: Joseph Huber <huberjn@outlook.com>	2025-12-18 08:53:03 +01:00
Hansang Bae	ecb94bcfe2	[Offload] Debug message update part 3 (#171684 ) Update debug messages based on the new method from #170425. Updated the following files. - plugins-nextgen/common/include/MemoryManager.h - plugins-nextgen/common/include/PluginInterface.h - plugins-nextgen/common/src/GlobalHandler.cpp - plugins-nextgen/common/src/PluginInterface.cpp - plugins-nextgen/host/dynamic_ffi/ffi.cpp	2025-12-17 09:05:16 -06:00
Kevin Sala Penades	35315a84b4	[offload] Fix CUDA args size by subtracting tail padding (#172249 ) This commit makes the cuLaunchKernel call to pass the total arguments size without tail padding.	2025-12-14 21:57:25 -08:00
Alex Duran	66ddc9b3e7	[OFFLOAD] Add support for more fine grained debug messages control (#165416 ) This PR introduces new debug macros that allow a more fined control of which debug message to output and introduce C++ stream style for debug messages. Changing existing messages (except a few that I changed for testing) will come in subsequent PRs. I also think that we should make debug enabling OpenMP agnostic but, for now, I prioritized maintaing the current libomptarget behavior for now, and we might need more changes further down the line as we we decouple libomptarget.	2025-11-20 18:39:56 +01:00
Joseph Huber	eea62159e8	[Offload] Make the RPC thread sleep briefly when idle (#168596 ) Summary: We start this thread if the RPC client symbol is detected in the loaded binary. We should make this sleep if there's no work to avoid the thread running at high priority when the (scarecely used) RPC call is actually required. So, right now after 25 microseconds we will assume the server is inactive and begin sleeping. This resets once we do find work. AMD supports a more intelligent way to do this. HSA signals can wake a sleeping thread from the kernel, and signals can be sent from the GPU side. This would be nice to have and I'm planning on working with it in the future to make this infrastructure more usable with existing AMD workloads.	2025-11-19 15:56:25 -06:00

1 2 3 4 5

221 Commits