llvm-project

Author	SHA1	Message	Date
Callum Fare	47c9609a86	[Offload] Check plugins aren't already deinitialized when tearing down (#148642 ) This is a hotfix for #148615 - it fixes the issue for me locally. I think a broader issue is that in the test environment we're calling olShutDown from a global destructor in the test binaries. We should do something more controlled, either calling olInit/olShutDown in every test, or move those to a GTest global environment. I didn't do that originally because it looked like it needed changes to LLVM's GTest wrapper.	2025-07-14 16:17:10 +01:00
Kenneth Benzie (Benie)	508f9a0274	[Offload] Skip event tests on AMDGPU (#148632 ) Add `OffloadDeviceTest::getPlatformBackend()` and use it to skip event tests which currently fail on AMDGPU due to: ``` OL_ERRC_UNIMPLEMENTED: synchronize event not implemented ```	2025-07-14 09:19:53 -05:00
Ross Brunton	a71187e976	[Offload] Return error rather than dropping it (#148609 )	2025-07-14 14:05:58 +01:00
Kenneth Benzie (Benie)	b520d21c02	[Offload] Add tagged type to enumerator docs (#147998 ) When `EnumRec::isTyped()` is true, include the `EnumValueRec::getTaggedType()` to the documentation.	2025-07-14 13:35:36 +01:00
Ross Brunton	2fdeeefacf	[Offload] Add global variable address/size queries (#147972 ) Add two new symbol info types for getting the bounds of a global variable. As well as a number of tests for reading/writing to it.	2025-07-11 16:12:48 +01:00
Ross Brunton	84e15d08c2	[Offload] Add `olGetSymbolInfo[Size]` (#147962 ) This mirrors the similar functions for other handles. The only implemented info at the moment is the symbol's kind.	2025-07-11 15:29:53 +01:00
Ross Brunton	eee723f928	[Offload] Replace `GetKernel` with `GetSymbol` with global support (#148221 ) `olGetKernel` has been replaced by `olGetSymbol` which accepts a `Kind` parameter. As well as loading information about kernels, it can now also load information about global variables.	2025-07-11 14:48:10 +01:00
Ross Brunton	466357ab51	[Offload] Change `ol_kernel_handle_t` -> `ol_symbol_handle_t` (#147943 ) In the future, we want `ol_symbol_handle_t` to represent both kernels and global variables The first step in this process is a rename and promotion to a "typed handle".	2025-07-10 14:54:10 +01:00
Ross Brunton	abb878438a	[Offload] Allow querying the size of globals (#147698 ) The `GlobalTy` helper has been extended to make both the Size and Ptr be optional. Now `getGlobalMetadataFromDevice`/`Image` is able to write the size of the global to the struct, instead of just verifying it.	2025-07-10 12:05:31 +01:00
Kenneth Benzie (Benie)	cea33304c0	[Offload] Add Offload API Sphinx documentation (#147323 ) * Add spec generation to offload-tblgen tool * This patch adds generation of Sphinx compatible reStructuedText utilizing the C domain to document the Offload API directly from the spec definition `.td` files. * Add Sphinx HTML documentation target * Introduces the `docs-offload-html` target when CMake is configured with `LLVM_ENABLE_SPHINX=ON` and `SPHINX_OUTPUT_HTML=ON`. Utilized `offload-tblgen -gen-spen` to generate Offload API specification docs.	2025-07-10 11:50:51 +01:00
Callum Fare	7c6edf4a05	[Offload] Implement olGetQueueInfo, olGetEventInfo (#142947 ) Add info queries for queues and events. `olGetQueueInfo` only supports getting the associated device. We were already tracking this so we can implement this for free. We will likely add other queries to it in the future (whether the queue is empty, what flags it was created with, etc) `olGetEventInfo` only supports getting the associated queue. This is another thing we were already storing in the handle. We'll be able to add other queries in future (the event type, status, etc)	2025-07-09 17:09:31 +01:00
Ye Luo	9f6784cc1f	[libomptarget] fix test offloading/disable_default_device.c Fixes the incorrect lit command line introduced in 536ba87726d8dea862d964678dbb761ca32e21fb	2025-07-09 09:52:00 -05:00
Ross Brunton	bed9fe77dc	[Offload] Tests for global memory and constructors (#147537 ) Adds two "launch kernel" tests for lib offload, one testing that global memory works and persists between different kernels, and one verifying that `[[gnu::constructor]]` works correctly. Since we now have tests that contain multiple kernels in the same binary, the test framework has been updated a bit.	2025-07-09 14:26:50 +01:00
Ross Brunton	0740db9bc1	[Offload] Add `_LAST` variant for generated enumerations (#147314 )	2025-07-09 13:55:25 +01:00
Michael Kruse	4be3e95284	[Flang-RT][Offload] Always use LLVM-built GTest (#143682 ) The Offload and Flang-RT had the ability to compile GTest themselves. But in bootstrapping builds, LLVM_LIBRARY_OUTPUT_INTDIR points to the same location as the stage1 build. If both are building GTest, they everwrite each others `libllvm_gtest.a` and `libllvm_test_main.a` which causes #143134. This PR removes the ability for the Offload/Flang-RT runtimes to build their own GTest and instead relies on the stage1 build of GTest. This was already the case with LLVM_INSTALL_GTEST=ON configurations. For LLVM_INSTALL_GTEST=OFF configurations, we now also export gtest into the buildtree configuration. Ultimately, this reduces combinatorial explosion of configurations in which unittests could be built (LLVM_INSTALL_GTEST=ON, GTest built by Offload, GTest built by Flang-RT, GTest built by Offload and also used by Flang-RT). GTest and therefore Offload/Runtime unittests will not be available if the runtimes are configured against an LLVM install tree. Since llvm-lit isn't available in the install tree either, it doesn't matter. Note that compiler-rt and libc also use GTest in non-default configrations. libc also depends on LLVM's GTest build (and would error-out if unavailable), but compiler-rt builds it completely different. Fixes #143134	2025-07-09 12:53:33 +02:00
Ross Brunton	8c06d0e547	[Offload] Generate OffloadInfo.inc (#147316 ) This is a generated file which contains a macro for all Device Info keys. This is visible to the plugin interface so that it can use the definitions in a future patch.	2025-07-09 11:35:22 +01:00
Ross Brunton	8e104d69fc	[Offload] Provide proper memory management for Images on host device (#146066 ) The `unloadBinaryImpl` method on the host plugin is now implemented properly (rather than just being a stub). When an image is unloaded, it is deallocated and the library associated with it is closed.	2025-07-08 12:42:06 +01:00
Abhinav Gaba	ae4a81e849	[NFC][OpenMP] Add tests for mapping pointers and their dereferences. (#146934 ) The output of the compile-and-run tests is incorrect. These will be used for reference in future commits that resolve the issues. Also updated the existing clang LIT test, target_map_both_pointer_pointee_codegen.cpp, with more constructs and fewer CHECKs (through more update_cc_test_checks filters).	2025-07-08 06:52:38 -04:00
Callum Fare	fdf6ab2a53	[Offload] Implement 'Vendor Name' device info for CUDA (#147334 ) After #146345 the device info implementation requires a value for every query, rather than silently returning an empty string. This broke the test for `OL_DEVICE_INFO_VENDOR` on CUDA. Add a value to the CUDA plugin. We can quite safely hard code this one.	2025-07-08 10:04:48 +01:00
Giorgi Gvalia	5110ac4113	[Offload] Allow CUDA Kernels to use arbitrarily large shared memory (#145963 ) Previously, the user was not able to use more than 48 KB of shared memory on NVIDIA GPUs. In order to do so, setting the function attribute `CU_FUNC_ATTRIBUTE_MAX_THREADS_PER_BLOCK` is required, which was not present in the code base. With this commit, we add the ability toset this attribute, allowing the user to utilize the full power of their GPU. In order to not have to reset the function attribute for each launch of the same kernel, we keep track of the maximum memory limit (as the variable `MaxDynCGroupMemLimit`) and only set the attribute if our desired amount exceeds the limit. By default, this limit is set to 48 KB. Feedback is greatly appreciated, especially around setting the new variable as mutable. I did this becuase the `launchImpl` method is const and I am not able to modify my variable otherwise. --------- Co-authored-by: Giorgi Gvalia <ggvalia@login33.chn.perlmutter.nersc.gov> Co-authored-by: Giorgi Gvalia <ggvalia@login07.chn.perlmutter.nersc.gov>	2025-07-07 15:26:16 -04:00
Ross Brunton	8ae8d31832	[Offload] Add liboffload unit tests for shared/local memory (#147040 )	2025-07-07 16:20:02 +01:00
Ross Brunton	6b19cdcefa	[Offload][amdgpu] Map `INVALID_CODE_OBJECT` to `INVALID_BINARY` (#147070 )	2025-07-04 16:17:51 +01:00
Callum Fare	3c0571a749	[Offload] Add missing license header to Common.td (#146737 ) All other tablegen files in this directory have the license header, but `Common.td` is missing it	2025-07-02 17:17:30 +01:00
Ross Brunton	7d52b0983e	[Offload] Add `MAX_WORK_GROUP_SIZE` device info query (#143718 ) This adds a new device info query for the maximum workgroup/block size for each dimension.	2025-07-02 16:33:54 +01:00
Ross Brunton	4f02965ae2	[Offload] Store kernel name in GenericKernelTy (#142799 ) GenericKernelTy has a pointer to the name that was used to create it. However, the name passed in as an argument may not outlive the kernel. Instead, GenericKernelTy now contains a std::string, and copies the name into there.	2025-07-02 14:11:05 +01:00
Callum Fare	acb52a8a98	[Offload] Improve liboffload documentation (#142403 ) - Update the main README to reflect the current project status - Rework the main API generation documentation. General fixes/tidying, but also spell out explicitly how to make API changes at the top of the document since this is what most people will care about. --------- Co-authored-by: Martin Grant <martingrant@outlook.com>	2025-07-02 13:52:27 +01:00
Kewen12	2b16af8df2	[Offload][cmake] Add GPU test job limit for AMDGPU buildbot cmake cache (#146611 ) Added GPU test job limit to make it consistent with current config https://github.com/llvm/llvm-zorg/blob/main/buildbot/osuosl/master/config/builders.py#L2027C31-L2027C77	2025-07-01 19:18:28 -05:00
Joseph Huber	3cff3d882b	[Offload] Add skeleton for offload conformance tests (#146391 ) Summary: This adds a basic outline for adding 'conformance' tests. These are tests that are intended to check device code against a standard. In this case, we will expect this to be filled with math conformance tests to make sure their results are within the ULP requirements we demand. Right now this just assumes the GPU libc is there, meaning you'll likely need to do a manual `ninja` before doing `ninja -C runtimes/runtimes-bins offload.conformance`.	2025-07-01 10:20:40 -05:00
Callum Fare	1a253e213d	[NFC][Offload] Fix possible edge cases in offload-tblgen (#146511 ) Fix a couple of unhandled edge cases in offload-tblgen that were found by static analysis * `LineStart` may wrap around to 0 when processing multi-line strings. The value is not actually being used in that case, but still better to explicitly handle it * Possible unchecked nullptr when processing parameter flags	2025-07-01 14:09:49 +01:00
Ye Luo	536ba87726	[libomptarget] Add a test for OMP_TARGET_OFFLOAD=disabled (#146385 ) closes https://github.com/llvm/llvm-project/issues/144786	2025-06-30 13:29:36 -05:00
Ross Brunton	67e73ba605	[Offload] Refactor device/platform info queries (#146345 ) This makes several small changes to how the platform and device info queries are handled: * ReturnHelper has been replaced with InfoWriter which is more explicit in how it is invoked. * InfoWriter consumes `llvm::Expected` rather than values directly, and will early exit if it returns an error. * As a result of the above, `GetInfoString` now correctly returns errors rather than empty strings. * The host device now has its own dedicated "getInfo" function rather than being checked in multiple places.	2025-06-30 15:00:43 +01:00
Ross Brunton	003145d0c8	[Offload] Implement `olShutDown` (#144055 ) `olShutDown` was not properly calling deinit on the platforms, resulting in random segfaults on AMD devices. As part of this, `olInit` and `olShutDown` now alloc and free the offload context rather than it being static. This allows `olShutDown` to be called within a destructor of a static object (like the tests do) without having to worry about destructor ordering.	2025-06-30 12:14:00 +01:00
Julian Brown	b62b58d1bb	[OpenMP] Fix crash with duplicate mapping on target directive (#146136 ) OpenMP allows duplicate mappings, i.e. in OpenMP 6.0, 7.9.6 "map Clause": Two list items of the map clauses on the same construct must not share original storage unless one of the following is true: they are the same list item [or other omitted reasons]" Duplicate mappings can arise as a result of user-defined mapper processing (which I think is a separate bug, and is not addressed here), but also in straightforward cases such as: #pragma omp target map(tofrom: s.mem[0:10]) map(tofrom: s.mem[0:10]) Both these cases cause crashes at runtime at present, due to an unfortunate interaction between reference counting behaviour and shadow pointer handling for blocks. This is what happens: 1. The member "s.mem" is copied to the target 2. A shadow pointer is created, modifying the pointer on the target 3. The member "s.mem" is copied to the target again 4. The previous shadow pointer metadata is still present, so the runtime doesn't modify the target pointer a second time. The fix is to disable step 3 if we've already done step 2 for a given block that has the "is new" flag set.	2025-06-29 22:41:24 +01:00
Ross Brunton	39f19f2f1f	[Offload] Store device info tree in device handle (#145913 ) Rather than creating a new device info tree for each call to `olGetDeviceInfo`, we instead do it on device initialisation. As well as improving performance, this fixes a few lifetime issues with returned strings. This does unfortunately mean that device information is immutable, but hopefully that shouldn't be a problem for any queries we want to implement. This also meant allowing offload initialization to fail, which it can now do.	2025-06-27 15:10:43 +01:00
Ross Brunton	102cf1b999	[Offload] Make CUDA Driver Version a string (#146049 ) AMD treats this value as a string, so for consistency require this in NVIDIA as well. This shouldn't change the output of the `llvm-offload-device-info` tool, but does fix an issue in liboffload when it tries to query the version.	2025-06-27 15:07:04 +01:00
Joseph Huber	df5097dd94	[Offload] Add default for HSA agent type to silence warning (#145943 ) Summary: There's a new one called the AIE (AI Engine). We could handle this, but since we don't use it currently I'm just making it future-proof. Adding the AIE check would require checking the HSA version which isn't worthwhile just yet.	2025-06-26 14:46:08 -05:00
Ross Brunton	3e337bc308	[Offload] Add a stub unloadBinaryImpl for host device (#145716 )	2025-06-25 17:06:17 +01:00
Ross Brunton	0870c8838b	[Offload] Add an `unloadBinary` interface to PluginInterface (#143873 ) This allows removal of a specific Image from a Device, rather than requiring all image data to outlive the device they were created for. This is required for `ol_program_handle_t`s, which now specify the lifetime of the buffer used to create the program.	2025-06-25 14:53:18 +01:00
Ross Brunton	4359e55838	[Offload] Properly report errors when jit compiling (#145498 ) Previously, if a binary failed to load due to failures when jit compiling, the function would return success with nullptr. Now it returns a new plugin error, `COMPILE_FAILURE`.	2025-06-24 16:27:12 +01:00
Ross Brunton	4785832144	[Offload] Fix cmake warning (#145488 ) Cmake was unhappy that there was no space between arguments, now it is.	2025-06-24 13:42:03 +01:00
Ross Brunton	02d2a1646a	[Offload] Fix entry_points.td test (#145292 ) This was broken as part of #144494 , and just needs an update to the check lines.	2025-06-23 11:09:08 +01:00
Ross Brunton	613c38a992	[Offload] Fix type mismatch warning in test (#143700 )	2025-06-23 10:14:12 +01:00
Joseph Huber	3f1de197b1	[Offload] Rework compiling device code for unit test suites (#144776 ) Summary: I'll probably want to use this as a more generic utility in the future. This patch reworks it to make it a top level function. I also tried to decouple this from the OpenMP utilities to make that easier in the future. Instead, I just use `-march=native` functionality which is the same thing. Needed a small hack to skip the linker stage for checking if that works. This should still create the same output as far as I'm aware.	2025-06-20 10:31:54 -05:00
Ross Brunton	f242360e15	[Offload] Add type information to device info nodes (#144535 ) Rather than being "stringly typed", store values as a std::variant that can hold various types. This means that liboffload doesn't have to do any string parsing for integer/bool device info keys.	2025-06-20 09:05:05 -05:00
Ross Brunton	e0633d59b9	[Offload] Check for initialization (#144370 ) All entry points (except olInit) now check that offload has been initialized. If not, a new `OL_ERRC_UNINITIALIZED` error is returned.	2025-06-20 09:04:50 -05:00
Ross Brunton	53336ad488	[Offload] Move (most) global state to an `OffloadContext` struct (#144494 ) Rather than having a number of static local variables, we now use a single `OffloadContext` struct to store global state. This is initialised by `olInit`, but is never deleted (de-initialization of Offload isn't yet implemented). The error reporting mechanism has not been moved to the struct, since that's going to cause issues with teardown (error messages must outlive liboffload).	2025-06-19 16:02:03 -05:00
Jan Patrick Lehr	dd65e6e060	[Offload][libc] Add cmake cache AMDGPU buildbot (#144500 ) An upcoming libc4GPU buildbot will be using this CMake cache file for its build configuration.	2025-06-17 20:51:40 +02:00
Ross Brunton	e6a3579653	[Offload] Replace device info queue with a tree (#144050 ) Previously, device info was returned as a queue with each element having a "Level" field indicating its nesting level. This replaces this queue with a more traditional tree-like structure. This should not result in a change to the output of `llvm-offload-device-info`.	2025-06-13 09:22:47 -05:00
Ethan Luis McDonough	daee5eee85	[Offload][PGO] Fix new GPU PGO tests (#143645 ) `pgo_atomic_teams.c` and `pgo_atomic_threads.c` currently are set to run on NVPTX despite the changes for that target not being upstreamed yet. This patch also replaces instances of `llvm-profdata` with `%profdata` in those tests.	2025-06-12 11:14:21 -05:00
Ross Brunton	4f60321ca1	[Offload] Add `ol_dimensions_t` and convert ranges from size_t -> uint32_t (#143901 ) This is a three element x, y, z size_t vector that can be used any place where a 3D vector is required. This ensures that all vectors across liboffload are the same and don't require any resizing/reordering dances.	2025-06-12 09:59:59 -05:00

1 2 3 4 5 ...

351 Commits