llvm-project

Author	SHA1	Message	Date
Kevin Sala	7b97941721	[OpenMP][libomptarget] Add missing symbols in dynamic_hsa This patch prepares for the new AMDGPU NextGen plugin. Differential Revision: https://reviews.llvm.org/D140213	2022-12-17 00:01:24 +01:00
Joseph Huber	d8b0f007cb	[libomptarget] Add HSA definitions for memory faults to dynamic_hsa Summary: We use the dynamic HSA file to forward declare needed definitions from the HSA runtime if not present at build time. These definitions were not included so using them caused problems on systems without it if used. Just add them.	2022-12-16 07:06:44 -06:00
Kevin Sala	a66826a233	Revert "[OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior" This reverts commit 87e6b96b0009983996bfe0aa27d358008c1d1087.	2022-12-16 11:53:45 +01:00
Kevin Sala	87e6b96b00	[OpenMP][libomptarget] Add AMDGPU NextGen plugin with asynchronous behavior This commit adds the AMDGPU NextGen plugin inheriting from PluginInterface's classes. It also implements the asynchronous behavior in the plugin operations: kernel launches and memory transfers. To this end, it implements the concept of streams of asynchronous operations. The streams are implemented using the HSA signals to define input and output dependencies between asynchronous operations. Missing features: - Retrieve the maximum number of threads per group that a kernel can run. This requires reading the image. - Implement __tgt_rtl_sync_event, not used on the libomptarget side. Differential Revision: https://reviews.llvm.org/D138389	2022-12-16 00:30:43 +01:00
Kevin Sala	39fe657b66	[OpenMP][libomptarget] Add utility header for AMDGPU plugins This patch prepares the PluginInterface for the new AMDGPU NextGen plugin. The original and the NextGen plugin will share some structures and functionalities. We use this header for defining them and avoiding code duplication. Differential Revision: https://reviews.llvm.org/D139792	2022-12-15 21:06:04 +01:00
Guilherme Valarini	89c82c8394	[OpenMP] Add non-blocking support for target nowait regions This patch better integrates the target nowait functions with the tasking runtime. It splits the nowait execution into two stages: a dispatch stage, which triggers all the necessary asynchronous device operations and stores a set of post-processing procedures that must be executed after said ops; and a synchronization stage, responsible for synchronizing the previous operations in a non-blocking manner and running the appropriate post-processing functions. Suppose during the synchronization stage the operations are not completed. In that case, the attached hidden helper task is re-enqueued to any hidden helper thread to be later synchronized, allowing other target nowait regions to be concurrently dispatched. Reviewed By: jdoerfert, tianshilei1992 Differential Revision: https://reviews.llvm.org/D132005	2022-12-14 14:03:32 -03:00
Jon Chesterfield	56ec7ce80d	[openmp][amdgpu] Let fine grain and kernarg pools differ	2022-12-14 02:04:21 +00:00
Shilei Tian	59ae452983	[OpenMP] Refactor CMake files related to `PluginInterface` in `plugins-nextgen` This patch uses refactors CMake files related to `PluginInterface` in `plugins-nextgen` to handle LLVM dependences in a better way. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D139371	2022-12-06 17:39:41 -05:00
Roman Lebedev	aa6ea6009f	Revert "[OpenMP] Use `add_llvm_library` to build the target `PluginInterface` in `plugins-nextgen`" This is still not working for me: ``` -- Configuring done CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.amdgpu" which requires target "elf_common" that is not in any export set. CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.cuda" which requires target "elf_common" that is not in any export set. CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.x86_64" which requires target "elf_common" that is not in any export set. CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.cuda.nextgen" which requires target "elf_common" that is not in any export set. CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.cuda.nextgen" which requires target "PluginInterface" that is not in any export set. CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.x86_64.nextgen" which requires target "elf_common" that is not in any export set. CMake Error: install(EXPORT "LLVMExports" ...) includes target "omptarget.rtl.x86_64.nextgen" which requires target "PluginInterface" that is not in any export set. -- Generating done ``` This reverts commit e682a76c3bf61c52628d79d6ec4db221430768c0.	2022-12-06 20:47:20 +03:00
Shilei Tian	e682a76c3b	[OpenMP] Use `add_llvm_library` to build the target `PluginInterface` in `plugins-nextgen` This patch uses `add_llvm_library` to build the target `PluginInterface` since it can handle LLVM dependences much better. One temporary drawback of using this is that currently LLVM CMake macro doesn't support object libraries very well (there was a try a couple years ago but it was reverted later `29e5722949`). After switching to that, `CXX_VISIBILITY_PRESET` can not be set correctly, which can cause runtime error that a function call from one plugin could go to another. As a consequence, `PluginInterface` is built as a static library for now. I have asked the question in CMake community (https://discourse.cmake.org/t/set-target-properties-doesnt-work-properly/7016). Once that issue is solved, I'll switch it back to object library. It is not necessarily too bad to use static library, especially `BUILDTREE_ONLY` is already set such that `PluginInterface.a` will not be installed. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D139371	2022-12-06 11:37:37 -05:00
Ron Lieberman	b09a5e5cb3	Revert "Add mean_anyway to hpc config" my bad, wrong repo ,so sorry. This reverts commit 0b9350f3da7daf1d740bbbfab79d01613fcd29f4.	2022-11-29 15:20:23 -06:00
Ron Lieberman	0b9350f3da	Add mean_anyway to hpc config	2022-11-29 15:11:57 -06:00
Joseph Huber	3458a2b737	[Libomptarget][NFC] Add missing LLVM header	2022-11-29 09:46:51 -06:00
Vitaly Buka	98441fc9e4	[NFC][OpenMP] Remove unused label	2022-11-17 23:35:08 -08:00
Vitaly Buka	a35ad711d9	[NFC][OpenMP] Fix const cast warning	2022-11-17 23:24:40 -08:00
Vitaly Buka	e42080ae3f	[NFC][OpenMP] Remove extra ";"	2022-11-17 23:24:40 -08:00
Kevin Sala	846904195b	[OpenMP][libomptarget] New plugin infrastructure and new CUDA plugin This patch adds a new infrastructure for OpenMP target plugins. It also implements the CUDA and GenericELF64bit plugins under this new infrastructure. We place the sources in a separate directory named plugins-nextgen, and we build the new plugins as different plugin libraries. The original plugins, which remain untouched, will be used by default. However, the user can change this behavior at run-time through the boolean envar LIBOMPTARGET_NEXTGEN_PLUGINS. If enabled, the libomptarget will try to load the NextGen version of each plugin, falling back to the original if they are not present or valid. The idea of this new plugin infrastructure is to implement the common parts of target plugins in generic classes (defined in files inside plugins-next/common/PluginInterface folder), and then, each specific plugin defines its own specific classes inheriting from the common ones. In this way, most logic remains on the common interface while reducing the plugin-specific source code. It is also beneficial in the sense that now most code and behavior are the same across the different plugins. As an example, we define classes for a plugin, a device, a device image, a stream manager, etc. The plugin object (a single instance per plugin library) holds different device objects (i.e., one per available device), while these latter are the responsible for managing its own resources. Most code on this patch is based on the changes made by @jdoerfert (Johannes Doerfert) Reviewed By: jhuber6, jdoerfert Differential Revision: https://reviews.llvm.org/D134396	2022-10-27 18:10:14 +00:00
Joseph Huber	429d3d4e9d	[Libomptarget] Build plugins with protected visibility by default The plugins all define the same interface symbols. This is generally not a problem when calling the plugin directly from the dynamic library's handle. However, when calling from within the plugin itself it is possible for another plugin's symbols to preempt the symbols. This was observed with the `__tgt_rtl_is_valid_binary` call in the `__tgt_rtl_is_valid_binary_info` function being mapped to the x86_64 plugin. This patch changes the default visibility to `protected` intead. This visibility ensures that these symbols are all externally visible from the plugin, but ensures their definitions are fixed within the shared library. Having protected visiiblity makes such symbol preemption impossible. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D136365	2022-10-20 11:12:18 -05:00
Joseph Huber	1af7541741	[Libomptarget] Fix missing semicolon in exports	2022-10-14 09:02:42 -05:00
Joseph Huber	619dced0fc	[Libomptarget] Don't use full names for exported plugin symbols Summary: This patch changes the `exports` file to export all `__tgt_rtl` functions. This is a better option as not each plugin implements all of these functions, furthermore any new functions added will be automatically included.	2022-10-14 08:57:57 -05:00
Slava Zakharin	88da0de14f	Revert "[Libomp] Do not error on undefined version script symbols" This reverts commit 096f93e73dc3f88636cdcb57515e3732385b452d. Revert "[Libomptarget] Make the plugins ingore undefined exported symbols" This reverts commit 3f62314c235bd2475c8e2b5b874b2932a444e823. Revert "[LLD] Enable --no-undefined-version by default." This reverts commit 7ec8b0d162e354c703f5390784287054601f9c69. Three commits are reverted because of the current omp build fail with GNU ld. See discussion here: https://reviews.llvm.org/rG096f93e73dc3	2022-10-13 14:12:07 -07:00
Joseph Huber	3f62314c23	[Libomptarget] Make the plugins ingore undefined exported symbols Summary: Recent changes made the default behaviour to error when given an undefined symbol in a version script. A previous patch fixed this for `libomptarget` by removing the single undefined symbol. However, the plguins are expected to only define a subset of the availible functions so we shouldn't treat it as an error. This patch updates the build flags to work appropriately.	2022-10-13 08:13:03 -05:00
Dan Palermo	db021abf33	[OpenMP][AMDGPU] Enable OpenMP device runtime build for gfx110[0123] Add OpenMP device runtime build support for the gfx1100, gfx1101, gfx1102, and gfx1103 targets. Differential Revision: https://reviews.llvm.org/D134465	2022-09-23 01:49:51 +00:00
Joseph Huber	292cb114b0	[Libomptarget] Revert changes to AMDGPU plugin destructors These patches exposed a lot of problems in the AMD toolchain. Rather than keep it broken we should revert it to its old semi-functional state. This will prevent us from using device destructors but should remove some new bugs. In the future this interface should be changed once these problems are addressed more correctly. This reverts commit ed0f21811544320f829124efbb6a38ee12eb9155. This reverts commit 2b7203a35972e98b8521f92d2791043dc539ae88. Fixes #57536 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D133997	2022-09-16 06:55:51 -05:00
Joseph Huber	23bc343855	[Libomptarget] Change device free routines to accept the allocation kind Previous support for device memory allocators used a single free routine and did not provide the original kind of the allocation. This is problematic as some of these memory types required different handling. Previously this was worked around using a map in runtime to record the original kind of each pointer. Instead, this patch introduces new free routines similar to the existing allocation routines. This allows us to avoid a map traversal every time we free a device pointer. The only interfaces defined by the standard are `omp_target_alloc` and `omp_target_free`, these do not take a kind as `omp_alloc` does. The standard dictates the following: "The omp_target_alloc routine returns a device pointer that references the device address of a storage location of size bytes. The storage location is dynamically allocated in the device data environment of the device specified by device_num." Which suggests that these routines only allocate the default device memory for the kind. So this has been changed to reflect this. This change is somewhat breaking if users were using `omp_target_free` as previously shown in the tests. Reviewed By: JonChesterfield, tianshilei1992 Differential Revision: https://reviews.llvm.org/D133053	2022-09-14 12:14:07 -05:00
Joseph Huber	c2acb1e5d3	[Libomptarget][NFC] Remove unused variable	2022-09-09 15:26:02 -05:00
Joseph Huber	83fcba82cc	[Libomptarget] Add proper LLVM libraries now that the AMDGPU plugin uses them Summary: The AMDGPU and CUDA plugins now relies on the Object and Support libraries. This patch adds them explicitly rather than hoping that they share the symbols loaded from the standard `libomptarget`.	2022-09-09 10:33:26 -05:00
Joseph Huber	8d2a447bf9	[Libomptarget] Remove leftover ELF header from x86 plugin Summary: We removed the linking support for `gelf.h` in a previous patch. This header was incorrectly leftover causing build problems on some systems.	2022-09-07 13:41:40 -05:00
Joseph Huber	300155911a	[Libomptarget] Replace libelf with LLVM's Elf libraries This patch replaces the dependency on `libelf` with LLVM's ELF support. With this patch the user no-longer needs to have `libelf` on their system to build and configure OpenMP offloading. The replacement is mostly mechanical, with the exception of the hash table support which was added in D131309. Depends on D131309 Reviewed By: JonChesterfield, saiislam Differential Revision: https://reviews.llvm.org/D131401	2022-09-07 12:38:51 -05:00
Joseph Huber	894531f59b	[Libomptarget] Add utility functions for loading an ELF symbol by name The `SHT_HASH` sections in an ELF are used to look up a symbol in the symbol table using a symbol's name. This is done by obtaining the `SHT_HASH` section and using its `sh_link` attribute to access the associated symbol table, from which we can access the string table containing the associated name. We can then search for the symbol using the hash of the name and the buckets and chains in the hash table itself This patch adds utility functions that allow us to look up a symbol in an ELF file by name. It will first attempt to look through the hash tables, and then search the section tables manually if failed. This allows us to pull out constants necessary for setting up offloading without first loading the object. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D131309	2022-09-07 12:38:50 -05:00
Joseph Huber	31f434ee3b	[Libomptarget][NFC] Clean up CUDA plugin and address warnings	2022-09-06 15:28:57 -05:00
Joseph Huber	f8b1f93f26	[libomptarget] Enable the device allocator for AMDGPU This patch adds support for the device memory type, this is currently equivalent to the default type so it should be treated as the same. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D133128	2022-09-01 12:40:59 -05:00
Jon Chesterfield	ffabe997a5	[openmp][amdgpu] Implement target_alloc_host as fine grain HSA memory The cuda plugin maps TARGET_ALLOC_HOST onto cuMemAllocHost which is page locked host memory. Fine grain HSA memory is not necessarily page locked but has the same read/write from host or device semantics. The cuda plugin does this per-gpu and this patch makes it accessible from any gpu, but it can be locked down to match the cuda behaviour if preferred. Enabling tests requires an equivalent to // RUN: %libomptarget-compile-run-and-check-nvptx64-nvidia-cuda for amdgpu which doesn't seem to be in use yet. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D132660	2022-08-25 16:27:52 +01:00
Ye Luo	322ea53144	[libomptarget][amdgpu] enable tests whenever possible. if(TARGET amdgpu-arch) doesn't work when ENABLE_LLVM_PROJECTS=openmp because openmp subdirectory is processed before clang subdirectory. Adopt the same logic of enabling tests like the CUDA plugin. Differential Revision: https://reviews.llvm.org/D132579	2022-08-24 14:33:28 -05:00
Joseph Huber	540a13652f	[Libomptarget] Replace use of `dlopen` with LLVM's dynamic library support This patch replaces uses of `dlopen` and `dlsym` with LLVM's support with `loadPermanentLibrary` and `getSymbolAddress`. This allows us to remove the explicit dependency on the `dl` libraries in the CMake. This removes another explicit dependency and solves an issue encountered while building on Windows platforms. The one downside to this is that the LLVM library does not currently support `dlclose` functionality, but this could be added in the future. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D131507	2022-08-24 10:46:21 -05:00
Joseph Huber	30efb459e0	[Libomptarget] Remove use of ELF link_address in x86_64 plugin We use the offloading entires array to determine the relative names and addressed of device-side kernel functions. The x86_64 plugin previously derived the device-side entry table by first identifying the `omp_offloading_entries` section offset in the loaded elf. Then we would use the base offset of the loaded dyanmic library to identify the entries array within the loaded image. This relied on some more unconventional methods which prevented us from using the LLVM dynamic library loader for this plugin. This patch simplifies this by instead copying the host-side entry and replacing its address with the device-side address looked up through `dlsym`. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D131516	2022-08-24 10:46:20 -05:00
Joseph Huber	fdbb15355e	[Libomptarget][CUDA] Check CUDA compatibilty correctly We recently added support for multi-architecture binaries in libomptarget. This is done by extracting the architecture from the embedded image and comparing it with the major and minor version supported by the current CUDA installation. Previously we just compared these directly, which was not correct for binary compatibility. The CUDA documentation states that we can consider any image with an equivalent major or a greater or equal to minor compatible with the current image. Change the check to use this new logic in the CUDA plugin. Fixes #57049 Reviewed By: jdoerfert, ye-luo Differential Revision: https://reviews.llvm.org/D131567	2022-08-10 11:15:27 -04:00
Fangrui Song	0972a390b9	LLVM_FALLTHROUGH => [[fallthrough]]. NFC	2022-08-09 04:06:52 +00:00
Jon Chesterfield	104f11630a	[nfc][openmp] clang-format system.cpp prior to D131401	2022-08-08 16:24:34 +01:00
Joseph Huber	b3335e8ed7	[Libomptarget][NFC] Clang format the AMDGPU plugin Summary: A previous patch did not format the plugin again after making changes. Ensure that libomptarget stays formatted.	2022-08-03 15:18:16 -04:00
Joseph Huber	2b7203a359	[Libomptarget] Deinitialize AMDGPU global state more intentionally A previous patch made the destruction of the HSA plugin more deterministic. However, there were still other global values that are not handled this way. When attempting to call a destructor kernel, the device would have already been uninitialized and we could not find the appropriate kernel to call. This is because they were stored in global containers that had their destructors called already. Merges this global state into the rest of the info state by putting those global values inside of the global pointer already allocated and deallocated by the constructor and destructor. This should allow the AMDGPU plugin to correctly identify the destructors if we were to run them. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D131011	2022-08-02 18:24:39 -04:00
Jon Chesterfield	ed0f218115	[openmp][amdgpu] Tear down amdgpu plugin accurately Moves DeviceInfo global to heap to accurately control lifetime. Moves calls from libomptarget to deinit_plugin later, plugins need to stay alive until very shortly before libomptarget is destructed. Leaving the deinit_plugin calls where initially inserted hits use after free from the dynamic_module.c offloading test (verified with valgrind that the new location is sound with respect to this) Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D130714	2022-07-28 20:00:03 +01:00
Jon Chesterfield	c214cb6a68	[amdgpu][openmp][nfc] Restore stb_local on DeviceInfo symbol	2022-07-28 16:50:46 +01:00
Jon Chesterfield	75aa521064	[openmp][amdgpu] Move global DeviceInfo behind call syntax prior to using D130712	2022-07-28 16:40:42 +01:00
Jon Chesterfield	1f9d3974e4	[openmp] Introduce optional plugin init/deinit functions Will allow plugins to migrate away from using global variables to manage lifetime, which will fix a segfault discovered in relation to D127432 Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D130712	2022-07-28 16:21:38 +01:00
Saiyedul Islam	4075a811ad	[Libomptarget] Add checks for AMDGPU TargetID using new image info This patch extends the is_valid_binary routine to also check if the binary's target ID matches the one parsed from the system's runtime environment. This should allow us to only use the binary whose compute capability matches, allowing us to support basic multi-architecture binaries for AMDGPU. It also handles compatibility testing of target IDs of the image and the enviornment. Depends on D127432 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D127769	2022-07-26 02:44:31 -05:00
Saiyedul Islam	4cf30c5157	Revert "Revert "Revert "[Libomptarget] Add checks for AMDGPU TargetID using new image info""" This reverts commit 281eb9223cf2e9366b5356fafab275abf0ea1d2b.	2022-07-25 11:35:37 -05:00
Saiyedul Islam	281eb9223c	Revert "Revert "[Libomptarget] Add checks for AMDGPU TargetID using new image info"" This reverts commit 8cbf4a386b6740180fe48aaebbd1ca9f8ee14367.	2022-07-25 08:32:26 -05:00
Saiyedul Islam	8cbf4a386b	Revert "[Libomptarget] Add checks for AMDGPU TargetID using new image info" This reverts commit 471f2abc62d96b3ef97e13f4f7be2d386fc9f75f.	2022-07-25 05:32:59 -05:00
Saiyedul Islam	471f2abc62	[Libomptarget] Add checks for AMDGPU TargetID using new image info This patch extends the is_valid_binary routine to also check if the binary's target ID matches the one parsed from the system's runtime environment. This should allow us to only use the binary whose compute capability matches, allowing us to support basic multi-architecture binaries for AMDGPU. It also handles compatibility testing of target IDs of the image and the enviornment. Depends on D127432 Differential Revision: https://reviews.llvm.org/D127769	2022-07-25 04:44:36 -05:00

1 2 3 4 5 ...

328 Commits