llvm-project

Author	SHA1	Message	Date
Jan Patrick Lehr	c7babfa6a3	[Offload] Find libc relative to DeviceRTL path (#118497 ) This was discussed as a potential solution in https://github.com/llvm/llvm-project/pull/118173	2024-12-03 16:37:57 +01:00
Joseph Huber	91f5f974cb	[OpenMP] Unconditionally provide an RPC client interface for OpenMP (#117933 ) Summary: This patch adds an RPC interface that lives directly in the OpenMP device runtime. This allows OpenMP to implement custom opcodes. Currently this is only providing the host call interface, which is the raw version of reverse offloading. Previously this lived in `libc/` as an extension which is not the correct place. The interface here uses a weak symbol for the RPC client by the same name that the `libc` interface uses. This means that it will defer to the libc one if both are present so we don't need to set up multiple instances. The presense of this symbol is what controls whether or not we set up the RPC server. Because this is an external symbol it normally won't be optimized out, so there's a special pass in OpenMPOpt that deletes this symbol if it is unused during linking. That means at `O0` the RPC server will always be present now, but will be removed trivially if it's not used at O1 and higher.	2024-12-02 14:31:51 -06:00
Joseph Huber	506ca19dc9	[OpenMP] Remove use of '__AMDGCN_WAVEFRONT_SIZE' (#113156 ) Summary: This is going to be deprecated in https://github.com/llvm/llvm-project/pull/112849. This patch ports it to use the builtin instead. This isn't a compile constant, so it could slightly negatively affect codegen. There really should be an IR pass to turn it into a constant if the function has known attributes. Using the builtin is correct when we just do it for knowing the size like we do here. Obviously guarding w32/w64 code with this check would be broken.	2024-11-25 07:38:28 -06:00
Matt Arsenault	a6fc489bb7	AMDGPU: Add gfx950 subtarget definitions (#116307 ) Mostly a stub, but adds some baseline tests and tests for removed instructions.	2024-11-18 10:41:14 -08:00
Carl Ritson	076aac59ac	[AMDGPU] Add a new target for gfx1153 (#113138 )	2024-10-23 12:56:58 +09:00
Joseph Huber	e8d2057ca4	[OpenMP] Add critical region lock for NVPTX targets (#110148 ) Summary: We define this on AMDGCN but not NVPTX, which leads to some failures dependong on the target.	2024-09-26 11:33:52 -07:00
Joseph Huber	c3ac3fe825	[OpenMP] Fix redefining `stdint.h` types (#108607 ) Summary: We can include `stdint.h` just fine as long as we don't allow it to find system headers, passing `-nostdlibinc` and `-nogpuinc` suppresses these extra paths so we will just use the clang resource headers for `stdint.h` and `stddef.h`.	2024-09-13 13:22:44 -05:00
Johannes Doerfert	08533a3ee8	[Offload][NFC] Reorganize `utils::` and make Device/Host/Shared clearer (#100280 ) We had three `utils::` namespaces, all with different "meaning" (host, device, hsa_utils). We should, when we can, keep "include/Shared" accessible from host and device, thus RefCountTy has been moved to a separate header. `hsa_utils` was introduced to make `utils::` less overloaded. And common functionality was de-duplicated, e.g., `utils::advance` and `utils::advanceVoidPtr` -> `utils:advancePtr`. Type punning now checks for the size of the result to make sure it matches the source type. No functional change was intended.	2024-09-05 13:36:26 -07:00
WÁNG Xuěruì	9adf81182e	[Offload] Fix stray libomptarget message helper calls (#106837 ) In #92581 the `LibomptargetUitls.cmake` helpers have been removed, but only uses of `libomptarget_say` were migrated. Migrate the remaining few warning and error messages so the `check-offload` target would not fail due to missing `libomptarget_warning_say`. While at it, update the `check-offload` unavailability message to say `check-offload` instead of `check-libomptarget`. Fixes #92581	2024-08-31 07:06:41 -05:00
Ethan Luis McDonough	fde2d23ee2	[PGO][OpenMP] Instrumentation for GPU devices (Revision of #76587 ) (#102691 ) This pull request is a revised version of #76587. This pull request fixes some build issues that were present in the previous version of this change. > This pull request is the first part of an ongoing effort to extends PGO instrumentation to GPU device code. This PR makes the following changes: > > - Adds blank registration functions to device RTL > - Gives PGO globals protected visibility when targeting a supported GPU > - Handles any addrspace casts for PGO calls > - Implements PGO global extraction in GPU plugins (currently only dumps info) > > These changes can be tested by supplying `-fprofile-instrument=clang` while targeting a GPU.	2024-08-22 01:10:54 -05:00
Joseph Huber	74d23f15b6	[OpenMP] Implement 'omp_alloc' on the device (#102526 ) Summary: The 'omp_alloc' function should be callable from a target region. This patch implemets it by simply calling `malloc` for every non-default trait value allocator. All the special access modifiers are unimplemented and return null. The null allocator returns null as the spec states it should not be usable from the target.	2024-08-14 13:38:55 -05:00
Joseph Huber	dbb8b7a0f4	Reapply "[OpenMP][libc] Remove special handling for OpenMP printf (#98940 )" This reverts commit fea5914c926e2f013a8b5e27eaa74c7047fb2c71.	2024-07-26 17:21:56 -05:00
Joseph Huber	fea5914c92	Revert "[OpenMP][libc] Remove special handling for OpenMP printf (#98940 )" This reverts commit 069e8bcd82c4420239f95c7e6a09e1f756317cfc. Summary: Some tests failing, revert this for now.	2024-07-26 16:39:12 -05:00
Joseph Huber	069e8bcd82	[OpenMP][libc] Remove special handling for OpenMP printf (#98940 ) Summary: Currently there are several layers to handle `printf`. Since we now have varargs and an implementation of `printf` this can be heavily simplified. 1. The frontend renames `printf` into `omp_vprintf` and gives it an argument buffer. Removing 1. triggered some code in the AMDGPU backend menat for HIP / OpenCL, so I hadded an exception to it. 2. Forward this to CUDA vprintf or ignore it. We no longer need special handling for it since we have varargs. So now we just forward this to CUDA vprintf if we have libc, otherwise just leave `printf` as an external function and expect that `libc` will be linked in.	2024-07-26 16:03:36 -05:00
Joseph Huber	7ebd97b852	[OpenMP] Do not define '__assert_fail' if we have the GPU libc (#100409 ) Summary: The C library is intended to provide `__assert_fail`, so in the cases that we have both we should defer to that. This means that if you build the C library for GPUs you'll get the RPC based asser, and if not you'll get the trap based one.	2024-07-26 15:18:10 -05:00
Shilei Tian	41f6599ae1	[NFC][Offload] Move variables to where they are used (#99956 )	2024-07-22 19:52:16 -04:00
Joseph Huber	3c50cbfda4	[DeviceRTL] Make defined 'libc' functions weak in OpenMP (#97356 ) Summary: These functions provide special-case implementations internal to the OpenMP device runtime. This can potentially conflict with the symbols pulled in from the actual GPU `libc`. This patch makes these weak, so in the case that the GPU libc functions exist they will be overridden. This should not impact performance in the average case because the old `-mlink-builtin-bitcode` version does internalization, deleting weak, and the new LTO path will resolve to the strong reference and then internalize it.	2024-07-02 13:23:53 -05:00
Gheorghe-Teodor Bercea	1a478a69bc	[OpenMP][offload] Fix dynamic schedule tracking (#97065 ) This patch fixes the dynamic schedule tracking.	2024-07-01 10:23:11 -04:00
Ethan Luis McDonough	2c8b912f63	Revert "[PGO][OpenMP] Instrumentation for GPU devices (#76587 )" This reverts commit 5fd2af38e461445c583d7ffc2fe23858966eee76. It caused build issues and broke the buildbot.	2024-06-28 12:30:45 -05:00
Ethan Luis McDonough	5fd2af38e4	[PGO][OpenMP] Instrumentation for GPU devices (#76587 ) This pull request is the first part of an ongoing effort to extends PGO instrumentation to GPU device code. This PR makes the following changes: - Adds blank registration functions to device RTL - Gives PGO globals protected visibility when targeting a supported GPU - Handles any addrspace casts for PGO calls - Implements PGO global extraction in GPU plugins (currently only dumps info) These changes can be tested by supplying `-fprofile-instrument=clang` while targeting a GPU.	2024-06-28 10:42:19 -05:00
Shilei Tian	1ca0055f45	[AMDGPU] Add a new target gfx1152 (#94534 )	2024-06-06 12:16:11 -04:00
Shilei Tian	b448efb8ea	Reapply "[OpenMP][OMPX] Add shfl_down_sync (#93311 )" (#94139 )	2024-06-03 11:17:36 -04:00
Shilei Tian	cf9eeb67e5	Revert "Reapply "[OpenMP][OMPX] Add shfl_down_sync (#93311 )"" This reverts commit 7b4865582299294455bc816358fd88a9c6e5e0be.	2024-05-26 01:04:39 -04:00
Shilei Tian	7b48655822	Reapply "[OpenMP][OMPX] Add shfl_down_sync (#93311 )" This reverts commit 9b31cc71d66064dfaf2afabf4a835211321bb4a0.	2024-05-26 00:57:50 -04:00
Joseph Huber	9b31cc71d6	Revert "[OpenMP][OMPX] Add shfl_down_sync (#93311 )" This reverts commit 098c6dfa8157681699a71fce9e3d94515e66311f. This reverts commit 8c718a3a91df4ab68dc3f1ca3887ea730c9aed84. This reverts commit 4fb02de9d490d0773441aa30124bb4d1272230d3.	2024-05-24 19:07:53 -05:00
Shilei Tian	4fb02de9d4	[OpenMP][OMPX] Add shfl_down_sync (#93311 )	2024-05-24 14:00:43 -04:00
Shilei Tian	7eeec8e6d1	[OpenMP][OMPX] Add ballot_sync (#91297 ) This patch adds the support for `ballot_sync` in ompx.	2024-05-24 09:54:54 -04:00
Joseph Huber	770d928303	[Offload][NFC] Remove 'libomptarget' message helpers (#92581 ) Summary: This isn't `libomptarget` anymore, and these messages were always unnecessary because no other project uses these prefixed messages. The effect of this is that no longer will the logs have `LIBOMPTARGET --` in front of everything. We have a message stating when we start building the offload project so it'll still be trivial to find.	2024-05-17 13:24:32 -05:00
Joseph Huber	16bb7e89a9	[Offload][NFC] Remove all trailing whitespace from offload/ (#92578 ) Summary: This patch cleans up the training whitespace in a bunch of tests and CMake files. Most just in preparation for other cleanups.	2024-05-17 13:15:04 -05:00
Joseph Huber	c4017cda00	[Offload][NFC] Remove header license in CMake files (#92544 ) Summary: No other project has these in the CMake itself, and they're wildly inconsistent even within the project. These don't really add anything so I think they should be removed.	2024-05-17 09:05:03 -05:00
Johannes Doerfert	330d8983d2	[Offload] Move `/openmp/libomptarget` to `/offload` (#75125 ) In a nutshell, this moves our libomptarget code to populate the offload subproject. With this commit, users need to enable the new LLVM/Offload subproject as a runtime in their cmake configuration. No further changes are expected for downstream code. Tests and other components still depend on OpenMP and have also not been renamed. The results below are for a build in which OpenMP and Offload are enabled runtimes. In addition to the pure `git mv`, we needed to adjust some CMake files. Nothing is intended to change semantics. ``` ninja check-offload ``` Works with the X86 and AMDGPU offload tests ``` ninja check-openmp ``` Still works but doesn't build offload tests anymore. ``` ls install/lib ``` Shows all expected libraries, incl. - `libomptarget.devicertl.a` - `libomptarget-nvptx-sm_90.bc` - `libomptarget.rtl.amdgpu.so` -> `libomptarget.rtl.amdgpu.so.18git` - `libomptarget.so` -> `libomptarget.so.18git` Fixes: https://github.com/llvm/llvm-project/issues/75124 --------- Co-authored-by: Saiyedul Islam <Saiyedul.Islam@amd.com>	2024-04-22 09:51:33 -07:00

31 Commits