llvm-project

Author	SHA1	Message	Date
Johannes Doerfert	e5a3d5ba88	[OpenMP][NFC] Enable more runtime tests and also run them with O3 The test run fine on my AMD GPU machine, we should verify them on others too and put them into our regular testing. Not testing O1/2/3 is really bad and not testing all architecturs is similarly problematic. Differential Revision: https://reviews.llvm.org/D148576	2023-07-31 15:45:53 -07:00
Joseph Huber	292eca41d9	[Libomptarget] Fix tests after previous patch Summary: The previous patch didn't remove these tests correctly.	2023-01-30 07:18:51 -06:00
Joseph Huber	47166968db	[OpenMP] Deprecate the old driver for OpenMP offloading Recently OpenMP has transitioned to using the "new" driver which primarily merges the device and host linking phases into a single wrapper that handles both at the same time. This replaced a few tools that were only used for OpenMP offloading, such as the `clang-offload-wrapper` and `clang-nvlink-wrapper`. The new driver carries some marked benefits compared to the old driver that is now being deprecated. Things like device-side LTO, static library support, and more compatible tooling. As such, we should be able to completely deprecate the old driver, at least for OpenMP. The old driver support will still exist for CUDA and HIP, although both of these can currently be compiled on Linux with `--offload-new-driver` to use the new method. Note that this does not deprecate the `clang-offload-bundler`, although it is unused by OpenMP now, it is still used by the HIP toolchain both as their device binary format and object format. When I proposed deprecating this code I heard some vendors voice concernes about needing to update their code in their fork. They should be able to just revert this commit if it lands. Reviewed By: jdoerfert, MaskRay, ye-luo Differential Revision: https://reviews.llvm.org/D130020	2022-08-26 13:47:09 -05:00
Joseph Huber	d5d836635c	[Libomptarget] Add test config for compiling in LTO-mode We are planning on making LTO the default compilation mode for offloading. In order to make sure it works we should run these tests on the test suite. AMDGPU already uses the LTO compilation path for its linking, but in LTO mode it also links the static library late. Performing LTO requires the static library to be built, if we make the change this will be a hard requirement and the old bitcode library will go away. This means users will need to use either a two-step build or a runtimes build for libomptarget. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D127512	2022-06-14 10:16:03 -04:00
Joseph Huber	ae23be84cb	[OpenMP] Make the new offloading driver the default Previously an opt-in flag `-fopenmp-new-driver` was used to enable the new offloading driver. After passing tests for a few months it should be sufficiently mature to flip the switch and make it the default. The new offloading driver is now enabled if there is OpenMP and OpenMP offloading present and the new `-fno-openmp-new-driver` is not present. The new offloading driver has three main benefits over the old method: - Static library support - Device-side LTO - Unified clang driver stages Depends on D122683 Differential Revision: https://reviews.llvm.org/D122831	2022-04-18 15:05:09 -04:00
Joseph Huber	9582f09690	[Libomptarget] Increase stack size for bug49779 test The 'bug49779.cpp' test has been failing recently. This is because the runtime is sufficiently complex when using nested parallelism without optimizations that the CUDA tools cannot statically determine the stack size. Because of this the kernel can exceed the thread stack size and crash. Work around this using the 'LIBOMPTARGET_STACK_SIZE' environment variable and add an FAQ entry for this situation. Fixes #53670 Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D119357	2022-02-09 15:37:23 -05:00
Giorgis Georgakoudis	a2dbfb6b72	[OpenMP] Simplify offloading parallel call codegen This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPMD parallel calls are unified under a single `kmpc_parallel_51` runtime entry point for parallel regions (which will be commonized between target, host-side parallel regions), data sharing is internalized to the runtime. Tests have been auto-generated using `update_cc_test_checks.py`. Also, the revision contains changes to OpenMPOpt for remark creation on target offloading regions. Reviewed By: jdoerfert, Meinersbur Differential Revision: https://reviews.llvm.org/D95976	2021-04-21 18:46:07 -07:00

7 Commits