llvm-project

Author	SHA1	Message	Date
Johannes Doerfert	b8cbc5c02c	[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401 ) The KernelEnvironment is for compile time information about a kernel. It allows the compiler to feed information to the runtime. The KernelLaunchEnvironment is for dynamic information per kernel launch. It allows the rutime to feed information to the kernel that is not shared with other invocations of the kernel. The first use case is to replace the globals that synchronize teams reductions with per-launch versions. This allows concurrent teams reductions. More uses cases will follow, e.g., per launch memory pools. Fixes: https://github.com/llvm/llvm-project/issues/70249	2023-10-31 19:38:43 -07:00
Shilei Tian	10068cd654	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Depend on D155886. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-07-26 13:35:14 -04:00
Shilei Tian	6bd74fd65f	Revert commits for kernel environment This reverts commits for kernel environments as they causes issues in AMD BB.	2023-07-23 23:32:31 -04:00
Shilei Tian	c5c8040390	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Depend on D155886. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-07-23 18:36:01 -04:00
Sergio Afonso	63ca93c7d1	[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes `IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to `-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed to `omp.is_target_device`. Getters and setters of all these renamed properties are also updated accordingly. Many unit tests have been updated to use the new names, but an alias for the `-fopenmp-is-device` option is created so that external programs do not stop working after the name change. `IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the `-fopenmp-is-target-device` compiler frontend option, which is only added to the OpenMP device invocation for offloading-enabled programs. Differential Revision: https://reviews.llvm.org/D154591	2023-07-10 14:14:16 +01:00
Shilei Tian	d4ecd1241c	Revert "[OpenMP] Introduce kernel environment" This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9. It makes a couple of buildbots unhappy because of the following test failures: - `Transforms/OpenMP/add_attributes.ll'` - `mapping/declare_mapper_target_data.cpp` on AMDGPU	2023-04-22 20:56:35 -04:00
Shilei Tian	35cfadfbe2	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-04-22 20:46:38 -04:00
Shilei Tian	270f533e8b	[Clang][OpenMP] Update tests using update_cc_test_checks.py Make preparation for other patches Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D144320	2023-02-21 21:09:28 -05:00
Shilei Tian	2ce96784ba	Revert "[Clang][OpenMP] Update tests using update_cc_test_checks.py" This reverts commit 61faf261506f93819e48f050f318f96452831f92 as it makes buildbot unhappy.	2023-02-21 20:28:00 -05:00
Shilei Tian	61faf26150	[Clang][OpenMP] Update tests using update_cc_test_checks.py Make preparation for other patches Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D144320	2023-02-21 20:09:17 -05:00
Johannes Doerfert	90609fb68f	[OpenMP][NFCI] Remove effectively dead code in clang and the runtime Differential Revision: https://reviews.llvm.org/D136903	2022-12-13 18:44:19 -08:00
Johannes Doerfert	f9c29878b0	Revert "[OpenMP][NFCI] Remove effectively dead code in clang and the runtime" This reverts commit c1c8cbbf5f29257d084a23a2f6c4236c40b7afb9. One of the tests seems to be flaky/non-deterministic.	2022-12-12 22:08:28 -08:00
Johannes Doerfert	c1c8cbbf5f	[OpenMP][NFCI] Remove effectively dead code in clang and the runtime	2022-12-12 20:55:36 -08:00
Joseph Huber	8c1449a84d	[OpenMP] Make kernels have protected visibility This patch changes the kernels generated by OpenMP to have protected visibility. This is unlikely to change anything functionally. However, protected visibility better matches the behaviour of these GPU kernels. We do not expect any pending shared library load to preempt these kernels so we can specify a more restrictive visibility. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D136198	2022-10-18 16:37:28 -05:00
Nikita Popov	a290f3c8fc	[OpenMP] Convert tests to opaque pointers (NFC) Conversion performed using the script at: https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34 These are only tests where no manual fixup was required.	2022-10-07 14:58:27 +02:00
Joseph Huber	2d26ecb1fb	[OpenMP] Remove simplified device runtime handling The old device runtime had a "simplified" version that prevented many of the runtime features from being initialized. The old device runtime was deleted in LLVM 14 and is no longer in use. Selectively deactivating features is now done using specific flags rather than the old technique. This patch simply removes the extra logic required for handling the old simple runtime scheme. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D133802	2022-09-14 09:41:50 -05:00
Nikita Popov	532dc62b90	[OpaquePtrs][Clang] Add -no-opaque-pointers to tests (NFC) This adds -no-opaque-pointers to clang tests whose output will change when opaque pointers are enabled by default. This is intended to be part of the migration approach described in https://discourse.llvm.org/t/enabling-opaque-pointers-by-default/61322/9. The patch has been produced by replacing %clang_cc1 with %clang_cc1 -no-opaque-pointers for tests that fail with opaque pointers enabled. Worth noting that this doesn't cover all tests, there's a remaining ~40 tests not using %clang_cc1 that will need a followup change. Differential Revision: https://reviews.llvm.org/D123115	2022-04-07 12:09:47 +02:00
Joseph Huber	b9f67d44ba	[OpenMP] Replace device kernel linkage with weak_odr Currently the device kernels all have weak linkage to prevent linkage errors on multiple defintions. However, this prevents some optimizations from adequately analyzing them because of the nature of weak linkage. This patch replaces the weak linkage with weak_odr linkage so we can statically assert that multiple declarations of the same kernel will have the same definition. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D122443	2022-03-25 11:29:15 -04:00
Shilei Tian	423d34f74a	[OpenMP][Offloading] Change `bool IsSPMD` to `int8_t Mode` in `__kmpc_target_init` and `__kmpc_target_deinit` This is a follow-up of D110029, which uses bitset to indicate execution mode. This patches makes the changes in the function call. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D110279	2021-09-22 17:16:41 -04:00
Johannes Doerfert	e2cfbfcc0c	[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch. The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers. [0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11 This was in parts extracted from D59319. Reviewed By: ABataev, JonChesterfield Differential Revision: https://reviews.llvm.org/D101976	2021-07-10 17:53:56 -05:00
Pushpinder Singh	4909cb1a0f	[OpenMP][AMDGPU] Use AMDGPU_KERNEL calling convention for entry function AMDGPU backend requires entry functions/kernels to have AMDGPU_KERNEL calling convention for proper linking. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D94060	2021-01-06 02:03:30 -05:00
Jon Chesterfield	76bfbb74d3	[libomptarget][amdgpu] Call into deviceRTL instead of ockl [libomptarget][amdgpu] Call into deviceRTL instead of ockl Amdgpu codegen presently emits a call into ockl. The same functionality is already present in the deviceRTL. Adds an amdgpu specific entry point to avoid the dependency. This lets simple openmp code (specifically, that which doesn't use libm) run without rocm device libraries installed. Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D93356	2021-01-04 16:48:47 +00:00
Pushpinder Singh	3a12ff0dac	[OpenMP][RTL] Remove dead code RequiresDataSharing was always 0, resulting dead code in device runtime library. Reviewed By: jdoerfert, JonChesterfield Differential Revision: https://reviews.llvm.org/D88829	2020-10-06 05:43:47 -04:00
Saiyedul Islam	160ff83765	[OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 3 Provides AMDGCN and NVPTX specific specialization of getGPUWarpSize, getGPUThreadID, and getGPUNumThreads methods. Adds tests for AMDGCN codegen for these methods in generic and simd modes. Also changes the precondition in InitTempAlloca to be slightly more permissive. Useful for AMDGCN OpenMP codegen where allocas are created with a cast to an address space. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D84260	2020-08-03 05:38:39 +00:00

24 Commits