llvm-project

Author	SHA1	Message	Date
Johannes Doerfert	cd3a4c31bc	[Attributor][NFC] update tests (#91011 )	2024-05-03 16:38:55 -07:00
Johannes Doerfert	3de645efe3	[OpenMP][NFC] Split the reduction buffer size into two components Before we tracked the size of the teams reduction buffer in order to allocate it at runtime per kernel launch. This patch splits the number into two parts, the size of the reduction data (=all reduction variables) and the (maximal) length of the buffer. This will allow us to allocate less if we need less, e.g., if we have less teams than the maximal length. It also allows us to move code from clangs codegen into the runtime as we now know how large the reduction data is.	2023-11-06 11:50:41 -08:00
Johannes Doerfert	b8cbc5c02c	[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401 ) The KernelEnvironment is for compile time information about a kernel. It allows the compiler to feed information to the runtime. The KernelLaunchEnvironment is for dynamic information per kernel launch. It allows the rutime to feed information to the kernel that is not shared with other invocations of the kernel. The first use case is to replace the globals that synchronize teams reductions with per-launch versions. This allows concurrent teams reductions. More uses cases will follow, e.g., per launch memory pools. Fixes: https://github.com/llvm/llvm-project/issues/70249	2023-10-31 19:38:43 -07:00
Mehdi Amini	f390a76b7e	Revert "Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257 )"" This reverts commit ddbaa11e9f43a38d50d62a9b9b07c3653b6bf8ab. Reapply the original commit, the broken test was repaired in 5e51363f38d083ab326736c0d4d1b5f9fe0de080 in the meantime.	2023-10-26 17:30:01 -07:00
Mehdi Amini	ddbaa11e9f	Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257 )" This reverts commit c2a1249a8257ed033a98e32e425539c6da6700ec. The MLIR bots are broken with an omp test failure.	2023-10-26 17:25:20 -07:00
Johannes Doerfert	c2a1249a82	[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257 ) The runtime needs to know about the acceptable launch bounds, especially if the compiler (middle- or backend) assumed those bounds. While this patch does not yet inform the runtime, it stores the bounds in a place that can/will be accessed and is associated with the kernel.	2023-10-26 14:46:55 -07:00
Shilei Tian	10068cd654	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Depend on D155886. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-07-26 13:35:14 -04:00
Shilei Tian	6bd74fd65f	Revert commits for kernel environment This reverts commits for kernel environments as they causes issues in AMD BB.	2023-07-23 23:32:31 -04:00
Shilei Tian	c5c8040390	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Depend on D155886. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-07-23 18:36:01 -04:00
Johannes Doerfert	8f4fadd1b4	[OpenMP] Use "kernel" attribute consistently	2023-06-05 16:33:53 -07:00
Shilei Tian	d4ecd1241c	Revert "[OpenMP] Introduce kernel environment" This reverts commit 35cfadfbe2decd9633560b3046fa6c17523b2fa9. It makes a couple of buildbots unhappy because of the following test failures: - `Transforms/OpenMP/add_attributes.ll'` - `mapping/declare_mapper_target_data.cpp` on AMDGPU	2023-04-22 20:56:35 -04:00
Shilei Tian	35cfadfbe2	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-04-22 20:46:38 -04:00
Ishaan Gandhi	aead502b11	[Attributor] Add convergent abstract attribute This patch adds the AANonConvergent abstract attribute. It removes the convergent attribute from functions that only call non-convergent functions. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D143228	2023-03-20 22:33:50 -07:00
Johannes Doerfert	56be9123ca	[Attributor][OpenMP][NFC] Cleanup tests via update script	2023-01-09 16:40:20 -08:00
Nikita Popov	aa8e9fac2a	[OpenMP] Convert some tests to opaque pointers (NFC)	2023-01-03 15:03:14 +01:00
Johannes Doerfert	23333bb6b7	[NFC] Rerun update test checks on Attributor and OpenMP-Opt tests	2022-12-13 18:44:19 -08:00
Johannes Doerfert	90609fb68f	[OpenMP][NFCI] Remove effectively dead code in clang and the runtime Differential Revision: https://reviews.llvm.org/D136903	2022-12-13 18:44:19 -08:00
Johannes Doerfert	f9c29878b0	Revert "[OpenMP][NFCI] Remove effectively dead code in clang and the runtime" This reverts commit c1c8cbbf5f29257d084a23a2f6c4236c40b7afb9. One of the tests seems to be flaky/non-deterministic.	2022-12-12 22:08:28 -08:00
Johannes Doerfert	c1c8cbbf5f	[OpenMP][NFCI] Remove effectively dead code in clang and the runtime	2022-12-12 20:55:36 -08:00
Doru Bercea	0b1160fdeb	Fix OpenMP Opt for target without a parallel region. Remove ctx redeclaration. Format code. Remove parallel check. Modify tests. Clean-up code. Fix another test. Move code to helper functions. Format file. Minor fixes.	2022-09-06 16:04:53 +00:00
Johannes Doerfert	d61aac76bf	[OpenMP][FIX] Do not signal SPMD-mode but then keep generic-mode If we assume SPMD-mode during the fixpoint iteration we have to execute the kernel in SPMD-mode. If we change our mind during manifest there is the chance of a mismatch between the simplification, e.g., of `__kmpc_is_spmd_exec_mode` calls, and the execution mode. This problem was introduced in D109438. This patch is compromise to resolve the problem purely in OpenMP-opt while trying to keep the benefits of D109438 around. This might not always work, see `get_hardware_num_threads_in_block_fold` but it often does. At the same time we do keep value specialization and execution mode in sync. Proper solutions to this problem should be considered. I believe a new execution mode is the easiest way forward (Singleton-SPMD). Alternatively, SPMD-mode execution can be used with a way to provide a new thread_limit (here 1) to the runtime. This is more general and could be useful if we see `num_threads` clauses or workshared loops with small trip counts in the kernel. In either proposal we need to disable the guarding for the kernel (which was the motivation for D109438). Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D112894	2021-11-02 23:22:04 -05:00
Shilei Tian	423d34f74a	[OpenMP][Offloading] Change `bool IsSPMD` to `int8_t Mode` in `__kmpc_target_init` and `__kmpc_target_deinit` This is a follow-up of D110029, which uses bitset to indicate execution mode. This patches makes the changes in the function call. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D110279	2021-09-22 17:16:41 -04:00
Joseph Huber	6b9a3ec3a2	[OpenMP] Do not SPMDize generic regions with no parallel This patch changes SPMDization to not trigger for regions with no parallelism. Otherwise, this will introduce unnecessary barriers that will slow the single-threaded region down. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D109438	2021-09-08 14:33:15 -04:00
Joseph Huber	29a74a3915	[OpenMP] Add an option to always inline OpenMP device functions. Performance on GPU targets can be highly variable, sometimes inlining everything hurts performance and sometimes it greatly improves it. Add an option to toggle this behaviour to better investigate it. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D109014	2021-08-31 18:48:30 -04:00

24 Commits