llvm-project

Author	SHA1	Message	Date
Nick Sarnie	008040482b	[clang] Add SPIR-V to some OpenMP clang tests (#133503 ) Just to get some more coverage. Some of the behavior might be weird and change in the future, but let's lock down what happens today to at least prevent regressions. Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>	2025-04-03 14:36:46 +00:00
Alex MacLean	6c2e170d04	[NVPTX] Convert vector function nvvm.annotations to attributes (#127736 ) Replace some more nvvm.annotations with function attributes, auto-upgrading the annotations as needed. These new attributes will be more idiomatic and compile-time efficient than the annotations. - !"maxntid[xyz]" -> "nvvm.maxntid" - !"reqntid[xyz]" -> "nvvm.reqntid" - !"cluster_dim_[xyz]" -> "nvvm.cluster_dim"	2025-02-26 08:45:27 -08:00
Johannes Doerfert	3c8efd7928	[OpenMP] Ensure the actual kernel is annotated with launch bounds (#99927 ) In debug mode there is a wrapper (the kernel) around the function in which we generate the kernel code. We worked around this before to get the correct kernel name, but now we really distinguish both to attach the launch bounds to the kernel, not the inner function.	2024-07-23 09:02:47 -07:00
Joseph Huber	0287a5cc4e	[OpenMP] Remove 'minncta' attributes from NVPTX kernels (#88398 ) Summary: Currently we treat this attribute as a minimum number for the amount of blocks scheduled on the kernel. However, the doucmentation states that this applies to CTA's mapped onto a single SM. Currently we just set it to the total number of blocks, which will almost always result in a warning that the value is out of range and will be ignored. We don't have a good way to automatically know how many CTAs can be put on a single SM nor if we should do this, so we should probably leave this up to users manually adding it. https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#performance-tuning-directives-minnctapersm	2024-04-15 15:28:05 -05:00
Johannes Doerfert	b8cbc5c02c	[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401 ) The KernelEnvironment is for compile time information about a kernel. It allows the compiler to feed information to the runtime. The KernelLaunchEnvironment is for dynamic information per kernel launch. It allows the rutime to feed information to the kernel that is not shared with other invocations of the kernel. The first use case is to replace the globals that synchronize teams reductions with per-launch versions. This allows concurrent teams reductions. More uses cases will follow, e.g., per launch memory pools. Fixes: https://github.com/llvm/llvm-project/issues/70249	2023-10-31 19:38:43 -07:00
Johannes Doerfert	0ba57c8bba	[OpenMP] Pass min/max thread and team count to the OMPIRBuilder (#70247 ) We now provide the information about the min/max thread and team count from to the OMPIRBuilder, no matter what the source was. That means we unify `thread_limit`, `num_teams`, `num_threads` handling with the target specific attriutes (`__launch_bounds__` and `amdgpu_flat_work_group_size`). This is in preparation to pass the values to the runtime, and to allow the middle-end (OpenMP-opt) to tighten the values if it seems appropriate. There is no "real" change after this commit.	2023-10-26 14:45:07 -07:00
Johannes Doerfert	08a220764b	Reapply "[OpenMP] Add the `ompx_attribute` clause for target directives" This reverts commit 0d12683046ca75fb08e285f4622f2af5c82609dc and reapplies ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2 with an extension to fix the Flang build. Differential Revision: https://reviews.llvm.org/D156184	2023-07-25 10:40:35 -07:00
Aaron Ballman	0d12683046	Revert "[OpenMP] Add the `ompx_attribute` clause for target directives" This reverts commit ef9ec4bbcca2fa4f64df47bc426f1d1c59ea47e2. The changes broke several bots: https://lab.llvm.org/buildbot/#/builders/176/builds/3408 https://lab.llvm.org/buildbot/#/builders/198/builds/4028 https://lab.llvm.org/buildbot/#/builders/197/builds/8491 https://lab.llvm.org/buildbot/#/builders/197/builds/8491	2023-07-25 07:57:36 -04:00
Johannes Doerfert	ef9ec4bbcc	[OpenMP] Add the `ompx_attribute` clause for target directives CUDA and HIP have kernel attributes to tune the code generation (in the backend). To reuse this functionality for OpenMP target regions we introduce the `ompx_attribute` clause that takes these kernel attributes and emits code as if they had been attached to the kernel fuction (which is implicitly generated). To limit the impact, we only support three kernel attributes: `amdgpu_waves_per_eu`, for AMDGPU `amdgpu_flat_work_group_size`, for AMDGPU `launch_bounds`, for NVPTX The existing implementations of those attributes are used for error checking and code generation. `ompx_attribute` can be attached to any executable target region and it can hold more than one kernel attribute. Differential Revision: https://reviews.llvm.org/D156184	2023-07-24 22:04:45 -07:00

9 Commits