11 Commits

Author SHA1 Message Date
Joseph Huber
154a128c65 Reapply "[OpenMP] Move OpenMP implicit argument to the end and reformat" (#186309)
Should be working downstream now
This reverts commit 9b61ff210fdff752d5db55b128474e9990258488.
2026-03-13 15:48:37 -05:00
theRonShark
9b61ff210f
Revert "[OpenMP] Move OpenMP implicit argument to the end and reformat" (#186309)
Reverts llvm/llvm-project#185989
2026-03-13 05:20:40 +00:00
Joseph Huber
4376fbd793
[OpenMP] Move OpenMP implicit argument to the end and reformat (#185989)
Summary:
We use this `dyn_ptr` argument in Clang/OpenMP to handle the
`KernelLaunchEnvironment`. This is a per-kernel argument used to share
some information. Currenetly, it's prepended to the argument list and we
generate storage for it in the runtime.

This is bad for a few reasons:
1. It changes the ABI by shifting user arguments
2. It cannot be trivially be left uninitialized if unused
3. The runtime must allocate its own memory for it

This PR changes it to be appended instead. Additionally, space for this
is always emitted. This means the OMPIRBuilder itself will provide the
storage, we simply need to populate it in the runtime if it is used.
This means that if it's unused we don't always pay the cost and it's
easier for non-OpenMP users to ignore it.

Backward compatibility is maintained by auto-upgrading the kernel
arguments. In `libomptarget` we completely allocate a new buffer to
store this in the new format. The plugins still need to respect the old
ABI of the called device object, so we simply rotate it if it's the old
version.
2026-03-12 18:08:22 -05:00
Joseph Huber
cc374d8056
[OpenMP] Remove register_requires global constructor (#80460)
Summary:
Currently, OpenMP handles the `omp requires` clause by emitting a global
constructor into the runtime for every translation unit that requires
it. However, this is not a great solution because it prevents us from
having a defined order in which the runtime is accessed and used.

This patch changes the approach to no longer use global constructors,
but to instead group the flag with the other offloading entires that we
already handle. This has the effect of still registering each flag per
requires TU, but now we have a single constructor that handles
everything.

This function removes support for the old `__tgt_register_requires` and
replaces it with a warning message. We just had a recent release, and
the OpenMP policy for the past four releases since we switched to LLVM
is that we do not provide strict backwards compatibility between major
LLVM releases now that the library is versioned. This means that a user
will need to recompile if they have an old binary that relied on
`register_requires` having the old behavior. It is important that we
actively deprecate this, as otherwise it would not solve the problem of
having no defined init and shutdown order for `libomptarget`. The
problem of `libomptarget` not having a define init and shutdown order
cascades into a lot of other issues so I have a strong incentive to be
rid of it.

It is worth noting that the current `__tgt_offload_entry` only has space
for a 32-bit integer here. I am planning to overhaul these at some point
as well.
2024-02-21 11:33:32 -06:00
Johannes Doerfert
b8cbc5c02c
[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the runtime. The
KernelLaunchEnvironment is for dynamic information *per* kernel launch.
It allows the rutime to feed information to the kernel that is not
shared with other invocations of the kernel. The first use case is to
replace the globals that synchronize teams reductions with per-launch
versions. This allows concurrent teams reductions. More uses cases will
follow, e.g., per launch memory pools.

Fixes: https://github.com/llvm/llvm-project/issues/70249
2023-10-31 19:38:43 -07:00
Aaron Ballman
84a3aadf0f Diagnose use of VLAs in C++ by default
Reapplication of 7339c0f782d5c70e0928f8991b0c05338a90c84c with a fix
for a crash involving arrays without a size expression.

Clang supports VLAs in C++ as an extension, but we currently only warn
on their use when you pass -Wvla, -Wvla-extension, or -pedantic.
However, VLAs as they're expressed in C have been considered by WG21
and rejected, are easy to use accidentally to the surprise of users
(e.g., https://ddanilov.me/default-non-standard-features/), and they
have potential security implications beyond constant-size arrays
(https://wiki.sei.cmu.edu/confluence/display/c/ARR32-C.+Ensure+size+arguments+for+variable+length+arrays+are+in+a+valid+range).
C++ users should strongly consider using other functionality such as
std::vector instead.

This seems like sufficiently compelling evidence to warn users about
VLA use by default in C++ modes. This patch enables the -Wvla-extension
diagnostic group in C++ language modes by default, and adds the warning
group to -Wall in GNU++ language modes. The warning is still opt-in in
C language modes, where support for VLAs is somewhat less surprising to
users.

RFC: https://discourse.llvm.org/t/rfc-diagnosing-use-of-vlas-in-c/73109
Fixes https://github.com/llvm/llvm-project/issues/62836
Differential Revision: https://reviews.llvm.org/D156565
2023-10-20 13:10:03 -04:00
Aaron Ballman
f5043f46c0 Revert "Diagnose use of VLAs in C++ by default"
This reverts commit 7339c0f782d5c70e0928f8991b0c05338a90c84c.

Breaks bots:
https://lab.llvm.org/buildbot/#/builders/139/builds/51875
https://lab.llvm.org/buildbot/#/builders/164/builds/45262
2023-10-20 10:00:18 -04:00
Aaron Ballman
7339c0f782 Diagnose use of VLAs in C++ by default
Clang supports VLAs in C++ as an extension, but we currently only warn
on their use when you pass -Wvla, -Wvla-extension, or -pedantic.
However, VLAs as they're expressed in C have been considered by WG21
and rejected, are easy to use accidentally to the surprise of users
(e.g., https://ddanilov.me/default-non-standard-features/), and they
have potential security implications beyond constant-size arrays
(https://wiki.sei.cmu.edu/confluence/display/c/ARR32-C.+Ensure+size+arguments+for+variable+length+arrays+are+in+a+valid+range).
C++ users should strongly consider using other functionality such as
std::vector instead.

This seems like sufficiently compelling evidence to warn users about
VLA use by default in C++ modes. This patch enables the -Wvla-extension
diagnostic group in C++ language modes by default, and adds the warning
group to -Wall in GNU++ language modes. The warning is still opt-in in
C language modes, where support for VLAs is somewhat less surprising to
users.

RFC: https://discourse.llvm.org/t/rfc-diagnosing-use-of-vlas-in-c/73109
Fixes https://github.com/llvm/llvm-project/issues/62836
Differential Revision: https://reviews.llvm.org/D156565
2023-10-20 09:50:21 -04:00
Johannes Doerfert
c5488c8dcc [OpenMP] Properly set static thread limit (w/o analysis)
We used to have two separate implementations to derive the number of
threads used in a target region. This lead us to sometimes miss out on
user provided thread bounds (num_threads, or thread_limit) when we
looked for "constant default values". If we might miss out on the
presence of those bounds, we cannot set the thread_limit statically
since the runtime will try to honor user input rather than cap it at the
"preferred default". This patch replaces the secondary implementation
with the primary in a mode that will not emit code but just look for the
presence, and potentially upper bounds, of thread limiting clauses.

The runtime test would not pass without this rewrite as we missed some
clauses, set the static limit on the device to the preferred value, but
then violated that value at runtime.

Fixes: https://github.com/llvm/llvm-project/issues/64845

Differential Revision: https://reviews.llvm.org/D158381
2023-08-23 11:12:03 -07:00
Sergio Afonso
63ca93c7d1
[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over
their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes
`IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to
`-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed
to `omp.is_target_device`. Getters and setters of all these renamed properties
are also updated accordingly. Many unit tests have been updated to use the new
names, but an alias for the `-fopenmp-is-device` option is created so that
external programs do not stop working after the name change.

`IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only
valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the
`-fopenmp-is-target-device` compiler frontend option, which is only added to
the OpenMP device invocation for offloading-enabled programs.

Differential Revision: https://reviews.llvm.org/D154591
2023-07-10 14:14:16 +01:00
Dave Pagan
eb61bde829 [OpenMP][CodeGen] Add codegen for combined 'loop' directives.
The loop directive is a descriptive construct which allows the compiler
flexibility in how it generates code for the directive's associated
loop(s). See OpenMP specification 5.2 [257:8-9].

Codegen added in this patch for the combined 'loop' directives are:

'target teams loop'     -> 'target teams distribute parallel for'
'teams loop'            -> 'teams distribute parallel for'
'target parallel loop'  -> 'target parallel for'
'parallel loop'         -> 'parallel for'

NOTE: The implementation of the 'loop' directive itself is unchanged.

Differential Revision: https://reviews.llvm.org/D145823
2023-07-05 12:31:59 -05:00