8 Commits

Author SHA1 Message Date
theRonShark
9b61ff210f
Revert "[OpenMP] Move OpenMP implicit argument to the end and reformat" (#186309)
Reverts llvm/llvm-project#185989
2026-03-13 05:20:40 +00:00
Joseph Huber
4376fbd793
[OpenMP] Move OpenMP implicit argument to the end and reformat (#185989)
Summary:
We use this `dyn_ptr` argument in Clang/OpenMP to handle the
`KernelLaunchEnvironment`. This is a per-kernel argument used to share
some information. Currenetly, it's prepended to the argument list and we
generate storage for it in the runtime.

This is bad for a few reasons:
1. It changes the ABI by shifting user arguments
2. It cannot be trivially be left uninitialized if unused
3. The runtime must allocate its own memory for it

This PR changes it to be appended instead. Additionally, space for this
is always emitted. This means the OMPIRBuilder itself will provide the
storage, we simply need to populate it in the runtime if it is used.
This means that if it's unused we don't always pay the cost and it's
easier for non-OpenMP users to ignore it.

Backward compatibility is maintained by auto-upgrading the kernel
arguments. In `libomptarget` we completely allocate a new buffer to
store this in the new format. The plugins still need to respect the old
ABI of the called device object, so we simply rotate it if it's the old
version.
2026-03-12 18:08:22 -05:00
Hari Limaye
94473f4db6
[IRBuilder] Generate nuw GEPs for struct member accesses (#99538)
Generate nuw GEPs for struct member accesses, as inbounds + non-negative
implies nuw.

Regression tests are updated using update scripts where possible, and by
find + replace where not.
2024-08-09 13:25:04 +01:00
dhruvachak
b5d02bbd0d
[OpenMP] Increment kernel args version, used by runtime for detecting dyn_ptr. (#85363)
A kernel implicit parameter (dyn_ptr) was introduced some time back.
This patch increments the kernel args version for a compiler supporting
dyn_ptr. The version will be used by the runtime to determine whether
the implicit parameter is generated by the compiler. The versioning is
required to support use cases where code generated by an older compiler
is linked with a newer runtime.

If approved, this patch should be backported to release 18.
2024-03-19 16:40:22 -07:00
Joseph Huber
cc374d8056
[OpenMP] Remove register_requires global constructor (#80460)
Summary:
Currently, OpenMP handles the `omp requires` clause by emitting a global
constructor into the runtime for every translation unit that requires
it. However, this is not a great solution because it prevents us from
having a defined order in which the runtime is accessed and used.

This patch changes the approach to no longer use global constructors,
but to instead group the flag with the other offloading entires that we
already handle. This has the effect of still registering each flag per
requires TU, but now we have a single constructor that handles
everything.

This function removes support for the old `__tgt_register_requires` and
replaces it with a warning message. We just had a recent release, and
the OpenMP policy for the past four releases since we switched to LLVM
is that we do not provide strict backwards compatibility between major
LLVM releases now that the library is versioned. This means that a user
will need to recompile if they have an old binary that relied on
`register_requires` having the old behavior. It is important that we
actively deprecate this, as otherwise it would not solve the problem of
having no defined init and shutdown order for `libomptarget`. The
problem of `libomptarget` not having a define init and shutdown order
cascades into a lot of other issues so I have a strong incentive to be
rid of it.

It is worth noting that the current `__tgt_offload_entry` only has space
for a 32-bit integer here. I am planning to overhaul these at some point
as well.
2024-02-21 11:33:32 -06:00
Johannes Doerfert
b8cbc5c02c
[OpenMP] Introduce the KernelLaunchEnvironment as implicit argument (#70401)
The KernelEnvironment is for compile time information about a kernel. It
allows the compiler to feed information to the runtime. The
KernelLaunchEnvironment is for dynamic information *per* kernel launch.
It allows the rutime to feed information to the kernel that is not
shared with other invocations of the kernel. The first use case is to
replace the globals that synchronize teams reductions with per-launch
versions. This allows concurrent teams reductions. More uses cases will
follow, e.g., per launch memory pools.

Fixes: https://github.com/llvm/llvm-project/issues/70249
2023-10-31 19:38:43 -07:00
Sergio Afonso
63ca93c7d1
[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over
their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes
`IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to
`-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed
to `omp.is_target_device`. Getters and setters of all these renamed properties
are also updated accordingly. Many unit tests have been updated to use the new
names, but an alias for the `-fopenmp-is-device` option is created so that
external programs do not stop working after the name change.

`IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only
valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the
`-fopenmp-is-target-device` compiler frontend option, which is only added to
the OpenMP device invocation for offloading-enabled programs.

Differential Revision: https://reviews.llvm.org/D154591
2023-07-10 14:14:16 +01:00
Dave Pagan
eb61bde829 [OpenMP][CodeGen] Add codegen for combined 'loop' directives.
The loop directive is a descriptive construct which allows the compiler
flexibility in how it generates code for the directive's associated
loop(s). See OpenMP specification 5.2 [257:8-9].

Codegen added in this patch for the combined 'loop' directives are:

'target teams loop'     -> 'target teams distribute parallel for'
'teams loop'            -> 'teams distribute parallel for'
'target parallel loop'  -> 'target parallel for'
'parallel loop'         -> 'parallel for'

NOTE: The implementation of the 'loop' directive itself is unchanged.

Differential Revision: https://reviews.llvm.org/D145823
2023-07-05 12:31:59 -05:00