396 Commits

Author SHA1 Message Date
Sergio Afonso
88163ca79c
[OpenMPIRBuilder] Add support for distribute-parallel-for/do constructs (#127818)
This patch adds codegen for `kmpc_dist_for_static_init` runtime calls,
used to support worksharing a single loop across teams and threads. This
can be used to implement `distribute parallel for/do` support.
2025-02-25 10:30:17 +00:00
Sergio Afonso
cebb8f72b7
[OpenMPIRBuilder] Add support for distribute constructs (#127816)
This patch adds the `OpenMPIRBuilder::createDistribute()` function and
updates `OpenMPIRBuilder::applyStaticWorkshareLoop()` in preparation for
adding `distribute` support to flang.

Co-authored-by: Dominik Adamski <dominik.adamski@amd.com>
2025-02-24 16:18:01 +00:00
Akash Banerjee
785a5b4676
[MLIR][OpenMP] Add LLVM translation support for OpenMP UserDefinedMappers (#124746)
This patch adds OpenMPToLLVMIRTranslation support for the OpenMP Declare
Mapper directive.

Since both MLIR and Clang now support custom mappers, I've changed the
respective function params to no longer be optional as well.

Depends on #121005
2025-02-18 17:55:48 +00:00
Pranav Bhandarkar
c5ea469f4d
[OMPIRBuilder] - Fix emitTargetTaskProxyFunc to not generate empty functions (#126958)
This is a fix for https://github.com/llvm/llvm-project/issues/126949
There are two issues being fixed here.
First, in some cases, OMPIRBuilder generates empty target task proxy functions. This happens
when the target kernel doesn't use any stack-allocated data (either no data or only globals).

The second problem is encountered when the target task i.e the code that makes the target call
spans a single basic block. This usually happens when we do not generate a target or device kernel
launch and instead fall back to the host. In such cases, we end up not outlining the target task entirely. 
This can cause us to call target kernel twice - once via the target task proxy function and a second time
via the host fallback

This PR fixes both of these problems and updates some tests to catch these problems should this patch fail.
2025-02-17 09:45:06 -06:00
Alex MacLean
a282b6c486
[NVPTX] Convert scalar function nvvm.annotations to attributes (#125908)
Replace some more nvvm.annotations with function attributes,
auto-upgrading the annotations as needed. These new attributes will be
more idiomatic and compile-time efficient than the annotations.

- !"maxclusterrank" / !"cluster_max_blocks" -> "nvvm.maxclusterrank"
- !"minctasm" -> "nvvm.minctasm"
- !"maxnreg" -> "nvvm.maxnreg"
2025-02-12 07:33:22 -08:00
Abid Qadeer
196a1acc7d
[OMPIRBuilder][debug] Fix debug info for variables in target region. (#118314)
When a new function is created to handle OpenMP target region, the
variables used in it are passed as arguments. The scope and the location
of these variable contains still point to te parent function of the
target region. Such variables will fail in Verifier as the scope of the
variables will be different from the containing functions.

Currently, flang is the only user of createOutlinedFunction and it does
not generate any debug data for the the variables in the target region
to avoid this error. When this PR is in, we should be able to remove
this limit in the flang (and anyother client) and have the better debug
experience for the target region.

This PR changes the location and scope of the variables in the target
region to point to correct entities. It is similar to what
fixupDebugInfoPostExtraction does for CodeExtractor. I initially tried
to re-use that function but found quickly that it would require quite a
bit of re-factoring and additions before it could be used. It was much
simpler to make the changes locally.
2025-02-10 18:56:35 +00:00
Nick Sarnie
f3cd223838
[OpenMP][OpenMPIRBuilder] Add initial changes for SPIR-V target frontend support (#125920)
As Intel is working to add support for SPIR-V OpenMP device offloading
in upstream clang/liboffload, we need to modify the OpenMP frontend to
allow SPIR-V as well as generate valid IR for SPIR-V. For example, we
need the frontend to generate code to define and interact with device
globals used in the DeviceRTL.

This is the beginning of what I expect will be (many) other changes, but
let's get started with something simple.

---------

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-02-10 16:16:40 +00:00
Joseph Huber
f1e917d07b
[Offload] Unify offloading entries into a single section (#125731)
Summary:
This patch unifies the existing offloading entires into a single section
called `llvm_offload_entires`. This lets us use a more unified
offloading infrastructure so that all targets share the same handling.
The effect is that people in the runtimes now need to check if the kind
is what they expect, but the expectation is that you can combine
multiple potential providers into a compile job. Doesn't fully work
yet because of other runtime issues, but some day. Mostly this helps the
future of liboffload where we want to handle different languages than
OpenMP.
2025-02-06 08:24:01 -06:00
Abid Qadeer
5f7acf7259
[flang][OMPIRbuilder] Set debug loc on terminator created by splitBB. (#125897)
Fixes #125088.

When splitBB is called with createBranch=true, it creates a branch
instruction in the old block. But no debug loc is set on that branch
instruction. If that is used as InsertPoint in the restoreIP, it has the
potential to set the current debug location to null and subsequent
instruction will come out without a debug location. This caused the
verification check to fail as shown in the bug report.

This PR changes splitBB and spliceBB function to also take a debugLoc
parameter which can be used to set the debug location of the branch
instruction.
2025-02-05 22:35:43 +00:00
Kareem Ergawy
205b0bddcd
[OpenMP][IRBuilder] Handle target directives with both if & nowait (#125029)
This fixes a bug when a `target` directive has both an `if` and a
`nowait` clauses. The bug happens because we tried to `emitKernelLaunch`
for `else` branch of the `if` clause.
2025-01-30 18:40:34 +01:00
Joseph Huber
13dcc95dcd
[Offload] Rework offloading entry type to be more generic (#124018)
Summary:
The previous offloading entry type did not fit the current use-cases
very well. This widens it and adds a version to prevent further
annoyances. It also includes the kind to better sort who's using it.

The first 64-bytes are reserved as zero so the OpenMP runtime can detect
the old format for binary compatibilitry.
2025-01-28 07:26:13 -06:00
Kareem Ergawy
83433d9361
[OpenMP][IRBuilder] Handle target ... nowait when codegen targets host (#124720)
Fixes https://github.com/llvm/llvm-project/issues/124578

Handles the `nowait` clause for `omp.target` ops when the actual target
is the host (i.e. there is no target device). Rather than only checking
for the `HasNoWait` boolean, we also check for the presence/absence of a
`DeviceID` value. We only emit the target task if both are present.
2025-01-28 12:57:40 +01:00
Jeremy Morse
34b139594a
[NFC][DebugInfo] Switch more call-sites to using iterator-insertion (#124283)
To finalise the "RemoveDIs" work removing debug intrinsics, we're
updating call sites that insert instructions to use iterators instead.
This set of changes are those where it's not immediately obvious that
just calling getIterator to fetch an iterator is correct, and one or two
places where more than one line needs to change.

Overall the same rule holds though: iterators generated for the start of
a block such as getFirstNonPHIIt need to be passed into insert/move
methods without being unwrapped/rewrapped, everything else can use
getIterator.
2025-01-27 16:44:14 +00:00
Alex MacLean
07ed8187ac
[OpenMP] Replace nvvm.annotation usage with kernel calling conventions (#122320)
Specifying a kernel with the `ptx_kernel` or `amdgpu_kernel` calling
convention is a more idiomatic and compile-time performant than using
the `nvvm.annoation !"kernel"` metadata.

Transition OMPIRBuilder to use calling conventions for PTX kernels and
no longer emit `nvvm.annoation`. Update OpenMPOpt to work with kernels
specified via calling convention as well as metadata. Update OpenMP
tests to use the calling conventions.
2025-01-24 16:56:10 -08:00
Jeremy Morse
8e70273509
[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to moveBefore use iterators.

This patch adds a (guaranteed dereferenceable) iterator-taking
moveBefore, and changes a bunch of call-sites where it's obviously safe
to change to use it by just calling getIterator() on an instruction
pointer. A follow-up patch will contain less-obviously-safe changes.

We'll eventually deprecate and remove the instruction-pointer
insertBefore, but not before adding concise documentation of what
considerations are needed (very few).
2025-01-24 10:53:11 +00:00
Mats Jun Larsen
d7c14c8f97
[IR] Replace of PointerType::getUnqual(Type) with opaque version (NFC) (#123909)
Follow up to https://github.com/llvm/llvm-project/issues/123569
2025-01-23 18:23:05 +09:00
NimishMishra
d7e48fbf20
[llvm][OpenMP] Add implicit cast to omp.atomic.read (#114659)
Should the operands of `omp.atomic.read` differ, emit an implicit cast.
In case of `struct` arguments, extract the 0-th index, emit an implicit
cast if required, and store at the destination.

Fixes https://github.com/llvm/llvm-project/issues/112908
2025-01-17 01:40:33 -08:00
Thirumalai Shaktivel
1d890b06ee
[Flang, OpenMP] Add LLVM lowering support for PRIORITY in TASK (#120710)
Implementation details:
The PRIORITY clause is recognized by setting the flags = 32 to the 
`__kmpc_omp_task_alloc` runtime call. Also, store the priority-value 
to the `kmp_task_t` struct member
2025-01-16 10:02:30 +05:30
Sergio Afonso
9bc8828093
[OMPIRBuilder][MLIR] Add support for target 'if' clause (#122478)
This patch implements support for handling the 'if' clause of OpenMP
'target' constructs in the OMPIRBuilder and updates MLIR to LLVM IR
translation of the `omp.target` MLIR operation to make use of this new
feature.
2025-01-15 10:16:19 +00:00
Sergio Afonso
0fb0ac708a
[OMPIRBuilder] Simplify error handling while emitting target calls, NFC (#122477)
The OMPIRBuilder uses `llvm::Error`s to allow callbacks passed to it to
signal errors and prevent OMPIRBuilder functions to continue after one
has been triggered. This means that OMPIRBuilder functions taking
callbacks needs to be able to forward these errors, which must always be
checked.

However, in cases where these functions are called from within the
OMPIRBuilder with callbacks also defined inside of it, it can be known
in advance that no errors will be produced. This is the case of those
defined in `emitTargetCall`.

This patch introduces calls to the `cantFail` function instead of the
previous superfluous checks that still assumed calls wouldn't fail,
making these assumptions more obvious and simplifying their
implementation.
2025-01-14 16:08:38 +00:00
Sergio Afonso
d0b641b7e2
[OMPIRBuilder] Propagate attributes to outlined target regions (#117875)
This patch copies the target-cpu and target-features attributes of
functions containing target regions into the corresponding outlined
function holding the target region.

This mirrors what is currently being done for all other outlined
functions through the `CodeExtractor` in `OpenMPIRBuilder::finalize()`.
2025-01-14 12:35:50 +00:00
Sergio Afonso
fabc443e93
[OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (#116051)
This patch introduces a `TargetKernelRuntimeAttrs` structure to hold
host-evaluated `num_teams`, `thread_limit`, `num_threads` and trip count
values passed to the runtime kernel offloading call.

Additionally, kernel type information is used to influence target device
code generation and the `IsSPMD` flag is replaced by `ExecFlags`, which
provides more granularity.
2025-01-14 12:34:37 +00:00
Sergio Afonso
27bc6bdaba
[OMPIRBuilder] Introduce struct to hold default kernel teams/threads (#116050)
This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs`
structure used to simplify passing default and constant values for
number of teams and threads, and possibly other target kernel-related
information in the future.

This is used to forward values passed to `createTarget` to
`createTargetInit`, which previously used a default unrelated set of
values.
2025-01-14 11:08:55 +00:00
Kareem Ergawy
42da12063f
[flang][OpenMP] Extend delayed privatization for omp.simd (#122156)
Adds support for delayed privatization for `simd` directives. This PR
includes PFT down to LLVM IR lowering.
2025-01-12 07:46:58 +01:00
Sergio Afonso
b79ed8729b
[OpenMP][OMPIRBuilder] Handle non-failing calls properly (#115863)
The preprocessor definition used to enable asserts and the one that
`llvm::Error` and `llvm::Expected` use to ensure all created instances are
checked are not the same. By making these checks inside of an `assert` in cases
where errors are not expected, certain build configurations would trigger
runtime failures (e.g. `-DLLVM_ENABLE_ASSERTIONS=OFF
-DLLVM_UNREACHABLE_OPTIMIZE=ON`).

The `llvm::cantFail()` function, which was intended for this use case, is used
by this patch in place of `assert` to prevent these runtime failures. In tests,
new preprocessor definitions based on `ASSERT_THAT_EXPECTED` and
`EXPECT_THAT_EXPECTED` are used instead, to avoid silent failures in release
builds.
2025-01-09 10:28:16 +00:00
Kaviya Rajendiran
d3eb65f15d
[MLIR][OpenMP] Lowering aligned clause to LLVM IR for SIMD directive (#119536)
This patch,
- Added a translation support for aligned clause in SIMD directive by passing the alignment details to "llvm.assume" intrinsic.
- Updated the insertion point for llvm.assume intrinsic call in "OMPIRBuilder.cpp".
- Added a check in aligned clause MLIR lowering, to ensure that the alignment value must be a power of 2.
2025-01-03 16:22:38 +05:30
Akash Banerjee
aadf606d90 Fix #110001 build error. 2024-12-18 15:55:56 +00:00
Akash Banerjee
6f0e9c4a56
[OpenMP][Clang] Migrate OpenMP UserDefinedMapper from Clang to OMPIRBuilder (#110001)
This patch migrates the OpenMP UserDefinedMapper codegen from Clang to
the OpenMPIRBuilder. I will be adding further patches in the near future
so that OpenMP dialect in MLIR can make use of these.
2024-12-18 15:02:14 +00:00
Kareem Ergawy
f9734b9df1
[mlir][OpenMP] - MLIR to LLVMIR translation support for delayed privatization of allocatables in omp.target ops (#116576)
This PR adds support to translate the `private` clause from MLIR to
LLVMIR when used on allocatables in the context of an `omp.target` op.

This replaces https://github.com/llvm/llvm-project/pull/113208.

Parent PR: https://github.com/llvm/llvm-project/pull/116770. Only the
latest commit is relevant to the PR.
2024-12-12 14:39:58 +01:00
NimishMishra
9eb4056144
[mlir][llvm] Translation support for task detach (#116601)
This PR adds translation support for task detach. Essentially, if the
`detach` clause is present on a task, emit a
`__kmpc_task_allow_completion_event` on it, and store its return (of
type `kmp_event_t*`) into the `event_handle`.
2024-12-08 06:09:52 -08:00
NimishMishra
b9e3a769b9
[flang][mlir][llvm][OpenMP] Add lowering and translation support for mergeable clause on task (#114662)
Add FIR generation and LLVMIR translation support for mergeable clause
on task construct. If mergeable clause is present on a task, the
relevant flag in `ompt_task_flag_t` is set and passed to
`__kmpc_omp_task_alloc`.
2024-11-26 02:40:26 -08:00
smanna12
d936c0cef0
[Clang] [OMPIRBuilder] Prevent Null Pointer Dereference in OpenMP IR Builder (#115506)
This commit addresses Static Analyzer issues related to potential null
dereference by replacing dyn_cast<> with cast<> in OMPIRBuilder.cpp to
ensure that ArgStructType is not null before it is used, improving the
stability and reliability of the code.
2024-11-21 10:35:14 -06:00
Kazu Hirata
36ada1b9b2
[Frontend] Remove unused includes (NFC) (#116927)
Identified with misc-include-cleaner.
2024-11-20 06:52:17 -08:00
Sergio Afonso
d87964de78
[OpenMP][OMPIRBuilder] Error propagation across callbacks (#112533)
This patch implements an approach to communicate errors between the
OMPIRBuilder and its users. It introduces `llvm::Error` and
`llvm::Expected` objects to replace the values returned by callbacks
passed to `OMPIRBuilder` codegen functions. These functions then check
the result for errors when callbacks are called and forward them back to
the caller, which has the flexibility to recover, exit cleanly or dump a
stack trace.

This prevents a failed callback to leave the IR in an invalid state and
still continue the codegen process, triggering unrelated assertions or
segmentation faults. In the case of MLIR to LLVM IR translation of the
'omp' dialect, this change results in the compiler emitting errors and
exiting early instead of triggering a crash for not-yet-implemented
errors. The behavior in Clang and openmp-opt stays unchanged, since
callbacks will continue always returning 'success'.
2024-10-25 11:30:16 +01:00
Kareem Ergawy
ad70f3e095
[flang][OpenMP] Support target enter|update|exit .. nowait (#113305)
Extends `nowait` support for other device directives. This PR refactors
the task generation utils used for the `target` directive so that they
are general enough to be reused for other device directives as well.
2024-10-23 10:48:54 +02:00
NimishMishra
645e6f1114
[llvm][OpenMP] Handle complex types in atomic read (#111377)
This patch adds functionality for atomically reading `llvm.struct`
types.

Fixes: https://github.com/llvm/llvm-project/issues/93441
2024-10-22 18:34:22 -07:00
Kareem Ergawy
d0d03805f8
[flang][OpenMP] Support target ... nowait (#111823)
Adds MLIR to LLVM lowering support for `target ... nowait`. This
leverages the already existings code-gen patterns for `task` by treating
`target ... nowait` as `task ... if(1)` and `target` (without `nowait`)
as `task ... if(0)`; similar to what clang does.
2024-10-15 14:39:16 +02:00
NimishMishra
aec87a2143
[llvm][mlir][flang][OpenMP] Emit __atomic_load and __atomic_compare_exchange libcalls for complex types in atomic update (#92364)
This patch adds functionality to emit relevant libcalls in case
atomicrmw instruction can not be emitted (for instance, in case of
complex types). The IRBuilder is modified to directly emit __atomic_load
and __atomic_compare_exchange libcalls. The added functions follow a
similar codegen path as Clang, so that LLVM Flang generates almost
similar IR as Clang.

Fixes https://github.com/llvm/llvm-project/issues/83760 and
https://github.com/llvm/llvm-project/issues/75138

Co-authored-by: Michael Kruse <llvm-project@meinersbur.de>
2024-10-02 23:32:36 -07:00
Youngsuk Kim
d071fdab44
[llvm][OMPIRBuilder] Avoid Type::getPointerTo() (NFC) (#110678)
`Type::getPointerTo()` is to be deprecated & removed soon.
2024-10-01 11:47:08 -04:00
Nikita Popov
ecb98f9fed [IRBuilder] Remove uses of CreateGlobalStringPtr() (NFC)
Since the migration to opaque pointers, CreateGlobalStringPtr()
is the same as CreateGlobalString(). Normalize to the latter.
2024-09-23 16:30:50 +02:00
Jay Foad
e03f427196
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
2024-09-19 16:16:38 +01:00
anjenner
4af249fe6e
Add usub_cond and usub_sat operations to atomicrmw (#105568)
These both perform conditional subtraction, returning the minuend and
zero respectively, if the difference is negative.
2024-09-06 16:19:20 +01:00
Akash Banerjee
2cf36f0293
[OpenMP]Update use_device_clause lowering (#101707)
This patch updates the use_device_ptr and use_device_addr clauses to use
the mapInfoOps for lowering. This allows all the types that are handle
by the map clauses such as derived types to also be supported by the
use_device_clauses.

This is patch 2/2 in a series of patches.
2024-09-04 12:36:03 +01:00
Abid Qadeer
9e08db796b
[OpenMPIRBuilder] Don't drop debug info for target region. (#80692)
When an outlined function is generated for omp target region, a
corresponding DISubprogram was not being generated. This resulted in all
the debug information for the target region being dropped.

This commit adds DISubprogram for the outlined function if there is one
available for the parent function. It also updates the current debug
location so that the right scope is used for the entries in the outlined
function.

There are places in the OpenMPIRBuilder which changes insertion point but
don't update the debug location accordingly. They cause issue when debug info
is enabled. I have fixed a few that I observed to cause issue. But there may be
more and a systematic cleanup may be required.

With this change in place, I can set source line breakpoint in target
region and run to them in debugger.
2024-09-04 10:16:14 +01:00
Kareem Ergawy
a195e2d461
[MLIR][OpenMP] Handle privatization for global values in MLIR->LLVM translation (#104407)
Potential fix for https://github.com/llvm/llvm-project/issues/102939 and
https://github.com/llvm/llvm-project/issues/102949.

The issues occurs because the CodeExtractor component only collect
inputs (to the parallel regions) that are defined in the same function
in which the parallel regions is present. Howerver, this is problematic
because if we are privatizing a global value (e.g. a `target` variable
which is emitted as a global), then we miss finding that input and we do
not privatize the variable.

This commit attempts to fix the issue by adding a flag to the
CodeExtractor so that we can collect global inputs.
2024-08-26 17:08:24 +02:00
Daniil Fukalov
0da2ba811a
[NFC] Cleanup in ADT and Analysis headers. (#104484)
Remove unused directly includes and forward declarations in ADT and
Analysis headers.
2024-08-17 13:11:18 +02:00
Shilei Tian
0551926fda
[Clang][OMPX] Add the code generation for multi-dim thread_limit clause (#102717) 2024-08-16 13:59:46 -04:00
Kazu Hirata
b57038a611
[OpenMP] Use range-based for loops (NFC) (#103511) 2024-08-14 20:03:45 -07:00
Nikita Popov
dc831e8422 [OMPIRBuilder] Use getAllOnesValue()
Split out from https://github.com/llvm/llvm-project/pull/80309.
2024-08-12 16:37:42 +02:00
Shilei Tian
ee8100ba02
[Clang][OMPX] Add the code generation for multi-dim num_teams (#101407)
This patch adds the code generation support for multi-dim `num_teams`
clause when it is used with `target teams ompx_bare` construct.
2024-08-09 10:33:41 -04:00