The preprocessor definition used to enable asserts and the one that
`llvm::Error` and `llvm::Expected` use to ensure all created instances are
checked are not the same. By making these checks inside of an `assert` in cases
where errors are not expected, certain build configurations would trigger
runtime failures (e.g. `-DLLVM_ENABLE_ASSERTIONS=OFF
-DLLVM_UNREACHABLE_OPTIMIZE=ON`).
The `llvm::cantFail()` function, which was intended for this use case, is used
by this patch in place of `assert` to prevent these runtime failures. In tests,
new preprocessor definitions based on `ASSERT_THAT_EXPECTED` and
`EXPECT_THAT_EXPECTED` are used instead, to avoid silent failures in release
builds.
This patch,
- Added a translation support for aligned clause in SIMD directive by passing the alignment details to "llvm.assume" intrinsic.
- Updated the insertion point for llvm.assume intrinsic call in "OMPIRBuilder.cpp".
- Added a check in aligned clause MLIR lowering, to ensure that the alignment value must be a power of 2.
This patch migrates the OpenMP UserDefinedMapper codegen from Clang to
the OpenMPIRBuilder. I will be adding further patches in the near future
so that OpenMP dialect in MLIR can make use of these.
This adds support for the loongarch64 architecture to the offload host
plugin.
Similar to #115773
To fix some test issues, I've had to add the LoongArch64 target to:
- CompilerInvocation::ParseLangArgs
- linkDevice in ClangLinuxWrapper.cpp
- OMPContext::OMPContext (to set the device_kind_cpu trait)
Reviewed By: jhuber6
Pull Request: https://github.com/llvm/llvm-project/pull/120173
This PR adds translation support for task detach. Essentially, if the
`detach` clause is present on a task, emit a
`__kmpc_task_allow_completion_event` on it, and store its return (of
type `kmp_event_t*`) into the `event_handle`.
Add FIR generation and LLVMIR translation support for mergeable clause
on task construct. If mergeable clause is present on a task, the
relevant flag in `ompt_task_flag_t` is set and passed to
`__kmpc_omp_task_alloc`.
This commit addresses Static Analyzer issues related to potential null
dereference by replacing dyn_cast<> with cast<> in OMPIRBuilder.cpp to
ensure that ArgStructType is not null before it is used, improving the
stability and reliability of the code.
The main issue to solve is that OpenMP modifiers can be specified in any
order, so the parser cannot expect any specific modifier at a given
position. To solve that, define modifier to be a union of all allowable
specific modifiers for a given clause.
Additionally, implement modifier descriptors: for each modifier the
corresponding descriptor contains a set of properties of the modifier
that allow a common set of semantic checks. Start with the syntactic
properties defined in the spec: Required, Unique, Exclusive, Ultimate,
and implement common checks to verify each of them.
OpenMP modifier overhaul: #2/3
Summary:
For our offloading entries, we currently store all the string names of
kernels that the runtime will need to load from the target executable.
These are available via pointer in the `__tgt_offload_entry` struct,
however this makes it difficult to obtain from the object itself. This
patch simply puts the strings in a named section so they can be easily
queried.
The motivation behind this is that when the linker wrapper is doing
linking, it wants to know which kernels the host executable is calling.
We *could* get this already via the `.relaomp_offloading_entires`
section and trawling through the string table, but that's quite annoying
and not portable. The follow-up to this should be to make the linker
wrapper get a list of all used symbols the device link job should count
as "needed" so we can handle static linking more directly.
This patch implements an approach to communicate errors between the
OMPIRBuilder and its users. It introduces `llvm::Error` and
`llvm::Expected` objects to replace the values returned by callbacks
passed to `OMPIRBuilder` codegen functions. These functions then check
the result for errors when callbacks are called and forward them back to
the caller, which has the flexibility to recover, exit cleanly or dump a
stack trace.
This prevents a failed callback to leave the IR in an invalid state and
still continue the codegen process, triggering unrelated assertions or
segmentation faults. In the case of MLIR to LLVM IR translation of the
'omp' dialect, this change results in the compiler emitting errors and
exiting early instead of triggering a crash for not-yet-implemented
errors. The behavior in Clang and openmp-opt stays unchanged, since
callbacks will continue always returning 'success'.
Extends `nowait` support for other device directives. This PR refactors
the task generation utils used for the `target` directive so that they
are general enough to be reused for other device directives as well.
Adds MLIR to LLVM lowering support for `target ... nowait`. This
leverages the already existings code-gen patterns for `task` by treating
`target ... nowait` as `task ... if(1)` and `target` (without `nowait`)
as `task ... if(0)`; similar to what clang does.
This patch adds functionality to emit relevant libcalls in case
atomicrmw instruction can not be emitted (for instance, in case of
complex types). The IRBuilder is modified to directly emit __atomic_load
and __atomic_compare_exchange libcalls. The added functions follow a
similar codegen path as Clang, so that LLVM Flang generates almost
similar IR as Clang.
Fixes https://github.com/llvm/llvm-project/issues/83760 and
https://github.com/llvm/llvm-project/issues/75138
Co-authored-by: Michael Kruse <llvm-project@meinersbur.de>
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
This patch updates the use_device_ptr and use_device_addr clauses to use
the mapInfoOps for lowering. This allows all the types that are handle
by the map clauses such as derived types to also be supported by the
use_device_clauses.
This is patch 2/2 in a series of patches.
When an outlined function is generated for omp target region, a
corresponding DISubprogram was not being generated. This resulted in all
the debug information for the target region being dropped.
This commit adds DISubprogram for the outlined function if there is one
available for the parent function. It also updates the current debug
location so that the right scope is used for the entries in the outlined
function.
There are places in the OpenMPIRBuilder which changes insertion point but
don't update the debug location accordingly. They cause issue when debug info
is enabled. I have fixed a few that I observed to cause issue. But there may be
more and a systematic cleanup may be required.
With this change in place, I can set source line breakpoint in target
region and run to them in debugger.
Potential fix for https://github.com/llvm/llvm-project/issues/102939 and
https://github.com/llvm/llvm-project/issues/102949.
The issues occurs because the CodeExtractor component only collect
inputs (to the parallel regions) that are defined in the same function
in which the parallel regions is present. Howerver, this is problematic
because if we are privatizing a global value (e.g. a `target` variable
which is emitted as a global), then we miss finding that input and we do
not privatize the variable.
This commit attempts to fix the issue by adding a flag to the
CodeExtractor so that we can collect global inputs.
This patch moves utilities from
`offload/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h` to
`llvm/Frontend/Offloading/Utility.h` to be reused by
other projects.
Concretely the following changes were made:
- Rename `KernelMetaDataTy` to `AMDGPUKernelMetaData`.
- Remove unused fields `KernelObject`, `KernelSegmentSize`,
`ExplicitArgumentCount` and `ImplicitArgumentCount` from
`AMDGPUKernelMetaData`.
- Return the produced error if `ELFObj.sections()` failed instead of
using `cantFail`.
- Added `AGPRCount` field to `AMDGPUKernelMetaData`.
- Added a default invalid value to all the fields in
`AMDGPUKernelMetaData`.
* Separate wasi and emscripten as they have different constraints and
abilities
* Emscripten mimics Linux/POSIX by statically linking the musl runtime.
This allow nearly all KMP_OS_LINUX code paths to work correctly. There
are only a few places that need to be adjusted related to dynamic
linking (dl_open)
* Internally link openmp globals
* With CommonLinkage it is needed to emit them in an assembly file, now
they are defined and used within each compilation unit
* With ExternalLinkage they suffer from duplicate symbols during linking
for unnamed globals like reduction/critical
* Interestingly this aligns with the TODO comment above this code
This patch modifies MLIR to LLVM IR lowering of the OpenMP dialect to take into
consideration the contents of the `omp.target_triples` module attribute while
generating code for `omp.target` operations.
It adds the `OpenMPIRBuilderConfig::TargetTriples` field and initializes it
using the `amendOperation` flow of the `OpenMPToLLVMIRTranslation` pass. Some
changes are introduced into the `OpenMPIRBuilder` to allow passing the
information about whether a target region is intended to be offloaded from
outside.
The result of this change is that offloading calls are only generated when the
`--offload-arch` or `-fopenmp-targets` options are given to the compiler.
Otherwise, only the host fallback code is generated. This fixes linker errors
currently triggered by `flang-new` if a source file containing a `target`
construct is compiled without any of the aforementioned options.
Several unit tests impacted by these changes, which are intended to check host
code generated for `omp.target` operations, are updated to contain the new
attribute. Without it, no calls to `__tgt_target_kernel` and associated control
flow operations are generated.
Fixes#100209.
The kernel names for OpenMP are manually mangled and not ideal when we
report something to the user. We demangle them now, providing the
function and line number of the target region, together with the actual
kernel name.
This patch introduces a new interface in `OpenMPIRBuilder` that combines
the creation of the so-called offloading pointer arrays and their
subsequent preparation as arguments to the OpenMP runtime library. We
then use this in Clang.
This is intended to be used in the near future
by other frontends such as Flang when lowering MLIR to LLVMIR.
In debug mode there is a wrapper (the kernel) around the function in
which we generate the kernel code. We worked around this before to get
the correct kernel name, but now we really distinguish both to attach
the launch bounds to the kernel, not the inner function.
This patch handles dependencies specified by the `depend` clause on an
OpenMP target construct. It does this much the same way clang does it by
materializing an OpenMP `task` that is tagged with the dependencies.
The following functions are relevant to this patch -
1) `createTarget` - This function itself is largely unchanged except
that it now accepts a vector of `DependData` objects that it simply
forwards to `emitTargetCall`
2) `emitTargetCall` - This function has changed now to check if an outer
target-task needs to be materialized (i.e if `target` construct has
`nowait` or has `depend` clause). If yes, it calls `emitTargetTask` to
do all the heavy lifting for creating and dispatching the task.
3) `emitTargetTask` - Bulk of the change is here. See the large comment
explaining what it does at the beginning of this function
This patch migrates the CGOpenMPRuntimeGPU::emitReduction and related functions to the OpenMPIRBUilder. In future patches MLIR OpenMP translation would be making use of these functions.
Co-authored-by: Jan Leyonberg <jan.leyonberg@amd.com>
This PR attempts to fix common block mapping for regular mapping of
these types as well as when they have been marked as "declare target
link". This PR should allow correct mapping of both the members of a
common block and the full common block via its block symbol.
The main changes were some adjustments to the Fortran OpenMP lowering to
HLFIR/FIR, the lowering of the LLVM+OpenMP dialect to LLVM-IR and
adjustments to the way the we handle target kernel map argument
rebinding inside of the OMPIRBuilder.
For the Fortran OpenMP lowering were two changes, one to prevent the
implicit capture of common block members when the common block symbol
itself has been marked and the other creates intermediate member access
inside of the target region to be used in-place of those external to the
target region, this prevents external usages breaking the
IsolatedFromAbove pact.
In the latter case, there was an adjustment to the size calculation for
types to better handle cases where we pass an array as the type of a map
(as opposed to the bounds and the type of the element), which occurs in
the case of common blocks. There is also some adjustment to how
handleDeclareTargetMapVar handles renaming of declare target symbols in
the module to the reference pointer, now it will only apply to those
within the kernel that is currently being generated and we also perform
a modification to replace constants with instructions as necessary as we
cannot replace these with our reference pointer (non-constant and
constants do not mix nicely).
In the case of the OpenMPIRBuilder some changes were made to defer
global symbol rebinding to kernel arguments until all other arguments
have been rebound. This makes sure we do not replace uses that may refer
to the global (e.g. a GEP) but are themselves actually a separate
argument that needs bound.
Currently "declare target to" still needs some work, but this may be the
case for all types in conjunction with "declare target to" at the
moment.
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.
This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
Move PassInstrumentationAnalysis into PassInstrumentation.h and stop
including it in PassManager.h (effectively inverting the direction of
the dependency).
Most places using PassManager are not interested in PassInstrumentation,
and we no longer have any uses of it in PassManager.h itself (only in
PassManagerImpl.h).