390 Commits

Author SHA1 Message Date
Mehdi Amini
f390a76b7e Revert "Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)""
This reverts commit ddbaa11e9f43a38d50d62a9b9b07c3653b6bf8ab.

Reapply the original commit, the broken test was repaired in 5e51363f38d083ab326736c0d4d1b5f9fe0de080 in the meantime.
2023-10-26 17:30:01 -07:00
Mehdi Amini
ddbaa11e9f Revert "[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)"
This reverts commit c2a1249a8257ed033a98e32e425539c6da6700ec.

The MLIR bots are broken with an omp test failure.
2023-10-26 17:25:20 -07:00
Johannes Doerfert
c2a1249a82
[OpenMP][NFC] Add min/max threads/teams count into the KernelEnvironment (#70257)
The runtime needs to know about the acceptable launch bounds, especially
if the compiler (middle- or backend) assumed those bounds. While this
patch does not yet inform the runtime, it stores the bounds in a place
that can/will be accessed and is associated with the kernel.
2023-10-26 14:46:55 -07:00
Johannes Doerfert
0ba57c8bba
[OpenMP] Pass min/max thread and team count to the OMPIRBuilder (#70247)
We now provide the information about the min/max thread and team count
from to the OMPIRBuilder, no matter what the source was. That means we
unify `thread_limit`, `num_teams`, `num_threads` handling with the
target specific attriutes (`__launch_bounds__` and
`amdgpu_flat_work_group_size`). This is in preparation to pass the
values to the runtime, and to allow the middle-end (OpenMP-opt) to
tighten the values if it seems appropriate. There is no "real" change
after this commit.
2023-10-26 14:45:07 -07:00
Joseph Huber
078ae8cd64
[Offloading][NFC] Move creation of offloading entries from OpenMP (#70116)
Summary:
This patch is a first step to remove dependencies on the OpenMPIRBuilder
for creating generic offloading entries. This patch changes no
functionality and merely moves the code around. In the future the
interface will be changed to allow for more code re-use in the
registration and creation of offloading entries as well as a more
generic interface for CUDA, HIP, OpenMP, and SYCL(?). Doing this as a
first step to reduce the noise involved in the functional changes.
2023-10-25 09:25:43 -04:00
Shraiysh
9922aadf9e
[OpenMPIRBuilder] Added if clause for teams (#69139)
This patch adds support for the `if` clause on `teams` construct. The
value of the argument must be an integer value. If the value evaluates
to true (non-zero) integer, then the number of threads is determined by
`num_threads` clause (or default and ICV if `num_threads` is absent).
When the condition evaluates to false (zero), then the bounds are set to
1. ([OpenMP 5.2 Section
10.2](https://www.openmp.org/spec-html/5.2/openmpse58.html))

This essentially means that
```
upperbound = ifexpr ? upperbound : 1
lowerbound = ifexpr ? lowerbound : 1
```
2023-10-17 15:00:39 -05:00
Shraiysh
e41eaf4896
[OpenMPIRBuilder] Add ThreadLimit and NumTeams clauses to teams construct (#68364)
This patch adds support for `thread_limit` and bounds on `num_teams`
clause for the teams construct in OpenMP.

Added testcases for the same.
2023-10-11 10:36:03 -05:00
Shraiysh
9050b27bd5
[OpenMPIRBuilder] Remove wrapper function in createTask, createTeams (#67723)
This patch removes the wrapper function in `OpenMPIRBuilder::createTask`
and `OpenMPIRBuilder.createTeams`. The outlined function is directly of
the form that is expected by the runtime library calls. This patch also
adds a utility function to help add fake values and their uses, which
will be deleted in finalization callbacks.

**Why we needed wrappers earlier?**
Before the post outline callbacks are executed, the IR has the following
structure:
```
define @func() {
  ;...
  call void @outlined_fn(ptr %data)
  ;...
}
define void @outlined_fn(ptr %data)
```
OpenMP offloading expects a specific signature for the outlined function
in a runtime call. For example, `__kmpc_fork_teams` expects the
following signature:
```
define @outlined_fn(ptr %global.tid, ptr %data)
```
As there is no way to change a function's arguments after it has been
created, a wrapper function with the expected signature is created that
calls the outlined function inside it.

**How we are handling it now?**
To handle this in the current patch, we create a "fake" global tid and
add a "fake" use for it in the to-be-outlined region. We need to create
these fake values so the outliner sees it as something it needs to pass
to the outlined function. We also tell the outliner to exclude this
global tid value from the aggregate `data` argument, so it comes as a
separate argument in the beginning. This way, we are able to directly
get the outlined function in the expected format. This is inspired by
the way `createParallel` handles outlining (using fake values and then
deleting them later). Tasks are handled with a similar approach. This
simplifies the generated code and the code to do this itself also
becomes simpler (because we no longer have to construct a new function).
2023-10-09 09:20:31 -04:00
agozillon
2a1f1b5fde
[OpenMP][OpenMPIRBuilder] Move copyInput to a passed in lambda function and re-order kernel argument load/stores (#68124)
This patch moves the existing copyInput function
into a lambda argument that can be defined
by a caller to the function.

This allows more flexibility in how the function
is defined, allowing Clang and MLIR to utilise
their own respective functions and types inside
of the lamba without affecting the OMPIRBuilder
itself.

The idea is to eventually replace/build on
the existing copyInput function that's used
and moved into OpenMPToLLVMIRTranslation.cpp
to a slightly more complex implementation
that uses MLIRs map information (primarily
ByRef and ByCapture information at the
moment).

The patch also moves kernel load stores to the top
of the kernel, prior to the first openmp runtime 
invocation. Just makes the IR a little closer to Clang.
2023-10-06 16:47:27 +02:00
agozillon
bc0c1783fd
[Clang][OpenMP][OMPIRBuilder] Move Clang's OpenMP Member/MemberOf flag helpers into the OMPIRBuilder (#67844)
This patch seeks to move the following functions to the OMPIRBuilder:
 - getFlagMemberOffset
 - getMemberOfFlag
 - setCorrectMemberOfFlag

These small helper functions help set the end bits of the
OpenMPOffloadMappingFlags flag that correspond to the reserved segment
for OMP_MAP_MEMBER_OF.

They will be of use in the future for lowering MLIR types/values that
can contian members and can be lowered similarly to a structure or class
type within the OpenMPToLLVMIRTranslation step of the OpenMP dialects
lowering to LLVM-IR. In particular for Flang which currently uses this
flow. Types with descriptors like pointers/allocatables, and likely
derived types in certain cases can be lowered as if they were structures
with explicitly mapped members.
2023-10-03 15:20:44 +02:00
Benjamin Kramer
4731623f81 [LoopUnroll] Fold variable only used in assert into the assert
Avoids warnings in Release builds. NFCI.
2023-09-27 13:19:18 +02:00
Nikita Popov
18be23f82a [OPMIRBuilder] Fix typo in condition
Fix a condition I accidentally inverted in
296671f059198185bedc3eba80056ebf4bc7ad30.
2023-09-27 12:24:02 +02:00
Nikita Popov
296671f059 [LoopUnroll] Store more information in UnrollCostEstimator (NFCI)
Instead of having ApproximateLoopSize() use a bunch of out parameters,
from which we later construct an UnrollCostEstimator, directly
construct UnrollCostEstimator which holds all the information
derived from loop analysis. This makes it easier to add additional
metrics in the future.
2023-09-27 12:06:13 +02:00
Shraiysh
8d17875acb
[OMPIRBuilder] Added createTeams (#66807)
This patch adds basic support for `omp teams` to the OpenMPIRBuilder.

The outlined function after code extraction is called from a wrapper
function with appropriate arguments. This wrapper function is passed to
the runtime calls.

This approach is different from the Clang approach - clang directly
emits the runtime call to the outlined function. The outlining utility
(OutlineInfo) simply outlines the code and generates a function call to
the outlined function. After the function has been generated by the
outlining utility, there is no easy way to alter the function arguments
without meddling with the outlining itself. Hence the wrapper function
approach is taken.
2023-09-24 16:23:43 -05:00
Prabhdeep Singh Soni
9b57b167bb [OMPIRBuilder] Fix shared clause for task construct
This patch fixes the shared clause for the task construct with multiple
shared variables. The shareds field in the kmp_task_t is not an inline
array in the struct, rather it is a pointer to an array. With an inline
array, the pointer dereference to the outlined function body of the task
would segmentation fault when accessed by the runtime.

Reviewed By: kiranchandramohan, jdoerfert

Differential Revision: https://reviews.llvm.org/D158462
2023-09-15 12:19:47 -04:00
Arthur Eubanks
0a1aa6cda2
[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295)
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.

This matches other nearby enums.

For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
2023-09-14 14:10:14 -07:00
Sergio Afonso
9058762789
[OpenMP][Flang][MLIR] Lowering of requires directive from MLIR to LLVM IR
Default atomic ordering information is processed in the OpenMP dialect
to LLVM IR lowering stage at every spot where an operation can be
affected by it. The rest of clauses are stored globally in the
OpenMPIRBuilderConfig object before starting that lowering stage, so
that the OMPIRBuilder can conditionally modify code generation
depending on these. At the end of the process, the omp.requires
attribute is itself lowered into a global constructor that passes these
clauses as flags to the OpenMP runtime.

Depends on D147217, D147218 and D158278.

Differential Revision: https://reviews.llvm.org/D147219
2023-09-14 10:35:44 +01:00
Sergio Afonso
094a63a20b
[OpenMP][OMPIRBuilder] OpenMPIRBuilder support for requires directive
This patch updates the `OpenMPIRBuilderConfig` structure to hold all
available 'requires' clauses, and it replicates part of the code
generation for the 'requires' registration function from clang in the
`OMPIRBuilder`, to be used with flang.

Porting the rest of features of the clang implementation to the IRBuilder
and sharing it between clang and flang remains for a future patch, due to the
complexity of the logic selecting the attributes of the generated
registration function.

Differential Revision: https://reviews.llvm.org/D147217
2023-09-14 10:33:54 +01:00
Shraiysh
8929f38320
[nfc][OpenMPIRBuilder] Formatting OMPIRBuilder.cpp and OMPIRBuilder.h (#65772) 2023-09-08 11:54:52 -05:00
Jeremy Morse
4427407a29 [NFC][RemoveDIs] Create a new spelling of the moveBefore method
As outlined in my proposal of how to get rid of debug intrinsics, this
patch adds a moveBefore method that signals the caller /intends/ the order
of moved instructions is to stay the same. This semantic difference has an
effect on debug-info, as it signals whether debug-info needs to move with
instructions or not.

The patch just replaces a few calls to moveBefore with calls to
moveBeforePreserving -- and the latter just calls the former, so it's all
NFC right now. A future patch will add an implementation of
moveBeforePreserving that takes action to correctly preserve debug-info,
but that's tightly coupled with our non-instruction debug-info
representation that's still being reviewed.

[0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939

Differential Revision: https://reviews.llvm.org/D156369
2023-09-07 18:37:57 +01:00
Joseph Huber
9da61aed75 [OpenMP] Emit offloading entries for indirect target variables
OpenMP 5.1 allows emission of the `indirect` clause on declare target
functions, see https://www.openmp.org/spec-html/5.1/openmpsu70.html#x98-1080002.14.7.
The intended use of this is to permit calling device functions via their
associated host pointer. In order to do this the first step will be
building a map associating these variables. Doing this will require the
same offloading entry handling we use for other kernels and globals.

We intentionally emit a new global on the device side. Although it's
possible to look up the device function's address directly, this would
require changing the visibility and would prevent us from making static
functions indirect. Also, the CUDA toolchain will optimize out unused
functions and using a global prevents that. The downside is that the
runtime will need to read the global and copy its value, but there
shouldn't be any other costs.

Note that this patch just performs the codegen, currently this new
offloading entry type is unused and will be ignored by the runtime.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D157738
2023-08-24 18:21:13 -05:00
Johannes Doerfert
f3958ce008 [OpenMP] Add NVIDIA annotations for static grid thread limit
We already add AMD GPU annotations, the NVIDIA ones are just a little
more convoluted to add/update but otherwise the same.
We see again that the interplay of ompx_attribute and deduced value
needs to be improved, see the TODO.

Differential Revision: https://reviews.llvm.org/D158383
2023-08-23 11:12:04 -07:00
Johannes Doerfert
7481b465ae [OpenMP] Use default grid value for static grid size
If the user did not provide any static clause to override the grid size,
we assume the default grid size as upper bound and use it to improve
code generation through vendor specific attributes.

Fixes: https://github.com/llvm/llvm-project/issues/64816

Differential Revision: https://reviews.llvm.org/D158382
2023-08-23 11:12:03 -07:00
Johannes Doerfert
df8d33fa7a [OpenMP][AMDGPU] Add "amdgpu-flat-work-group-size" for known thread counts
If we know the thread count statically and it is a constant, we can set
the "amdgpu-flat-work-group-size" kernel attribute.

Fixes https://github.com/llvm/llvm-project/issues/64816 in parts.
2023-08-18 21:47:57 -07:00
Akash Banerjee
5d9ccd7a96 [OpenMP] Migrate dispatch related utility functions from Clang codegen to OMPIRBuilder
Migrate createForStaticInitFunction, createDispatchInitFunction, createDispatchNextFunction and createDispatchFiniFunction from Clang CodeGen to OMPIRBuilder.

Differential Revision: https://reviews.llvm.org/D157994
2023-08-16 16:35:28 +01:00
Soumi Manna
bd1ddc5850 [NFC][OpenMP] Initialize pointer field
Reviewed By: tahonermann

Differential Revision: https://reviews.llvm.org/D157989
2023-08-16 07:47:24 -07:00
Joseph Huber
43125b6392 [OpenMP] Use protected visibility for the kernel environment
Summary:
These new globals should use protected visibility. Visibility like this
helps certain checks on AMDGPU architecture and LTO.
2023-08-15 12:47:44 -05:00
Jan Sjodin
b7fcf51515 [OpenMP][OpenMPIRBuilder] Add kernel launch codegen to emitTargetCall
This patch adds code emission in emitTargetCall to call the OpenMP runtime to
launch an kernel, and to call the fallback host implementation if the launch
fails.

Reviewed By: TIFitis, kiranchandramohan, jdoerfert

Differential Revision: https://reviews.llvm.org/D155633
2023-08-15 10:03:06 -04:00
Hao Jin
c384e79675 [OpenMP][IR] Set correct alignment for internal variables
OpenMP runtime functions assume the pointers are aligned to sizeof(pointer),
but it is being aligned incorrectly. Fix with the proper alignment in the IR builder.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D157040
2023-08-11 15:20:15 -04:00
Akash Banerjee
1e92e25cb4 [MLIR][OpenMP] Added MLIR translation support for use_device clauses
Added MLIR support for translating use_device_ptr and use_device_addr clauses for LLVMIR lowering.

  - use_device_ptr: The mapped variables marked with use_device_ptr are accessed through a copy of the base pointer mappers. The mapper is copied onto a new temporary pointer variable.
  - use_device_addr: The mapped variables marked with use_device_addr are accessed directly through the base pointer mappers.
  - If mapping information is not provided explicitly then default map_type of alloc/release is assumed and the map_size is set to 0.

Depends on D152554

Reviewed By: kiranchandramohan, raghavendhra

Differential Revision: https://reviews.llvm.org/D146648
2023-08-04 15:38:50 +01:00
Shilei Tian
c171ed4693 [OpenMP] Fix the linkage for kernel environment global variable
This patch fixes the issue that multiple definition of kernel environment global
variables can occur because of wrong linkage.

Fixes #64284.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D156955
2023-08-03 21:33:31 -04:00
Bjorn Pettersson
e6e9a87534 Drop some typed pointer handling
Differential Revision: https://reviews.llvm.org/D156739
2023-08-02 12:08:37 +02:00
Shilei Tian
10068cd654 [OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.

This is a combination and refinement of patch series D116908, D116909, and D116910.

Depend on D155886.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142569
2023-07-26 13:35:14 -04:00
Jan Sjodin
caa35a1ad9 [OpenMP][OpenMPIRBuilder] Make outlined function parameters i64 or ptr
This patch ensures that all outlined functions parameters are i64 or ptr when
compiling for a target device, which is what the OpenMP runtime expects. The
values are then cast to the correct type inside the kernel.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D155628
2023-07-25 13:01:40 -04:00
Shilei Tian
6bd74fd65f Revert commits for kernel environment
This reverts commits for kernel environments as they causes issues in AMD BB.
2023-07-23 23:32:31 -04:00
Shilei Tian
c5c8040390 [OpenMP] Introduce kernel environment
This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime.

This is a combination and refinement of patch series D116908, D116909, and D116910.

Depend on D155886.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142569
2023-07-23 18:36:01 -04:00
Nikita Popov
8db30225ec [OpenMPIRBuilder] Check GV type instead of pointee type (NFC)
Change the assertion to check the value type instead of the pointee
type. To facilitate this, store GlobalVariable* instead of Constant*
in the map.
2023-07-18 11:31:41 +02:00
Akash Banerjee
227012cbd7 [OpenMP] Migrate device code privatisation from Clang CodeGen to OMPIRBuilder
This patch migrates the UseDevicePtr and UseDeviceAddr clause related code for handling privatisation from Clang codegen to the OMPIRBuilder

Depends on D150860

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D152554
2023-07-12 12:03:28 +01:00
Sergio Afonso
63ca93c7d1
[OpenMP][OMPIRBuilder] Rename IsEmbedded and IsTargetCodegen flags
This patch renames the `OpenMPIRBuilderConfig` flags to reduce confusion over
their meaning. `IsTargetCodegen` becomes `IsGPU`, whereas `IsEmbedded` becomes
`IsTargetDevice`. The `-fopenmp-is-device` compiler option is also renamed to
`-fopenmp-is-target-device` and the `omp.is_device` MLIR attribute is renamed
to `omp.is_target_device`. Getters and setters of all these renamed properties
are also updated accordingly. Many unit tests have been updated to use the new
names, but an alias for the `-fopenmp-is-device` option is created so that
external programs do not stop working after the name change.

`IsGPU` is set when the target triple is AMDGCN or NVIDIA PTX, and it is only
valid if `IsTargetDevice` is specified as well. `IsTargetDevice` is set by the
`-fopenmp-is-target-device` compiler frontend option, which is only added to
the OpenMP device invocation for offloading-enabled programs.

Differential Revision: https://reviews.llvm.org/D154591
2023-07-10 14:14:16 +01:00
Haojian Wu
9e362e5e3f Fix -Wunused-vairable on release build, NFC 2023-07-08 22:57:03 +02:00
Youngsuk Kim
f69b9b7cce [llvm] Remove uses of Type::getPointerTo() (NFC)
Partial progress towards removing in-tree uses of `getPointerTo()`,
by employing the following options:

* Drop the call entirely if the sole purpose of it is to support
  a no-op bitcast (remove the no-op bitcast as well).

* Replace with `PointerType::get()`/`PointerType::getUnqual()`.

Also, remove no-op function `EmitBitCastOfLValueToProperType()`.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D154392
2023-07-08 13:05:58 -04:00
Akash Banerjee
4a49226537 [OpenMP] Change clang emitTargetDataCalls to use OMPIRBuilder
This patch changes the emitTargetDataCalls function in clang to make use of the OpenMPIRBuilder::createTargetData function for Target Data directive code gen.
Reapplying commit 0d8d718171192301f2beb10bd08ce62e70281a5e after fixing libomptarget test failure related to debug info.

Depends on D146557

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D150860
2023-07-05 14:03:54 +01:00
Akash Banerjee
a5ea676067 Reverting commit 0d8d718171192301f2beb10bd08ce62e70281a5e as it broke libomptarget tests 2023-06-30 16:03:57 +01:00
Jan Sjodin
ac65cc1215 [OpenMP][OpenMPIRBuilder] Migrate kernel launch code and host fallback code generation from Clang to the OpenMPIRBuilder
This patch refactors the code generation that emits the offloading kernel
launch and moves the core portion to the OpenMPIRBuilder so that it can be used
from flang in the future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D151035
2023-06-30 10:40:21 -04:00
Akash Banerjee
0d8d718171 [OpenMP] Change clang emitTargetDataCalls to use OMPIRBuilder
This patch changes the emitTargetDataCalls function in clang to make use of the OpenMPIRBuilder::createTargetData function for Target Data directive code gen.

Depends on D146557

Differential Revision: https://reviews.llvm.org/D150860
2023-06-30 15:12:28 +01:00
Akash Banerjee
a032dc139d [MLIR][OpenMP] Refactoring createTargetData in OMPIRBuilder
Key changes:
  - Refactor the createTargetData function to make use of the emitOffloadingArrays and emitOffloadingArraysArgument functions to generate code.
  - Added a new emitIfClause helper function to allow handling if clauses in a similar fashion to Clang.
  - Updated the MLIR side of code to account for changes to createTargetData.

Depends on D149872

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D146557
2023-06-19 13:09:35 +01:00
Sindhu Chittireddy
0e444c57c8 Remove dead conditionals 2023-06-13 09:48:20 -07:00
Akash Banerjee
3401a5f758 [OpenMP][OMPIRBuilder] Migrate emitOffloadingArrays and EmitNonContiguousDescriptor from Clang
This patch migrates the emitOffloadingArrays and EmitNonContiguousDescriptor functions from Clang codegen to OpenMPIRBuilder.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D149872
2023-06-12 15:43:09 +01:00
Andrew Gozillon
5fc1ed471d [Clang][OpenMP][OMPIRBuilder] Fix getTargetEntryUniqueInfo to fatal error instead of assert when it cannot generate a unique ID
This prevents access of uninitialized ID components later, or continuing progress
further making errors harder to track down.

Hoepfully this will be the last in my long series of mistakes in recent
commits relating to this area of code!
2023-06-07 19:28:00 -05:00
Andrew Gozillon
73104d4fbd [Clang][OpenMP][OMPIRBuilder] Change llvm_unreachable inside of getTargetEntryUniqueInfo to a more apt assert
However, with the correct condition this time and appropriate Clang
callback function to give the correct fallback Filename and Line Info.
2023-06-07 17:52:49 -05:00