Sometimes it might be beneficial to spawn more thread blocks instead of
reusing existing for multiple loop iterations.
**Alternatives considered:**
Make `DefaultNumBlocks` settable via an environment variable.
---------
Co-authored-by: Joseph Huber <huberjn@outlook.com>
This PR seeks to expand/replace the Constant -> Instruction conversion
that needs to occur inside of the OpenMP Target kernel generation to
allow kernel argument replacement of uses within the kernel (cannot
replace constant uses within constant expressions with non-constants).
It does so by making use of the new-ish utility
convertUsersOfConstantsToInstructions which is a much more expansive
version of what the smaller "version" of the function I wrote does,
effectively expanding uses of the input argument that are constant
expressions into instructions so that we can replace with the
appropriate kernel argument.
Also alters convertUsersOfConstantsToInstructions to optionally
restrict the replacement to a function and optionally leave
dead constants alone, the latter is necessary when lowering from
MLIR as we cannot be sure we can remove the constants at this
stage, even if rewritten to instructions the ModuleTranslation may
maintain links to the original constants and utilise them in further
lowering steps (as when we're lowering the kernel, the module is
still in the process of being lowered). This can result in unusual
ICEs later. These dead constants can be tidied up later (and
appear to be in subsequent lowering from checking with
emit-llvm).
The current test is not really correct because the mask is set to
0xffffffff
even if it is on an AMDGPU whose wavefront size is 64. Besides,
`__AMDGCN_WAVEFRONT_SIZE` is not set on host compilation so the
verification
happens to work.
This reverts commit 098c6dfa8157681699a71fce9e3d94515e66311f.
This reverts commit 8c718a3a91df4ab68dc3f1ca3887ea730c9aed84.
This reverts commit 4fb02de9d490d0773441aa30124bb4d1272230d3.
Named location attribute added to `tgt_offload_entry` shall be used by
runtime calls like `ompx_dump_mapping_tables` to print the information
of variables that are mapped to the device. `ompx_dump_mapping_tables`
was printing the wrong location information and this change fixes it.
A sample execution of example before the change:
```
omptarget device 0 info: OpenMP Host-Device pointer mappings after block at libomptarget:0:0:
omptarget device 0 info: Host Ptr Target Ptr Size (B) DynRefCount HoldRefCount Declaration
omptarget device 0 info: 0x0000000000206df0 0x00007f02cdc00000 20000000 1 0 <program-file-loc> at unknown:18:35
```
The change replaces unknown to the mapped symbol and location to the
declaration location.
This is a large series of runtime tests that help to add coverage for the specific cases intended to be supported by the PR stack
that extends derived type map support in Flang+OpenMP. Primarily this will add functionality coverage, there's cases where
things may work, but not optimally (or at least similarly to the status quo in Clang), addiitonal IR tests are added in the
relevant segments of the related PRs to test for breakages like that.
Pull Request: https://github.com/llvm/llvm-project/pull/82850
This is a large series of runtime tests that help to add coverage for
the specific cases intended to be supported by the PR stack
that extends derived type map support in Flang+OpenMP. Primarily this will add functionality coverage, there's cases where
things may work, but not optimally (or at least similarly to the status quo in Clang), additional IR tests are added in the
relevant segments of the related PRs to test for breakages like that.
In a nutshell, this moves our libomptarget code to populate the offload
subproject.
With this commit, users need to enable the new LLVM/Offload subproject
as a runtime in their cmake configuration.
No further changes are expected for downstream code.
Tests and other components still depend on OpenMP and have also not been
renamed. The results below are for a build in which OpenMP and Offload
are enabled runtimes. In addition to the pure `git mv`, we needed to
adjust some CMake files. Nothing is intended to change semantics.
```
ninja check-offload
```
Works with the X86 and AMDGPU offload tests
```
ninja check-openmp
```
Still works but doesn't build offload tests anymore.
```
ls install/lib
```
Shows all expected libraries, incl.
- `libomptarget.devicertl.a`
- `libomptarget-nvptx-sm_90.bc`
- `libomptarget.rtl.amdgpu.so` -> `libomptarget.rtl.amdgpu.so.18git`
- `libomptarget.so` -> `libomptarget.so.18git`
Fixes: https://github.com/llvm/llvm-project/issues/75124
---------
Co-authored-by: Saiyedul Islam <Saiyedul.Islam@amd.com>