31 Commits

Author SHA1 Message Date
Joseph Huber
4213f4a9ae [Libomptarget] Fix resizing the buffer of RPC handles
Summary:
The previous code would potentially make it smaller if a device with a
lower ID touched it later. Also we should minimize changes to the state
for multi threaded reasons. This just sets up an owned slot for each at
initialization time.
2024-04-01 07:29:57 -05:00
Joseph Huber
a1a8bb1d3a
[libc] Change RPC interface to not use device ids (#87087)
Summary:
The current implementation of RPC tied everything to device IDs and
forced us to do init / shutdown to manage some global state. This turned
out to be a bad idea in situations where we want to track multiple
hetergeneous devices that may report the same device ID in the same
process.

This patch changes the interface to instead create an opaque handle to
the internal device and simply allocates it via `new`. The user will
then take this device and store it to interface with the attached
device. This interface puts the burden of tracking the device identifier
to mapped d evices onto the user, but in return heavily simplifies the
implementation.
2024-03-29 12:49:16 -05:00
Joseph Huber
ed68aac9f2
[Libomptarget] Move API implementations into GenericPluginTy (#86683)
Summary:
The plan is to remove the entire plugin interface and simply use the
`GenericPluginTy` inside of `libomptarget` by statically linking against
it. This means that inside of `libomptarget` we will simply do
`Plugin.data_alloc` without the dynamically loaded interface. To reduce
the amount of code required, this patch simply moves all of the RTL
implementation functions inside of the Generic device. Now the
`__tgt_rtl_` interface is simply a shallow wrapper that will soon go
away. There is some redundancy here, this will be improved later. For
now what is important is minimizing the changes to the API.
2024-03-27 14:10:54 -05:00
Joseph Huber
4dc3225248
[Libomptarget] Replace global PluginTy::get interface with references (#86595)
Summary:
We have a plugin singleton that implements the Plugin interface. This
then spawns separate device and kernels. Previously when these needed to
reach into the global singleton they would use the `PluginTy::get`
routine to get access to it. In the future we will move away from this
as the lifetime of the plugin will be handled by `libomptarget`
directly. This patch removes uses of this inside of the plugin
implementaion themselves by simply keeping a reference to the plugin
inside of the device.

The external `__tgt_rtl` functions still use the global method, but will
be removed later.
2024-03-26 07:13:59 -05:00
Joseph Huber
2cad43c1ba
[Libomptarget] Factor functions out of 'Plugin' interface (#86528)
Summary:
This patch factors common functions out of the `Plugin` interface prior
to its removal in a future patch. This simply temporarily renames it to
`PluginTy` so that we could re-use `Plugin::check` internally as this
needs to be defined statically per plugin now. We can refactor this
later.

The future patch will delete `PluginTy` and `PluginTy::get` entirely.
This simply tries to minimize a few changes to make it easier to land.
2024-03-25 15:24:39 -05:00
dhruvachak
b5d02bbd0d
[OpenMP] Increment kernel args version, used by runtime for detecting dyn_ptr. (#85363)
A kernel implicit parameter (dyn_ptr) was introduced some time back.
This patch increments the kernel args version for a compiler supporting
dyn_ptr. The version will be used by the runtime to determine whether
the implicit parameter is generated by the compiler. The versioning is
required to support use cases where code generated by an older compiler
is linked with a newer runtime.

If approved, this patch should be backported to release 18.
2024-03-19 16:40:22 -07:00
Joseph Huber
470040bd4d [Libomptarget][NFC] Remove warning on return value const 2024-03-15 18:50:33 -05:00
Ulrich Weigand
2210c85a66 Reapply [libomptarget] Support BE ELF files in plugins-nextgen (#85246)
Code in plugins-nextgen reading ELF files is currently hard-coded to
assume a 64-bit little-endian ELF format. Unfortunately, this assumption
is even embedded in the interface between GlobalHandler and Utils/ELF
routines, which use ELF64LE types.

To fix this, I've refactored the interface to use generic types, in
particular by using (a unique_ptr to) ObjectFile instead of
ELF64LEObjectFile, and ELFSymbolRef instead of ELF64LE::Sym.

This allows properly templating over multiple ELF format variants inside
Utils/ELF; specifically, this patch adds support for 64-bit big-endian
ELF files in addition to 64-bit little-endian files.
2024-03-15 18:28:28 +01:00
Ulrich Weigand
4c8714efc5 Revert "[libomptarget] Support BE ELF files in plugins-nextgen (#85246)"
This reverts commit 611c62b30d160375b46b7afedc04965ee6f67d1a.
2024-03-14 18:38:13 +01:00
Ulrich Weigand
611c62b30d
[libomptarget] Support BE ELF files in plugins-nextgen (#85246)
Code in plugins-nextgen reading ELF files is currently hard-coded to
assume a 64-bit little-endian ELF format. Unfortunately, this assumption
is even embedded in the interface between GlobalHandler and Utils/ELF
routines, which use ELF64LE types.

To fix this, I've refactored the interface to use generic types, in
particular by using (a unique_ptr to) ObjectFile instead of
ELF64LEObjectFile, and ELFSymbolRef instead of ELF64LE::Sym.

This allows properly templating over multiple ELF format variants inside
Utils/ELF; specifically, this patch adds support for 64-bit big-endian
ELF files in addition to 64-bit little-endian files.
2024-03-14 18:19:12 +01:00
Ulrich Weigand
fb7cc73975 Revert "[libomptarget] Support BE ELF files in plugins-nextgen (#83976)"
This reverts commit 15b7b3182cc28f4f0b950bd73d931caa27b833ec.
2024-03-06 21:37:45 +01:00
Ulrich Weigand
15b7b3182c
[libomptarget] Support BE ELF files in plugins-nextgen (#83976)
Code in plugins-nextgen reading ELF files is currently hard-coded to
assume a 64-bit little-endian ELF format. Unfortunately, this assumption
is even embedded in the interface between GlobalHandler and Utils/ELF
routines, which use ELF64LE types.

To fix this, I've refactored the interface to push all ELF specific
types into Utils/ELF. Specifically, this patch removes both the
getSymbol and getSymbolAddress routines and replaces them with a
single findSymbolInImage, which gets a StringRef identifying the
raw object file image as input, and returns a StringRef covering
the data addressed by the symbol (address and size) if found, or
std::nullopt otherwise.

This allows properly templating over multiple ELF format variants inside
Utils/ELF; specifically, this patch adds support for 64-bit big-endian
ELF files in addition to 64-bit little-endian files.
2024-03-06 20:49:12 +01:00
Daniel Martinez
aa6ebf9be1
Replace some C headers with C++ ones (#82697)
#81434

Replaced some C headers with C++ ones

Co-authored-by: Daniel Martinez <danielmartinez@cock.li>
2024-03-04 01:21:31 -05:00
Joseph Huber
0ac4438560
[Libomptarget] Remove unused 'SupportsEmptyImages' API function (#80316)
Summary:
This function is always false in the current implementation and is not
even considered required. Just remove it and if someone needs it in the
future they can add it back in. This is done to simplify the interface
prior to other changes
2024-02-05 10:00:09 -06:00
Joseph Huber
621bafd5c1
[Libomptarget] Move target table handling out of the plugins (#77150)
Summary:
This patch removes the bulk of the handling of the
`__tgt_offload_entries` out of the plugins itself. The reason for this
is because the plugins themselves should not be handling this
implementation detail of the OpenMP runtime. Instead, we expose two new
plugin API functions to get the points to a device pointer for a global
as well as a kernel type.

This required introducing a new type to represent a binary image that
has been loaded on a device. We can then use this to load the addresses
as needed. The creation of the mapping table is then handled just in
`libomptarget` where we simply look up each address individually. This
should allow us to expose these operations more generically when we
provide a separate API.
2024-01-22 11:06:47 -06:00
carlobertolli
ae99966a27
[OpenMP] Enable automatic unified shared memory on MI300A. (#77512)
This patch enables applications that did not request OpenMP
unified_shared_memory to run with the same zero-copy behavior, where
mapped memory does not result in extra memory allocations and memory
copies, but CPU-allocated memory is accessed from the device. The name
for this behavior is "automatic zero-copy" and it relies on detecting:
that the runtime is running on a MI300A, that the user did not select
unified_shared_memory in their program, and that XNACK (unified memory
support) is enabled in the current GPU configuration. If all these
conditions are met, then automatic zero-copy is triggered.

This patch also introduces an environment variable OMPX_APU_MAPS that,
if set, triggers automatic zero-copy also on non APU GPUs (e.g., on
discrete GPUs).
This patch is still missing support for global variables, which will be
provided in a subsequent patch.

Co-authored-by: Thorsten Blass <thorsten.blass@amd.com>
2024-01-22 10:30:22 -06:00
Joseph Huber
37c1a5e3f5
[Libomptarget] Fix GPU Dtors referencing possibly deallocated image (#77828)
Summary:
The constructors and destructors look up a symbol in the ELF quickly to
determine if they need to be run on the GPU. This allows us to avoid the
very slow actions required to do the slower lookup using the vendor API.

One problem occurs with how we handle the lifetime of these images.
Right now there is no invariant to specify the lifetime of the
underlying binary image that is loaded. In the typical case, this comes
from the binary itself in the `.llvm.offloading` section, meaning that
the lifetime of the binary should match the executable itself. This
would work fine, if it weren't for the fact that the plugin is loaded
via `dlopen` and can have a teardown order out of sync with the main
executable.

This was likely what was occuring when this failed on some systems but
not others. A potential solution would be to simply copy images into
memory so the runtime does not rely on external references. Another
would be to manually zero these out after initialization as to prevent
this mistake from happening accidentally. The former has the benefit of
making some checks easier, and allowing for constant initialization be
done on the ELF itself (normally we can't do this because writing to a
constant section, e.g. .llvm.offloading is a segfault.). The downside
would be the extra time required to copy the image in bulk (Although we
are likely doing this in the vendor runtimes as well).

This patch went with a quick solution to simply set a boolean value at
initialization time if we need to call destructors.

Fixes: https://github.com/llvm/llvm-project/issues/77798
2024-01-11 15:00:53 -06:00
Joseph Huber
d03b8c3a04
[Libomptarget][NFC] Format in-line comments consistently (#77530)
Summary:
The LLVM style uses /*Foo=*/ when indicating the name of a constant. See
https://llvm.org/docs/CodingStandards.html#comment-formatting. This is
useful for consistency, as well as because `clang-format` understands
this syntax and formats it more cleanly. Do a bulk update of this
syntax.
2024-01-10 10:10:08 -06:00
carlobertolli
ce4144406c
Revert "[OpenMP][libomptarget] Enable automatic unified shared memory executi…" (#77371)
Reverts llvm/llvm-project#75999

lit test is failing.
2024-01-08 14:38:29 -06:00
carlobertolli
22a73e7c46
[OpenMP][libomptarget] Enable automatic unified shared memory executi… (#75999)
…on (zero-copy) on MI300A.

This patch enables applications that did not request OpenMP
unified_shared_memory to run with the same zero-copy behavior, where
mapped memory does not result in extra memory allocations and memory
copies, but CPU-allocated memory is accessed from the device. The name
for this behavior is "automatic zero-copy" and it relies on detecting:
that the runtime is running on a MI300A, that the user did not select
unified_shared_memory in their program, and that XNACK (unified memory
support) is enabled in the current GPU configuration. If all these
conditions are met, then automatic zero-copy is triggered.

This patch is still missing support for global variables, which will be
provided in a subsequent patch.

Co-authored-by: Thorsten Blass <thorsten.blass@amd.com>
2024-01-08 14:17:28 -06:00
Joseph Huber
64f0681e97
[Libomptarget] Rework image checking further (#76120)
Summary:
In the future, we may have more checks for different kinds of inputs,
e.g. SPIR-V. This patch simply reworks the handling to be more generic
and do the magic detection up-front. The checks inside the routines are
now asserts so we don't spend time checking this stuff over and over
again.

This patch also tweaked the bitcode check. I used a different function
to get the Lazy-IR module now, as it returns the raw expected value
rather than the SM diganostic.

No functionality change intended.
2023-12-29 15:14:39 -06:00
Joseph Huber
f324584ae3
[Libomptarget][NFCI] Remove caching of created ELF files (#76080)
Summary:
We currently keep a cache of created ELF files from the relevant images.
This shouldn't be necessary as the entire ELF interface is generally
trivially constructable and extremely cheap. The cost of constructing
one of these objects is simply a size check and writing a pointer to the
underlying data. Given that, keeping a cache of these images should not
be necessary overall.
2023-12-20 17:13:41 -06:00
Joseph Huber
e4f4022b70 [Libomptarget][NFC] Fix linting warnings in the plugins
Summary:
Fix some linting warnings present in the plugins.
2023-12-20 10:07:34 -06:00
Joseph Huber
ac029e02a9
[Libomptarget] Remove __tgt_image_info and use the ELF directly (#75720)
Summary:
This patch reorganizes a lot of the code used to check for compatibility
with the current environment. The main bulk of this patch involves
moving from using a separate `__tgt_image_info` struct (which just
contains a string for the architecture) to instead simply checking this
information from the ELF directly. Checking information in the ELF is
very inexpensive as creating an ELF file is simply writing a base
pointer.

The main desire to do this was to reorganize everything into the ELF
image. We can then do the majority of these checks without first
initializing the plugin. A future patch will move the first ELF checks
to happen without initializing the plugin so we no longer need to
initialize and plugins that don't have needed images.

This patch also adds a lot more sanity checks for whether or not the ELF
is actually compatible. Such as if the images have a valid ABI, 64-bit
width, executable, etc.
2023-12-19 20:01:31 -06:00
Shilei Tian
3768039913
[OpenMP] Directly use user's grid and block size in kernel language mode (#70612)
In kernel language mode, use user's grid and blocks size directly. No
validity
check, which means if user's values are too large, the launch will fail,
similar
to what CUDA and HIP are doing right now.
2023-12-18 12:26:18 -05:00
Kazu Hirata
b8f89b84bc Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-16 15:02:17 -08:00
Joseph Huber
0ab663d202
[Libomptarget] Move ELF symbol extraction to the ELF utility (#74717)
Summary:
We shouldn't have the format specific ELF handling in the generic plugin
manager. This patch moves that out of the implementation and into the
ELF utilities. This patch changes the SHT_NOBITS case to be a hard
error, which should be correct as the existing use already seemed to
return an error if the result was a null pointer.

This also uses a `const_cast`, which is bad practice. However,
rebuilding the `constness` of all of this would be a massive overhaul,
and this matches the previous behaviour (We would take a pointer to the
image that is most likely read-only in the ELF).
2023-12-14 11:04:13 -06:00
Joseph Huber
6f3bd3a2f6
[Libomptarget] Add a utility function for checking existence of symbols (#74550)
Summary:
There are now a few cases that check if a symbol is present before
continuing, effectively making them optional features if present in the
image. This was done in at least three locations and required an ugly
operation to consume the error. This patch makes a utility function to
handle that instead.
2023-12-06 07:41:27 -06:00
Johannes Doerfert
5fe741f08e
[OpenMP] Separate Requirements into a standalone header (#74126)
This is not completely NFC since we now check all 4 requirements and the
test is checking the good and the bad case for combining flags.
2023-12-01 14:47:00 -08:00
Johannes Doerfert
148dec9fa4
[OpenMP][NFC] Separate Envar (environment variable) handling (#73994) 2023-11-30 15:23:34 -08:00
Johannes Doerfert
db96a9c3b7
[OpenMP][NFC] Flatten plugin-nextgen/common folder sturcture (#73725)
For historic reasons we had it setup that there was
`  plugin-nextgen/common/PluginInterface/<sources + headers>`
which is not what we do anywhere else.
Now it looks like the rest:
```
  plugin-nextgen/common/include/<headers>
  plugin-nextgen/common/src/<sources>
```
As part of this, `dlwrap.h` was moved into common/include (as
`DLWrap.h`)
since it is exclusively used by the plugins.
2023-11-29 07:57:01 -08:00