5 Commits

Author SHA1 Message Date
Joseph Huber
fb32977ac7
[Libomptarget] Fix RPC-based malloc on NVPTX (#72440)
Summary:
The device allocator on NVPTX architectures is enqueued to a stream that
the kernel is potentially executing on. This can lead to deadlocks as
the kernel will not proceed until the allocation is complete and the
allocation will not proceed until the kernel is complete. CUDA 11.2
introduced async allocations that we can manually place on separate
streams to combat this. This patch makes a new allocation type that's
guaranteed to be non-blocking so it will actually make progress, only
Nvidia needs to care about this as the others are not blocking in this
way by default.

I had originally tried to make the `alloc` and `free` methods take a
`__tgt_async_info`. However, I observed that with the large volume of
streams being created by a parallel test it quickly locked up the system
as presumably too many streams were being created. This implementation
not just creates a new stream and immediately destroys it. This
obviously isn't very fast, but it at least gets the cases to stop
deadlocking for now.
2024-01-02 16:53:53 -06:00
Johannes Doerfert
db96a9c3b7
[OpenMP][NFC] Flatten plugin-nextgen/common folder sturcture (#73725)
For historic reasons we had it setup that there was
`  plugin-nextgen/common/PluginInterface/<sources + headers>`
which is not what we do anywhere else.
Now it looks like the rest:
```
  plugin-nextgen/common/include/<headers>
  plugin-nextgen/common/src/<sources>
```
As part of this, `dlwrap.h` was moved into common/include (as
`DLWrap.h`)
since it is exclusively used by the plugins.
2023-11-29 07:57:01 -08:00
Johannes Doerfert
8327f4a851
[OpenMP][NFC] Move Utils.h and Debug.h into a "Shared" include folder (#73701)
Headers used throughout the different runtimes are different from the
internal headers. This is a first step to bring structure in into the
include folder.
2023-11-28 13:44:57 -08:00
Konstantinos Parasyris
d6a3d6b96d
[openmp] Fixed Support for VA for record-replay. (#70396)
The commit was discussed in phabricator
(https://reviews.llvm.org/D157186).

Record replay currently fails on AMD as it conflicts with the heap
memory allocator introduced in #69806. The workaround is setting
`LIBOMPTARGET_HEAP_SIZE=0` during both record and replay run.
2023-10-29 12:27:19 -07:00
Joseph Huber
e90ab9148b [OpenMP] Delete old plugins
It's time to remove the old plugins as the next-gen has already been set
to default in LLVM 16.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D142820
2023-07-05 17:39:47 -05:00