6 Commits

Author SHA1 Message Date
Joseph Huber
dfc162ad3f [libc] Free the GPU memory allocated in the device loaders
Summary:
This part was ignored and we just hoped that shutting down the runtime
freed these correctly. But it's best to be specific and free the memory
we've allocated.
2023-04-03 11:55:32 -05:00
Joseph Huber
2bef46d2ad [libc] Add a loader utility for NVPTX architectures for testing
This patch adds a loader utility targeting the CUDA driver API to launch
NVPTX images called `nvptx_loader`. This takes a GPU image on the
command line and launches the `_start` kernel with the appropriate
arguments. The `_start` kernel is provided by the already implemented
`nvptx/start.cpp`. So, an application with a `main` function can be
compiled and run as follows.

```
clang++ --target=nvptx64-nvidia-cuda main.cpp crt1.o -march=sm_70 -o image
./nvptx_loader image args to kernel
```

This implementation is not tested and does not yet support RPC. This
requires further development to work around NVIDIA specific limitations
in atomics and linking.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D146681
2023-03-24 20:04:42 -05:00
Joseph Huber
6bd4d717d5 [libc] Add environment variables to GPU libc test for AMDGPU
This patch performs the same operation to copy over the `argv` array to
the `envp` array. This allows the GPU tests to use environment
variables.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D146322
2023-03-20 13:16:58 -05:00
Joseph Huber
ae30ae23aa [libc][NFC] Add some missing comments to the RPC implementation
Summary:
These comments were accidentally dropped from the committed version. Add
them back in.
2023-03-20 09:30:12 -05:00
Joseph Huber
8e4f9b1fcb [libc] Add initial support for an RPC mechanism for the GPU
This patch adds initial support for an RPC client / server architecture.
The GPU is unable to perform several system utilities on its own, so in
order to implement features like printing or memory allocation we need
to be able to communicate with the executing process. This is done via a
buffer of "sharable" memory. That is, a buffer with a unified pointer
that both the client and server can use to communicate.

The implementation here is based off of Jon Chesterfields minimal RPC
example in his work. We use an `inbox` and `outbox` to communicate
between if there is an RPC request and to signify when work is done.
We use a fixed-size buffer for the communication channel. This is fixed
size so that we can ensure that there is enough space for all
compute-units on the GPU to issue work to any of the ports. Right now
the implementation is single threaded so there is only a single buffer
that is not shared.

This implementation still has several features missing to be complete.
Such as multi-threaded support and asynchrnonous calls.

Depends on D145912

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D145913
2023-03-17 12:55:31 -05:00
Joseph Huber
67d78e3c6f [libc] Add a loader utility for AMDHSA architectures for testing
This is the first attempt to get some testing support for GPUs in LLVM's
libc. We want to be able to compile for and call generic code while on
the device. This is difficult as most GPU applications also require the
support of large runtimes that may contain their own bugs (e.g. CUDA /
HIP / OpenMP / OpenCL / SYCL). The proposed solution is to provide a
"loader" utility that allows us to execute a "main" function on the GPU.

This patch implements a simple loader utility targeting the AMDHSA
runtime called `amdhsa_loader` that takes a GPU program as its first
argument. It will then attempt to load a predetermined `_start` kernel
inside that image and launch execution. The `_start` symbol is provided
by a `start` utility function that will be linked alongside the
application. Thus, this should allow us to run arbitrary code on the
user's GPU with the following steps for testing.

```
clang++ Start.cpp --target=amdgcn-amd-amdhsa -mcpu=<arch> -ffreestanding -nogpulib -nostdinc -nostdlib -c
clang++ Main.cpp --target=amdgcn-amd-amdhsa -mcpu=<arch> -nogpulib -nostdinc -nostdlib -c
clang++ Start.o Main.o --target=amdgcn-amd-amdhsa -o image
amdhsa_loader image <args, ...>
```

We determine the `-mcpu` value using the `amdgpu-arch` utility provided
either by `clang` or `rocm`. If `amdgpu-arch` isn't found or returns an
error we shouldn't run the tests as the machine does not have a valid
HSA compatible GPU. Alternatively we could make this utility in-source
to avoid the external dependency.

This patch provides a single test for this untility that simply checks
to see if we can compile an application containing a simple `main`
function and execute it.

The proposed solution in the future is to create an alternate
implementation of the LibcTest.cpp source that can be compiled and
launched using this utility. This approach should allow us to use the
same test sources as the other applications.

This is primarily a prototype, suggestions for how to better integrate
this with the existing LibC infastructure would be greatly appreciated.
The loader code should also be cleaned up somewhat. An implementation
for NVPTX will need to be written as well.

Reviewed By: sivachandra, JonChesterfield

Differential Revision: https://reviews.llvm.org/D139839
2023-02-13 13:49:01 -06:00