46 Commits

Author SHA1 Message Date
serge-sans-paille
0ffaffcaac
Reapply 6fa2abf90886f18472c87bc9bffbcdf4f73c465e
Lazyly initialize uncommon toolchain detector

Cuda and rocm toolchain detectors are currently run unconditionally,
while their result may not be used at all. Make their initialization
lazy so that the discovery code is not run in common cases.

Reapplied since 77910ac374656319ff114ef251fda358d4aa166a landed and
fixes the test ordering issue.

Differential Revision: https://reviews.llvm.org/D142606
2023-02-06 16:44:11 +01:00
Jonas Hahnfeld
b5ee4f755f Revert "Lazyly initialize uncommon toolchain detector"
clang/test/Driver/rocm-detect.hip is failing for a number of
configurations, for example:

clang-x86_64-debian-fast
https://lab.llvm.org/buildbot/#/builders/109/builds/57270

clang-debian-cpp20
https://lab.llvm.org/buildbot/#/builders/249/builds/310

clang-with-lto-ubuntu
https://lab.llvm.org/buildbot/#/builders/124/builds/6693

This reverts commit 6fa2abf90886f18472c87bc9bffbcdf4f73c465e.
2023-02-06 15:39:33 +01:00
serge-sans-paille
6fa2abf908
Lazyly initialize uncommon toolchain detector
Cuda and rocm toolchain detectors are currently run unconditionally,
while their result may not be used at all. Make their initialization
lazy so that the discovery code is not run in common cases.

Differential Revision: https://reviews.llvm.org/D142606
2023-02-06 12:03:00 +01:00
Joseph Huber
db202286eb [Clang][NFC] Fix out-of-date comments on 'clang-offload-bundler'
Summary:
These comments are confusing as the `clang-offload-bundler` is no longer
used by these toolchains.
2023-01-26 13:03:01 -06:00
Joseph Huber
5d1dc9fa04 [OpenMP] Do not link the bitcode OpenMP runtime when targeting AMDGPU.
The AMDGPU target can only emit LLVM-IR, so we can always rely on LTO to
link the static version of the runtime optimally. Using the static
library only has a few advantages. Namely, it avoids several known bugs
and allows us to optimize out more functions. This is legal since the
changes in D142486 and D142484

Depends on D142486 D142484

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142491
2023-01-24 17:01:37 -06:00
Joseph Huber
255922be7f [OpenMP] Clean up AMD handling for -fopenmp-targets=amdgcn arch inference
Previously we had some special handling here that errored out if
multiple architectures were detected. This isn't a problem anymore as
the runtime can handle multi-archicture binaries automatically. So it's
safe to simply take the first architecture that we know works. If users
use `--offload-arch=native` instead it will build for all the
architectures at the same time rather than just picking one. This patch
makes it consisten with the NVPTX version.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142138
2023-01-20 17:33:56 -06:00
Joseph Huber
56ebfca4bc [CUDA][HIP] Add support for --offload-arch=native to CUDA and refactor
This patch adds basic support for `--offload-arch=native` to CUDA. This
is done using the `nvptx-arch` tool that was introduced previously. Some
of the logic for handling executing these tools was factored into a
common helper as well. This patch does not add support for OpenMP or the
"new" driver. That will be done later.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D141051
2023-01-11 10:30:30 -06:00
Joseph Huber
194ec844f5 [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU
Previously, we linked in the ROCm device libraries which provide math
and other utility functions late. This is not stricly correct as this
library contains several flags that are only set per-TU, such as fast
math or denormalization. This patch changes this to pass the bitcode
libraries per-TU using the same method we use for the CUDA libraries.
This has the advantage that we correctly propagate attributes making
this implementation more correct. Additionally, many annoying unused
functions were not being fully removed during LTO. This lead to
erroneous warning messages and remarks on unused functions.

I am not sure if not finding these libraries should be a hard error. let
me know if it should be demoted to a warning saying that some device
utilities will not work without them.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D133726
2022-09-14 09:42:06 -05:00
Joseph Huber
47166968db [OpenMP] Deprecate the old driver for OpenMP offloading
Recently OpenMP has transitioned to using the "new" driver which
primarily merges the device and host linking phases into a single
wrapper that handles both at the same time. This replaced a few tools
that were only used for OpenMP offloading, such as the
`clang-offload-wrapper` and `clang-nvlink-wrapper`. The new driver
carries some marked benefits compared to the old driver that is now
being deprecated. Things like device-side LTO, static library
support, and more compatible tooling. As such, we should be able to
completely deprecate the old driver, at least for OpenMP. The old driver
support will still exist for CUDA and HIP, although both of these can
currently be compiled on Linux with `--offload-new-driver` to use the new
method.

Note that this does not deprecate the `clang-offload-bundler`, although
it is unused by OpenMP now, it is still used by the HIP toolchain both
as their device binary format and object format.

When I proposed deprecating this code I heard some vendors voice
concernes about needing to update their code in their fork. They should
be able to just revert this commit if it lands.

Reviewed By: jdoerfert, MaskRay, ye-luo

Differential Revision: https://reviews.llvm.org/D130020
2022-08-26 13:47:09 -05:00
Kazu Hirata
f5ef2c5838 [clang] Convert for_each to range-based for loops (NFC) 2022-06-10 22:39:45 -07:00
Joseph Huber
8477a0d769 [OpenMP] Allow compiling multiple target architectures with OpenMP
This patch adds support for OpenMP to use the `--offload-arch` and
`--no-offload-arch` options. Traditionally, OpenMP has only supported
compiling for a single architecture via the `-Xopenmp-target` option.
Now we can pass in a bound architecture and use that if given, otherwise
we default to the value of the `-march` option as before.

Note that this only applies the basic support, the OpenMP target runtime
does not yet know how to choose between multiple architectures.
Additionally other parts of the offloading toolchain (e.g. LTO) require
the `-march` option, these should be worked out later.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D124721
2022-05-06 16:57:16 -04:00
Ron Lieberman
f1e7ecaa18 Revert "[AMDPU][Sanitizer] Refactor sanitizer options handling for AMDGPU Toolchain"
This reverts commit cc2139524f77248c7e147d4cc3befb31fe3e6daa.

failed a few buildbots
2022-04-02 13:25:50 +00:00
Ron Lieberman
cc2139524f [AMDPU][Sanitizer] Refactor sanitizer options handling for AMDGPU Toolchain
authored by amit.pandey@amd.com  ampandey-AMD

Differential Revision: https://reviews.llvm.org/D122781
2022-04-02 11:01:09 +00:00
Joseph Huber
a0e8077d28 [OpenMP][NFC] Simplify identifying the device bitcode library
Now that the old device runtime has been deleted there is only a single
target that differs by the triple and the architecture. Simplify the
scheme for identifying the library but directly using the triple.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D119638
2022-02-12 14:55:47 -05:00
Joseph Huber
5966c2ec02 [OpenMP] Fix mismatched device runtime name
Summary:
The new runtime was deleted. AMD's old runtime used the triple name
`amdgcn` while the new runtime used `amdgpu`. This was not updated when
the old runtime was removed causing the library to not be found on
AMDGPU.
2022-02-04 16:54:31 -05:00
Joseph Huber
034adaf5be [OpenMP] Completely remove old device runtime
This patch completely removes the old OpenMP device runtime. Previously,
the old runtime had the prefix `libomptarget-new-` and the old runtime
was simply called `libomptarget-`. This patch makes the formerly new
runtime the only runtime available. The entire project has been deleted,
and all references to the `libomptarget-new` runtime has been replaced
with `libomptarget-`.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D118934
2022-02-04 15:31:33 -05:00
Joseph Huber
3762111aa9 [OpenMP] Link the bitcode library late for device LTO
Summary:
This patch adds support for linking the OpenMP device bitcode library
late when doing LTO. This simply passes it in as an additional device
file when doing the final device linking phase with LTO. This has the
advantage that we don't link it multiple times, and the device
references do not get inlined and prevent us from doing needed OpenMP
optimizations when we have visiblity of the whole module.
Fix some failings where the implicit conversion of an Error to an
Expected triggered the deleted copy constructor.

Depends on D116675

Differential revision: https://reviews.llvm.org/D117048
2022-01-31 23:11:41 -05:00
Johannes Doerfert
6f2ee1ca5e [OpenMP][AMDGPU] Optimize the linked in math libraries
Once we linked in math files, potentially even if we link in only other
"system libraries", we want to optimize the code again. This is not only
reasonable but also helps to hide various problems with the missing
attribute annotations in the math libraries.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D116906
2022-01-19 23:36:36 -06:00
Joseph Huber
a9935b5db7 [openmp] Unconditionally set march commandline argument
Extracted from D117246. This reflects the march value used by the
compile back into the toolchain arguments, letting downstream processes
such as LTO rely on it being present. Subsequent patches should also be able
to remove the two other calls to checkSystemForAMDGPU.

Reviewed By: jonchesterfield

Differential Revision: https://reviews.llvm.org/D117706
2022-01-19 19:14:47 +00:00
Saiyedul Islam
32357266fd [Clang][NFC] Fix multiline comment prefixes in function headers
Cleanup of D105191 after latest clang-format changes.

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D111545
2022-01-04 11:51:31 +00:00
Jon Chesterfield
6bb2a4f3e6 [openmp] Default to new rtl for amdgpu
Reverts D114965 as the compiler backend appears to be working again

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D115157
2021-12-06 16:56:14 +00:00
Joseph Huber
96ff74a0d5 [OpenMP] Remove the new runtime default for AMDGPU
The new runtime is currently broken for AMD offloading. This patch makes
the default the old runtime only for the AMD target.

Reviewed By: ronlieb

Differential Revision: https://reviews.llvm.org/D114965
2021-12-02 12:35:58 -05:00
Joseph Huber
c99407e31c [OpenMP] Make the new device runtime the default
This patch changes the `-fopenmp-target-new-runtime` option which controls if
the new or old device runtime is used to be true by default.  Disabling this to
use the old runtime now requires using `-fno-openmp-target-new-runtime`.

Reviewed By: JonChesterfield, tianshilei1992, gregrodgers, ronlieb

Differential Revision: https://reviews.llvm.org/D114890
2021-12-02 11:11:45 -05:00
Jon Chesterfield
0e738323a9 [openmp][amdgpu] Add comment warning that libm may be broken
Using llvm-link to add rocm device-libs probably doesn't work

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D112639
2021-11-15 15:56:01 +00:00
Jon Chesterfield
4d50803ce4 [libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987.
CI is showing a different set of pass/fails to local, committing this
without the tests enabled by default while debugging that difference.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D112227
2021-10-28 12:34:01 +01:00
Jon Chesterfield
6c7b203d1d Revert "[libomptarget] Build DeviceRTL for amdgpu"
- more tests failing on CI than failed locally when writing this patch

This reverts commit 33427fdb7b52b79ce5e25b7e14e0f1a44d876bd2.
2021-10-28 01:01:53 +01:00
Jon Chesterfield
33427fdb7b [libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D112227
2021-10-28 00:41:45 +01:00
Saiyedul Islam
35ebe4cc24 [Clang][OpenMP] Add partial support for Static Device Libraries
An archive containing device code object files can be passed to
clang command line for linking. For each given offload target
it creates a device specific archives which is either passed to llvm-link
if the target is amdgpu, or to clang-nvlink-wrapper if the target is
nvptx. -L/-l flags are used to specify these fat archives on the command
line. E.g.
  clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp -L. -lmylib

It currently doesn't support linking an archive directly, like:
  clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp libmylib.a

Linking with x86 offload also does not work.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D105191
2021-10-08 09:37:51 +00:00
Saiyedul Islam
94e2b0258a Revert "[Clang][OpenMP] Add partial support for Static Device Libraries"
This reverts commit 4c4117089599cb5b6c6fa5635c28462ffd1bddf4.
2021-10-07 14:13:24 +00:00
Saiyedul Islam
4c41170895 [Clang][OpenMP] Add partial support for Static Device Libraries
An archive containing device code object files can be passed to
clang command line for linking. For each given offload target
it creates a device specific archives which is either passed to llvm-link
if the target is amdgpu, or to clang-nvlink-wrapper if the target is
nvptx. -L/-l flags are used to specify these fat archives on the command
line. E.g.
  clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp -L. -lmylib

It currently doesn't support linking an archive directly, like:
  clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp libmylib.a

Linking with x86 offload also does not work.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D105191
2021-10-07 04:45:19 +00:00
Pushpinder Singh
60e07a9568 [AMDGPU][OpenMP] Use llvm-link to link ocml libraries
This fixes the 'unused linker option: -lm' warning when compiling
program with -c.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D107952
2021-08-13 13:36:57 +05:30
Pushpinder Singh
9830f902e4 [AMDGPU][OpenMP] Support linking of math libraries
Math libraries are linked only when -lm is specified. This is because
host system could be missing rocm-device-libs.

Reviewed By: JonChesterfield, yaxunl

Differential Revision: https://reviews.llvm.org/D105981
2021-07-30 13:53:44 +00:00
Jan Svoboda
60426f33b1 [clang][driver] NFC: Move InputInfo.h from lib to include
Moving `InputInfo.h` from `lib/Driver/` into `include/Driver` to be able to expose it in an API consumed from outside of `clangDriver`.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D106787
2021-07-27 09:17:39 +02:00
Joseph Huber
d297211692 [OpenMP] Add a driver flag to enable the new device runtime library
This patch adds a driver flag `-fopenmp-target-new-runtime` to optionally enable the new device runtime
bitcode library. This allows users to enable the new experimental runtime
before it becomes the default in the future.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106793
2021-07-26 16:35:56 -04:00
Pushpinder Singh
59ad4e0f01 Reapply "[AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed"
This reverts commit 93604305bb72201641f31cc50a6e7b2fe65d3af3.
2021-04-27 10:47:05 +00:00
Pushpinder Singh
93604305bb Revert "Reapply "[AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed""
This reverts commit 15be0c41d2e59fb4599c9aebf21ede498c61f51d.
2021-04-27 02:23:44 +00:00
Jon Chesterfield
15be0c41d2 Reapply "[AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed"
This reverts commit 24c1ed3b34f7602b955e52cd8a362f4e27eb5f20.
2021-04-23 01:07:16 +01:00
Jon Chesterfield
24c1ed3b34 Revert "[AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed"
This reverts commit 722d4d8e7585457d407d0639a4ae2610157e06a8.

Unclear where hsa.h should be included from, see report in D99949
2021-04-22 19:39:37 +01:00
Pushpinder Singh
722d4d8e75 [AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed
This patch adds new clang tool named amdgpu-arch which uses
HSA to detect installed AMDGPU and report back latter's march.
This tool is built only if system has HSA installed.

The value printed by amdgpu-arch is used to fill -march when
latter is not explicitly provided in -Xopenmp-target.

Reviewed By: JonChesterfield, gregrodgers

Differential Revision: https://reviews.llvm.org/D99949
2021-04-22 05:20:28 +00:00
Pushpinder Singh
0ad50bf27f Revert "[AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed"
This reverts commit 3194761d2763a471dc6426a3e77c1445cb9ded3b.
2021-04-21 08:05:38 +00:00
Pushpinder Singh
3194761d27 [AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed
This patch adds new clang tool named amdgpu-arch which uses
HSA to detect installed AMDGPU and report back latter's march.
This tool is built only if system has HSA installed.

The value printed by amdgpu-arch is used to fill -march when
latter is not explicitly provided in -Xopenmp-target.

Reviewed By: JonChesterfield, gregrodgers

Differential Revision: https://reviews.llvm.org/D99949
2021-04-21 05:05:49 +00:00
Pushpinder Singh
efc013ec4d Revert "[AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed"
This reverts commit 7029cffc4e78556cfe820791c612968bb15b2ffb.
2021-04-16 09:16:58 +00:00
Pushpinder Singh
7029cffc4e [AMDGPU][OpenMP] Add amdgpu-arch tool to list AMD GPUs installed
This patch adds new clang tool named amdgpu-arch which uses
HSA to detect installed AMDGPU and report back latter's march.
This tool is built only if system has HSA installed.

The value printed by amdgpu-arch is used to fill -march when
latter is not explicitly provided in -Xopenmp-target.

Reviewed By: JonChesterfield, gregrodgers

Differential Revision: https://reviews.llvm.org/D99949
2021-04-16 05:26:20 +00:00
Pushpinder Singh
fc12a64ecc [OpenMP][AMDGPU] Skip backend and assemble phases for amdgcn
Remove emit-llvm-bc from addClangTargetOptions as it conflicts with -E for save-temps.

AMDGCN does not yet support linking object files so backend and assemble actions are
skipped, leaving LLVM IR as the output format.

Reviewed By: JonChesterfield, ronlieb

Differential Revision: https://reviews.llvm.org/D96769
2021-03-16 04:58:14 +00:00
Pushpinder Singh
79401b43ce [OpenMP][AMDGPU] Add support for linking libomptarget bitcode
This patch uses the existing logic of CUDA for searching libomptarget
and extracts it to a common method.

Reviewed By: JonChesterfield, tianshilei1992

Differential Revision: https://reviews.llvm.org/D96248
2021-02-12 00:42:41 -05:00
Pushpinder Singh
fcf03e7280 [OpenMP] Add OpenMP offloading toolchain for AMDGPU
This patch adds AMDGPUOpenMPToolChain for supporting OpenMP
offloading to AMD GPU's.

Originally authored by Greg Rodgers

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94961
2021-02-03 00:42:52 -05:00