63 Commits

Author SHA1 Message Date
Jakub Chlanda
073460a2a3
[HIP][Clang][Driver] Move BC preference logic into ROCm detection (#149294)
This patch provides a single point for handling the logic behind
choosing common bitcode libraries. The intention is that the users of
ROCm installation detector will not have to rewrite options handling
code each time the bitcode libraries are queried. This is not too
distant from detectors for other architecture that encapsulate the
similar decision making process, providing cleaner interface. The only
flag left in `getCommonBitcodeLibs` (main point of entry) is
`NeedsASanRT`, this is deliberate, as in order to calculate it we need
to consult `ToolChain`.
2025-07-23 09:45:11 +02:00
Kazu Hirata
6c37341943
[Driver] Remove unused includes (NFC) (#141448)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-05-26 09:13:36 -07:00
Joseph Huber
f6e3d33c00
[Clang][NFC] Introduce --offloadlib positive flag for nogpulib and alias to --no-offloadlib (#126567)
Summary:
We support `nogpulib` to disable implicit libraries. In the future we
will want to change the default linking of these libraries based on the
user language. This patch just introduces a positive variant so now we
can do `-nogpulib -gpulib` to disable it.

Later patch will make the default a variable in the ROCmToolChain
depending on the target languages.
2025-02-13 07:59:08 -06:00
Amit Kumar Pandey
46f1bab793
Reapply "[Driver][ROCm][OpenMP] Fix default ockl linking for OpenMP."… (#126671)
- This reverts commit
0c6c4a9993.
  - Add '-mcode-object-version=5' as to explicitly use code object
    version 5 to match with 'FAIL' diagnostic.
  - Add Requires directive to support lit test run on platforms
    registered with x86_64 and amdgpu.
2025-02-12 13:40:51 +05:30
Florian Mayer
0c6c4a9993
Revert "[Driver][ROCm][OpenMP] Fix default ockl linking for OpenMP." (#126628)
Reverts llvm/llvm-project#126186

This broke the sanitizer buildbot:
https://lab.llvm.org/buildbot/#/builders/55/builds/6846
2025-02-10 16:35:26 -08:00
Amit Kumar Pandey
c69be3fe4b
[Driver][ROCm][OpenMP] Fix default ockl linking for OpenMP. (#126186)
ASan gpu runtime (asanrtl.bc) linking is dependent on 'ockl.bc'. Link
'ockl.bc' only when ASan is enabled for openmp amdgpu offloading
application.
2025-02-10 21:41:49 +05:30
Amit Kumar Pandey
646d352ab0
[OpenMP][ASan] Enable ASan Instrumentation for AMDGPUOpenMPToolChain. (#124754)
Enable device code ASan instrumentation for openmp offload applications
using option '-fsanitize=address'.
2025-02-05 13:37:31 +05:30
Joseph Huber
272ce90ed4
[Clang] Make OpenMP offloading consistently use the bound architecture (#125135)
Summary:
OpenMP was weirdly split between using the bound architecture from
`--offload-arch=` and the old `-march=` option which only worked for
single jobs. This patch removes that special handling. The main benefit
here is that we can now use `getToolchainArgs` without it throwing an
error.

I'm assuming SYCL doesn't care about this because they don't use an
architecture.
2025-01-31 10:32:24 -06:00
Amit Kumar Pandey
b68b4f64a2
[Driver][ASan] Refactor Clang-Driver "Sanitizer Bitcode" linking. (#123922)
ASan bitcode linking is currently available for HIPAMD,OpenMP and
OpenCL. Moving sanitizer specific common parts of logic to appropriate
API's so as to reduce code redundancy and maintainability.
2025-01-30 16:28:03 +05:30
Joseph Huber
12e8e0b10c
[AMDGPU] Correctly use the auxiliary toolchain to include libc++ (#109366)
Summary:
Now that we have a functional build for `libc++` on the GPU, it will now
find the target specific headers in `include/amdgcn-amd-amdhsa`. This is
a problem for offloading via OpenMP because we need the CPU and GPU
headers to match exactly. All the other toolchains forward this
correctly except the AMDGPU OpenMP one, fix this by overriding it to use
the host toolchain instead of the device one, so the triple is not
returned as `amdgcn-amd-amdhsa`.
2024-09-20 09:29:59 -07:00
macurtis-amd
13dd795ef1
[clang][NFC] Make OffloadLTOMode getter a separate method (#101200)
Minor readability improvement (IMHO). Also makes it easier to find the
places where we are getting the offload lto mode.
2024-08-05 10:06:51 -05:00
Dominik Adamski
14c323cfd6
[OpenMP][AMDGPU] Do not attach -fcuda-is-device (#99002)
-fcuda-is-device flag is not used for OpenMP offloading for AMD GPUs and
it does not need to be added as clang cc1 option for OpenMP code.

This PR has the same functionality as
https://github.com/llvm/llvm-project/pull/96909 but it doesn't introduce
regression for virtual function support.
2024-07-18 09:00:09 +02:00
Dominik Adamski
5a1a467229
Revert "[AMDGPU][OpenMP] Do not attach -fcuda-is-device flag for AMDGPU OpenMP" (#97531)
Reverts llvm/llvm-project#96909 (commit ID: 8bb00cb160830ec8f6029c2aae79d3e46b04b99c)

It breaks OpenMP CI:
https://gitlab.e4s.io/uo-public/llvm-openmp-offloading/-/jobs/283716
2024-07-03 11:42:32 +02:00
Dominik Adamski
8bb00cb160
[AMDGPU][OpenMP] Do not attach -fcuda-is-device flag for AMDGPU OpenMP (#96909)
`-fcuda-is-device` flag is not used for OpenMP offloading for AMD GPUs
and it does not need to be added as clang cc1 option for OpenMP code.
2024-07-01 11:36:49 +02:00
Jakub Chlanda
ab20086422
[CUDA][NFC] CudaArch to OffloadArch rename (#97028)
Rename `CudaArch` to `OffloadArch` to better reflect its content and the
use.
Apply a similar rename to helpers handling the enum.
2024-06-30 07:56:07 +02:00
Joseph Huber
374f6554c3
[OpenMP] Fix passing target id features to AMDGPU offloading (#94765)
Summary:
AMDGPU supports a `target-id` feature which is used to qualify targets
with different incompatible features. These are both rules and target
features. Currently, we pass `-target-cpu` twice when offloading to
OpenMP, and do not pass the target-id features at all. The effect was
that passing something like `--offload-arch=gfx90a:xnack+` would show up
as `-target-cpu=gfx90a:xnack+ -target-cpu=gfx90a`. Thus ignoring the
xnack completely and passing it twice. This patch fixes that to pass it
once and then separate it like how HIP does.
2024-06-07 11:14:16 -05:00
Yaxun (Sam) Liu
46b6756255
[AMDGPU] Diagnose unaligned atomic (#80322)
AMDGPU does not support unaligned atomics, therefore make the warning an
error.

This patch is transferred from

https://reviews.llvm.org/D99201
2024-02-02 10:41:47 -05:00
serge-sans-paille
0ffaffcaac
Reapply 6fa2abf90886f18472c87bc9bffbcdf4f73c465e
Lazyly initialize uncommon toolchain detector

Cuda and rocm toolchain detectors are currently run unconditionally,
while their result may not be used at all. Make their initialization
lazy so that the discovery code is not run in common cases.

Reapplied since 77910ac374656319ff114ef251fda358d4aa166a landed and
fixes the test ordering issue.

Differential Revision: https://reviews.llvm.org/D142606
2023-02-06 16:44:11 +01:00
Jonas Hahnfeld
b5ee4f755f Revert "Lazyly initialize uncommon toolchain detector"
clang/test/Driver/rocm-detect.hip is failing for a number of
configurations, for example:

clang-x86_64-debian-fast
https://lab.llvm.org/buildbot/#/builders/109/builds/57270

clang-debian-cpp20
https://lab.llvm.org/buildbot/#/builders/249/builds/310

clang-with-lto-ubuntu
https://lab.llvm.org/buildbot/#/builders/124/builds/6693

This reverts commit 6fa2abf90886f18472c87bc9bffbcdf4f73c465e.
2023-02-06 15:39:33 +01:00
serge-sans-paille
6fa2abf908
Lazyly initialize uncommon toolchain detector
Cuda and rocm toolchain detectors are currently run unconditionally,
while their result may not be used at all. Make their initialization
lazy so that the discovery code is not run in common cases.

Differential Revision: https://reviews.llvm.org/D142606
2023-02-06 12:03:00 +01:00
Joseph Huber
db202286eb [Clang][NFC] Fix out-of-date comments on 'clang-offload-bundler'
Summary:
These comments are confusing as the `clang-offload-bundler` is no longer
used by these toolchains.
2023-01-26 13:03:01 -06:00
Joseph Huber
5d1dc9fa04 [OpenMP] Do not link the bitcode OpenMP runtime when targeting AMDGPU.
The AMDGPU target can only emit LLVM-IR, so we can always rely on LTO to
link the static version of the runtime optimally. Using the static
library only has a few advantages. Namely, it avoids several known bugs
and allows us to optimize out more functions. This is legal since the
changes in D142486 and D142484

Depends on D142486 D142484

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142491
2023-01-24 17:01:37 -06:00
Joseph Huber
255922be7f [OpenMP] Clean up AMD handling for -fopenmp-targets=amdgcn arch inference
Previously we had some special handling here that errored out if
multiple architectures were detected. This isn't a problem anymore as
the runtime can handle multi-archicture binaries automatically. So it's
safe to simply take the first architecture that we know works. If users
use `--offload-arch=native` instead it will build for all the
architectures at the same time rather than just picking one. This patch
makes it consisten with the NVPTX version.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D142138
2023-01-20 17:33:56 -06:00
Joseph Huber
56ebfca4bc [CUDA][HIP] Add support for --offload-arch=native to CUDA and refactor
This patch adds basic support for `--offload-arch=native` to CUDA. This
is done using the `nvptx-arch` tool that was introduced previously. Some
of the logic for handling executing these tools was factored into a
common helper as well. This patch does not add support for OpenMP or the
"new" driver. That will be done later.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D141051
2023-01-11 10:30:30 -06:00
Joseph Huber
194ec844f5 [OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU
Previously, we linked in the ROCm device libraries which provide math
and other utility functions late. This is not stricly correct as this
library contains several flags that are only set per-TU, such as fast
math or denormalization. This patch changes this to pass the bitcode
libraries per-TU using the same method we use for the CUDA libraries.
This has the advantage that we correctly propagate attributes making
this implementation more correct. Additionally, many annoying unused
functions were not being fully removed during LTO. This lead to
erroneous warning messages and remarks on unused functions.

I am not sure if not finding these libraries should be a hard error. let
me know if it should be demoted to a warning saying that some device
utilities will not work without them.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D133726
2022-09-14 09:42:06 -05:00
Joseph Huber
47166968db [OpenMP] Deprecate the old driver for OpenMP offloading
Recently OpenMP has transitioned to using the "new" driver which
primarily merges the device and host linking phases into a single
wrapper that handles both at the same time. This replaced a few tools
that were only used for OpenMP offloading, such as the
`clang-offload-wrapper` and `clang-nvlink-wrapper`. The new driver
carries some marked benefits compared to the old driver that is now
being deprecated. Things like device-side LTO, static library
support, and more compatible tooling. As such, we should be able to
completely deprecate the old driver, at least for OpenMP. The old driver
support will still exist for CUDA and HIP, although both of these can
currently be compiled on Linux with `--offload-new-driver` to use the new
method.

Note that this does not deprecate the `clang-offload-bundler`, although
it is unused by OpenMP now, it is still used by the HIP toolchain both
as their device binary format and object format.

When I proposed deprecating this code I heard some vendors voice
concernes about needing to update their code in their fork. They should
be able to just revert this commit if it lands.

Reviewed By: jdoerfert, MaskRay, ye-luo

Differential Revision: https://reviews.llvm.org/D130020
2022-08-26 13:47:09 -05:00
Kazu Hirata
f5ef2c5838 [clang] Convert for_each to range-based for loops (NFC) 2022-06-10 22:39:45 -07:00
Joseph Huber
8477a0d769 [OpenMP] Allow compiling multiple target architectures with OpenMP
This patch adds support for OpenMP to use the `--offload-arch` and
`--no-offload-arch` options. Traditionally, OpenMP has only supported
compiling for a single architecture via the `-Xopenmp-target` option.
Now we can pass in a bound architecture and use that if given, otherwise
we default to the value of the `-march` option as before.

Note that this only applies the basic support, the OpenMP target runtime
does not yet know how to choose between multiple architectures.
Additionally other parts of the offloading toolchain (e.g. LTO) require
the `-march` option, these should be worked out later.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D124721
2022-05-06 16:57:16 -04:00
Ron Lieberman
f1e7ecaa18 Revert "[AMDPU][Sanitizer] Refactor sanitizer options handling for AMDGPU Toolchain"
This reverts commit cc2139524f77248c7e147d4cc3befb31fe3e6daa.

failed a few buildbots
2022-04-02 13:25:50 +00:00
Ron Lieberman
cc2139524f [AMDPU][Sanitizer] Refactor sanitizer options handling for AMDGPU Toolchain
authored by amit.pandey@amd.com  ampandey-AMD

Differential Revision: https://reviews.llvm.org/D122781
2022-04-02 11:01:09 +00:00
Joseph Huber
a0e8077d28 [OpenMP][NFC] Simplify identifying the device bitcode library
Now that the old device runtime has been deleted there is only a single
target that differs by the triple and the architecture. Simplify the
scheme for identifying the library but directly using the triple.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D119638
2022-02-12 14:55:47 -05:00
Joseph Huber
5966c2ec02 [OpenMP] Fix mismatched device runtime name
Summary:
The new runtime was deleted. AMD's old runtime used the triple name
`amdgcn` while the new runtime used `amdgpu`. This was not updated when
the old runtime was removed causing the library to not be found on
AMDGPU.
2022-02-04 16:54:31 -05:00
Joseph Huber
034adaf5be [OpenMP] Completely remove old device runtime
This patch completely removes the old OpenMP device runtime. Previously,
the old runtime had the prefix `libomptarget-new-` and the old runtime
was simply called `libomptarget-`. This patch makes the formerly new
runtime the only runtime available. The entire project has been deleted,
and all references to the `libomptarget-new` runtime has been replaced
with `libomptarget-`.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D118934
2022-02-04 15:31:33 -05:00
Joseph Huber
3762111aa9 [OpenMP] Link the bitcode library late for device LTO
Summary:
This patch adds support for linking the OpenMP device bitcode library
late when doing LTO. This simply passes it in as an additional device
file when doing the final device linking phase with LTO. This has the
advantage that we don't link it multiple times, and the device
references do not get inlined and prevent us from doing needed OpenMP
optimizations when we have visiblity of the whole module.
Fix some failings where the implicit conversion of an Error to an
Expected triggered the deleted copy constructor.

Depends on D116675

Differential revision: https://reviews.llvm.org/D117048
2022-01-31 23:11:41 -05:00
Johannes Doerfert
6f2ee1ca5e [OpenMP][AMDGPU] Optimize the linked in math libraries
Once we linked in math files, potentially even if we link in only other
"system libraries", we want to optimize the code again. This is not only
reasonable but also helps to hide various problems with the missing
attribute annotations in the math libraries.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D116906
2022-01-19 23:36:36 -06:00
Joseph Huber
a9935b5db7 [openmp] Unconditionally set march commandline argument
Extracted from D117246. This reflects the march value used by the
compile back into the toolchain arguments, letting downstream processes
such as LTO rely on it being present. Subsequent patches should also be able
to remove the two other calls to checkSystemForAMDGPU.

Reviewed By: jonchesterfield

Differential Revision: https://reviews.llvm.org/D117706
2022-01-19 19:14:47 +00:00
Saiyedul Islam
32357266fd [Clang][NFC] Fix multiline comment prefixes in function headers
Cleanup of D105191 after latest clang-format changes.

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D111545
2022-01-04 11:51:31 +00:00
Jon Chesterfield
6bb2a4f3e6 [openmp] Default to new rtl for amdgpu
Reverts D114965 as the compiler backend appears to be working again

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D115157
2021-12-06 16:56:14 +00:00
Joseph Huber
96ff74a0d5 [OpenMP] Remove the new runtime default for AMDGPU
The new runtime is currently broken for AMD offloading. This patch makes
the default the old runtime only for the AMD target.

Reviewed By: ronlieb

Differential Revision: https://reviews.llvm.org/D114965
2021-12-02 12:35:58 -05:00
Joseph Huber
c99407e31c [OpenMP] Make the new device runtime the default
This patch changes the `-fopenmp-target-new-runtime` option which controls if
the new or old device runtime is used to be true by default.  Disabling this to
use the old runtime now requires using `-fno-openmp-target-new-runtime`.

Reviewed By: JonChesterfield, tianshilei1992, gregrodgers, ronlieb

Differential Revision: https://reviews.llvm.org/D114890
2021-12-02 11:11:45 -05:00
Jon Chesterfield
0e738323a9 [openmp][amdgpu] Add comment warning that libm may be broken
Using llvm-link to add rocm device-libs probably doesn't work

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D112639
2021-11-15 15:56:01 +00:00
Jon Chesterfield
4d50803ce4 [libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987.
CI is showing a different set of pass/fails to local, committing this
without the tests enabled by default while debugging that difference.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D112227
2021-10-28 12:34:01 +01:00
Jon Chesterfield
6c7b203d1d Revert "[libomptarget] Build DeviceRTL for amdgpu"
- more tests failing on CI than failed locally when writing this patch

This reverts commit 33427fdb7b52b79ce5e25b7e14e0f1a44d876bd2.
2021-10-28 01:01:53 +01:00
Jon Chesterfield
33427fdb7b [libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D112227
2021-10-28 00:41:45 +01:00
Saiyedul Islam
35ebe4cc24 [Clang][OpenMP] Add partial support for Static Device Libraries
An archive containing device code object files can be passed to
clang command line for linking. For each given offload target
it creates a device specific archives which is either passed to llvm-link
if the target is amdgpu, or to clang-nvlink-wrapper if the target is
nvptx. -L/-l flags are used to specify these fat archives on the command
line. E.g.
  clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp -L. -lmylib

It currently doesn't support linking an archive directly, like:
  clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp libmylib.a

Linking with x86 offload also does not work.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D105191
2021-10-08 09:37:51 +00:00
Saiyedul Islam
94e2b0258a Revert "[Clang][OpenMP] Add partial support for Static Device Libraries"
This reverts commit 4c4117089599cb5b6c6fa5635c28462ffd1bddf4.
2021-10-07 14:13:24 +00:00
Saiyedul Islam
4c41170895 [Clang][OpenMP] Add partial support for Static Device Libraries
An archive containing device code object files can be passed to
clang command line for linking. For each given offload target
it creates a device specific archives which is either passed to llvm-link
if the target is amdgpu, or to clang-nvlink-wrapper if the target is
nvptx. -L/-l flags are used to specify these fat archives on the command
line. E.g.
  clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp -L. -lmylib

It currently doesn't support linking an archive directly, like:
  clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp libmylib.a

Linking with x86 offload also does not work.

Reviewed By: ye-luo

Differential Revision: https://reviews.llvm.org/D105191
2021-10-07 04:45:19 +00:00
Pushpinder Singh
60e07a9568 [AMDGPU][OpenMP] Use llvm-link to link ocml libraries
This fixes the 'unused linker option: -lm' warning when compiling
program with -c.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D107952
2021-08-13 13:36:57 +05:30
Pushpinder Singh
9830f902e4 [AMDGPU][OpenMP] Support linking of math libraries
Math libraries are linked only when -lm is specified. This is because
host system could be missing rocm-device-libs.

Reviewed By: JonChesterfield, yaxunl

Differential Revision: https://reviews.llvm.org/D105981
2021-07-30 13:53:44 +00:00
Jan Svoboda
60426f33b1 [clang][driver] NFC: Move InputInfo.h from lib to include
Moving `InputInfo.h` from `lib/Driver/` into `include/Driver` to be able to expose it in an API consumed from outside of `clangDriver`.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D106787
2021-07-27 09:17:39 +02:00