118 Commits

Author SHA1 Message Date
Jun Wang
86842e1f72
[AMDGPU] New clang option for emitting a waitcnt instruction after each memory instruction (#79236)
This patch introduces a new command-line option for clang, namely,
amdgpu-precise-mem-op (or precise-memory in the backend). When this option is specified, a waitcnt
instruction is generated after each memory load/store instruction. The
counter values are always 0, but which counters are involved depends on
the memory instruction.

---------

Co-authored-by: Jun Wang <jun.wang7@amd.com>
2024-04-10 10:47:04 -07:00
Fangrui Song
2b5cd8be3a [Driver] Remove InstallDir and getInstalledDir. NFC
Follow-up to #80527.
2024-03-03 18:10:46 -08:00
Joseph Huber
99660082cb
[Clang] Append target search paths for direct offloading compilation (#82699)
Summary:
Recent changes to the `libc` project caused the headers to be installed
to `include/<triple>` for the GPU and the libraries to be in
`lib/<triple>`. This means we should automatically append these search
paths so they can be found by default. This allows the following to work
targeting AMDGPU.

```shell
$ clang foo.c -flto -mcpu=native --target=amdgcn-amd-amdhsa -lc <install>/lib/amdgcn-amd-amdhsa/crt1.o
$ amdhsa-loader a.out
```
2024-02-23 14:21:02 -06:00
Yaxun (Sam) Liu
46b6756255
[AMDGPU] Diagnose unaligned atomic (#80322)
AMDGPU does not support unaligned atomics, therefore make the warning an
error.

This patch is transferred from

https://reviews.llvm.org/D99201
2024-02-02 10:41:47 -05:00
Yaxun (Sam) Liu
fcd3752342
[HIP] fix HIP detection for /usr (#80190)
Skip checking HIP version file under parent directory for /usr/local
since /usr will be checked after /usr/local.

Fixes: https://github.com/llvm/llvm-project/issues/78344
2024-02-01 10:33:51 -05:00
Alex Voicu
907f2a0927
[HIP][Driver] Automatically include hipstdpar forwarding header (#78915)
The forwarding header used by `hipstdpar` on AMDGPU targets is now
pacakged with `rocThrust`. This change augments the ROCm Driver
component so that it can automatically pick up the packaged header iff
the user hasn't overridden it via the dedicated flag.
2024-01-23 00:55:59 +00:00
Kazu Hirata
f3dcc2351c
[clang] Use StringRef::{starts,ends}_with (NFC) (#75149)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 08:54:13 -08:00
Joseph Huber
5513d58ad5
[OpenMP][AMDGPU] Do not include 'ockl' implementations in OpenMP (#70462)
Summary:
The 'ockl' bitcode library from the ROCm device library contains several
implementations of functions like `printf` and `malloc`. We currently do
not depend on these in the OpenMP toolchain, so we shouldn't be linking
them. The primary motivation behind this change is the library rewriting
calls to `printf` and pulling in other unused 'hostcall' routines.
2023-10-27 14:56:29 -05:00
Alex Voicu
9a408588d1 [HIP][Clang][Driver] Add Driver support for hipstdpar
This patch adds the Driver changes needed for enabling HIP parallel algorithm offload on AMDGPU targets. What this change does can be summed up as follows:

- add two flags, one for enabling `hipstdpar` compilation, the second enabling the optional allocation interposition mode;
- the flags correspond to new LangOpt members;
- if we are compiling or linking with --hipstdpar, we enable HIP; in the compilation case C and C++ inputs are treated as HIP inputs;
- the ROCm / AMDGPU driver is augmented to look for and include an implementation detail forwarding header; we error out if the user requested `hipstdpar` but the header or its dependencies cannot be found.

Tests for the behaviour described above are also added.

Reviewed by: MaskRay, yaxunl

Differential Revision: https://reviews.llvm.org/D155775
2023-10-03 13:14:46 +01:00
Jacob Lambert
0661533e41 [AMDGPU] Prepend --no-undefined option for linker instead of append
Previously, for linking in amdgpu contexts, the --no-undefined was appended to the options passed to lld,
overriding any user-supplied options via "-Wl," or "-Xlinker". We now prepend --no-undefined so that
the user options are respected.

Differential Revision: https://reviews.llvm.org/D158582
2023-08-23 12:25:01 -07:00
Yaxun (Sam) Liu
91b9bdeb92 [AMDGPU] Support -mcpu=native for OpenCL
When -mcpu=native is specified, try detecting GPU
on the system by using amdgpu-arch tool. If it
fails to detect GPU, emit an error about GPU
not detected. If multiple GPUs are detected,
use the first GPU and emit a warning.

Reviewed by: Matt Arsenault, Fangrui Song

Differential Revision: https://reviews.llvm.org/D154531
2023-07-13 16:21:35 -04:00
Fangrui Song
681cb54a54 [Driver] Fix duplicate -L after D150013
D150013 is to render -L for AMDGPU but updating tools::AddLinkerInputs is wrong
and causes many non-isCrossCompiling targets to have duplicate -L options
because they do `Args.AddAllArgs(CmdArgs, options::OPT_L);`.

Revert the change and add a `Args.AddAllArgs(CmdArgs, options::OPT_L);` instead.
2023-07-07 23:57:45 -07:00
Yaxun (Sam) Liu
41a1625e07 [HIP] Fix version detection for old HIP-PATH
ROCm used to install components under individual directories,
e.g. HIP installed to /opt/rocm/hip and rocblas installed to
/opt/rocm/rocblas. ROCm has transitioned to a flat directory
structure where all components are installed to /opt/rocm.
HIP-PATH and --hip-path are supposed to be /opt/rocm as
clang detect HIP version by /opt/rocm/share/hip/version.
However, some existing HIP app still uses HIP-PATH=/opt/rocm/hip.
To avoid regression, clang will also try detect share/hip/version
under the parent directory of HIP-PATH or --hip-path.
This way, the detection will work for both new HIP-PATH and
old HIP-PATH.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D154077

Fixes: SWDEV-407757
2023-06-29 14:57:26 -04:00
Joseph Huber
765301183f [AMDGPU] Always pass -mcpu to the lld linker
Currently, AMDGPU more or less only supports linking with LTO. If the
user does not either pass `-flto` or `-Wl,-plugin-opt=mcpu=` manually
linking will fail because the architecture's aren't compatible. THis
patch simply passes `-mcpu` by default if it was specified. Should be a
no-op if it's not actually used.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D153909
2023-06-28 08:52:37 -05:00
Elliot Goodrich
b0abd4893f [llvm] Add missing StringExtras.h includes
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.
2023-06-25 15:42:22 +01:00
Yaxun (Sam) Liu
e40e427a64 [HIP] Fix HIP path detection
Fix two issues:

--hip-path should not do rigorous checking, i.e. if .hipVersion exists it
will use it, otherwise it will not error out but assumes the default
HIP version. This is to be consistent with --rocm-path behavior.

when HIP_PATH is empty, it should be ignored. This is to be consistent
with ROCM_PATH behavior.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D152734

Fixes: SWDEV-404771
2023-06-13 11:12:11 -04:00
Yaxun (Sam) Liu
6adb9a0602 [AMDGPU] Emit predefined macro __AMDGCN_CUMODE__
Predefine __AMDGCN_CUMODE__ as 1 or 0 when compilation assumes CU or WGP modes.

If WGP mode is not supported, ignore -mno-cumode and emit a warning.

This is needed for implementing device functions like __smid
(312dff7b79/include/hip/amd_detail/amd_device_functions.h (L957))

Reviewed by: Matt Arsenault, Artem Belevich, Brian Sumner

Differential Revision: https://reviews.llvm.org/D145343
2023-05-12 18:50:52 -04:00
Cordell Bloor
f859835766 [HIP] Detect HIP for Ubuntu, Mint, Gentoo, etc.
HIP may be installed into /usr or /usr/local on a variety of Linux
operating systems. It may become unwieldy to list them all.

Reviewed by: Siu Chi Chan, Yaxun Liu

Differential Revision: https://reviews.llvm.org/D149110
2023-05-09 11:31:57 -04:00
Yaxun (Sam) Liu
6aa74ae29f [HIP] Supports env var HIP_PATH
Currently HIP toolchain recognize env var ROCM_PATH and option --rocm-path
but only recognize --hip-path.

Some package management tools e.g. Spack relies on env var HIP_PATH to
be able to load different version of HIP dynamically.
(https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/hip/package.py#L446)
Therefore add support of env var HIP_PATH.

Reviewed by: Artem Belevich, Fangrui Song

Differential Revision: https://reviews.llvm.org/D145391
2023-04-02 08:46:19 -04:00
Joseph Huber
55f38495e3 [Clang] Always use --no-undefined when linking AMDGPU images
AMDGPU uses ELF shared libraries to implement their executable device
images. One downside to this method is that it disables regular warnings
on undefined symbols. This is because shared libraries expect these to
be resolves by later loads. However, the GPU images do not support
dynamic linking so any undefined symbol is going to cause a runtime
error. This patch adds `--no-undefined` to the `ld.lld` invocation to guarantee
that undefined symbols are always caught as linking errors rather than
runtime errors.

Reviewed By: arsenm, MaskRay, #amdgpu

Differential Revision: https://reviews.llvm.org/D145941
2023-03-14 13:11:33 -05:00
Joseph Huber
c45d2df05e [Clang] Add options in LTO mode when cross compiling for AMDGPU
The AMDGPU toolchain support directly compiling GPU images using
cross-compilation such as `clang --target=amdgcn-amd-amdhsa foo.c`.
However, when attempting to link bitcode this does not work because the
`-mcpu` options are not forwarded to the linker among others. This patch
simply adds them so that `clang --target=amdgcn-amd-amdhsa foo.c -flto`
works correctly.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D144505
2023-02-22 10:14:05 -06:00
Archibald Elliott
d768bf994f [NFC][TargetParser] Replace uses of llvm/Support/Host.h
The forwarding header is left in place because of its use in
`polly/lib/External/isl/interface/extract_interface.cc`, but I have
added a GCC warning about the fact it is deprecated, because it is used
in `isl` from where it is included by Polly.
2023-02-10 09:59:46 +00:00
serge-sans-paille
0ffaffcaac
Reapply 6fa2abf90886f18472c87bc9bffbcdf4f73c465e
Lazyly initialize uncommon toolchain detector

Cuda and rocm toolchain detectors are currently run unconditionally,
while their result may not be used at all. Make their initialization
lazy so that the discovery code is not run in common cases.

Reapplied since 77910ac374656319ff114ef251fda358d4aa166a landed and
fixes the test ordering issue.

Differential Revision: https://reviews.llvm.org/D142606
2023-02-06 16:44:11 +01:00
Jonas Hahnfeld
b5ee4f755f Revert "Lazyly initialize uncommon toolchain detector"
clang/test/Driver/rocm-detect.hip is failing for a number of
configurations, for example:

clang-x86_64-debian-fast
https://lab.llvm.org/buildbot/#/builders/109/builds/57270

clang-debian-cpp20
https://lab.llvm.org/buildbot/#/builders/249/builds/310

clang-with-lto-ubuntu
https://lab.llvm.org/buildbot/#/builders/124/builds/6693

This reverts commit 6fa2abf90886f18472c87bc9bffbcdf4f73c465e.
2023-02-06 15:39:33 +01:00
serge-sans-paille
6fa2abf908
Lazyly initialize uncommon toolchain detector
Cuda and rocm toolchain detectors are currently run unconditionally,
while their result may not be used at all. Make their initialization
lazy so that the discovery code is not run in common cases.

Differential Revision: https://reviews.llvm.org/D142606
2023-02-06 12:03:00 +01:00
Joseph Huber
9271c5da43 [Clang] Adjust PIC handling for the AMDGPU ToolChain
The AMDGPU target only emits shared libraries currently. This patch
changes the handling of the PIC level to be managed in the
AMDGPUToolChain rather than having a special case for it. This causes
`--target=amdgcn--` to no longer set the PIC. This should be an
acceptable change since that doesn't use a correct toolchain anyway.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D142999
2023-01-31 14:31:10 -06:00
Joseph Huber
26d62674cf [Clang] Explicitly move returned values converted to expected
Summary:
These can cause failures on GCC-7 it seems. We should explicitly move
them to prevent this from causing build failures.
2023-01-12 14:38:03 -06:00
Siu Chi Chan
a18fe67b9f [AMDGCN] Update search path for device libraries
- Add support for finding device libraries in new ROCm directory
structure
- Simplify and remove the handling of legacy ROCm directory structure

Change-Id: I04da3bc9da85ced4b56b0225efb6b94448b8c5a1

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D140315
2023-01-11 11:51:30 -05:00
Joseph Huber
56ebfca4bc [CUDA][HIP] Add support for --offload-arch=native to CUDA and refactor
This patch adds basic support for `--offload-arch=native` to CUDA. This
is done using the `nvptx-arch` tool that was introduced previously. Some
of the logic for handling executing these tools was factored into a
common helper as well. This patch does not add support for OpenMP or the
"new" driver. That will be done later.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D141051
2023-01-11 10:30:30 -06:00
Krzysztof Parzyszek
3c255f679c Process: convert Optional to std::optional
This applies to GetEnv and FindInEnvPath.
2022-12-06 09:56:14 -08:00
Juan Manuel MARTINEZ CAAMAÑO
a446827249 [NFC][Clang][Driver][AMDGPU] Avoid temporary copies of std::string by using Twine and StringRef
Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D139023
2022-12-05 07:27:10 -06:00
Fangrui Song
0c2f6e36f9 [Driver] llvm::None => std::nullopt. NFC 2022-12-03 19:43:25 +00:00
Matt Arsenault
e748db0f7f Support: Convert Program APIs to std::optional 2022-12-01 17:00:44 -05:00
Matt Arsenault
840a793375 clang/AMDGPU: Use Support's wrapper around getenv
This does some extra stuff for Windows, so might as well
use it just in case.
2022-11-14 11:07:31 -08:00
Yaxun (Sam) Liu
082593ff7a [HIP] Detect HIP for Debian/Fedora
HIP is installed at /usr or /usr/local on Debin/Fedora,
and the version file is at {root}/share/hip/version.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D135796
2022-10-12 22:59:16 -04:00
Fangrui Song
1491282165 [clang] Change cc1 -fvisibility's canonical spelling to -fvisibility= 2022-09-02 11:49:38 -07:00
Kazu Hirata
a33ef8f2b7 Use llvm::all_equal (NFC) 2022-08-27 09:53:10 -07:00
Kazu Hirata
ca4af13e48 [clang] Don't use Optional::getValue (NFC) 2022-06-20 22:59:26 -07:00
Kazu Hirata
f5ef2c5838 [clang] Convert for_each to range-based for loops (NFC) 2022-06-10 22:39:45 -07:00
Ron Lieberman
f1e7ecaa18 Revert "[AMDPU][Sanitizer] Refactor sanitizer options handling for AMDGPU Toolchain"
This reverts commit cc2139524f77248c7e147d4cc3befb31fe3e6daa.

failed a few buildbots
2022-04-02 13:25:50 +00:00
Ron Lieberman
cc2139524f [AMDPU][Sanitizer] Refactor sanitizer options handling for AMDGPU Toolchain
authored by amit.pandey@amd.com  ampandey-AMD

Differential Revision: https://reviews.llvm.org/D122781
2022-04-02 11:01:09 +00:00
Fangrui Song
c37accf0a2 [Option] Avoid using the default argument for the 3-argument hasFlag. NFC
The default argument true is error-prone: I think many would think the
default is false.
2022-03-26 00:57:06 -07:00
Yaxun (Sam) Liu
6730b44480 [HIP] Fix HIP include path
The clang compiler prepends the HIP header include paths to the search
list using -internal-isystem when building for the HIP language. This
prevents warnings related to things like reserved identifiers when
including the HIP headers even when ROCm is installed in a non-system
directory, such as /opt/rocm.

However, when HIP is installed in /usr, then the prepended include
path would be /usr/include. That is a problem, because the C standard
library headers are stored in /usr/include and the C++ standard
library headers must come before the C library headers in the search
path list (because the C++ standard library headers use #include_next
to include the C standard library headers).

While the HIP wrapper headers _do_ need to be earlier in the search
than the C++ headers, those headers get their own subdirectory and
their own explicit -internal-isystem argument. This include path is for
<hip/hip_runtime_api.h> and <hip/hip_runtime.h>, which do not require a
particular search ordering with respect to the C or C++ headers. Thus,
HIP include path is added after other system include paths.

With contribution from Cordell Bloor.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D120132
2022-03-09 20:57:27 -05:00
Yaxun (Sam) Liu
092f15ac40 [HIP] File device library ABI version file name
It should be oclc_abi_version* instead of abi_version*.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D120557
2022-02-28 16:24:50 -05:00
Yaxun (Sam) Liu
d4e4ef2e81 [HIP] Support code object v5
New device library supporting v4 and v5 has abi_version_400.bc and abi
version_500.bc.

For v5, abi_version_500.bc is linked.

For v2-4, abi_version_400.bc is linked.

For old device library, for v2-4, none of the above is linked. For v5,
error is emitted about unsupported ABI version.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D118949

Fixes: SWDEV-321313
2022-02-04 09:55:08 -05:00
Henry Linjamäki
4e94cba5b4 [HIPSPV][2/4] Add HIPSPV tool chain
This patch adds a new tool chain, HIPSPVToolChain, for emitting HIP
device code as SPIR-V binary. The SPIR-V binary is emitted by using an
external tool, SPIRV-LLVM-Translator, temporarily. We intend to switch
the translator to the llc tool when the SPIR-V backend lands on LLVM
and proves to work well on HIP implementations which consume SPIR-V.

Before the SPIR-V emission the tool chain loads an optional external
pass plugin, either automatically from a HIP installation or from a
path pointed by --hipspv-pass-plugin, and runs passes that are meant
to expand/lower HIP features that do not have direct counterpart in
SPIR-V (e.g. dynamic shared memory).

Code emission for SPIR-V will be enabled and HIPSPVToolChain tests
will be added in the follow up patch part 3.

Other changes: New option ‘-nohipwrapperinc’ is added to exclude HIP
include wrappers. The reason for the addition is that they cause
compile errors when compiling HIP sources for the host side for HIPCL
and HIPLZ implementations. New option is added to avoid this issue.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D110618
2021-12-14 10:22:38 -08:00
Kazu Hirata
16ceb44e62 [clang] Use llvm::{count,count_if,find_if,all_of,none_of} (NFC) 2021-10-25 09:14:45 -07:00
Kazu Hirata
d8e4170b0a Ensure newlines at the end of files (NFC) 2021-10-23 08:45:29 -07:00
Kazu Hirata
b8debabb77 [clang] Remove redundant calls to c_str() (NFC)
Identified with readability-redundant-string-cstr.
2021-08-31 08:53:51 -07:00
Pushpinder Singh
9830f902e4 [AMDGPU][OpenMP] Support linking of math libraries
Math libraries are linked only when -lm is specified. This is because
host system could be missing rocm-device-libs.

Reviewed By: JonChesterfield, yaxunl

Differential Revision: https://reviews.llvm.org/D105981
2021-07-30 13:53:44 +00:00