llvm-project

Author	SHA1	Message	Date
Yaxun (Sam) Liu	46b6756255	[AMDGPU] Diagnose unaligned atomic (#80322 ) AMDGPU does not support unaligned atomics, therefore make the warning an error. This patch is transferred from https://reviews.llvm.org/D99201	2024-02-02 10:41:47 -05:00
Yaxun (Sam) Liu	7c2e32d6fe	Partial revert "[HIP] Fix -mllvm option for device lld linker" (#80202 ) This partially reverts commit aa964f157f9b50fab3895afbfda6e0915cf6bb4a because it caused perf regressions in rccl due to drop of -mllvm -amgpu-kernarg-preload-count=16 from the linker step. Potentially it could cause similar regressions for other HIP apps using -mllvm options with -fgpu-rdc. Fixes: SWDEV-443345	2024-01-31 17:51:42 -05:00
Alex Voicu	9a408588d1	[HIP][Clang][Driver] Add Driver support for `hipstdpar` This patch adds the Driver changes needed for enabling HIP parallel algorithm offload on AMDGPU targets. What this change does can be summed up as follows: - add two flags, one for enabling `hipstdpar` compilation, the second enabling the optional allocation interposition mode; - the flags correspond to new LangOpt members; - if we are compiling or linking with --hipstdpar, we enable HIP; in the compilation case C and C++ inputs are treated as HIP inputs; - the ROCm / AMDGPU driver is augmented to look for and include an implementation detail forwarding header; we error out if the user requested `hipstdpar` but the header or its dependencies cannot be found. Tests for the behaviour described above are also added. Reviewed by: MaskRay, yaxunl Differential Revision: https://reviews.llvm.org/D155775	2023-10-03 13:14:46 +01:00
Joseph Huber	c1afed9f48	[Clang][HIP] Remove 'clangPostLink' from SDL handling (#67366 ) Summary: This feature is not needed anymore and is replaced by different implementations. The code guarded by this flag also causes us to emit an invalid argument to `-mlink-builtin-bitcode` that will cause errors if ever actually executed. Remove this feature.	2023-09-26 11:27:38 -05:00
Yaxun (Sam) Liu	e17882430e	[CUDA][HIP] Rename and fix `-fcuda-approx-transcendentals` Rename -fcuda-approx-transcendentals as -fgpu-approx-transcendentals and pass it to both device and host clang -cc1. Fix its interaction with -ffast-math to allow -fno-gpu-approx-transcendentals to override the implicit -fcuda-approx-transcendentals due to -ffast-math. Rename the predefined macro to be __CLANG_GPU_APPROX_TRANSCENDENTALS__. Emit the macro for both device and host compilation. Reviewed by: Artem Belevich, Fangrui Song Differential Revision: https://reviews.llvm.org/D154797	2023-07-25 12:01:41 -04:00
Yaxun (Sam) Liu	aa964f157f	[HIP] Fix -mllvm option for device lld linker currently clang passes -mllvm options to the device lld linker plugin when compiling HIP. This is against default clang behavior which is only passing -mllvm options to linker plugin specified through -Wl options. This patch lets clang only pass -Xoffload-linker -mllvm= options to device lld linker plugin. Fixes: https://github.com/llvm/llvm-project/issues/63604 Reviewed by: Joseph Huber, Matt Arsenault Differential Revision: https://reviews.llvm.org/D154145	2023-06-30 12:54:38 -04:00
Siu Chi Chan	f1aee32f1c	[HIP] Instruct lld to go through all archives Add the --whole-archive flag when linking HIP programs to instruct lld to go through every archive library to link in all the kernel functions (entry pointers to the GPU program); otherwise, lld may skip some library files if there are no more symbols that need to be resolved. Differential Revision: https://reviews.llvm.org/D152207 Change-Id: I084d3d606f9cee646f9adc65f4b648c9bcb252e6	2023-06-09 08:50:44 -04:00
Scott Linder	97ba3c2bec	[Clang][AMDGPU] Set LTO CG opt level based on Clang option For AMDGCN default to mapping --lto-O# to --lto-CGO# in a 1:1 manner (i.e. clang -O<N> implies --lto-O<N> and --lto-CGO<N>). Ensure there is a means to override this via -Xoffload-linker and begin to claim these arguments to avoid incorrect warnings that they are not used. Reviewed By: yaxunl, MaskRay Differential Revision: https://reviews.llvm.org/D142499	2023-02-15 17:34:35 +00:00
Archibald Elliott	8e3d7cf5de	[NFC][TargetParser] Remove llvm/Support/TargetParser.h	2023-02-07 11:08:21 +00:00
serge-sans-paille	0ffaffcaac	Reapply 6fa2abf90886f18472c87bc9bffbcdf4f73c465e Lazyly initialize uncommon toolchain detector Cuda and rocm toolchain detectors are currently run unconditionally, while their result may not be used at all. Make their initialization lazy so that the discovery code is not run in common cases. Reapplied since 77910ac374656319ff114ef251fda358d4aa166a landed and fixes the test ordering issue. Differential Revision: https://reviews.llvm.org/D142606	2023-02-06 16:44:11 +01:00
Jonas Hahnfeld	b5ee4f755f	Revert "Lazyly initialize uncommon toolchain detector" clang/test/Driver/rocm-detect.hip is failing for a number of configurations, for example: clang-x86_64-debian-fast https://lab.llvm.org/buildbot/#/builders/109/builds/57270 clang-debian-cpp20 https://lab.llvm.org/buildbot/#/builders/249/builds/310 clang-with-lto-ubuntu https://lab.llvm.org/buildbot/#/builders/124/builds/6693 This reverts commit 6fa2abf90886f18472c87bc9bffbcdf4f73c465e.	2023-02-06 15:39:33 +01:00
serge-sans-paille	6fa2abf908	Lazyly initialize uncommon toolchain detector Cuda and rocm toolchain detectors are currently run unconditionally, while their result may not be used at all. Make their initialization lazy so that the discovery code is not run in common cases. Differential Revision: https://reviews.llvm.org/D142606	2023-02-06 12:03:00 +01:00
Joseph Huber	9271c5da43	[Clang] Adjust PIC handling for the AMDGPU ToolChain The AMDGPU target only emits shared libraries currently. This patch changes the handling of the PIC level to be managed in the AMDGPUToolChain rather than having a special case for it. This causes `--target=amdgcn--` to no longer set the PIC. This should be an acceptable change since that doesn't use a correct toolchain anyway. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D142999	2023-01-31 14:31:10 -06:00
Juan Manuel MARTINEZ CAAMAÑO	a446827249	[NFC][Clang][Driver][AMDGPU] Avoid temporary copies of std::string by using Twine and StringRef Reviewed By: tra Differential Revision: https://reviews.llvm.org/D139023	2022-12-05 07:27:10 -06:00
Yaxun (Sam) Liu	056ebadf5c	[HIP] Fix lld failure when devie object is empty When -fgpu-rdc is used for linking relocatable objects, clang driver launches clang-offload-bundler to extract a device relocatable object from each input relocatable object file and passes the extracted files to lld. The input relocatable object file could either come from HIP program or C++ program. The relocatable object file from C++ program does not contain device relocatable objects, therefore clang-offload-bundler extracts an empty file and passes it to lld. lld treates empty file as linker script. When there is no object input file to lld, lld will emit error: target emulation unknown: -m or at least one .o file required This patch adds "elf64_amdgpu" to lld so that lld always know the target no matter whether there are object input files or not. Reviewed by: Artem Belevich, Fangrui Song Differential Revision: https://reviews.llvm.org/D138221	2022-11-22 10:38:42 -05:00
Joseph Huber	194ec844f5	[OpenMP][AMDGPU] Link bitcode ROCm device libraries per-TU Previously, we linked in the ROCm device libraries which provide math and other utility functions late. This is not stricly correct as this library contains several flags that are only set per-TU, such as fast math or denormalization. This patch changes this to pass the bitcode libraries per-TU using the same method we use for the CUDA libraries. This has the advantage that we correctly propagate attributes making this implementation more correct. Additionally, many annoying unused functions were not being fully removed during LTO. This lead to erroneous warning messages and remarks on unused functions. I am not sure if not finding these libraries should be a hard error. let me know if it should be demoted to a warning saying that some device utilities will not work without them. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D133726	2022-09-14 09:42:06 -05:00
Kazu Hirata	b7a7aeee90	[clang] Qualify auto in range-based for loops (NFC)	2022-09-03 23:27:27 -07:00
Fangrui Song	1491282165	[clang] Change cc1 -fvisibility's canonical spelling to -fvisibility=	2022-09-02 11:49:38 -07:00
Kazu Hirata	ca4af13e48	[clang] Don't use Optional::getValue (NFC)	2022-06-20 22:59:26 -07:00
Kazu Hirata	f5ef2c5838	[clang] Convert for_each to range-based for loops (NFC)	2022-06-10 22:39:45 -07:00
Yaxun (Sam) Liu	92a606f6de	[HIP] Pass -Xoffload-linker option to device linker Reuse -Xoffload-linker option for HIP toolchain. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D126704	2022-05-31 22:17:40 -04:00
Jacob Lambert	afcc6baac5	[clang][HIP] Updating driver to enable archive/bitcode to bitcode linking when targeting HIPAMD toolchain Differential Revision: https://reviews.llvm.org/D124151	2022-04-21 09:24:33 -07:00
Ron Lieberman	f1e7ecaa18	Revert "[AMDPU][Sanitizer] Refactor sanitizer options handling for AMDGPU Toolchain" This reverts commit cc2139524f77248c7e147d4cc3befb31fe3e6daa. failed a few buildbots	2022-04-02 13:25:50 +00:00
Ron Lieberman	cc2139524f	[AMDPU][Sanitizer] Refactor sanitizer options handling for AMDGPU Toolchain authored by amit.pandey@amd.com ampandey-AMD Differential Revision: https://reviews.llvm.org/D122781	2022-04-02 11:01:09 +00:00
Fangrui Song	c37accf0a2	[Option] Avoid using the default argument for the 3-argument hasFlag. NFC The default argument true is error-prone: I think many would think the default is false.	2022-03-26 00:57:06 -07:00
Yaxun (Sam) Liu	da9a70313d	[HIP] Fix -fno-gpu-sanitize Fix a typo about -fno-gpu-sanitize handling and disable warnings when -fno-gpu-sanitize is specified. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D121302	2022-03-09 20:58:50 -05:00
Yaxun (Sam) Liu	fa0f90bc55	[HIP] Support linking archive of bundled bitcode HIP programs compiled with -c -fgpu-rdc generate clang-offload-bundler bundles which contain bitcode for different GPU's. Such files can be archived to an archive file which can be linked with HIP programs with -fgpu-rdc. This patch adds suppor of linking archive of bundled bitcode. When an archive of bundled bitcode is passed to clang by -l, for each GPU specified through --offload-arch, clang extracts bitcode from the archive and creates a new archive for that GPU and pass it to lld. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D120070 Fixes: SWDEV-321741, SWDEV-315773	2022-02-19 18:37:44 -05:00
Kazu Hirata	2d303e6781	Remove redundant return and continue statements (NFC) Identified with readability-redundant-control-flow.	2021-12-24 23:17:54 -08:00
Yaxun (Sam) Liu	240be6541d	Fix warning about unused variable in HIPAMD.cpp	2021-12-13 11:25:48 -05:00
Yaxun (Sam) Liu	78b0f3701d	[HIPSPV][1/4] Refactor HIP tool chain This patch refactors the HIP tool chain for new HIP tool chain, HIPSPV tool chain, which is added in the follow up patch part 2. Rename HIPToolChain to HIPAMDToolChain and Renames HIP.* files to HIPAMD.. Introduce HIPUtility. file where common HIP utilities, shared among HIP tool chain implementations, are placed in. Move constructHIPFatbinCommand() and constructGenerateObjFileFromHIPFatBinary() to HIPUtility. HIPSPV tool chain is going to use them. Tweak bundle target ID in constructHIPFatbinCommand(): extra dashes are dropped if the Target ID is empty and 'hip' offload kind is made default for non-AMD targets. Patch by: Henry Linjamäki Reviewed by: Yaxun Liu, Artem Belevich, Eric Christopher Differential Revision: https://reviews.llvm.org/D110549	2021-12-13 10:50:25 -05:00

30 Commits