This patch implements pages 15-17 from
jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf
Documentation:
jhauser.us/RISCV/ext-P/RVP-baseInstrs-014.pdf
jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf
The Generic_GCC::GCCInstallationDetector class picks the GCC
installation directory with the largest version number. Since the
location of the libstdc++ include directories is tied to the GCC
version, this can break C++ compilation if the libstdc++ headers for
this particular GCC version are not available. Linux distributions tend
to package the libstdc++ headers separately from GCC. This frequently
leads to situations in which a newer version of GCC gets installed as a
dependency of another package without installing the corresponding
libstdc++ package. Clang then fails to compile C++ code because it
cannot find the libstdc++ headers. Since libstdc++ headers are in fact
installed on the system, the GCC installation continues to work, the
user may not be aware of the details of the GCC detection, and the
compiler does not recognize the situation and emit a warning, this
behavior can be hard to understand - as witnessed by many related bug
reports over the years.
The goal of this work is to change the GCC detection to prefer GCC
installations that contain libstdc++ include directories over those
which do not. This should happen regardless of the input language since
picking different GCC installations for a build that mixes C and C++
might lead to incompatibilities.
Any change to the GCC installation detection will probably have a
negative impact on some users. For instance, for a C user who relies on
using the GCC installation with the largest version number, it might
become necessary to use the --gcc-install-dir option to ensure that this
GCC version is selected.
This seems like an acceptable trade-off given that the situation for
users who do not have any special demands on the particular GCC
installation directory would be improved significantly.
This patch does not yet change the automatic GCC installation directory
choice. Instead, it does introduce a warning that informs the user about
the future change if the chosen GCC installation directory differs from
the one that would be chosen if the libstdc++ headers are taken into
account.
See also this related Discourse discussion:
https://discourse.llvm.org/t/rfc-take-libstdc-into-account-during-gcc-detection/86992.
This patch is part of a series to support driver managed module builds
for C++ named modules and Clang modules.
This introduces a scanner that detects C++ named module usage early in
the driver with only negligible overhead.
For now, it is enabled only with the `-fmodules-driver` flag and serves
solely diagnostic purposes. In the future, the scanner will be enabled
for any (modules-driver compatible) compilation with two or more inputs,
and will help the driver determine whether to implicitly enable the
modules driver.
Since the scanner adds very little overhead, we are also exploring
enabling it for compilations with only a single input. This approach
could allow us to detect `import std` usage in a single-file
compilation, which would then activate the modules driver. For
performance measurements on this, see
https://github.com/naveen-seth/llvm-dev-cxx-modules-check-benchmark.
RFC for driver managed module builds:
https://discourse.llvm.org/t/rfc-modules-support-simple-c-20-modules-use-from-the-clang-driver-without-a-build-system
This patch relands the reland (2d31fc8) for commit ded1426. The earlier
reland failed due to a missing link dependency on `clangLex`. This
reland fixes the issue by adding the link dependency after discussing it
in the following RFC:
https://discourse.llvm.org/t/rfc-driver-link-the-driver-against-clangdependencyscanning-clangast-clangfrontend-clangserialization-and-clanglex
Summary:
Previously we extracted archives and did special symbol resolution on
them, this was mostly done because the nvlink executable couldn't handle
archives natively. Since I have added a wrapper around it that handles
that, we can now remove a lot of this complexity and just create a new
archive and pass it to the device linker.
The one issue here is that for offloading code, we often have references
to kernels that do not come from the device link job, but from the host
that wishes to call them. Under normal circumstances we would ignore
them because they are not referenced, but we need them. For now I am
just passing `--whole-archive`. In a future patch I will likely add some
logic to create a linker script that will tell the linker which symbols
we want to be extracted. That won't work for NVPTX but it's the most
canonical solution.
Also some ugliness in this patch where I need to juggle whether or not
the input is an archive and throw pairs around, I may merge that into
the `OffloadFile` struct in a later patch.
When determining what arguments to pass to `clang-linker-wrapper` as
device linker args, don't forward `-mllvm` args if the offloading
toolchain doesn't have native LLVM support.
I saw this when working with SPIR-V, we were trying to pass `-mllvm` to
`spirv-link`.
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
iOS 26/watchOS 26 deprecate some devices, allowing the default CPU to be
raised based on the deployment target.
watchOS 26 also introduces arm64 (vs. arm64_32/arm64e).
The defaults are now:
- arm64-apple-ios26 apple-a12
- arm64-apple-watchos26 apple-s6
- arm64_32-apple-watchos26 apple-s6
- arm64e-apple-watchos26 apple-s6
Left unchanged are:
- arm64-apple-tvos26 apple-a7
- arm64e-apple-tvos26 apple-a12
- arm64e-apple-ios26 apple-a12
- arm64_32-apple-watchos11 apple-s4
While there, rewrite an outdated comment in a related Mac test.
Summary:
We always want to use the detected path. The clang driver's detection is
far more sophisticated so we should use that whenever possible. Also
update the usage so we properly fall back to path instead of incorrectly
using the `/bin` if not provided.
-mcpu is used to determine the ISA string if an explicit -march is not
present on the command line. If there is a -march present it always has
priority over -mcpu regardless of where it appears in the command line.
This can cause issues if -march appears in a Makefile and a user wants
to override it with an -mcpu on the command line. The user would need to
provide a potentially long ISA string to -march that matches the -mcpu
in order to override the MakeFile.
This issue can also be seen on Compiler Explorer where the rv64gc
toolchain is passed -march=rv64gc before any user command line options
are added. If you pass -mcpu=spacemit-x60, vectors will not be enabled
due to the hidden -march.
This patch adds a new option for -march, "unset" that makes the -march
ignored for purposes of prioritizing over -mcpu. Now a user can write
-march=unset -mcpu=<cpu_name>. This is also implemented by gcc for
ARM32.
An alternative would be to allow -march to take a cpu name, but that
requires "-march=<cpu_name> -mcpu=<cpu_name>" or "-march=<cpu_name>
-mtune=<cpu_name>" to ensure the tune cpu also gets updated. IMHO,
needing to repeat the CPU name twice isn't friendly and invites
mistakes.
gcc has implemented this as well.
Several Clang tests were failing on Cygwin, and were already marked as
requiring !system-windows, unsupported on system-windows, or xfail on
system-windows. Add system-cygwin to lit's llvm.config, and use it in
such tests in addition to system-windows.
HIP runtime support for compressed bundle format v3 is in place,
therefore switch the default compressed bundle format to v3 in compiler.
This allows both compressed and decompressed fat binary size to exceed
4GB by default.
Environment variable COMPRESSED_BUNDLE_FORMAT_VERSION=2 can be used for
backward compatibility for older HIP runtimes not supporting v3.
Fixes: SWDEV-548879
This commit adds driver support for linking libclc OpenCL libraries. It
takes the form of a new optional flag: --libclc-lib=namespec. Nothing is
linked unless this flag is specified.
Not all libclc targets have corresponding clang targets. For this reason
it is desirable for users to be able to specify a libclc library name.
We support this by taking both a library name (without the .bc suffix)
or a filename. Both of these are searched for in the clang resource
directory. Filenames are
also checked themselves so that absolute paths can be provided. The
syntax for specifying filenames (as opposed to library names) uses a
leading colon (:), inspired by the -l option.
To accommodate this option, libclc libraries are now placed into clang's
resource directory in an in-tree configuration. The libraries are all
placed in <resource-dir>/lib/libclc and
are not grouped under host-specific directories as some other runtime
libraries are; it is not expected that OpenCL libraries will differ
depending on the host toolchain.
Currently only the AMDGPU toolchain supports this option as a proof of
concept. Other targets such as NVPTX or SPIR/SPIR-V could support it
too. We could optionally let target toolchains search for libclc
libraries themselves, possibly when passed an empty --libclc-lib.
CUDA-12.9 has removed fatbinary tool's `--image` argument we've been
using till now.
--image3 has been supported since cuda-9, so we do not need CUDA SDK
version checks.
This patch removes %T from clang lit tests. %T has been deprecated for
about seven years and is not reccomended as it is not unique to each
test, which can lead to races. This patch is intended to remove usage in
tree with the end goal of removing support for %T within lit.
This patch removes the use of %T from crash-report-modules.m. %T has
been deprecated for about seven years, and is to be avoided as it can
lead to races between tests.
Posting this one as a separate patch given it was explicitly ported to
%T in 5c268f078ddbaa7cc1e9ea0ef658eb90563ff5db. Given
36bb47c70802629df98276ce90b0f8163f959e28 was landed seven years ago to
completely disable the test on Windows due to continued failures, I
think it should be fine to just leave this test unsupported on Windows.
Summary:
The new driver's behavior forwards all unrecognized command line
arguments to the host linker. It only knew `--compress` so when
`-compress` was passed it didn't forward it correctly. This patch
changes the spelling because multi word arguments should have two
dashes.
When building with -fdebug-compilation-dir/-fcoverige-compilation-dir,
infer the compilation directory in clang driver, rather than try to
fallback to VFS current working directory lookup during CodeGen. This
allows compilation directory to be used as it is passed via cc1 flag and
the value can be empty to remove dependency on CWD if needed.
Reviewers: adrian-prantl, dwblaikie
Reviewed By: adrian-prantl, dwblaikie
Pull Request: https://github.com/llvm/llvm-project/pull/150112
Some tests in LLVM-libc require this flag
(https://github.com/llvm/llvm-project/issues/145349), which requires
compiler-rt to be linked in, but not the C library. With this change,
the `-nolibc` flag will not be ignored.
As discussed in PR #145603, the following command seems to fail to
produce a YAML remarks file for offload LTO passes and thus for
kernel-info:
```
clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \
-Rpass=kernel-info -fsave-optimization-record
```
The problem is that, in clang-linker-wrapper's clang call, clang names
the file based on clang's main output file (from `-o`). That is a
temporary file, so the YAML file becomes a temporary file, which the
user never sees.
This patch:
- Makes clang honor `-dumpdir` for the default YAML remarks file in the
case of LTO.
- Extends clang-linker-wrapper to specify that option to clang.
To demonstrate the appeal of the generality of `-dumpdir` (as opposed to
a one-off `-fsave-optimization-record` solution in
clang-linker-wrapper), this patch also fixes `-gsplit-dwarf`. Without
this patch, when using `-gsplit-dwarf` and later debugging using rocgdb,
the dwo directory for offload is a temporary file, so temporary file
cleanup causes rocgdb to lose debug symbols for offload code.
WARNING: The clang driver passes `-dumpdir` to various clang frontend
calls. For LTO, that was previously being ignored, and now it's not.
That changes some auxiliary file names, as revealed by changes in some
existing tests' expected output: `clang/test/Driver/opt-record.c` and
`clang/test/Driver/lto-dwo.c`. Hopefully this change does not introduce
a backward compatibility issue for users.
I was attempting to build openblas with clang in msys2's `ucrt64`
environment (I'm aware of the `clang64` environment, but I wanted
libstdc++). The openblas link failed with the following:
```
clang -march=native -mtune=native -m64 -O2 -fno-asynchronous-unwind-tables -O2 -DSMALL_MATRIX_OPT -DMS_ABI -DMAX_STACK_ALLOC=2048 -Wall -m64 -DF_INTERFACE_GFORT -DDYNAMIC_ARCH -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=512 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.29\" -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME= -DASMFNAME=_ -DNAME=_ -DCNAME= -DCHAR_NAME=\"_\" -DCHAR_CNAME=\"\" -DNO_AFFINITY -I.. libopenblas64_.def dllinit.obj \
-shared -o ../libopenblas64_.dll -Wl,--out-implib,../libopenblas64_.dll.a \
-Wl,--whole-archive ../libopenblas64_p-r0.3.29.a -Wl,--no-whole-archive -LC:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0 -LC:/msys64/ucrt64/bin/../lib/gcc -LC:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../../../x86_64-w64-mingw32/lib/../lib -LC:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../../../lib -LC:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../../../x86_64-w64-mingw32/lib -LC:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../.. -lgfortran -lmingwex -lmsvcrt -lquadmath -lm -lpthread -lmingwex -lmsvcrt -defaultlib:advapi32 -lgfortran -defaultlib:advapi32 -lgfortran
C:/msys64/ucrt64/bin/ld: C:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../../../lib/libmingw32.a(lib64_libmingw32_a-pseudo-reloc.o): in function `__report_error':
D:/W/B/src/mingw-w64/mingw-w64-crt/crt/pseudo-reloc.c:157:(.text+0x59): undefined reference to `abort'
C:/msys64/ucrt64/bin/ld: C:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../../../lib/libmingw32.a(lib64_libmingw32_a-tlsthrd.o): in function `___w64_mingwthr_add_key_dtor':
D:/W/B/src/mingw-w64/mingw-w64-crt/crt/tlsthrd.c:48:(.text+0xa5): undefined reference to `calloc'
C:/msys64/ucrt64/bin/ld: C:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../../../lib/libmingw32.a(lib64_libmingw32_a-pesect.o): in function `_FindPESectionByName':
D:/W/B/src/mingw-w64/mingw-w64-crt/crt/pesect.c:79:(.text+0xfd): undefined reference to `strncmp'
```
These symbols come from the `-lmingw32` dep that the driver added and
are ordinarily found in `-lmsvcrt`, which got skipped here, because
openblas passed `-lmsvcrt` explicitly earlier in the link line. Since we
always add these libraries at the end here, I think that clang is "at
fault" (as opposed to a user or packaging mistake) and should have added
some crt here.
To preserve the intent of letting the user override which crt is chosen,
duplicate the (first) user chosen crt `-l` into this position, although
we should perhaps consider an explicit `-mcrtdll` like gcc has as well.
If a user passed an invalid value to `-march`, an assertion failure
happened in the AArch64 multilib logic.
But an invalid `-march` value is an expected case that should be handled
via error messages.
This patch removes the requirement that the `-march` value must be
valid.
I'm unsure if there is an official source for which targets use/support
which emulations, but for the baremetal GNU Arm/AArch64 toolchains or
binutils builds I've tried to use, GNU ld either did not support the
Linux emulations (resulting in errors unless overriding the emulation)
or the Linux emulations were supported but GCC passed the non-Linux
emulations by default.
These emulations all seem to be accepted by lld as well, so try to align
with what it seems GCC is doing and prefer the non-Linux emulations for
baremetal Arm/AArch64 targets.
Summary:
This is a bit of an awkward transition point for the new and old
drivers. Previously AMDGPU uses this to generate offloading bundles, but
the new driver much prefers to output the file itself. This patch
changes the behavior to always respect `--gpu-bundle-output` instead of
having it be the default behavior. This means that we effectively get to
override the default new driver behavior with this flag now. This should
hoepfully fix some errors in the downstream comgr tests.
The mentioned flag is already both a cc1 & driver flag; however, whether
it is respected was tied to either:
1. Whether it was passed as a cc1 option (`Xclang`)
2. or `-fmodules` accompanying it
This poses a consistency problem where `std=c++20` enables the modules
feature, independent of other module settings.
This patch resolves this issue by checking for the presence
unconditionally & passing it down to cc1 when applicable.
report_fatal_error is not a good way to report diagnostics to the users, so this switches to using actual diagnostic reporting mechanisms instead.
Fixes#147187
`libomp` is the default value when unconfigured in cmake, but llvm can
be configured to have `libgomp` be the default instead. Explicitly
specify this value so the test does not fail when it assumes libomp is
always the default.
While investigating PR #149990, I noticed that while both the Oracle
Studio compilers and GCC default to `-mv8plus` on 32-bit Solaris/SPARC,
Clang does not.
This patch fixes this by enabling the `v8plus` feature.
Tested on `sparcv9-sun-solaris2.11` and `sparc64-unknown-linux-gnu`.
Summary:
This is broken with the current target because it was not bundling the
output as HIP likes and this would fail if targeting both at the same
time.
Prompted by PR #149652, this patch changes the Solaris/SPARC default to
-mcpu, matching both the Oracle Studio 12.6 compilers and GCC 16:
[[PATCH] Default to -mcpu=ultrasparc3 on
Solaris/SPARC](https://gcc.gnu.org/pipermail/gcc-patches/2025-July/690191.html).
This is equivalent to enabling the `vis2` feature.
Tested on `sparcv9-sun-solaris2.11` and `sparc64-unknown-linux-gnu`.
Commit 597ee88 moved the -X flag to a new position in the baremetal
toolchain's linker job, but unintentionally left the original instance in place.
This patch removes the redundant flag, ensuring -X is passed only once.