The Generic_GCC::GCCInstallationDetector class picks the GCC
installation directory with the largest version number. Since the
location of the libstdc++ include directories is tied to the GCC
version, this can break C++ compilation if the libstdc++ headers for
this particular GCC version are not available. Linux distributions tend
to package the libstdc++ headers separately from GCC. This frequently
leads to situations in which a newer version of GCC gets installed as a
dependency of another package without installing the corresponding
libstdc++ package. Clang then fails to compile C++ code because it
cannot find the libstdc++ headers. Since libstdc++ headers are in fact
installed on the system, the GCC installation continues to work, the
user may not be aware of the details of the GCC detection, and the
compiler does not recognize the situation and emit a warning, this
behavior can be hard to understand - as witnessed by many related bug
reports over the years.
The goal of this work is to change the GCC detection to prefer GCC
installations that contain libstdc++ include directories over those
which do not. This should happen regardless of the input language since
picking different GCC installations for a build that mixes C and C++
might lead to incompatibilities.
Any change to the GCC installation detection will probably have a
negative impact on some users. For instance, for a C user who relies on
using the GCC installation with the largest version number, it might
become necessary to use the --gcc-install-dir option to ensure that this
GCC version is selected.
This seems like an acceptable trade-off given that the situation for
users who do not have any special demands on the particular GCC
installation directory would be improved significantly.
This patch does not yet change the automatic GCC installation directory
choice. Instead, it does introduce a warning that informs the user about
the future change if the chosen GCC installation directory differs from
the one that would be chosen if the libstdc++ headers are taken into
account.
See also this related Discourse discussion:
https://discourse.llvm.org/t/rfc-take-libstdc-into-account-during-gcc-detection/86992.
In relation to the approval and merge of the
[PRIF](https://github.com/llvm/llvm-project/pull/76088) specification
about multi-image features in Flang, here is a first PR to add support
for the `-fcoarray` compilation flag and the initialization of the PRIF
environment.
Other PRs will follow for adding support of lowering to PRIF.
Both clang and gfortran support the -fopenmp-simd flag, which enables
OpenMP support only for simd constructs, while disabling the rest of
OpenMP.
Implement the appropriate parse tree rewriting to remove non-SIMD OpenMP
constructs at the parsing stage.
Add a new SimdOnly flang OpenMP IR pass which rewrites generated OpenMP
FIR to handle untangling composite simd constructs, and clean up OpenMP
operations leftover after the parse tree rewriting stage.
With this approach, the two parts of the logic required to make the flag
work can be self-contained within the parse tree rewriter and the MLIR
pass, respectively. It does not need to be implemented within the core
lowering logic itself.
The flag is expected to have no effect if -fopenmp is passed explicitly,
and is only expected to remove OpenMP constructs, not things like OpenMP
library functions calls. This matches the behaviour of other compilers.
---------
Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
Introduces the use of pointer authentication to protect the invocation,
copy and dispose, reference, and descriptor pointers in Objective-C
block objects.
Resolves#141176
When determining what arguments to pass to `clang-linker-wrapper` as
device linker args, don't forward `-mllvm` args if the offloading
toolchain doesn't have native LLVM support.
I saw this when working with SPIR-V, we were trying to pass `-mllvm` to
`spirv-link`.
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
iOS 26/watchOS 26 deprecate some devices, allowing the default CPU to be
raised based on the deployment target.
watchOS 26 also introduces arm64 (vs. arm64_32/arm64e).
The defaults are now:
- arm64-apple-ios26 apple-a12
- arm64-apple-watchos26 apple-s6
- arm64_32-apple-watchos26 apple-s6
- arm64e-apple-watchos26 apple-s6
Left unchanged are:
- arm64-apple-tvos26 apple-a7
- arm64e-apple-tvos26 apple-a12
- arm64e-apple-ios26 apple-a12
- arm64_32-apple-watchos11 apple-s4
While there, rewrite an outdated comment in a related Mac test.
Summary:
We always want to use the detected path. The clang driver's detection is
far more sophisticated so we should use that whenever possible. Also
update the usage so we properly fall back to path instead of incorrectly
using the `/bin` if not provided.
-mcpu is used to determine the ISA string if an explicit -march is not
present on the command line. If there is a -march present it always has
priority over -mcpu regardless of where it appears in the command line.
This can cause issues if -march appears in a Makefile and a user wants
to override it with an -mcpu on the command line. The user would need to
provide a potentially long ISA string to -march that matches the -mcpu
in order to override the MakeFile.
This issue can also be seen on Compiler Explorer where the rv64gc
toolchain is passed -march=rv64gc before any user command line options
are added. If you pass -mcpu=spacemit-x60, vectors will not be enabled
due to the hidden -march.
This patch adds a new option for -march, "unset" that makes the -march
ignored for purposes of prioritizing over -mcpu. Now a user can write
-march=unset -mcpu=<cpu_name>. This is also implemented by gcc for
ARM32.
An alternative would be to allow -march to take a cpu name, but that
requires "-march=<cpu_name> -mcpu=<cpu_name>" or "-march=<cpu_name>
-mtune=<cpu_name>" to ensure the tune cpu also gets updated. IMHO,
needing to repeat the CPU name twice isn't friendly and invites
mistakes.
gcc has implemented this as well.
This commit adds driver support for linking libclc OpenCL libraries. It
takes the form of a new optional flag: --libclc-lib=namespec. Nothing is
linked unless this flag is specified.
Not all libclc targets have corresponding clang targets. For this reason
it is desirable for users to be able to specify a libclc library name.
We support this by taking both a library name (without the .bc suffix)
or a filename. Both of these are searched for in the clang resource
directory. Filenames are
also checked themselves so that absolute paths can be provided. The
syntax for specifying filenames (as opposed to library names) uses a
leading colon (:), inspired by the -l option.
To accommodate this option, libclc libraries are now placed into clang's
resource directory in an in-tree configuration. The libraries are all
placed in <resource-dir>/lib/libclc and
are not grouped under host-specific directories as some other runtime
libraries are; it is not expected that OpenCL libraries will differ
depending on the host toolchain.
Currently only the AMDGPU toolchain supports this option as a proof of
concept. Other targets such as NVPTX or SPIR/SPIR-V could support it
too. We could optionally let target toolchains search for libclc
libraries themselves, possibly when passed an empty --libclc-lib.
CUDA-12.9 has removed fatbinary tool's `--image` argument we've been
using till now.
--image3 has been supported since cuda-9, so we do not need CUDA SDK
version checks.
Summary:
The new driver's behavior forwards all unrecognized command line
arguments to the host linker. It only knew `--compress` so when
`-compress` was passed it didn't forward it correctly. This patch
changes the spelling because multi word arguments should have two
dashes.
When building with -fdebug-compilation-dir/-fcoverige-compilation-dir,
infer the compilation directory in clang driver, rather than try to
fallback to VFS current working directory lookup during CodeGen. This
allows compilation directory to be used as it is passed via cc1 flag and
the value can be empty to remove dependency on CWD if needed.
Reviewers: adrian-prantl, dwblaikie
Reviewed By: adrian-prantl, dwblaikie
Pull Request: https://github.com/llvm/llvm-project/pull/150112
Some tests in LLVM-libc require this flag
(https://github.com/llvm/llvm-project/issues/145349), which requires
compiler-rt to be linked in, but not the C library. With this change,
the `-nolibc` flag will not be ignored.
As discussed in PR #145603, the following command seems to fail to
produce a YAML remarks file for offload LTO passes and thus for
kernel-info:
```
clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \
-Rpass=kernel-info -fsave-optimization-record
```
The problem is that, in clang-linker-wrapper's clang call, clang names
the file based on clang's main output file (from `-o`). That is a
temporary file, so the YAML file becomes a temporary file, which the
user never sees.
This patch:
- Makes clang honor `-dumpdir` for the default YAML remarks file in the
case of LTO.
- Extends clang-linker-wrapper to specify that option to clang.
To demonstrate the appeal of the generality of `-dumpdir` (as opposed to
a one-off `-fsave-optimization-record` solution in
clang-linker-wrapper), this patch also fixes `-gsplit-dwarf`. Without
this patch, when using `-gsplit-dwarf` and later debugging using rocgdb,
the dwo directory for offload is a temporary file, so temporary file
cleanup causes rocgdb to lose debug symbols for offload code.
WARNING: The clang driver passes `-dumpdir` to various clang frontend
calls. For LTO, that was previously being ignored, and now it's not.
That changes some auxiliary file names, as revealed by changes in some
existing tests' expected output: `clang/test/Driver/opt-record.c` and
`clang/test/Driver/lto-dwo.c`. Hopefully this change does not introduce
a backward compatibility issue for users.
This patch moves `LazyDetector` and target specific (Cuda, Hip, SYCL)
installation detectors to clang's include directory. It was problematic
for downstream to use headers from clang's lib dir. The use of lib
headers could lead to subtle errors, as some of the symbols there are
annotated with `LLVM_LIBRARY_VISIBILITY`. For instance
[`ROCMToolChain::getCommonDeviceLibNames`](https://github.com/jchlanda/llvm-project/blob/jakub/installation_detectors/clang/lib/Driver/ToolChains/AMDGPU.h#L147)
is c++ public, but because of the annotation it ends up as ELF hidden
symbol, which causes errors when accessed from another shared library.
I was attempting to build openblas with clang in msys2's `ucrt64`
environment (I'm aware of the `clang64` environment, but I wanted
libstdc++). The openblas link failed with the following:
```
clang -march=native -mtune=native -m64 -O2 -fno-asynchronous-unwind-tables -O2 -DSMALL_MATRIX_OPT -DMS_ABI -DMAX_STACK_ALLOC=2048 -Wall -m64 -DF_INTERFACE_GFORT -DDYNAMIC_ARCH -DSMP_SERVER -DNO_WARMUP -DMAX_CPU_NUMBER=512 -DMAX_PARALLEL_NUMBER=1 -DBUILD_SINGLE=1 -DBUILD_DOUBLE=1 -DBUILD_COMPLEX=1 -DBUILD_COMPLEX16=1 -DVERSION=\"0.3.29\" -UASMNAME -UASMFNAME -UNAME -UCNAME -UCHAR_NAME -UCHAR_CNAME -DASMNAME= -DASMFNAME=_ -DNAME=_ -DCNAME= -DCHAR_NAME=\"_\" -DCHAR_CNAME=\"\" -DNO_AFFINITY -I.. libopenblas64_.def dllinit.obj \
-shared -o ../libopenblas64_.dll -Wl,--out-implib,../libopenblas64_.dll.a \
-Wl,--whole-archive ../libopenblas64_p-r0.3.29.a -Wl,--no-whole-archive -LC:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0 -LC:/msys64/ucrt64/bin/../lib/gcc -LC:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../../../x86_64-w64-mingw32/lib/../lib -LC:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../../../lib -LC:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../../../x86_64-w64-mingw32/lib -LC:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../.. -lgfortran -lmingwex -lmsvcrt -lquadmath -lm -lpthread -lmingwex -lmsvcrt -defaultlib:advapi32 -lgfortran -defaultlib:advapi32 -lgfortran
C:/msys64/ucrt64/bin/ld: C:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../../../lib/libmingw32.a(lib64_libmingw32_a-pseudo-reloc.o): in function `__report_error':
D:/W/B/src/mingw-w64/mingw-w64-crt/crt/pseudo-reloc.c:157:(.text+0x59): undefined reference to `abort'
C:/msys64/ucrt64/bin/ld: C:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../../../lib/libmingw32.a(lib64_libmingw32_a-tlsthrd.o): in function `___w64_mingwthr_add_key_dtor':
D:/W/B/src/mingw-w64/mingw-w64-crt/crt/tlsthrd.c:48:(.text+0xa5): undefined reference to `calloc'
C:/msys64/ucrt64/bin/ld: C:/msys64/ucrt64/bin/../lib/gcc/x86_64-w64-mingw32/15.1.0/../../../../lib/libmingw32.a(lib64_libmingw32_a-pesect.o): in function `_FindPESectionByName':
D:/W/B/src/mingw-w64/mingw-w64-crt/crt/pesect.c:79:(.text+0xfd): undefined reference to `strncmp'
```
These symbols come from the `-lmingw32` dep that the driver added and
are ordinarily found in `-lmsvcrt`, which got skipped here, because
openblas passed `-lmsvcrt` explicitly earlier in the link line. Since we
always add these libraries at the end here, I think that clang is "at
fault" (as opposed to a user or packaging mistake) and should have added
some crt here.
To preserve the intent of letting the user override which crt is chosen,
duplicate the (first) user chosen crt `-l` into this position, although
we should perhaps consider an explicit `-mcrtdll` like gcc has as well.
The Cygwin target is generally very similar to the MinGW target. The
default auto-import behavior, the default calling convention, the
`.dll.a` import library extension, the `__GXX_TYPEINFO_EQUALITY_INLINE`
pre-define by `g++`, and the long double configuration.
Co-authored-by: Mateusz Mikuła <oss@mateuszmikula.dev>
I'm unsure if there is an official source for which targets use/support
which emulations, but for the baremetal GNU Arm/AArch64 toolchains or
binutils builds I've tried to use, GNU ld either did not support the
Linux emulations (resulting in errors unless overriding the emulation)
or the Linux emulations were supported but GCC passed the non-Linux
emulations by default.
These emulations all seem to be accepted by lld as well, so try to align
with what it seems GCC is doing and prefer the non-Linux emulations for
baremetal Arm/AArch64 targets.
The mentioned flag is already both a cc1 & driver flag; however, whether
it is respected was tied to either:
1. Whether it was passed as a cc1 option (`Xclang`)
2. or `-fmodules` accompanying it
This poses a consistency problem where `std=c++20` enables the modules
feature, independent of other module settings.
This patch resolves this issue by checking for the presence
unconditionally & passing it down to cc1 when applicable.
While investigating PR #149990, I noticed that while both the Oracle
Studio compilers and GCC default to `-mv8plus` on 32-bit Solaris/SPARC,
Clang does not.
This patch fixes this by enabling the `v8plus` feature.
Tested on `sparcv9-sun-solaris2.11` and `sparc64-unknown-linux-gnu`.
This patch provides a single point for handling the logic behind
choosing common bitcode libraries. The intention is that the users of
ROCm installation detector will not have to rewrite options handling
code each time the bitcode libraries are queried. This is not too
distant from detectors for other architecture that encapsulate the
similar decision making process, providing cleaner interface. The only
flag left in `getCommonBitcodeLibs` (main point of entry) is
`NeedsASanRT`, this is deliberate, as in order to calculate it we need
to consult `ToolChain`.
Prompted by PR #149652, this patch changes the Solaris/SPARC default to
-mcpu, matching both the Oracle Studio 12.6 compilers and GCC 16:
[[PATCH] Default to -mcpu=ultrasparc3 on
Solaris/SPARC](https://gcc.gnu.org/pipermail/gcc-patches/2025-July/690191.html).
This is equivalent to enabling the `vis2` feature.
Tested on `sparcv9-sun-solaris2.11` and `sparc64-unknown-linux-gnu`.
Commit 597ee88 moved the -X flag to a new position in the baremetal
toolchain's linker job, but unintentionally left the original instance in place.
This patch removes the redundant flag, ensuring -X is passed only once.
Summary:
This patch reworks how we create offloading toolchains. Previously we
would handle this separately for all the different kinds. This patch
instead changes this to use the target triple and the offloading kind to
determine the proper toolchain. In the old case where the user only
passes `--offload-arch` we instead infer the triple from the passed
arguments. This is a pretty major overhaul but currently passes all the
clang tests with only minor changes to error messages.
As documented in 20.x, we'd like to keep reduced BMI off by default for
1~2 versions. And now we're in 22.x.
I rarely receive bug reports for reduced BMI. I am not sure about the
reason. Maybe not a lot of people are using it. Or it is really stable
enough.
And also, we've been enabling the reduced BMI internally for roughly half a
year.
So I think it's the time to move on. See the document changes for other
information.
As we are now Sema-complete for OpenACC 3.4 (and thus have a conforming
implementation, in all modes), we can now set the _OPENACC macro
correctly.
Additionally, we remove the temporary 'override' functionality, which
was intended to allow people to experiment with this. We aren't having a
deprecation period as OpenACC support is still considered experimental.
This patch introduces support for Integrated Distributed ThinLTO (DTLTO)
in Clang.
DTLTO enables the distribution of ThinLTO backend compilations via
external distribution systems, such as Incredibuild, during the
traditional link step: https://llvm.org/docs/DTLTO.html.
Testing:
- `lit` test coverage has been added to Clang's Driver tests.
- The DTLTO cross-project tests will use this Clang support.
For the design discussion of the DTLTO feature, see:
https://github.com/llvm/llvm-project/pull/126654
Similar to how clang-cl driver does it, make it possible to build arm64x
binaries with a mingw-style invocation.
Signed-off-by: Sasha Finkelstein <fnkl.kernel@gmail.com>
This PR introduces the use of pointer authentication to objective-c[++].
This includes:
* __ptrauth qualifier support for ivars
* protection of isa and super fields
* protection of SEL typed ivars
* protection of class_ro_t data
* protection of methodlist pointers and content
Summary:
The semantics of static libraries only extract stuff that's used. We
somewhat extend this behavior with the linker wrapper only doing this to
fat binaries that match any found architectures. However, this has some
unfortunate effects when the user uses static libraries.
This is somewhat of a hack, but we now assume that if the user specified
`--offload-arch=` on the link job, they *definitely* want that
architecture to be used if it exists. This patch just forces extraction
of those libraries which resolves an issue observed with some customers.
The old behavior will still be present if the user does `--offload-link`
with no offloading architecture present, and for the vast, vast majority
of cases this will change nothing.
Fixes: https://github.com/llvm/llvm-project/issues/147788
Swift-versioned API notes get applied at PCM constrution time relying on
'-fapinotes-swift-version=X' argument to pick the appropriate version.
This change adds a new APINotes application mode with
'-fswift-version-independent-apinotes' which causes *all* versioned API
notes to get recorded into the PCM wrapped in 'SwiftVersionedAttr'
instances. The expectation in this mode is that the Swift client will
perform the required transformations as per the API notes on the client
side, when loading the PCM, instead of them getting applied on the
producer side. This will allow the same PCM to be usable by Swift
clients building with different language versions.
In addition to versioned-wrapping the various existing API notes
annotations which are carried in declaration attributes, this change
adds a new attribute for two annotations which were previously applied
directly to the declaration at the PCM producer side: 1) Type and 2)
Nullability annotations with 'SwiftTypeAttr' and 'SwiftNullabilityAttr',
respectively. The logic to apply these two annotations to a declaration
is refactored into API.
This patch adds an option to select the method for computing complex
number division. It uses `LoweringOptions` to determine whether to lower
complex division to a runtime function call or to MLIR's `complex.div`,
and `CodeGenOptions` to select the computation algorithm for
`complex.div`. The available option values and their corresponding
algorithms are as follows:
- `full`: Lower to a runtime function call. (Default behavior)
- `improved`: Lower to `complex.div` and expand to Smith's algorithm.
- `basic`: Lower to `complex.div` and expand to the algebraic algorithm.
See also the discussion in the following discourse post:
https://discourse.llvm.org/t/optimization-of-complex-number-division/83468
---------
Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
Re-land #146582 now that the Flang bugs have been fixed.
There is no way in Arm64 Windows to indicate that a given function has
used the Frame Pointer as a General Purpose Register, as such stack
walks will always assume that the frame chain is valid and will follow
whatever value has been saved for the Frame Pointer (even if it is
pointing to data, etc.).
This change makes the Frame Pointer always reserved when building for
Arm64 Windows to avoid this issue.
We will be updating the official Windows ABI documentation to reflect
this requirement, and I will provide a link once it's available.
Summary:
These commands both do the same thing and behave like the same tool.
Now, the `nvptx-arch` and `amdgpu-arch` tools cause it to only emit
architectures for that name.
cl.exe has /std:clatest, analogous to /std:c++latest, which sets the
language mode to the latest standard. For C, that's C23. This adds
support for the option.
Fixes#147233
The RISCVToolChain, which was removed in commit f8cb798,
used a different compiler-rt path relative to the resource-dir
when GCCInstallation was valid. However, this behavior was
unintentionally overridden when it was merged into the
BareMetal toolchain. This patch restores the correct path
handling logic.
After this fix -
With valid GCCInstallation
`<resource-dir>/<llvm-rel-ver>/lib/<target-triple>libclang_rt.builtins.a`
Without valid GCCInstallation
`<resource-dir>/<llvm-rel-ver>/lib/baremetal/libclang_rt.builtins-<target>.a`
This change removes unnecessary tune args to the AArch64 backend. The
AArch64 backend automatically handles `tune-cpu` and adds the necessar
y features based on the models from TableGen.
It follows this fix: https://github.com/llvm/llvm-project/pull/146260
where updating a subtarget feature didn't fail the frontend test because
both the toolchain and the test suffered from a coordinated error.