A CMake change included in CMake 4.0 makes `AIX` into a variable
(similar to `APPLE`, etc.)
ff03db6657
However, `${CMAKE_SYSTEM_NAME}` unfortunately also expands exactly to
`AIX` and `if` auto-expands variable names in CMake. That means you get
a double expansion if you write:
`if (${CMAKE_SYSTEM_NAME} MATCHES "AIX")`
which becomes:
`if (AIX MATCHES "AIX")`
which is as if you wrote:
`if (ON MATCHES "AIX")`
You can prevent this by quoting the expansion of "${CMAKE_SYSTEM_NAME}",
due to policy
[CMP0054](https://cmake.org/cmake/help/latest/policy/CMP0054.html#policy:CMP0054)
which is on by default in 4.0+. Most of the LLVM CMake already does
this, but this PR fixes the remaining cases where we do not.
The new builtin `__builtin_dedup_pack` removes duplicates from list of
types.
The added builtin is special in that they produce an unexpanded pack
in the spirit of P3115R0 proposal.
Produced packs can be used directly in template argument lists and get
immediately expanded as soon as results of the computation are
available.
It allows to easily combine them, e.g.:
```cpp
template <class ...T>
struct Normalize {
// Note: sort is not included in this PR, it illustrates the idea.
using result = std::tuple<
__builtin_sort_pack<
__builtin_dedup_pack<int, double, T...>...
>...>;
}
;
```
Limitations:
- only supported in template arguments and bases,
- can only be used inside the templates, even if non-dependent,
- the builtins cannot be assigned to template template parameters.
The actual implementation proceeds as follows:
- When the compiler encounters a `__builtin_dedup_pack` or other
type-producing
builtin with dependent arguments, it creates a dependent
`TemplateSpecializationType`.
- During substitution, if the template arguments are non-dependent, we
will produce: a new type `SubstBuiltinTemplatePackType`, which stores
an argument pack that needs to be substituted. This type is similar to
the existing `SubstTemplateParmPack` in that it carries the argument
pack that needs to be expanded further. The relevant code is shared.
- On top of that, Clang also wraps the resulting type into
`TemplateSpecializationType`, but this time only as a sugar.
- To actually expand those packs, we collect the produced
`SubstBuiltinTemplatePackType` inside `CollectUnexpandedPacks`.
Because we know the size of the produces packs only after the initial
substitution, places that do the actual expansion will need to have a
second run over the substituted type to finalize the expansions (in
this patch we only support this for template arguments, see
`ExpandTemplateArgument`).
If the expansion are requested in the places we do not currently
support, we will produce an error.
More follow-up work will be needed to fully shape this:
- adding the builtin that sorts types,
- remove the restrictions for expansions,
- implementing P3115R0 (scheduled for C++29, see
https://github.com/cplusplus/papers/issues/2300).
This patch does the bare minimum to start setting up the reduction
recipe support, including adding a type to the AST to store it. No real
additional work is done, and a bunch of static_asserts are left around
to allow us to do this properly.
Summary:
Previously we extracted archives and did special symbol resolution on
them, this was mostly done because the nvlink executable couldn't handle
archives natively. Since I have added a wrapper around it that handles
that, we can now remove a lot of this complexity and just create a new
archive and pass it to the device linker.
The one issue here is that for offloading code, we often have references
to kernels that do not come from the device link job, but from the host
that wishes to call them. Under normal circumstances we would ignore
them because they are not referenced, but we need them. For now I am
just passing `--whole-archive`. In a future patch I will likely add some
logic to create a linker script that will tell the linker which symbols
we want to be extracted. That won't work for NVPTX but it's the most
canonical solution.
Also some ugliness in this patch where I need to juggle whether or not
the input is an archive and throw pairs around, I may merge that into
the `OffloadFile` struct in a later patch.
Summary:
We always want to use the detected path. The clang driver's detection is
far more sophisticated so we should use that whenever possible. Also
update the usage so we properly fall back to path instead of incorrectly
using the `/bin` if not provided.
This is a major change on how we represent nested name qualifications in
the AST.
* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.
This patch offers a great performance benefit.
It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.
This has great results on compile-time-tracker as well:

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.
It has some other miscelaneous drive-by fixes.
About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.
There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.
How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.
The rest and bulk of the changes are mostly consequences of the changes
in API.
PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.
Fixes#136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
This was left over from implementation and shouldn't have been left in,
but in the end 'bind' doesn't require any additional work here, so this
patch removes the assert.
`Parser->Run(Opts.NoInitialTextSection)` calls initSections. Remove a
redundant initSections to remove an extra FT_Align fragment, observed
when investigating a missing MCOrgFragment relaxation issue
https://github.com/ClangBuiltLinux/linux/issues/2116
This patch adds the 'init recipes' to firstprivate like I did for
'private', so that we can properly init these types. At the moment,
the recipe init isn't generated (just the VarDecl), and this isn't
really used anywhere as it will be used exclusively in Codegen.
1. Added %commands to documentation
2. Added %help command to clang repl
3. Expanded parsing to throw unique errors in the case of users entering
an invalid %command or using %lib without an argument
4. Added tests to check the behvaior of %help, %lib, and bad %commands
Previously, #151360 implemented 'private' clause lowering, but didn't
properly initialize the variables. This patch adds that behavior to make
sure we correctly get the constructor or other init called.
Summary:
This tool tries to print the offloading architectures. If the user
doesn't have the HIP libraries or the CUDA libraries it will print and
error.
This isn't super helpful since we're just querying things. So, for
example, if we're on an AMD machine with no CUDA you'll get the AMD GPUs
and then an error message saying that CUDA wasn't found. This silences
the error just on failing to find the library unless verbose is on.
This commit handles the following types:
- clang::ExternalASTSource
- clang::TargetInfo
- clang::ASTContext
- clang::SourceManager
- clang::FileManager
Part of cleanup #151026
Handles clang::DiagnosticsEngine and clang::DiagnosticIDs.
For DiagnosticIDs, this mostly migrates from `new DiagnosticIDs` to
convenience method `DiagnosticIDs::create()`.
Part of cleanup https://github.com/llvm/llvm-project/issues/151026
These changes allow LLVM and Clang to be built with Clang targeting
Arm64EC using the MSVC linker.
Built with these options:
```
-DLLVM_ENABLE_PROJECTS="clang"
-DLLVM_HOST_TRIPLE=arm64ec-pc-windows-msvc
-DCMAKE_C_COMPILER=clang-cl.exe
-DCMAKE_C_COMPILER_TARGET=arm64ec-pc-windows-msvc
-DCMAKE_CXX_COMPILER=clang-cl.exe
-DCMAKE_CXX_COMPILER_TARGET=arm64ec-pc-windows-msvc
-DCMAKE_LINKER_TYPE=MSVC
```
This switches to `makeIntrusiveRefCnt<FileSystem>` where creating a new
object, and to passing/returning by `IntrusiveRefCntPtr<FileSystem>`
instead of `FileSystem*` or `FileSystem&`, when dealing with existing
objects.
Part of cleanup #151026.
As discussed in PR #145603, the following command seems to fail to
produce a YAML remarks file for offload LTO passes and thus for
kernel-info:
```
clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \
-Rpass=kernel-info -fsave-optimization-record
```
The problem is that, in clang-linker-wrapper's clang call, clang names
the file based on clang's main output file (from `-o`). That is a
temporary file, so the YAML file becomes a temporary file, which the
user never sees.
This patch:
- Makes clang honor `-dumpdir` for the default YAML remarks file in the
case of LTO.
- Extends clang-linker-wrapper to specify that option to clang.
To demonstrate the appeal of the generality of `-dumpdir` (as opposed to
a one-off `-fsave-optimization-record` solution in
clang-linker-wrapper), this patch also fixes `-gsplit-dwarf`. Without
this patch, when using `-gsplit-dwarf` and later debugging using rocgdb,
the dwo directory for offload is a temporary file, so temporary file
cleanup causes rocgdb to lose debug symbols for offload code.
WARNING: The clang driver passes `-dumpdir` to various clang frontend
calls. For LTO, that was previously being ignored, and now it's not.
That changes some auxiliary file names, as revealed by changes in some
existing tests' expected output: `clang/test/Driver/opt-record.c` and
`clang/test/Driver/lto-dwo.c`. Hopefully this change does not introduce
a backward compatibility issue for users.
Reland https://github.com/llvm/llvm-project/pull/150805
Shared libs build was broken.
Add `${dialect_libs}` and `${conversion_libs}` to
`MLIRRegisterAllExtensions` because it depends on
`registerConvert***ToLLVMInterface` functions.
`InitAll***` functions are used by `opt`-style tools to init all MLIR
dialects/passes/extensions. Currently they are implemeted as inline
functions and include essentially the entire MLIR header tree. Each file
which includes this header (~10 currently) takes 10+ sec and multiple GB
of ram to compile (tested with clang-19), which limits amount of
parallel compiler jobs which can be run. Also, flang just includes this
file into one of its headers.
Move the actual registration code to the static library, so it's
compiled only once.
Discourse thread
https://discourse.llvm.org/t/rfc-moving-initall-implementation-into-static-library/87559
As reported in #138173, enabling `-ftime-report` adds pass timing info
to the stats file if `-stats-file` is specified. This was determined to
be WAI. However, if one intentionally wants to put timer information in
the stats file, using `-ftime-report` may lead to a lot of logspam (that
can't be removed by directing stderr to `/dev/null` as that would also
redirect compiler errors). To address this, this PR adds a flag
`-stats-file-timers` that adds timer data to the stats file without
outputting to stderr.
The checks for the 'z' and 't' format specifiers added in the original
PR #143653 had some issues and were overly strict, causing some build
failures and were consequently reverted at
4c85bf2fe8.
In the latest commit
27c58629ec,
I relaxed the checks for the 'z' and 't' format specifiers, so warnings
are now only issued when they are used with mismatched types.
The original intent of these checks was to diagnose code that assumes
the underlying type of `size_t` is `unsigned` or `unsigned long`, for
example:
```c
printf("%zu", 1ul); // Not portable, but not an error when size_t is unsigned long
```
However, it produced a significant number of false positives. This was
partly because Clang does not treat the `typedef` `size_t` and
`__size_t` as having a common "sugar" type, and partly because a large
amount of existing code either assumes `unsigned` (or `unsigned long`)
is `size_t`, or they define the equivalent of size_t in their own way
(such as
sanitizer_internal_defs.h).2e67dcfdcd/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h (L203)
Including the results of `sizeof`, `sizeof...`, `__datasizeof`,
`__alignof`, `_Alignof`, `alignof`, `_Countof`, `size_t` literals, and
signed `size_t` literals, the results of pointer-pointer subtraction and
checks for standard library functions (and their calls).
The goal is to enable clang and downstream tools such as clangd and
clang-tidy to provide more portable hints and diagnostics.
The previous discussion can be found at #136542.
This PR implements this feature by introducing a new subtype of `Type`
called `PredefinedSugarType`, which was considered appropriate in
discussions. I tried to keep `PredefinedSugarType` simple enough yet not
limited to `size_t` and `ptrdiff_t` so that it can be used for other
purposes. `PredefinedSugarType` wraps a canonical `Type` and provides a
name, conceptually similar to a compiler internal `TypedefType` but
without depending on a `TypedefDecl` or a source file.
Additionally, checks for the `z` and `t` format specifiers in format
strings for `scanf` and `printf` were added. It will precisely match
expressions using `typedef`s or built-in expressions.
The affected tests indicates that it works very well.
Several code require that `SizeType` is canonical, so I kept `SizeType`
to its canonical form.
The failed tests in CI are allowed to fail. See the
[comment](https://github.com/llvm/llvm-project/pull/135386#issuecomment-3049426611)
in another PR #135386.
Add `NamespaceBaseDecl` as common base class of `NamespaceDecl` and
`NamespaceAliasDecl`. This simplifies `NestedNameSpecifier` a bit.
Co-authored-by: Matheus Izvekov <mizvekov@gmail.com>
For more context, see
https://github.com/llvm/llvm-project/pull/119269#issuecomment-3075444493,
but briefly, when removing ARCMigrate, I also removed some symbols in
libclang, which constitutes an ABI break that we don’t want, so this pr
reintroduces the removed symbols; the declarations are marked as
deprecated for future removal, and the implementations print an error
and do nothing, which is what we used to do when ARCMigrate was
disabled.
Clients of the dependency scanning service may need to add dependencies
based on the visibility of importing modules, for example, when
determining whether a Swift overlay dependency should be brought in
based on whether there's a corresponding **visible** clang module for
it.
This patch introduces a new field `VisibleModules` that contains all the
visible top-level modules in a given TU.
Because visibility is determined by which headers or (sub)modules were
imported, and not top-level module dependencies, the scanner now
performs a separate DFS starting from what was directly imported for
this computation.
In my local performance testing, there was no observable performance
impact.
resolves: rdar://151416358
---------
Co-authored-by: Jan Svoboda <jan@svoboda.ai>
Summary:
The semantics of static libraries only extract stuff that's used. We
somewhat extend this behavior with the linker wrapper only doing this to
fat binaries that match any found architectures. However, this has some
unfortunate effects when the user uses static libraries.
This is somewhat of a hack, but we now assume that if the user specified
`--offload-arch=` on the link job, they *definitely* want that
architecture to be used if it exists. This patch just forces extraction
of those libraries which resolves an issue observed with some customers.
The old behavior will still be present if the user does `--offload-link`
with no offloading architecture present, and for the vast, vast majority
of cases this will change nothing.
Fixes: https://github.com/llvm/llvm-project/issues/147788
This patch fixes an issue where Microsoft-specific layout attributes,
such as __declspec(empty_bases), were ignored during CUDA/HIP device
compilation on a Windows host. This caused a critical memory layout
mismatch between host and device objects, breaking libraries that rely
on these attributes for ABI compatibility.
The fix introduces a centralized hasMicrosoftRecordLayout() check within
the TargetInfo class. This check is aware of the auxiliary (host) target
and is set during TargetInfo::adjust if the host uses a Microsoft ABI.
The empty_bases, layout_version, and msvc::no_unique_address attributes
now use this centralized flag, ensuring device code respects them and
maintains layout consistency with the host.
Fixes: https://github.com/llvm/llvm-project/issues/146047
Add explicit dependency for gen_vt to the CMakeLists.txt for
clang/tools/clang-fuzzer/handle-llvm/handle_llvm.cpp to prevent race
condition on generation of llvm/CodeGen/GenVT.inc This explicit
dependency was added in other CMakeLists.txt when the tablegen was added
for GenVT.inc file in https://reviews.llvm.org/D148770, but not for
handle-llvm
A similar fix was made in
https://github.com/llvm/llvm-project/pull/109306
rdar://151325382
This PR introduces out-of-process (OOP) execution support for
Clang-Repl. With this enhancement, two new flags, oop-executor and
oop-executor-connect, are added to the Clang-Repl interface. These flags
enable the launch of an external executor (llvm-jitlink-executor), which
handles code execution in a separate process.
Introduce a type alias for the commonly used `std::pair<FileID,
unsigned>` to improve code readability, and make it easier for future
updates (64-bit source locations).
[Discourse
link](https://discourse.llvm.org/t/a-small-proposal-for-extraction-of-inline-assembly-block-information/86658)
We strive for exposing such information using existing stable ABIs. In
doing so, queries are limited to what the original source holds or the
LLVM IR `asm` block would expose in connection with attributes that the
queries are concerned.
These APIs opens new opportunities for `rust-bindgen` to translate
inline assemblies in reasonably cases into Rust inline assembly blocks,
which would further aid better interoperability with other existing
code.
---------
Signed-off-by: Xiangfei Ding <dingxiangfei2009@protonmail.ch>
## Problem
When using `-fsave-optimization-record` with offloading, the Clang
driver passes optimization record options like
`-plugin-opt=opt-remarks-format=yaml` to `clang-nvlink-wrapper`.
However, the wrapper doesn't recognize these options, causing
compilation to fail.
## Solution
This patch adds support for the standard optimization record command
line options to `clang-nvlink-wrapper`, matching the interface provided
by LLD and gold-plugin as documented in the [LLVM Remarks
documentation](https://llvm.org/docs/Remarks.html).
## Changes
- **NVLinkOpts.td**: Added definitions for `opt-remarks-filename`,
`opt-remarks-format`, `opt-remarks-filter`, and
`opt-remarks-with-hotness` options
- **NVLinkOpts.td**: Added `plugin-opt=` aliases for these options to
match what the Clang driver sends
- **ClangNVLinkWrapper.cpp**: Updated `createLTO()` to use command line
arguments when available, falling back to existing global variables
## Testing
The fix allows `-fsave-optimization-record` to work correctly with
offloading, generating optimization records during the LTO phase without
throwing unknown argument errors.
This change maintains backward compatibility and follows the existing
pattern used by other LLVM linkers.
While the source code isn't supposed to change during a build, in some
environments it does. This adds an option that disables caching of stat
failures, meaning that source files can be added to the build during
scanning.
This adds a `-no-cache-negative-stats` option to clang-scan-deps to
enable this behavior. There are no tests for clang-scan-deps as there's
no reliable way to do so from it. A unit test has been added that
modifies the filesystem between scans to test it.
Summary:
The logic here is flawed, it was only intended to apply to the CPU case
where we use the linker passed in on the command line. This was falsely
applying to SPIR-V which caused issues.