130 Commits

Author SHA1 Message Date
Cyndy Ishida
1365b5b1ad
[clang][DependencyScanning] Track dependencies from prebuilt modules to determine IsInStableDir (#132237)
When a module is being scanned, it can depend on modules that have
already been built from a pch dependency. When this happens, the pcm
files are reused for the module dependencies. When this is the case,
check if input files recorded from the PCMs come from the provided
stable directories transitively since the scanner will not have access
to the full set of file dependencies from prebuilt modules.
2025-04-08 15:48:25 -07:00
Cyndy Ishida
9aecbdf8ed
[clang][DepScan] Allow ModuleDep to be const (#132968)
This type can be exposed from C APIs, where instantiations of this type
are not expected to mutate after creation. To support this, mark the
lazy computation of build arguments mutable, as that is not intended to
otherwise mutate the state of these objects.


This was reviewed separately by @jansvoboda11
2025-03-25 13:13:58 -07:00
Cyndy Ishida
ad8f0e2760
[clang][DepScan] Pass references to ModuleDeps instead of ModuleID in lookupModuleOutput callbacks, NFCI (#131688)
This allows clients to reference more read-only attributes, like IsInStableDirectories.
2025-03-17 16:30:09 -07:00
Cyndy Ishida
584f8cc305
[clang][DependencyScanning] Track modules that resolve from "stable" locations (#130634)
That patch tracks whether all the file & module dependencies of a module
resolve to a stable location. This information will later be queried by
build systems for determining where to store the accompanying pcms.
2025-03-17 15:33:23 -07:00
Jan Svoboda
d2e66625bc
[clang][deps] Propagate the entire service (#128959)
Shared state between dependency scanning workers is managed by the
dependency scanning service.

Right now, the members are individually threaded through the worker,
action, and collector. This makes any change to the service and its
members a very laborious process. Moreover, this situation causes
frequent merge conflicts in our downstream repo where the service does
have some extra members that need to be passed around.

To ease the maintenance burden, this PR starts passing a reference to
the entire service.
2025-02-27 10:06:26 -08:00
Qiongsi Wu
7f482aa848
[clang modules] Setting DebugCompilationDir when it is safe to ignore current working directory (#128446)
This PR explicitly sets `DebugCompilationDir` to the system's root
directory if it is safe to ignore the current working directory.

This fixes a problem where a PCM file's embedded debug information can
lead to compilation failure. The compiler may have decided it is indeed
safe to ignore the current working directory. In this case, the PCM
file's content is functionally correct regardless of the current working
directory because no inputs use relative paths (see
https://github.com/llvm/llvm-project/pull/124786). However, a PCM may
contain debug info. If debug info is requested, the compiler uses the
current working directory value to set `DW_AT_comp_dir`. This may lead
to the following situation:
1. Two different compilations need the same PCM file. 
2. The PCM file is compiled assuming a working directory, which is
embedded in the debug info, but otherwise has no effect.
3. The second compilation assumes a different working directory, and
expects an identically-sized pcm file. However, it cannot find such a
PCM, because the existing PCM file has been compiled assuming a
different `DW_AT_comp_dir `, which is embedded in the debug info.

This PR resets the `DebugCompilationDir` if it is functionally safe to
ignore the working directory so the above situation is avoided, since
all debug information will share the same working directory.

rdar://145249881
2025-02-26 13:21:14 -08:00
Qiongsi Wu
54acda2e0e
[clang module] Current Working Directory Pruning (#124786)
When computing the context hash, `clang` always includes the compiler's
working directory. This can lead to situations when the only difference
between two compilations is the working directory, different module
variants are generated. These variants are redundant. This PR implements
an optimization that ignores the working directory when computing the
context hash when safe.

Specifically, `clang` checks if it is safe to ignore the working
directory in `isSafeToIgnoreCWD`. The check involves going through
compile command options to see if any paths specified are relative. The
definition of relative path used here is that the input path is not
empty, and `llvm::sys::path::is_absolute` is false. If all the paths
examined are not relative, `clang` considers it safe to ignore the
current working directory and does not consider the working directory
when computing the context hash.
2025-02-04 20:04:39 -08:00
Sirraide
c4a019747c
[Clang] Remove ARCMigrate (#119269)
In the discussion around #116792, @rjmccall mentioned that ARCMigrate
has been obsoleted and that we could go ahead and remove it from Clang,
so this patch does just that.
2025-01-30 05:32:25 +01:00
Jan Svoboda
9d4837f47c
[clang][deps][modules] Allocate input file paths lazily (#114457)
This PR builds on top of #113984 and attempts to avoid allocating input
file paths eagerly. Instead, the `InputFileInfo` type used by
`ASTReader` now only holds `StringRef`s that point into the PCM file
buffer, and the full input file paths get resolved on demand.

The dependency scanner makes use of this in a bit of a roundabout way:
`ModuleDeps` now only holds (an owning copy of) the short unresolved
input file paths, which get resolved lazily. This can be a big win, I'm
seeing up to a 5% speedup.
2024-11-11 09:46:50 -08:00
Jan Svoboda
da1a16ae10
[clang][modules] Preserve the module map that allowed inferring (#113389)
With inferred modules, the dependency scanner takes care to replace the
fake "__inferred_module.map" path with the file that allowed the module
to be inferred. However, this only worked when such a module was
imported directly in the TU. Whenever such module got loaded
transitively, the scanner would fail to perform the replacement. This is
caused by the fact that PCM files are lossy and drop this information.

This patch makes sure that PCMs include this file for each submodule (in
the `SUBMODULE_DEFINITION` record), fixes one existing test with an
incorrect assertion, and does a little drive-by refactoring of
`ModuleMap`.
2024-10-28 11:24:27 -07:00
Kazu Hirata
c7895f0d72
[DependencyScanning] Avoid repeated hash lookups (NFC) (#111088) 2024-10-04 07:37:54 -07:00
Jan Svoboda
b1aea98cfa
[clang] Make deprecations of some FileManager APIs formal (#110014)
Some `FileManager` APIs still return `{File,Directory}Entry` instead of
the preferred `{File,Directory}EntryRef`. These are documented to be
deprecated, but don't have the attribute that warns on their usage. This
PR marks them as such with `LLVM_DEPRECATED()` and replaces their usage
with the recommended counterparts. NFCI.
2024-09-25 10:36:44 -07:00
Jan Svoboda
5acd9d1137
[clang][scan] Report module dependencies in topological order (#107474) 2024-09-05 19:13:08 -07:00
Artem Chikin
68eb3b202f
[clang][deps] Collect discovered module dependencies' Link Libraries (#93588)
This will allow scanner clients to be able to compute e.g. auto-linking
dependencies of the scanned translation unit.
2024-06-04 09:16:49 -07:00
Jan Svoboda
3ea5dff0ef
[clang][modules] Only avoid pruning module maps when asked to (#89428)
Pruning non-affecting module maps is useful even when passing module
maps explicitly via `-fmodule-map-file=<path>`. For this situation, this
patch reinstates the behavior we had prior to #87849. For the situation
where the explicit module map file arguments were generated by the
dependency scanner (which already pruned the non-affecting ones), this
patch introduces new `-cc1` flag
`-fno-modules-prune-non-affecting-module-map-files` that avoids the
extra work.
2024-04-19 12:45:40 -07:00
Argyrios Kyrtzidis
6331024353
[clang/DependencyScanning/ModuleDepCollector] Refactor part of makeCommonInvocationForModuleBuild into its own function (#88447)
The new function is about clearing out benign codegen options and can be
applied for PCH invocations as well.
2024-04-15 15:05:55 -07:00
Argyrios Kyrtzidis
c7eede5d4c
[clang][deps] Remove pgo profile flags from modules (#87724)
These are not necessary when not performing codegen.
2024-04-05 10:31:16 -07:00
Michael Spencer
de3b2c293b
[clang][ScanDeps] Allow PCHs to have different VFS overlays (#82294)
It turns out it's not that uncommon for real code to pass a different
set of VFSs while building a PCH than while using the PCH. This can
cause problems as seen in `test/ClangScanDeps/optimize-vfs-pch.m`. If
you scan `compile-commands-tu-no-vfs-error.json` without -Werror and run
the resulting commands, Clang will emit a fatal error while trying to
emit a note saying that it can't find a remapped header.

This also adds textual tracking of VFSs for prebuilt modules that are
part of an included PCH, as the same issue can occur in a module we are
building if we drop VFSs. This has to be textual because we have no
guarantee the PCH had the same list of VFSs as the current TU.

This uses the `PrebuiltModuleListener` to collect `VFSOverlayFiles`
instead of trying to extract it out of a `serialization::ModuleFile`
each time it's needed. There's not a great way to just store a pointer
to the list of strings in the serialized AST.
2024-02-23 17:48:58 -08:00
Jan Svoboda
da95d926f6
[clang][lex] Always pass suggested module to InclusionDirective() callback (#81061)
This patch provides more information to the
`PPCallbacks::InclusionDirective()` hook. We now always pass the
suggested module, regardless of whether it was actually imported or not.
The extra `bool ModuleImported` parameter then denotes whether the
header `#include` will be automatically translated into import the the
module.

The main change is in `clang/lib/Lex/PPDirectives.cpp`, where we take
care to not modify `SuggestedModule` after it's been populated by
`LookupHeaderIncludeOrImport()`. We now exclusively use the `SM`
(`ModuleToImport`) variable instead, which has been equivalent to
`SuggestedModule` until now. This allows us to use the original
non-modified `SuggestedModule` for the callback itself.

(This patch turns out to be necessary for
https://github.com/apple/llvm-project/pull/8011).
2024-02-08 10:19:18 -08:00
Michael Spencer
c003d851f5
[clang][DependencyScanner] Remove unused -fmodule-map-file arguments (#80090)
Since we already add a `-fmodule-map-file=` argument for every used
modulemap, we can remove all `ModuleMapFiles` entries before adding
them.

This reduces the number of module variants when `-fmodule-map-file=`
appears on the original command line.
2024-01-31 12:28:27 -08:00
Michael Spencer
7847e44594
[clang][DependencyScanner] Remove unused -ivfsoverlay files (#73734)
`-ivfsoverlay` files are unused when building most modules. Enable
removing them by,
* adding a way to visit the filesystem tree with extensible RTTI to
  access each `RedirectingFileSystem`.
* Adding tracking to `RedirectingFileSystem` to record when it
  actually redirects a file access.
* Storing this information in each PCM.

Usage tracking is only enabled when iterating over the source manager
and affecting modulemaps. Here each path is stated to cause an access.
During scanning these stats all hit the cache.
2024-01-30 15:39:18 -08:00
Juergen Ributzka
e007551b10
[clang][modules] Strip LLVM options (#75405)
Currently, the dep scanner does not remove LLVM options from the
argument list.
Since LLVM options shouldn't affect the AST, it is safe to remove them
all.
2023-12-14 09:21:18 -08:00
Kazu Hirata
f3dcc2351c
[clang] Use StringRef::{starts,ends}_with (NFC) (#75149)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 08:54:13 -08:00
Jan Svoboda
0cb0a48cde
[clang] NFC: Remove OptionalFileEntryRefDegradesToFileEntryPtr (#74899) 2023-12-08 18:22:41 -08:00
Michael Spencer
13386c608e
[clang][DependencyScanner] Include the working directory in the context hash (#73719)
The working directory is included in the PCM, but is not currently part
of the context hash. This causes problems because different builds of a
PCM with exactly the same command line can end up with different binary
content for a PCM. If a build system tracks tasks by both working
directory and command line, it may build a given PCM multiple times,
causing a "module file out of date" error when loading the PCM due to
different sizes.
2023-11-30 12:47:27 -08:00
Michael Spencer
731152e18f
[clang][DependencyScanner] Remove all warning flags when suppressing warnings (#71612)
Since system modules don't emit most warnings, remove the warning flags
to increase module reuse.
2023-11-13 16:45:38 -08:00
Michael Spencer
fb07d9cc09
[clang][DepScan] Make OptimizeArgs a bit mask enum and enable by default (#71588)
Make it easier to control which optimizations are enabled by making
OptimizeArgs a bit masked enum. There's currently only one such
optimization, but more will be added in followup commits.
2023-11-07 16:06:59 -08:00
Chuanqi Xu
e107c9468b
[clang-scan-deps] [P1689] Keep consistent behavior for make dependencies with clang (#69551)
Close https://github.com/llvm/llvm-project/issues/69439.

This patch tries to reuse the codes to generate make style dependencies
information with P1689 format directly.
2023-10-31 23:59:47 +08:00
Kazu Hirata
d7b18d5083 Use llvm::endianness{,::little,::native} (NFC)
Now that llvm::support::endianness has been renamed to
llvm::endianness, we can use the shorter form.  This patch replaces
llvm::support::endianness with llvm::endianness.
2023-10-09 00:54:47 -07:00
Jan Svoboda
b19fe81eb5 [clang][deps] NFCI: Use FileEntryRef in ModuleDepCollectorPP 2023-09-10 19:53:54 -07:00
Jan Svoboda
3b1a68655e
[clang][deps] Generate command lines lazily (#65691)
This patch makes the generation of command lines for modular
dependencies lazy/on-demand. That operation is somewhat expensive and
prior to this patch used to be performed multiple times for the
identical `ModuleDeps` (i.e. when they were imported from multiple
different TUs).
2023-09-07 17:45:53 -07:00
Jan Svoboda
9208065a7b
[clang][deps] Store common, partially-formed invocation (#65677)
We create one `CompilerInvocation` for each modular dependency we
discover. This means we create a lot of copies, even though most of the
invocation is the same between modules. This patch makes use of the
copy-on-write flavor of `CompilerInvocation` to share the common parts,
reducing memory usage and speeding up the scan.
2023-09-07 15:49:11 -07:00
Jan Svoboda
5746002ebb [clang] NFCI: Change returned LanguageOptions pointer to reference 2023-09-05 13:23:53 -07:00
Ben Langmuir
dc5cbba319 [clang][modules] Add -Wsystem-headers-in-module=
Add a way to enable -Wsystem-headers only for a specific module. This is
useful for validating a module that would otherwise not see system
header diagnostics without being flooded by diagnostics for unrelated
headers/modules. It's relatively common for a module to be marked
[system] but still wish to validate itself explicitly.

rdar://113401565

Differential Revision: https://reviews.llvm.org/D156948
2023-08-09 10:40:53 -07:00
Jan Svoboda
dcd3a0c9f1 [clang][modules][deps] Create more efficient API for visitation of ModuleFile inputs
The current `ASTReader::visitInputFiles()` function calls into `FileManager` to create `FileEntryRef` objects. This ends up being fairly costly in `clang-scan-deps`, where we mostly only care about file paths.

This patch introduces new `ASTReader` API that gives clients access to just the serialized paths. Since the scanner needs both the as-requested path and the on-disk one (and doesn't want to transform the former into the latter via `FileManager`), this patch starts serializing both of them into the PCM file if they differ.

This increases the size of scanning PCMs by 0.1% and speeds up scanning by 5%.

Reviewed By: benlangmuir, vsapsai

Differential Revision: https://reviews.llvm.org/D157066
2023-08-09 10:19:36 -07:00
Jan Svoboda
8fd56ea112 [clang][deps] NFC: Speed up canonical context hash computation
This patch makes use of the infrastructure established in D157046 to speed up computation of the canonical context hash in the dependency scanner. This is somewhat hot code, since it's ran for all modules in the dependency graph of every TU.

I also tried an alternative approach that tried to avoid allocations as much as possible (essentially doing `HashBuilder.add(Arg.toStringRef(ArgVec))`), but that turned out to be slower than approach in this patch.

Note that this is not problematic in the same way command-line hashing used to be prior D143027. The lambda is now being called even for constant strings.

Depends on D157046.

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D157052
2023-08-03 20:36:34 -07:00
Jan Svoboda
c75b331fc2 [clang][deps] Remove ModuleDeps::ImportedByMainFile
This information is already exposed via `TranslationUnitDeps::ClangModuleDeps` on the `DependencyScanningTool` level, and this patch also adds it on the `DependencyScanningWorker` level via `DependencyConsumer::handleDirectModuleDependency()`.

Besides being redundant, this bit of information is misleading for clients that share single `ModuleDeps` instance between multiple TUs (by using the `AlreadySeen` set). The module can be imported directly in some TUs but transitively in others.

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D156563
2023-07-28 12:04:35 -07:00
Stoorx
40136ecefc [clang] Make access to submodules via iterator_range
In file `clang/lib/Basic/Module.cpp` the `Module` class had `submodule_begin()` and `submodule_end()` functions to retrieve corresponding iterators for private vector of Modules. This commit removes mentioned functions, and replaces all of theirs usages with `submodules()` function and range-based for-loops.

Differential Revision: https://reviews.llvm.org/D148954
2023-04-24 12:05:59 +03:00
Ben Langmuir
758bca6483 [clang][deps] Remove -coverage-data-file and -coverage-notes-file from modules
When not performing codegen, we can strip the coverage-data-file and
coverage-notes-file options to improve canonicalization.

rdar://107443796

Differential Revision: https://reviews.llvm.org/D147282
2023-03-31 09:43:22 -07:00
Ben Langmuir
296ba5bbd3 [clang][deps] Split lookupModuleOutput out of DependencyConsumer NFC
The idea is to split the callbacks that are used to consume dependency
information (DependencyConsumer) from callbacks that modify the scan
behaviour itself in any way (DependencyActionController). Currently this
is just lookupModuleOutput, but we have additional callbacks related to
CAS support that we intend to upstream in the future.

Differential Revision: https://reviews.llvm.org/D144058
2023-03-10 13:14:49 -08:00
Chuanqi Xu
eb70b38f83 Recommit [C++20] [Modules] [ClangScanDeps] Add ClangScanDeps support for C++20 Named Modules in P1689 format (2/4)
Close https://github.com/llvm/llvm-project/issues/51792
Close https://github.com/llvm/llvm-project/issues/56770

This patch adds ClangScanDeps support for C++20 Named Modules in P1689
format. We can find the P1689 format at:
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1689r5.html.
After we land the patch, we're able to compile C++20 Named
Modules with CMake! And although P1689 is written by kitware people,
other build systems should be able to use the format to compile C++20
Named Modules too.

TODO: Support header units in P1689 Format.
TODO2: Support C++20 Modules in the full dependency format of
ClangScanDeps. We also want to support C++20 Modules and clang modules
together according to
https://discourse.llvm.org/t/how-should-we-support-dependency-scanner-for-c-20-modules/66027.
But P1689 format cares about C++20 Modules only for now. So let's focus
on C++ Modules and P1689 format. And look at the full dependency format
later.

I'll add the ReleaseNotes and Documentations after the patch get landed.

Reviewed By: jansvoboda11

Differential Revision: https://reviews.llvm.org/D137527
2023-02-13 10:42:35 +08:00
NAKAMURA Takumi
069dd8768a Revert "[C++20] [Modules] [ClangScanDeps] Add ClangScanDeps support for C++20 Named Modules in P1689 format (2/4)"
This reverts commit de17c665e3f995c7f5a0e453461ce3a1b8aec196.

See also D137527
2023-02-12 18:38:25 +09:00
Chuanqi Xu
de17c665e3 [C++20] [Modules] [ClangScanDeps] Add ClangScanDeps support for C++20 Named Modules in P1689 format (2/4)
Close https://github.com/llvm/llvm-project/issues/51792
Close https://github.com/llvm/llvm-project/issues/56770

This patch adds ClangScanDeps support for C++20 Named Modules in P1689
format. We can find the P1689 format at:
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1689r5.html.
After we land the patch, we're able to compile C++20 Named
Modules with CMake! And although P1689 is written by kitware people,
other build systems should be able to use the format to compile C++20
Named Modules too.

TODO: Support header units in P1689 Format.
TODO2: Support C++20 Modules in the full dependency format of
ClangScanDeps. We also want to support C++20 Modules and clang modules
together according to
https://discourse.llvm.org/t/how-should-we-support-dependency-scanner-for-c-20-modules/66027.
But P1689 format cares about C++20 Modules only for now. So let's focus
on C++ Modules and P1689 format. And look at the full dependency format
later.

I'll add the ReleaseNotes and Documentations after the patch get landed.

Reviewed By: jansvoboda11

Differential Revision: https://reviews.llvm.org/D137527
2023-02-10 10:26:43 +08:00
Ben Langmuir
8711120e8b [clang][deps] Migrate ModuleDepCollector to LexedFileChanged NFCI
LexedFileChanged has the semantics we want of ignoring #line/etc. It's
also consistent with other dep collectors like DependencyFileGenerator.

Differential Revision: https://reviews.llvm.org/D143613
2023-02-09 11:42:03 -08:00
Ben Langmuir
223e99fb69 [clang][deps] Fix module context hash for constant strings
We were not hashing constant strings in the command-line, only ones that
required allocations. This was causing us to get the same hash across
different flag options.

rdar://101053855

Differential Revision: https://reviews.llvm.org/D143027
2023-02-03 15:10:19 -08:00
Michael Spencer
ffa2a647c6 [Clang][DependencyScanner] Remove secondary actions from -cc1
The -arcmt-action= and -objcmt-migrate* actions were being passed to
module builds. This caused these builds to fail, as they are secondary
actions that suppress emitting modules.

Differential Revision: https://reviews.llvm.org/D143040
2023-02-01 15:17:32 -08:00
Ben Langmuir
73dba2f649 [clang][deps] Fix modulemap file path for implementation of module
Use the name "as requested" for the path of the implemented module's
modulemap file, just as we do for other modulemap file paths. This fixes
fatal errors with modules where we tried to find framework headers
relative to the wrong directory when imported by an implementation file
of the same module.

rdar://104619123

Differential Revision: https://reviews.llvm.org/D142501
2023-01-27 11:37:35 -08:00
Jan Svoboda
beebad9a9b [clang][deps] NFC: Remove dead code
This patch removes some dead code in the dependency scanner.

The `ModuleDeps::ImplicitModulePCMPath` member stopped being used in D131934.

The strict context hash was replaced in D129884 by hash of the canonical command line.

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D142416
2023-01-24 09:49:22 -08:00
Jan Svoboda
96a54b2258 [clang][deps] Account for transitive spurious dependencies
In D106100, we started guarding against spurious dependencies on modules that ended up being textual includes and thus didn't have any AST file associated. That patch accounted only for direct dependencies. There's a way how to get spurious dependencies for modules that are transitive. This patch guards against that scenario and adds a test case.

(Note that since D142167, we don't allow `@import FW_Private` with `-fmodule-name=FW` anymore. However, that check lives in sema, which the scanner doesn't run. Being defensive in this patch therefore still makes sense.)

rdar://104324602

Reviewed By: benlangmuir

Differential Revision: https://reviews.llvm.org/D142165
2023-01-24 09:48:58 -08:00
Ben Langmuir
43854fa263 [clang][deps] Add module files for input dependencies earlier
I originally thought we needed to add module file inputs for modular
deps at the same time as outputs because they depend on the
lookupModuleOutput callback, but this is not the case: they only depend
on the callback results for other modules, which have already been
computed by this point. So move them earlier so that they're set in the
CompilerInvocation at the same time as other inputs. This makes the
code easier to understand.

This change is effectively NFC, though it technically changes the module
exact value of the context hash.

Differential Revision: https://reviews.llvm.org/D142392
2023-01-24 08:45:20 -08:00