111 Commits

Author SHA1 Message Date
Jan Svoboda
1e25ff84d8 [clang][deps] Fix traversal of precompiled dependencies
The code for traversing precompiled dependencies is somewhat complicated and contains a dangling iterator bug.

This patch simplifies the code and fixes the bug.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D121533
2022-03-16 12:17:53 +01:00
Jan Svoboda
d73daa9135 [clang][deps] Don't prune search paths used by dependencies
When pruning header search paths (to reduce the number of modules we need to build explicitly), we can't prune the search paths used in (transitive) dependencies of a module. Otherwise, we could end up with either of the following dependency graphs:

```
X:<hash1> -> Y:<hash2>
X:<hash1> -> Y:<hash3>
```

depending on the search paths of the translation unit we discovered `X` and `Y` from.

This patch fixes that.

Depends on D121295.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D121303
2022-03-16 12:17:53 +01:00
Jan Svoboda
cf4a31fc0f [clang][deps] Remove '-fmodules-cache-path=' arguments
With explicit modules build, the '-fmodules-cache-path=' argument is unused.

This patch removes the argument to avoid warnings or errors (with '-Werror') stemming from that.

Depends on D118915.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D120474
2022-03-12 11:42:07 +01:00
Jan Svoboda
7f6af60746 [clang][deps] Generate '-fmodule-file=' only for direct dependencies
The `clang-scan-deps` tool currently generates `-fmodule-file=` command-line arguments for the whole transitive closure of modular dependencies. This is not necessary, we only need to provide the direct dependencies on the command line. Information about transitive dependencies is stored within the `.pcm` files of direct dependencies. This makes the command lines shorter, but should be a NFC otherwise (unless there are bugs in the loading mechanism for explicit modules).

Depends on D120465.

Reviewed By: Bigcheese

Differential Revision: https://reviews.llvm.org/D118915
2022-03-12 11:32:51 +01:00
Jan Svoboda
a6ef363546 [clang][deps] Disable implicit module maps
Since D113473, we don't report any module map files via `-fmodule-map-file=` in explicit builds. The ultimate goal here is to make sure Clang doesn't open/read/parse/evaluate unnecessary module maps.

However, implicit module maps still end up reading all reachable module maps. This patch disables implicit module maps in explicit builds.

Unfortunately, we still need to report some module map files that aren't encoded in PCM files of dependencies: module maps that are necessary to correctly evaluate includes in modules marked as `[no_undeclared_includes]`.

Depends on D120464.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D120465
2022-03-12 11:07:21 +01:00
Jan Svoboda
19017c2435 [clang][deps] Return the whole TU command line
The dependency scanner already generates canonical -cc1 command lines that can be used to compile discovered modular dependencies.

For translation unit command lines, the scanner only generates additional driver arguments the build system is expected to append to the original command line.

While this works most of the time, there are situations where that's not the case. For example with `-Wunused-command-line-argument`, Clang will complain about the `-fmodules-cache-path=` argument that's not being used in explicit modular builds. Combine that with `-Werror` and the build outright fails.

To prevent such failures, this patch changes the dependency scanner to return the full driver command line to compile the original translation unit. This gives us more opportunities to massage the arguments into something reasonable.

Reviewed By: Bigcheese

Differential Revision: https://reviews.llvm.org/D118986
2022-02-23 15:46:20 +01:00
Jan Svoboda
80a696898c [clang][deps] NFC: Update documentation
In D113473, the dependency scanner stopped emitting "-fmodule-map-file=" arguments. Potential build systems are expected to not add any such arguments on their own. This commit removes mentions of such arguments to avoid confusion.
2022-02-23 15:46:20 +01:00
Jan Svoboda
c6f8704053 [clang][deps] Disable global module index
While scanning dependencies of a TU that depends on a PCH, the scanner basically performs mixed implicit/explicit modular compilation. (Explicit modules come from the PCH.) This seems to trip up the global module index.

This patch disables global module index in the dependency scanner.

Reviewed By: Bigcheese

Differential Revision: https://reviews.llvm.org/D118890
2022-02-15 09:51:23 +01:00
Jan Svoboda
8cc2a13727 [clang][deps] Handle symlinks in minimizing FS
The minimizing and caching filesystem used by the dependency scanner can be configured to **not** minimize some files. That's necessary when scanning a TU with prebuilt inputs (i.e. PCH) that refer to the original (non-minimized) files. Minimizing such files in the dependency scanner would cause discrepancy between the current perceived state of the filesystem and the file sizes stored in the AST file. By not minimizing such files, we avoid creating the discrepancy.

The problem with the current approach is that files that should not be minimized are identified by their path. This breaks down when the prebuilt input (PCH) and the current TU refer to the same file via different paths (i.e. symlinks). This patch switches from paths to `llvm::sys::fs::UniqueID` when identifying ignored files. This is consistent with how the rest of Clang treats files.

Depends on D114966.

Reviewed By: dexonsmith, arphaman

Differential Revision: https://reviews.llvm.org/D114971
2022-01-21 13:04:25 +01:00
Jan Svoboda
5daeada330 [clang][deps] Ensure filesystem cache consistency
The minimizing filesystem used by the dependency scanner isn't great when it comes to the consistency of its caches. There are two problems that can be exposed by a filesystem that changes during dependency scan:
1. In-memory cache entries for original and minimized files are distinct, populated at different times using separate stat/open syscalls. This means that when a file is read with minimization disabled, its contents might be inconsistent when the same file is read with minimization enabled at later point (and vice versa).
2. In-memory cache entries are indexed by filename. This is problematic for symlinks, where the contents of the symlink might be inconsistent with contents of the original file (for the same reason as in problem 1).

This patch ensures consistency by always stating/reading a file exactly once. The original contents are always cached and minimized contents are derived from that on demand. The cache entries are now indexed by their `UniqueID` ensuring consistency for symlinks too. Moreover, the stat/read syscalls are now issued outside of critical section.

Depends on D115935.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D114966
2022-01-21 13:04:25 +01:00
Jan Svoboda
ced077e1ba [clang][deps] NFC: Simplify handling of cached FS errors
The return types of some `CachedFileSystemEntry` member function are needlessly complex.

This patch attempts to simplify the code by unwrapping cached entries that represent errors early, and then asserting `!isError()`.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D115935
2022-01-21 13:04:25 +01:00
Michael Spencer
37e6e022d2 Re-land "[Clang][ScanDeps] Use the virtual path for module maps"
This re-lands:
- 04192422c4e3b730c580498b8e948088cb15580b
- 015e08c6badad6b27404d6f94569e25c18d79049

Which I reverted in ea835171389aa356b865bf9cb72ca8f4f84b64fd in error.

Differential Revision: https://reviews.llvm.org/D114206
2022-01-06 21:05:05 +00:00
Archibald Elliott
ea83517138 Revert "[Clang][ScanDeps] Use the virtual path for module maps"
This reverts commits:
- 04192422c4e3b730c580498b8e948088cb15580b.
- 015e08c6badad6b27404d6f94569e25c18d79049

D114206 was landed before it was approved - and was landed knowing that
the test crashed on windows, without an xfail. The promised follow-up
commit with fixes has not appeared since it was promised on December 14th.
2022-01-05 12:17:06 +00:00
Jan Svoboda
3f3b5c3ec0 [clang][deps] NFC: Unify ErrorOr patterns
This patch canonicalized some code into repetitive ErrorOr pattern. This will make refactoring easier if we ever come up with a way to simplify this.
2021-12-17 14:00:20 +01:00
Jan Svoboda
bcdf7f5e91 [clang][deps] NFC: Take and store entry as reference 2021-12-17 14:00:20 +01:00
Jan Svoboda
af7a421ef4 [clang][deps] NFC: Remove explicit call to implicit constructor 2021-12-17 14:00:20 +01:00
Jan Svoboda
195a5294c2 [clang][deps] NFC: Rename member variable 2021-12-17 14:00:20 +01:00
Jan Svoboda
4170ea9445 [clang][deps] NFC: Fix whitespace formatting 2021-12-17 14:00:20 +01:00
Jan Svoboda
f66803457e [clang][deps] Squash caches for original and minimized files
The minimizing and caching filesystem used by the dependency scanner keeps minimized and original files in separate caches.

This setup is not well suited for dealing with files that are sometimes minimized and sometimes not. Such files are being stat-ed and read twice, which is wasteful and also means the two versions of the file can get "out of sync".

This patch squashes the two caches together. When a file is stat-ed or read, its original contents are populated. If a file needs to be minimized, we give the minimizer the already loaded contents instead of reading the file again.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D115346
2021-12-16 09:57:21 +01:00
Jan Svoboda
da920c3bcc [clang][deps] NFC: Move entry initialization into member functions
This is a prep-patch for making `CachedFileSystemEntry` initialization more lazy.
2021-12-15 16:39:29 +01:00
Jan Svoboda
3031fd71b9 [clang][deps] NFC: Use clearer wording around entry initialization
The code and documentation around `CachedFileSystemEntry` use the following terms:
* "invalid stat" for `llvm::ErrorOr<llvm::vfs::Status>` that is *not* an error and contains an unknown status,
* "initialized entry" for an entry that contains "invalid stat",
* "valid entry" for an entry that contains "invalid stat", synonymous to "initialized" entry.

Having an entry be "valid" while it contains an "invalid" status object is counter-intuitive.
This patch cleans up the wording by referring to the status as "unknown" and to the entry as either "initialized" or "uninitialized".
2021-12-15 16:14:44 +01:00
Michael Spencer
04192422c4 [Clang][ScanDeps] Use the virtual path for module maps
Make clang-scan-deps use the virtual path for module maps instead of the on disk
path. This is needed so that modulemap relative lookups are done correctly in
the actual module builds. The file dependencies still use the on disk path as
that's what matters for build invalidation.

Differential Revision: https://reviews.llvm.org/D114206
2021-12-14 11:21:42 -07:00
Jan Svoboda
13a351e862 [clang][deps] Use MemoryBuffer in minimizing FS
This patch avoids unnecessarily copying contents of `mmap`-ed files into `CachedFileSystemEntry` by storing `MemoryBuffer` instead. The change leads to ~50% reduction of peak memory footprint when scanning LLVM+Clang via `clang-scan-deps`.

Depends on D115331.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D115043
2021-12-09 11:32:13 +01:00
Jan Svoboda
58822837cd [clang][deps] Use lock_guard instead of unique_lock
This patch changes uses of `std::unique_lock` to `std::lock_guard`.

The `std::unique_lock` template provides some advanced capabilities (deferred locking, time-constrained locking attempts, etc.) we don't use in the caching filesystem. Plain `std::lock_guard` will do here.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D115332
2021-12-09 10:42:50 +01:00
Jan Svoboda
5b6c08379b [clang][deps] Reset some benign codegen options
Some command-line codegen arguments are likely to differ between identical modules discovered from different translation units. This patch removes them to make builds deterministic and/or reduce the number of built modules.

Reviewed By: Bigcheese

Differential Revision: https://reviews.llvm.org/D112923
2021-12-08 11:53:50 +01:00
Jan Svoboda
97e504cff9 [clang][deps] NFC: Extract function
This commits extracts a couple of nested conditions into a separate function with early returns, making the control flow easier to understand.
2021-11-26 14:01:24 +01:00
Jan Svoboda
12eafd944e [clang][deps] NFC: Clean up wording (ignored vs minimized)
The filesystem used during dependency scanning does two things: it caches file entries and minimizes source file contents. We use the term "ignored file" in a couple of places, but it's not clear what exactly that means. This commit clears up the semantics, explicitly spelling out this relates to minimization.
2021-11-26 12:18:37 +01:00
Jan Svoboda
d8a3538788 [clang][deps] NFC: Remove else after early return 2021-11-26 12:18:37 +01:00
Jan Svoboda
17ec9d1f6b [clang][deps] Don't emit -fmodule-map-file=
During explicit modules build, when all modules are provided via `-fmodule-file=<path>` and implicit modules and implicit module maps are disabled (`-fno-implicit-modules`, `-fno-implicit-module-maps`), we don't need to load the original module map files at all. This patch stops emitting the `-fmodule-map-file=` arguments we don't need, saving some compilation time due to avoiding parsing such module maps and making the command line shorter.

Reviewed By: bnbarham

Differential Revision: https://reviews.llvm.org/D113473
2021-11-18 12:31:24 +01:00
Jan Svoboda
c62220f962 [clang][deps] NFC: Rename building CompilerInvocation
The dependency scanner works with multiple instances of `Compiler{Instance,Invocation}`. From names of the variables/members, their purpose is not obvious.

This patch gives descriptive name to the generated `CompilerInvocation` that can be used to derive the command-line to build a modular dependency.

Depends on D111725.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111728
2021-10-21 13:51:27 +02:00
Jan Svoboda
207e9fdea7 [clang][deps] NFC: Rename scanning CompilerInstance
The dependency scanner works with multiple instances of `Compiler{Instance,Invocation}`. From names of the variables/members, their purpose is not obvious.

This patch gives a distinct name to the `CompilerInstance` that's used to run the implicit build during dependency scan.

Depends on D111724.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111725
2021-10-21 13:51:00 +02:00
Jan Svoboda
24616664af [clang][deps] NFC: Remove redundant CompilerInstance reference
The `ModuleDepCollectorPP` class holds a reference to `ModuleDepCollector` as well as `ModuleDepCollector`'s `CompilerInstance`. The fact that these refer to the same object is non-obvious.

This patch removes the `CompilerInvocation` reference from `ModuleDepCollectorPP` and accesses it through `ModuleDepCollector` instead.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111724
2021-10-21 13:50:46 +02:00
Jan Svoboda
954d77b98d [clang][deps] Ensure reported context hash is strict
One of main goals of the dependency scanner is to be strict about module compatibility. This is achieved through strict context hash. This patch ensures that strict context hash is enabled not only during the scan itself (and its minimized implicit build), but also when actually reporting the dependency.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111720
2021-10-21 13:49:47 +02:00
Jan Svoboda
08c8016cfb [clang][modules] Cache loads of modules imported by PCH
During explicit modular build, PCM files are typically specified via the `-fmodule-file=<path>` command-line option. Early during the compilation, Clang uses the `ASTReader` to read their contents and caches the result so that the module isn't loaded implicitly later on. A listener is attached to the `ASTReader` to collect names of the modules read from the PCM files. However, if the PCM has already been loaded previously via PCH:
1. the `ASTReader` doesn't do anything for the second time,
2. the listener is not invoked at all,
3. the module load result is not cached,
4. the compilation fails when attempting to load the module implicitly later on.

This patch solves this problem by attaching the listener to the `ASTReader` for PCH reading as well.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111560
2021-10-13 18:09:52 +02:00
Jan Svoboda
6a1f50b84a [clang][deps] Prune unused header search paths
To reduce the number of explicit builds of a single module, we can try to squash multiple occurrences of the module with different command-lines (and context hashes) by removing benign command-line options. The greatest contributors to benign differences between command-lines are the header search paths.

In this patch, the lookup cache in `HeaderSearch` is used to identify paths that were actually used when implicitly building the module during scanning. This information is serialized into the unhashed control block of the implicitly-built PCM. The dependency scanner then loads this and may use it to prune the header search paths before computing the context hash of the module and generating the command-line.

We could also prune the header search paths when serializing `HeaderSearchOptions` into the PCM. That way, we could do it only once instead of every load of the PCM file by dependency scanner. However, that would result in a PCM file whose contents don't produce the same context hash as the original build, which is probably highly surprising.

There is an alternative approach to storing extra information into the PCM: wire up preprocessor callbacks to capture the used header search paths on-the-fly during preprocessing of modularized headers (similar to what we currently do for the main source file and textual headers). Right now, that's not compatible with the fact that we do an actual implicit build producing PCM files during dependency scanning. The second run of dependency scanner loads the PCM from the first run, skipping the preprocessing altogether, which would result in different results between runs. We can revisit this approach when we stop building implicitly during dependency scanning.

Depends on D102923.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D102488
2021-10-12 12:39:23 +02:00
Jan Svoboda
993f60ae32 [clang][deps] Sanitize both instances of DiagnosticOptions
During dependency scanning, we generally want to suppress -Werror. Apply the same logic to the DiagnosticOptions instance used for command-line parsing.

This fixes a test failure on the PS4 bot, where the system header directory could not be found, which was reported due to -Werror being on the command line and not being sanitized.
2021-09-10 14:47:21 +02:00
Jan Svoboda
1e760b5902 [clang][deps] Use correct DiagnosticOptions for command-line handling
In this patch the dependency scanner starts using proper `DiagnosticOptions` parsed from the actual TU command-line in order to mimic what the actual compiler would do. The actual functionality will be enabled and tested in follow-up patches. (This split is necessary to avoid temporary regression.)

Depends on D108976.

Reviewed By: dexonsmith, arphaman

Differential Revision: https://reviews.llvm.org/D108982
2021-09-10 13:44:35 +02:00
Jan Svoboda
0ebf61963b [clang][deps] NFC: Remove CompilationDatabase from DependencyScanningTool API
This patch simplifies the dependency scanner API. Depends on D108980.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D108981
2021-09-10 12:31:27 +02:00
Jan Svoboda
729f7b1220 [clang][deps] NFC: Remove CompilationDatabase from DependencyScanningWorker API
This patch simplifies the dependency scanner API. Depends on D108979.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D108980
2021-09-10 11:23:12 +02:00
Jan Svoboda
146ec74a83 [clang][deps] NFC: Stop going through ClangTool
The dependency scanner currently uses `ClangTool` to invoke the dependency scanning action.

However, `ClangTool` seems to be the wrong level of abstraction. It's intended to be run over a collection of compile commands, which we actively avoid via `SingleCommandCompilationDatabase`. It automatically injects `-fsyntax-only` and other flags, which we avoid by calling `clearArgumentsAdjusters()`. It deduces the resource directory based on the current executable path, which we'd like to change to deducing from `argv[0]`.

Internally, `ClangTool` uses `ToolInvocation` which seems to be more in line with what the dependency scanner tries to achieve. This patch switches to directly using `ToolInvocation` instead. NFC.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D108979
2021-09-10 11:02:41 +02:00
Diana Picus
b2528fc490 [clang][deps] Stop using ClangTool for virtual files
This patch changes how the dependency scanner creates the fake input file when scanning dependencies of a single module (introduced in D109485). The scanner now has its own `InMemoryFilesystem` which sits under the minimizing FS (when that's requested). This makes it possible to drop the duplicate work in `DependencyScanningActions::runInvocation` that sets up the main file ID. Besides that, this patch makes it possible to land D108979, where we drop `ClangTool` entirely.

Depends on D109485.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D109498
2021-09-10 10:19:27 +02:00
Akira Hatanaka
17c2948d04 [clang-scan-deps] Add an API for clang dependency scanner to perform
module lookup by name alone

This removes the need to create a fake source file that imports a
module.

rdar://64538073

Differential Revision: https://reviews.llvm.org/D109485
2021-09-09 08:52:50 -07:00
Jan Svoboda
6da811fd5c [clang][deps] Reset non-modular language and preprocessor options
There are a number of language and preprocessor options that are reset in the `CompilerInvocation` that describes the build of an implicit module. This patch uses the logic for explicit modules as well.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D108710
2021-08-26 08:43:21 +02:00
Jan Svoboda
b5088cb408 [clang][deps] Ensure deterministic order of TU '-fmodule-file=' arguments
Translation units with multiple direct modular dependencies trigger a non-deterministic ordering in `clang-scan-deps`. This boils down to usage of `std::unordered_map`, which gets replaced by `std::map` in this patch.

Depends on D103526.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D103807
2021-08-25 11:14:16 +02:00
Jan Svoboda
3b8f536fec [clang][deps] Use top-level modules as precompiled dependencies
The `ASTReader` populates `Module::PresumedModuleMapFile` only for top-level modules, not submodules. To avoid generating empty `-fmodule-map-file=` arguments, make discovered modules depend on top-level precompiled modules. The granularity of submodules is not important here.

The documentation of `Module::PresumedModuleMapFile` says this field is non-empty only when building from preprocessed source. This means there can still be cases where the dependency scanner generates empty `-fmodule-map-file=` arguments. That's being addressed in separate patch: D108544.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D108647
2021-08-25 10:51:34 +02:00
Jan Svoboda
83c633ea1a [clang][deps] Collect precompiled deps from submodules too
In this patch, the dependency scanner starts collecting precompiled dependencies from all encountered submodules, not only from top-level modules.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D108540
2021-08-25 10:35:34 +02:00
Christopher Di Bella
c874dd5362 [llvm][clang][NFC] updates inline licence info
Some files still contained the old University of Illinois Open Source
Licence header. This patch replaces that with the Apache 2 with LLVM
Exception licence.

Differential Revision: https://reviews.llvm.org/D107528
2021-08-11 02:48:53 +00:00
Alex Lorenz
c68f247275 [clang-scan-deps] ignore top-level module dependencies that aren't actually imported
Whenever -fmodule-name=top_level_module name is parsed, and clang actually tries to
import top_level_module, the headers are imported textually and the module isn't actually
built. However, the dependency scanner could still record it as a potential dependency
if the module was reimported and thus recorded by the preprocessor callbacks.
This change avoids collecting this kind of module as a dependency by verifying that we don't
collect top level modules without actual PCM files.

Differential Revision: https://reviews.llvm.org/D106100
2021-07-20 11:11:28 -07:00
Jan Svoboda
c94a345a5c [clang][deps] Fix test by checking ignored files correctly
After a rebase, bc1a2979fc70d954ae97122205c71c8404a1b17e accidentally changed `shouldIgnoreFile(Filename)` to incorrect `IgnoredFiles.count(Filename)`. This avoided using native filenames, which the patch intended to solve in the first place.

Failing Windows builds:
* https://lab.llvm.org/buildbot#builders/123/builds/5147
* https://lab.llvm.org/buildbot#builders/86/builds/17177
2021-07-20 13:20:56 +02:00
Jan Svoboda
e564fd93ab [clang][deps] Avoid minimizing PCH input files
This patch avoid minimizing input files that contributed to a PCH or its modules. This prevents the implicit modular build to fail on unexpected file size. Depends on D106146.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D104536
2021-07-20 12:20:10 +02:00