llvm-project

Author	SHA1	Message	Date
Qinkun Bao	18b885f66b	Revert "[clang][modules] Timestamp-less validation API" (#139987 ) Reverts llvm/llvm-project#138983	2025-05-14 21:02:57 -04:00
Jan Svoboda	960afcc90e	[clang][modules] Timestamp-less validation API (#138983 ) Timestamps are an implementation detail of the cross-process module cache implementation. This PR hides it from the `ModuleCache` API, which simplifies the in-process implementation.	2025-05-14 14:31:23 -07:00
Jan Svoboda	49c513844d	[clang][modules] Allow not forcing validation of user headers (#139091 ) Force-validation of user headers was implemented in acb803e8 to deal with files changing during build. The dependency scanner guarantees an immutable file system during single build session, so the validation is unnecessary. (We don't hit the disk too often due to the caching VFS, but even avoiding going to the cache and deserializing the input files makes sense.)	2025-05-09 08:33:28 -07:00
Jan Svoboda	1698beb542	[clang][modules][deps] Optimize in-process timestamping of PCMs (#137363 ) In the past, timestamps used for `-fmodules-validate-once-per-build-session` were found to be a source of contention in the dependency scanner ([D149802](https://reviews.llvm.org/D149802), https://github.com/llvm/llvm-project/pull/112452). This PR is yet another attempt to optimize these. We now make use of the new `ModuleCache` interface to implement the in-process version in terms of atomic `std::time_t` variables rather the mtime attribute on `.timestamp` files.	2025-05-07 14:02:40 -07:00
Jan Svoboda	b69dcb8734	[clang][frontend] Require invocation to construct `CompilerInstance` (#137668 ) This PR makes it so that `CompilerInvocation` needs to be provided to `CompilerInstance` on construction. There are a couple of benefits in my view: * Making it impossible to mis-use some `CompilerInstance` APIs. For example there are cases, where `createDiagnostics()` was called before `setInvocation()`, causing the `DiagnosticEngine` to use the default-constructed `DiagnosticOptions` instead of the intended ones. * This shrinks `CompilerInstance`'s state space. * This makes it possible to access the invocation in `CompilerInstance`'s constructor (to be used in a follow-up).	2025-05-01 07:31:30 -07:00
Jan Svoboda	060f3f0dd1	[clang][deps] Make dependency directives getter thread-safe (#136178 ) This PR fixes two issues in one go: 1. The dependency directives getter (a `std::function`) was being stored in `PreprocessorOptions`. This goes against the principle where the options classes are supposed to be value-objects representing the `-cc1` command line arguments. This is fixed by moving the getter directly to `CompilerInstance` and propagating it explicitly. 2. The getter was capturing the `ScanInstance` VFS. That's fine in synchronous implicit module builds where the same VFS instance is used throughout, but breaks down once you try to build modules asynchronously (which forces the use of separate VFS instances). This is fixed by explicitly passing a `FileManager` into the getter and extracting the right instance of the scanning VFS out of it.	2025-04-23 10:33:12 -07:00
Kazu Hirata	c2d6c7cea7	[clang] Use llvm::append_range (NFC) (#136448 )	2025-04-19 12:21:14 -07:00
Cyndy Ishida	40050888a1	[clang][depscan] Centralize logic for populating StableDirs, NFC (#135704 ) Pass a reference to `StableDirs` when creating ModuleDepCollector. This avoids needing to create one from the same ScanInstance for each call to `handleTopLevelModule` & reduces the amount of potential downstream changes needed for handling StableDirs.	2025-04-15 09:59:23 -07:00
Cyndy Ishida	1365b5b1ad	[clang][DependencyScanning] Track dependencies from prebuilt modules to determine IsInStableDir (#132237 ) When a module is being scanned, it can depend on modules that have already been built from a pch dependency. When this happens, the pcm files are reused for the module dependencies. When this is the case, check if input files recorded from the PCMs come from the provided stable directories transitively since the scanner will not have access to the full set of file dependencies from prebuilt modules.	2025-04-08 15:48:25 -07:00
Kazu Hirata	7cc17fb085	[ADT] Remove old range constructors of SmallSet and StringSet (#133205 ) This patch removes the old range constructors of SmallSet and StringSet that do not take the llvm::from_range tag. Since there are so few uses, this patch directly removes them without going through the deprecation process.	2025-03-27 07:52:13 -07:00
Jan Svoboda	056264b838	[clang][deps] Implement efficient in-process `ModuleCache` (#129751 ) The dependency scanner uses implicitly-built Clang modules under the hood. This system was originally designed to handle multiple concurrent processes working on the same module cache, and mutual exclusion was implemented using file locks. The scanner, however, runs within single process, making file locks unnecessary. This patch virtualizes the interface for module cache locking and provides an implementation based on `std::shared_mutex`. This reduces `clang-scan-deps` runtime by ~17% on my benchmark. Note that even when multiple processes run a scan on the same module cache (and therefore don't coordinate efficiently), this should still be correct due to the strict context hash, the write-through `InMemoryModuleCache` and the logic for rebuilding out-of-date or incompatible modules.	2025-03-18 14:01:04 -07:00
Jan Svoboda	d2e66625bc	[clang][deps] Propagate the entire service (#128959 ) Shared state between dependency scanning workers is managed by the dependency scanning service. Right now, the members are individually threaded through the worker, action, and collector. This makes any change to the service and its members a very laborious process. Moreover, this situation causes frequent merge conflicts in our downstream repo where the service does have some extra members that need to be passed around. To ease the maintenance burden, this PR starts passing a reference to the entire service.	2025-02-27 10:06:26 -08:00
Ben Langmuir	e3cab30ab9	[clang][deps] Ensure DiagnosticConsumer::finish is always called (#127110 ) When using the clang dependency scanner with an arbitrary DiagnosticConsumer, it is important that we always call finish(). Previously, if there was an error preventing us from reaching the scanning action, or if the command line contained no scannable actions we would fail to finish(), which would break some consumers (e.g. serialized diag consumer).	2025-02-13 14:06:17 -08:00
Steven Wu	7a52b93837	[DependencyScanning] Add ability to scan TU with a buffer input (#125111 ) Update Dependency scanner so it can scan the dependency of a TU with a provided buffer rather than relying on the on disk file system to provide the input file.	2025-02-04 16:37:29 -08:00
Kadir Cetinkaya	df9a14d7bb	Reapply "[NFC] Explicitly pass a VFS when creating DiagnosticsEngine (#115852 )" This reverts commit a1153cd6fedd4c906a9840987934ca4712e34cb2 with fixes to lldb breakages. Fixes https://github.com/llvm/llvm-project/issues/117145.	2024-11-21 14:55:30 +01:00
Sylvestre Ledru	a1153cd6fe	Revert "[NFC] Explicitly pass a VFS when creating DiagnosticsEngine (#115852 )" Reverted for causing: https://github.com/llvm/llvm-project/issues/117145 This reverts commit bdd10d9d249bd1c2a45e3de56a5accd97e953458.	2024-11-21 13:04:30 +01:00
kadir çetinkaya	bdd10d9d24	[NFC] Explicitly pass a VFS when creating DiagnosticsEngine (#115852 ) Starting with 41e3919ded78d8870f7c95e9181c7f7e29aa3cc4 DiagnosticsEngine creation might perform IO. It was implicitly defaulting to getRealFileSystem. This patch makes it explicit by pushing the decision making to callers. It uses ambient VFS if one is available, and keeps using `getRealFileSystem` if there aren't any VFS.	2024-11-21 12:11:41 +01:00
Jan Svoboda	25d1ac11d5	[clang][deps] Only write preprocessor info into PCMs (#115239 ) This patch builds on top of https://github.com/llvm/llvm-project/pull/115237 and https://github.com/llvm/llvm-project/pull/115235, only passing the `Preprocessor` object to `ASTWriter`. This reduces the size of scanning PCM files by 1/3 and speeds up scans by 16%.	2024-11-11 13:07:08 -08:00
Jan Svoboda	a6637ae2cc	[clang][deps] Share `FileManager` between modules (#115065 ) The `FileManager` sharing between module-building `CompilerInstance`s was disabled a while ago due to `FileEntry::getName()` being unreliable. Now that we use `FileEntryRef::getNameAsRequested()` in places where it matters, re-enabling `FileManager` is sound and improves performance of `clang-scan-deps` by ~6.2%.	2024-11-06 14:21:01 -08:00
Jan Svoboda	6e4dcbb21d	[clang][deps] Print tracing VFS data (#108056 ) Clang's `-cc1 -print-stats` shows lots of useful internal data including basic `FileManager` stats. Since this layer caches some results, it is unclear how that information translates to actual filesystem accesses. This PR uses `llvm::vfs::TracingFileSystem` to provide that missing information. Similar mechanism is implemented for `clang-scan-deps`'s verbose mode (`-v`). IO contention proved to be a real bottleneck a couple of times already and this new feature should make those easier to detect in the future. The tracing VFS is inserted below the caching FS and above the real FS.	2024-09-11 16:04:56 -07:00
Chuanqi Xu	62fec3d23d	[NFCI] [ClangScanDeps] [P1689] Use PreprocessorOnly Action for P1689 It is fine enough to use PreprocessorOnly action for P1689 format. We don't need to read any PCH or module files.	2024-09-06 15:20:59 +08:00
Jan Svoboda	55323ca6c8	[clang][deps] Only bypass scanning VFS for the module cache (#88800 ) The scanning VFS doesn't cache stat failures of paths with no extension. This was originally implemented to avoid caching the non-existence of the modules cache directory that the modular scanner will eventually create if it does not exist. However, this prevents caching of the non-existence of all directories and notably also header files from the standard C++ library, which can lead to sub-par performance. This patch adds an API to the scanning VFS that allows clients to configure path prefix for which to bypass the scanning VFS and use the underlying VFS directly.	2024-08-13 08:41:39 -07:00
Chuanqi Xu	d64eccf433	[clang] Split ObjectFilePCHContainerReader from ObjectFilePCHContainerWriter (#99599 ) Close https://github.com/llvm/llvm-project/issues/99479 See https://github.com/llvm/llvm-project/issues/99479 for details	2024-07-23 23:55:31 +08:00
Nishith Kumar M Shah	0559eaff5a	Revert "Pass LangOpts from CompilerInstance to DependencyScanningWorker (#93753 )" (#94488 ) This reverts commit 9862080b1cbf685c0d462b29596e3f7206d24aa2.	2024-06-05 11:42:13 -07:00
Nishith Kumar M Shah	9862080b1c	Pass LangOpts from CompilerInstance to DependencyScanningWorker (#93753 ) This commit fixes https://github.com/llvm/llvm-project/issues/88896 by passing LangOpts from the CompilerInstance to DependencyScanningWorker so that the original LangOpts are preserved/respected. This makes for more accurate parsing/lexing when certain language versions or features specific to versions are to be used.	2024-06-03 17:20:43 +02:00
Kazu Hirata	197c3a3efc	Use llvm::less_first (NFC) (#94136 )	2024-06-02 07:45:50 -07:00
Alexandre Ganea	39ed3c68e5	[clang-scan-deps] Fix contention when updating `TrackingStatistic`s in hot code paths in `FileManager`. (#88427 ) `FileManager::getDirectoryRef()` and `FileManager::getFileRef()` are hot code paths in `clang-scan-deps`. These functions are updating on every call a few atomics related to printing statistics, which causes contention on high core count machines. ![Screenshot 2024-04-10 214123](https://github.com/llvm/llvm-project/assets/37383324/5756b1bc-cab5-4612-8769-ee7e03a66479) ![Screenshot 2024-04-10 214246](https://github.com/llvm/llvm-project/assets/37383324/3d560e89-61c7-4fb9-9330-f9e660e8fc8b) ![Screenshot 2024-04-10 214315](https://github.com/llvm/llvm-project/assets/37383324/006341fc-49d4-4720-a348-7af435c21b17) After this patch we make the variables local to the `FileManager`. In our test case, this saves about 49 sec over 1 min 47 sec of `clang-scan-deps` run time (1 min 47 sec before, 58 sec after). These figures are after applying my suggestion in https://github.com/llvm/llvm-project/pull/88152#issuecomment-2049803229, that is: ``` static bool shouldCacheStatFailures(StringRef Filename) { return true; } ``` Without the above, there's just too much OS noise from the high volume of `status()` calls with regular non-modules C++ code. Tested on Windows with clang-cl.	2024-04-25 10:31:45 -04:00
Jan Svoboda	2248164a9a	Revert "[clang] Move state out of `PreprocessorOptions` (1/n) (#86358 )" This reverts commit 407a2f23 which stopped propagating the callback to module compiles, effectively disabling dependency directive scanning for all modular dependencies. Also added a regression test.	2024-04-09 13:26:45 -07:00
Jan Svoboda	407a2f231a	[clang] Move state out of `PreprocessorOptions` (1/n) (#86358 ) An instance of `PreprocessorOptions` is part of `CompilerInvocation` which is supposed to be a value type. The `DependencyDirectivesForFile` member is problematic, since it holds an owning reference of the scanning VFS. This makes it not a true value type, and it can keep potentially large chunk of memory (the local cache in the scanning VFS) alive for longer than clients might expect. Let's move it into the `Preprocessor` instead.	2024-03-29 11:20:55 -07:00
Jan Svoboda	b768a8c1db	[clang][deps] Lazy dependency directives (#86347 ) Since b4c83a13f664582015ea22924b9a0c6290d41f5b, `Preprocessor` and `Lexer` are aware of the concept of scanning dependency directives. This makes it possible to scan for them on-demand rather than eagerly on the first filesystem operation (open, or even just stat). This might improve performance, but is also necessary for the "PCH as module" mode. Some precompiled header sources use the ".pch" file extension, which means they were not getting scanned for dependency directives. This was okay when the PCH was the main input file in a separate scan step, because there we just lex the file in a scanning-specific frontend action. But when such source gets treated as a module implicitly loaded from a TU, it will get compiled as any other module - with Sema - which will result in compilation errors. (See attached test case.) rdar://107663951	2024-03-22 16:09:34 -07:00
Ben Langmuir	083da46ff0	[clang][deps] Fix dependency scanning with -working-directory (#84525 ) Stop overriding -working-directory to CWD during argument parsing, which should no longer necessary after we set the VFS working directory, and set FSOpts correctly after parsing arguments so that working-directory behaves correctly.	2024-03-12 08:02:54 -07:00
Michael Spencer	de3b2c293b	[clang][ScanDeps] Allow PCHs to have different VFS overlays (#82294 ) It turns out it's not that uncommon for real code to pass a different set of VFSs while building a PCH than while using the PCH. This can cause problems as seen in `test/ClangScanDeps/optimize-vfs-pch.m`. If you scan `compile-commands-tu-no-vfs-error.json` without -Werror and run the resulting commands, Clang will emit a fatal error while trying to emit a note saying that it can't find a remapped header. This also adds textual tracking of VFSs for prebuilt modules that are part of an included PCH, as the same issue can occur in a module we are building if we drop VFSs. This has to be textual because we have no guarantee the PCH had the same list of VFSs as the current TU. This uses the `PrebuiltModuleListener` to collect `VFSOverlayFiles` instead of trying to extract it out of a `serialization::ModuleFile` each time it's needed. There's not a great way to just store a pointer to the list of strings in the serialized AST.	2024-02-23 17:48:58 -08:00
Michael Spencer	d42de86eb3	reland: [clang][ScanDeps] Canonicalize -D and -U flags (#82568 ) Canonicalize `-D` and `-U` flags by sorting them and only keeping the last instance of a given name. This optimization will only fire if all `-D` and `-U` flags start with a simple identifier that we can guarantee a simple analysis of can determine if two flags refer to the same identifier or not. See the comment on `getSimpleMacroName()` for details of what the issues are. Previous version of this had issues with sed differences between macOS, Linux, and Windows. This test doesn't check paths, so just don't run sed. Other tests should use `sed -E 's:\\\\?:/:g'` to get portable behavior. Windows has different command line parsing behavior than Linux for compilation databases, so the test has been adjusted to ignore that difference.	2024-02-23 17:44:32 -08:00
Nico Weber	84ed55e11f	Revert "[clang][ScanDeps] Canonicalize -D and -U flags (#82298 )" This reverts commit 3ff805540173b83d73b673b39ac5760fc19bac15. Test is failing on bots, see https://github.com/llvm/llvm-project/pull/82298#issuecomment-1955664462	2024-02-20 20:24:32 -05:00
Michael Spencer	3ff8055401	[clang][ScanDeps] Canonicalize -D and -U flags (#82298 ) Canonicalize `-D` and `-U` flags by sorting them and only keeping the last instance of a given name. This optimization will only fire if all `-D` and `-U` flags start with a simple identifier that we can guarantee a simple analysis of can determine if two flags refer to the same identifier or not. See the comment on `getSimpleMacroName()` for details of what the issues are.	2024-02-20 15:20:40 -08:00
Michael Spencer	b21a2f9365	[clang][scan-deps] Stop scanning if any scanning setup emits an error. Without this scanning will continue and later hit an assert that the number of `RedirectingFileSystem`s matches the number of -ivfsoverlay arguments.	2024-01-30 17:03:13 -08:00
Michael Spencer	7847e44594	[clang][DependencyScanner] Remove unused -ivfsoverlay files (#73734 ) `-ivfsoverlay` files are unused when building most modules. Enable removing them by, * adding a way to visit the filesystem tree with extensible RTTI to access each `RedirectingFileSystem`. * Adding tracking to `RedirectingFileSystem` to record when it actually redirects a file access. * Storing this information in each PCM. Usage tracking is only enabled when iterating over the source manager and affecting modulemaps. Here each path is stated to cause an access. During scanning these stats all hit the cache.	2024-01-30 15:39:18 -08:00
Kazu Hirata	9b2c25c704	[clang] Use SmallString::operator std::string (NFC)	2024-01-20 18:57:30 -08:00
Jan Svoboda	22c68511ac	[clang][deps] Skip writing `DIAG_PRAGMA_MAPPINGS` record (#70874 ) Following up on #69975, this patch skips writing `DIAG_PRAGMA_MAPPINGS` as well. Deserialization of this PCM record is still showing up in profiles, since it needs to be VBR-decoded for every transitively loaded PCM file. The scanner doesn't make any guarantees about diagnostic accuracy (and it even disables all warnings), so skipping this record should be safe.	2023-11-10 07:04:43 -08:00
Michael Spencer	fb07d9cc09	[clang][DepScan] Make OptimizeArgs a bit mask enum and enable by default (#71588 ) Make it easier to control which optimizations are enabled by making OptimizeArgs a bit masked enum. There's currently only one such optimization, but more will be added in followup commits.	2023-11-07 16:06:59 -08:00
Jan Svoboda	6c465a201b	[clang][deps] Skip slow `UNHASHED_CONTROL_BLOCK` records (#69975 ) Deserialization of the `DIAGNOSTIC_OPTIONS` and `HEADER_SEARCH_PATHS` records is slow and done for every transitively loaded PCM. Deserialization of these records cannot be skipped, because the words are VBR6-encoded and we don't store the length of the entire record. We could either turn them into binary blobs that can be skipped during deserialization, or skip writing them altogether. This patch takes the latter approach, since these records are not necessary in scanning PCMs. The scanner doesn't make any guarantees about the accuracy of diagnostics, and we always have the same header search paths due to strict context hashing. The commit that makes the `DIAGNOSTIC_OPTIONS` record skippable was originally implemented by @benlangmuir in a downstream repo.	2023-11-02 15:07:58 -07:00
Connor Sughrue	6b4de7b1c7	[clang][deps] add support for dependency scanning with cc1 command line Allow users to run a dependency scan with a cc1 command line in addition to a driver command line. DependencyScanningAction was already being run with a cc1 command line, but DependencyScanningWorker::computeDependencies assumed that it was always provided a driver command line. Now DependencyScanningWorker::computeDependencies can handle cc1 command lines too. Reviewed By: jansvoboda11 Differential Revision: https://reviews.llvm.org/D156234	2023-08-04 14:13:18 -07:00
Jan Svoboda	227f719958	[clang][modules][deps] Avoid checks for relocated modules Currently, `ASTReader` performs some checks to diagnose relocated modules. This can add quite a bit of overhead to the scanner: it requires looking up, parsing and resolving module maps for all transitively loaded module files (and all the module maps encountered in the search paths on the way). Most of those checks are not really useful in the scanner anyway, since it uses strict context hash and immutable filesystem, which prevent those scenarios in the first place. This can speed up scanning by up to 30%. Depends on D150292. Reviewed By: benlangmuir Differential Revision: https://reviews.llvm.org/D150320	2023-07-17 13:50:24 -07:00
Ben Langmuir	8fe8d69ddf	[clang][deps] Make clang-scan-deps write modules in raw format We have no use for debug info for the scanner modules, and writing raw ast files speeds up scanning ~15% in some cases. Note that the compile commands produced by the scanner will still build the obj format (if requested), and the scanner can read obj format pcms, e.g. from a PCH. rdar://108807592 Differential Revision: https://reviews.llvm.org/D149693	2023-05-03 12:07:46 -07:00
Jan Svoboda	34f143988f	[clang][deps] NFC: Don't collect PCH input files Since b4c83a13, PCH input files are no longer necessary.	2023-04-05 12:29:03 -07:00
Ben Langmuir	fcab930cd3	[clang][deps] Handle response files in dep scanner Extract the code the driver uses to expand response files and reuse it in the dependency scanner. rdar://106155880 Differential Revision: https://reviews.llvm.org/D145838	2023-03-13 15:47:35 -07:00
Ben Langmuir	296ba5bbd3	[clang][deps] Split lookupModuleOutput out of DependencyConsumer NFC The idea is to split the callbacks that are used to consume dependency information (DependencyConsumer) from callbacks that modify the scan behaviour itself in any way (DependencyActionController). Currently this is just lookupModuleOutput, but we have additional callbacks related to CAS support that we intend to upstream in the future. Differential Revision: https://reviews.llvm.org/D144058	2023-03-10 13:14:49 -08:00
Chuanqi Xu	eb70b38f83	Recommit [C++20] [Modules] [ClangScanDeps] Add ClangScanDeps support for C++20 Named Modules in P1689 format (2/4) Close https://github.com/llvm/llvm-project/issues/51792 Close https://github.com/llvm/llvm-project/issues/56770 This patch adds ClangScanDeps support for C++20 Named Modules in P1689 format. We can find the P1689 format at: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1689r5.html. After we land the patch, we're able to compile C++20 Named Modules with CMake! And although P1689 is written by kitware people, other build systems should be able to use the format to compile C++20 Named Modules too. TODO: Support header units in P1689 Format. TODO2: Support C++20 Modules in the full dependency format of ClangScanDeps. We also want to support C++20 Modules and clang modules together according to https://discourse.llvm.org/t/how-should-we-support-dependency-scanner-for-c-20-modules/66027. But P1689 format cares about C++20 Modules only for now. So let's focus on C++ Modules and P1689 format. And look at the full dependency format later. I'll add the ReleaseNotes and Documentations after the patch get landed. Reviewed By: jansvoboda11 Differential Revision: https://reviews.llvm.org/D137527	2023-02-13 10:42:35 +08:00
NAKAMURA Takumi	069dd8768a	Revert "[C++20] [Modules] [ClangScanDeps] Add ClangScanDeps support for C++20 Named Modules in P1689 format (2/4)" This reverts commit de17c665e3f995c7f5a0e453461ce3a1b8aec196. See also D137527	2023-02-12 18:38:25 +09:00
Archibald Elliott	d768bf994f	[NFC][TargetParser] Replace uses of llvm/Support/Host.h The forwarding header is left in place because of its use in `polly/lib/External/isl/interface/extract_interface.cc`, but I have added a GCC warning about the fact it is deprecated, because it is used in `isl` from where it is included by Polly.	2023-02-10 09:59:46 +00:00

1 2 3

121 Commits