Resolves: #70930 (and probably latest comments from
https://github.com/clangd/clangd/issues/251)
by fixing racing for the shared `DiagStorage` value which caused messing
with args inside the storage and then formatting the following message
with `getArgSInt(1)` == 2:
```
def err_module_odr_violation_function : Error<
"%q0 has different definitions in different modules; "
"%select{definition in module '%2'|defined here}1 "
"first difference is "
```
which causes `HandleSelectModifier` to go beyond the `ArgumentLen` so
the recursive call to `FormatDiagnostic` was made with `DiagStr` >
`DiagEnd` that leads to infinite `while (DiagStr != DiagEnd)`.
**The Main Idea:**
Reuse the existing `DiagStorageAllocator` logic to make all
`DiagnosticBuilder`s having independent states.
Also, encapsulating the rest of state (e.g. ID and Loc) into
`DiagnosticBuilder`.
**TODO (if it will be requested by reviewer):**
- [x] add a test (I have no idea how to turn a whole bunch of my
proprietary code which leads `clangd` to OOM into a small public
example.. probably I must try using
[this](https://github.com/llvm/llvm-project/issues/70930#issuecomment-2209872975)
instead)
- [x] [`Diag.CurDiagID !=
diag::fatal_too_many_errors`](https://github.com/llvm/llvm-project/pull/108187#pullrequestreview-2296395489)
- [ ] ? get rid of `DiagStorageAllocator` at all and make
`DiagnosticBuilder` having they own `DiagnosticStorage` coz it seems
pretty small so should fit the stack for short-living
`DiagnosticBuilder` instances
CompilerInstance can re-use same SourceManager across multiple
frontendactions. During this process it calls
`SourceManager::clearIDTables` to reset any caches based on FileIDs.
It didn't reset IncludeLocMap, resulting in wrong include locations for
workflows that triggered multiple frontend-actions through same
CompilerInstance.
Add libclang function `clang_isBeforeInTranslationUnit` to allow checking the order between two source locations.
Simplify the `SourceRange.__contains__` implementation using this new function.
Add tests for `SourceRange.__contains__` and the newly added functionality.
Fixes#22617Fixes#52827
We have been running into source location exhaustion recently and want
to use the statistics to monitor the usage in various files to be able
to anticipate where the next problem will happen.
I picked `Statistic` because it can be written into a structured JSON
file and is easier to consume by further automation.
This commit does not change any existing per-source-manager metrics
exposed via `SourceManager::PrintStats()`. This does create some
redundancy, but I also expect to be non-controversial because it aligns
with the intended use of `Statistic`.
The commit adds serialization and de-serialization implementations for
the stored regions. Basically, the serialized representation of the
regions of a PP is a (ordered) sequence of source location encodings.
For de-serialization, regions from loaded files are stored by their ASTs.
When later one queries if a loaded location L is in an opt-out
region, PP looks up the regions of the loaded AST where L is at.
(Background if helps: a pair of `#pragma clang unsafe_buffer_usage begin/end` pragmas marks a
warning-opt-out region. The begin and end locations (opt-out regions)
are stored in preprocessor instances (PP) and will be queried by the
`-Wunsafe-buffer-usage` analyzer.)
The reported issue at upstream: https://github.com/llvm/llvm-project/issues/90501
rdar://124035402
This patch adds several const-qualified variants of existing member
functions to `SourceManager`.
I started with removing const qualification from
`setNumCreatedFIDsForFileID`, and removing `const_cast` in the body of
this function, as I think it doesn't make sense to const-qualify
setters.
This fixes a typo ("SLocEntry's" -> "SLocEntries"), fixes capitalization ("Sloc" -> "SLoc") and adds extra information (capacity in bytes of `LoadedSLocEntryTable`).
`createExpansionLocImpl` has an assert that checks if we ran out of
source locations. We have observed this happening on real code and in
release builds the assertion does not fire and the compiler just keeps
running indefinitely without giving any indication that something went
wrong.
Diagnose this problem and reliably crash to make sure the problem is
easy to detect.
I have also tried:
- returning invalid source locations,
- reporting sloc address space usage on error.
Both caused the compiler to run indefinitely. It would be nice to dig
further why that happens, but until then crashing seems like a better
alternative.
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an enum.
This patch replaces llvm::support::{big,little,native} with
llvm::endianness::{big,little,native}.
In `SourceManager::getFileID()`, Clang performs binary search over its
buffer of `SLocEntries`. For modules, this binary search fully
deserializes the entire `SLocEntry` block for each visited entry. For
some entries, that includes decompressing the associated buffer (e.g.
the predefines buffer, macro expansion buffers, contents of volatile
files), which shows up in profiles of the dependency scanner.
This patch moves the binary search over loaded entries into `ASTReader`,
which can perform cheaper partial deserialization during the binary
search, reducing the wall time of dependency scans by ~3%. This also
reduces the number of retired instructions by ~1.4% on regular
(implicit) modules compilation.
Note that this patch drops the optimizations based on the last lookup ID
(pruning the search space and performing linear search before resorting
to the full binary search). Instead, it reduces the search space by
asking `ASTReader::GlobalSLocOffsetMap` for the containing `ModuleFile`
and only does binary search over entries of single module file.
This patch fixes:
clang/lib/Basic/SourceManager.cpp:1979:64: error: 'greater' may not
intend to support class template argument deduction
[-Werror,-Wctad-maybe-unsupported]
This commit removes the list of SLocEntry offsets to preload eagerly
from PCM files. Commit introducing this functionality (258ae54a) doesn't
clarify why this would be more performant than the lazy approach used
regularly.
Currently, the only SLocEntry the reader is supposed to preload is the
predefines buffer, but in my experience, it's not actually referenced in
most modules, so the time spent deserializing its SLocEntry is wasted.
This is especially noticeable in the dependency scanner, where this
change brings 4.56% speedup on my benchmark.
This patch removes the `SourceManager` APIs that create `FileID` from a
`const FileEntry *` in favor of APIs that take `FileEntryRef`. This also
removes a misleading documentation that claims `nullptr` file entry
represents stdin. I don't think that's right, since we just try to
dereference that pointer anyways.
The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.
By doing so, sparsely accessed PagedVector can profit from reduced
memory footprint.
This reapplies ddbcc10b9e26b18f6a70e23d0611b9da75ffa52f, except for a tiny part that was reverted separately: 65331da0032ab4253a4bc0ddcb2da67664bd86a9. That will be reapplied later on, since it turned out to be more involved.
This commit is enabled by 5523fefb01c282c4cbcaf6314a9aaf658c6c145f and f0f548a65a215c450d956dbcedb03656449705b9, specifically the part that makes 'clang-tidy/checkers/misc/header-include-cycle.cpp' separator agnostic.
This commit replaces some calls to the deprecated `FileEntry::getName()` with `FileEntryRef::getName()` by swapping current usages of `SourceManager::getFileEntryForID()` with `SourceManager::getFileEntryRefForID()`. This lowers the number of usages of the deprecated `FileEntry::getName()` from 95 to 50.
The needed tweaks are mostly trivial, the one nasty bit is Clang's usage
of OptionalStorage. To keep this working old Optional stays around as
clang::CustomizableOptional, with the default Storage removed.
Optional<File/DirectoryEntryRef> is replaced with a typedef.
I tested this with GCC 7.5, the oldest supported GCC I had around.
Differential Revision: https://reviews.llvm.org/D140332
This reverts commit 8f0df9f3bbc6d7f3d5cbfd955c5ee4404c53a75d.
The Optional*RefDegradesTo*EntryPtr types want to keep the same size as
the underlying type, which std::optional doesn't guarantee. For use with
llvm::Optional, they define their own storage class, and there is no way
to do that in std::optional.
On top of that, that commit broke builds with older GCCs, where
std::optional was not trivially copyable (static_assert in the clang
sources was failing).
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This patch fixes:
clang/lib/Basic/SourceManager.cpp:1292:19: error: comparison of
integers of different signs: 'long' and 'unsigned long'
[-Werror,-Wsign-compare]
This revision fixes typos where there are 2 consecutive words which are
duplicated. There should be no code changes in this revision (only
changes to comments and docs). Do let me know if there are any
undesirable changes in this revision. Thanks.
When CurrentLoadedOffset is less than TotalSize, current code will
trigger unsigned overflow and will not return an "allocation failed"
indicator.
Google ref: b/248613299
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D135192
Prune the search space -- If we know offset(LastFileIDLookup) < SearchOffset, we
can prune the initial binary-search range from [0, end) to [LastFileIDlookup, end).
It reduces the binary search scan by ~30%.
SemaExpr.cpp: 1393437 -> 1035426
FindTarget.cpp: 1275930 -> 920087
Linux kernel:
getFileIDLocal: 2.45% -> 2.15%
Differential Revision: https://reviews.llvm.org/D135132
Similar to getFileIDLocal patch, but for the version for load module.
Test with clangd (building AST with preamble), FileID scans in binary
search is reduced:
SemaExpr.cpp: 142K -> 137K (-3%)
FindTarget.cpp: 368K -> 343K (-6%)
Differential Revision: https://reviews.llvm.org/D135258
Update SourceManager::ContentCache::OrigEntry to keep the original
FileEntryRef, and use that to enable ModuleMap::getModuleMapFile* to
return the original FileEntryRef. This change should be NFC for
most users of SourceManager::ContentCache, but it could affect behaviour
for users of getNameAsRequested such as in compileModuleImpl. I have not
found a way to detect that difference without additional functional
changes, other than incidental cases like changes from / to \ on
Windows so there is no new test.
Differential Revision: https://reviews.llvm.org/D135220
isBeforeInTranslationUnit compares SourceLocations across FileIDs by
mapping them onto a common ancestor file, following include/expansion edges.
It is possible to get a tie in the common ancestor, because multiple
"chunks" of a macro arg will expand to the same macro param token in the body:
#define ID(X) X
#define TWO 2
ID(1 TWO)
Here two FileIDs both expand into `X` in ID's expansion:
- one containing `1` and spelled on line 3
- one containing `2` and spelled by the macro expansion of TWO
isBeforeInTranslationUnit breaks this tie by comparing the two FileIDs:
the one "on the left" is always created first and is numerically smaller.
This seems correct so far.
Prior to this patch it also takes a shortcut (unclear if intentionally).
Instead of comparing the two FileIDs that directly expand to the same location,
it compares the original FileIDs being compared. These may not be the
same if there are multiple macro expansions in between.
This *almost* always yields the right answer, because macro expansion
yields "trees" of FileIDs allocated in a contiguous range: when comparing tree A
to tree B, it doesn't matter what representative you pick.
However, the splitting of >> tokens is modeled as macro expansion (as if
the first '>' was a macro that expands to a '>' spelled a scratch buffer).
This splitting occurs retroactively when parsing, so the FileID allocated is
larger than expected if it were a real macro expansion performed during lexing.
As a result, macro tree A can be on the left of tree B, and yet contain
a token-split FileID whose numeric value is *greator* than those in B.
In this case the tiebreak gives the wrong answer.
Concretely:
#define ID(X) X
template <typename> class S{};
ID(
ID(S<S<int>> x);
int y;
)
Given Greater = (typeloc of S<int>).getEndLoc();
Y = (decl of y).getLocation();
isBeforeInTranslationUnit(Greater, Y) should return true, but returns false.
Here the common FileID of (Greater, Y) is the body of the outer ID
expansion, and they both expand to X within it.
With the current tiebreak rules, we compare the FileID of Greater (a split)
to the FileID of Y (a macro arg expansion into X of the outer ID).
The former is larger because the token split occurred relatively late.
This patch fixes the issue by removing the shortcut. It tracks the immediate
FileIDs used to reach the common file, and uses these IDs to break ties.
In the example, we now compare the macro arg expansion of the inner ID()
to the macro arg expansion of Y, and find that it is smaller.
This requires some changes to the InBeforeInTUCacheEntry (sic).
We store a little more data so it's probably slightly slower.
It was difficult to resist more invasive changes:
- performance: the sizing is very suspicious, and once the cache "fills up"
we're thrashing a single entry
- API: the class seems to be needlessly complicated
However I tried to avoid mixing these with subtle behavior changes, and
will send a followup instead.
Differential Revision: https://reviews.llvm.org/D134685
There is a fix for the search procedure at `SourceManager::getFileIDLoaded`. It might return an invalid (not loaded) entry. That might cause clang/clangd crashes. At my case the scenario was the following:
- `SourceManager::getFileIDLoaded` returned an invalid file id for a non loaded entry and incorrectly set `SourceManager::LastFileIDLookup` to the value
- `getSLocEntry` returned `SourceManager::FakeSLocEntryForRecovery` introduced at [D89748](https://reviews.llvm.org/D89748).
- The result entry is not tested at `SourceManager::isOffsetInFileID`and as result the call `return SLocOffset < getSLocEntryByID(FID.ID+1).getOffset();` returned `true` value because `FID.ID+1` pointed to a non-fake entry
- The tested offset was marked as one that belonged to the fake `SLockEntry`
Such behaviour might cause a weird clangd crash when preamble contains some header files that were removed just after the preamble created. Unfortunately it's not so easy to reproduce the crash in the form of a LIT test thus I provided the fix only.
Test Plan:
```
ninja check-clang
```
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D130847