418 Commits

Author SHA1 Message Date
James Y Knight
c7f3437507
NFC: Clean up of IntrusiveRefCntPtr construction from raw pointers. (#151545)
Handles clang::DiagnosticsEngine and clang::DiagnosticIDs.

For DiagnosticIDs, this mostly migrates from `new DiagnosticIDs` to
convenience method `DiagnosticIDs::create()`.

Part of cleanup https://github.com/llvm/llvm-project/issues/151026
2025-07-31 15:07:35 -04:00
James Y Knight
9ddbb478ce
NFC: Clean up construction of IntrusiveRefCntPtr from raw pointers for llvm::vfs::FileSystem. (#151407)
This switches to `makeIntrusiveRefCnt<FileSystem>` where creating a new
object, and to passing/returning by `IntrusiveRefCntPtr<FileSystem>`
instead of `FileSystem*` or `FileSystem&`, when dealing with existing
objects.

Part of cleanup #151026.
2025-07-31 09:57:13 -04:00
Sirraide
7b43c6c6a7
Revert "[Clang] [Diagnostics] Simplify filenames that contain '..'" (#148367)
Revert llvm/llvm-project#143520 for now since it’s causing issues for
people who are using symlinks and prefer to preserve the original path
(i.e. looks like we’ll have to make this configurable after all; I just
need to figure out how to pass `-no-canonical-prefixes` down through the
driver); I’m planning to refactor this a bit and reland it in a few
days.
2025-07-12 15:13:22 +02:00
Sirraide
e3e7393c46
[Clang] [Diagnostics] Simplify filenames that contain '..' (#143520)
This can significantly shorten file paths to standard library headers,
e.g. on my system, `<ranges>` is currently printed as
```console
/usr/lib/gcc/x86_64-redhat-linux/15/../../../../include/c++/15/ranges
```
but with this change, we instead print
```console
/usr/include/c++/15/ranges
```

This is of course just a heuristic; there are paths that would get longer
as a result of this, so we use whichever path ends up being shorter.

@AaronBallman pointed out that this might be problematic for network
file systems since path resolution might take a while, so this is enabled 
only for paths that are part of a local filesystem—though not on Windows
since there we noticed that the check itself is slow.

The file names are cached in `SourceManager`.
2025-07-08 01:02:19 +02:00
Haojian Wu
7c2182a132 NFC, use structured binding to simplify the code in SourceManager.cpp. 2025-07-07 16:43:46 +02:00
Haojian Wu
7fea83e314 [clang] NFC, use LocalLocOffsetTable in getFIleIDLocal. 2025-07-07 09:57:41 +02:00
Haojian Wu
784bd61fc4
[clang] Speedup getFileIDLocal with a separate offset table. (#146604)
The `SLocEntry` structure is 24 bytes, and the binary search only needs
the offset. Loading an entry's offset might pull the entire SLocEntry
object into the CPU cache.

To make the binary search much more cache-efficient, we use a separate
offset table.

See
https://llvm-compile-time-tracker.com/compare.php?from=650d0151c623c123e4e9736fe50421624a329260&to=6af564c0d75aff28a2784a8554448c0679877792&stat=instructions:u.
2025-07-07 09:42:38 +02:00
Haojian Wu
e2510b189d
[clang] SourceManager: Cache offsets for LastFileIDLookup to speed up getFileID (#146782)
`getFileID` is a hot method. By caching the offset range in
`LastFileIDLookup`, we can more quickly check whether a given offset
falls within it, avoiding calling `isOffsetInFileID`.

https://llvm-compile-time-tracker.com/compare.php?from=0588e8188c647460b641b09467fe6b13a8d510d5&to=64843a500f0191b79a8109da9acd7e80d961c7a3&stat=instructions:u
2025-07-04 22:11:59 +02:00
Abhina Sree
a9ee1797b7
Remove helper function and use target agnostic needConversion function (#146680)
This patch adds back the needed AutoConvert.h header and removes the
unneeded include guard of MVS to prevent this header from being removed
in the future
2025-07-02 10:02:46 -04:00
Haojian Wu
650d0151c6
[clang] Improve getFileIDLocal binary search. (#146510)
Avoid reading the `LocalSLocEntryTable` twice per loop iteration. NFC.

https://llvm-compile-time-tracker.com/compare.php?from=0b6ddb02efdcbdac9426e8d857499ea0580303cd&to=1aa335ccfb07ba96177b89b1933aa6b980fa14f6&stat=instructions:u
2025-07-01 21:59:09 +02:00
Kazu Hirata
c9cdc33dd6
[clang] Remove unused includes (NFC) (#146254)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-06-28 20:41:46 -07:00
Haojian Wu
0b6ddb02ef
[clang] NFC: Add alias for std::pair<FileID, unsigned> used in SourceLocation (#145711)
Introduce a type alias for the commonly used `std::pair<FileID,
unsigned>` to improve code readability, and make it easier for future
updates (64-bit source locations).
2025-06-26 14:12:51 +02:00
kadir çetinkaya
4551e50355
[clang] Reset FileID based diag state mappings (#143695)
When sharing same compiler instance for multiple compilations, we reset
source manager's file id tables in between runs. Diagnostics engine
keeps a cache based on these file ids, that became dangling references
across compilations.

This patch makes sure we reset those whenever sourcemanager is trashing
its FileIDs.
2025-06-12 10:49:23 +02:00
Abhina Sreeskantharajan
cda5ca8792 Add back AutoConvert.h header that is used for autoconversion on MVS 2025-06-02 08:28:50 -04:00
Kazu Hirata
cd9fe8a34c
[Basic] Remove unused includes (NFC) (#142295)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-05-31 19:00:31 -07:00
Jan Svoboda
13e1a2cb22 Reapply "[clang] Remove intrusive reference count from DiagnosticOptions (#139584)"
This reverts commit e2a885537f11f8d9ced1c80c2c90069ab5adeb1d. Build failures were fixed right away and reverting the original commit without the fixes breaks the build again.
2025-05-22 12:52:03 -07:00
Kazu Hirata
e2a885537f Revert "[clang] Remove intrusive reference count from DiagnosticOptions (#139584)"
This reverts commit 9e306ad4600c4d3392c194a8be88919ee758425c.

Multiple builtbot failures have been reported:
https://github.com/llvm/llvm-project/pull/139584
2025-05-22 12:44:20 -07:00
Jan Svoboda
9e306ad460
[clang] Remove intrusive reference count from DiagnosticOptions (#139584)
The `DiagnosticOptions` class is currently intrusively
reference-counted, which makes reasoning about its lifetime very
difficult in some cases. For example, `CompilerInvocation` owns the
`DiagnosticOptions` instance (wrapped in `llvm::IntrusiveRefCntPtr`) and
only exposes an accessor returning `DiagnosticOptions &`. One would
think this gives `CompilerInvocation` exclusive ownership of the object,
but that's not the case:

```c++
void shareOwnership(CompilerInvocation &CI) {
  llvm::IntrusiveRefCntPtr<DiagnosticOptions> CoOwner = &CI.getDiagnosticOptions();
  // ...
}
```

This is a perfectly valid pattern that is being actually used in the
codebase.

I would like to ensure the ownership of `DiagnosticOptions` by
`CompilerInvocation` is guaranteed to be exclusive. This can be
leveraged for a copy-on-write optimization later on. This PR changes
usages of `DiagnosticOptions` across `clang`, `clang-tools-extra` and
`lldb` to not be intrusively reference-counted.
2025-05-22 12:33:52 -07:00
Kazu Hirata
325281631a
[clang] Use *Map::try_emplace (NFC) (#140477)
We can simplify the code with *Map::try_emplace where we need
default-constructed values while avoding calling constructors when
keys are already present.
2025-05-19 06:19:53 -07:00
Sean Perry
84666d6874
Add back include for AutoConvert.h as it's needed for z/OS (#135430)
The commit
a1935fd380
removed an include that is needed when building on z/OS.
2025-04-13 03:04:58 -07:00
Reid Kleckner
a1935fd380 [clang] Remove unused SourceManager.cpp includes, NFC (trying out clangd) 2025-04-04 22:10:19 -07:00
Jay Foad
e87f94a6a8
[llvm-project] Fix typos mutli and mutliple. NFC. (#122880) 2025-01-14 11:59:41 +00:00
Abhina Sreeskantharajan
6edd867e43 [SystemZ][z/OS] Replace assert with updated return statement to check if a file size will grow due to conversion 2024-12-12 11:56:08 -05:00
Abhina Sree
04379c9863
[SystemZ][z/OS] Update autoconversion functions to improve support for UTF-8 (#98652)
This fixes the following error when reading source and header files on
z/OS: error: source file is not valid UTF-8
2024-12-11 07:46:51 -05:00
Boaz Brickner
bc7f24cd8d
[clang] [NFC] Remove SourceLocation() parameter from Diag.Report() calls in SourceManager, and use the equivalent Report() overload instead (#116937) 2024-11-21 09:41:09 +01:00
Boaz Brickner
9a365bc9a0
[Clang] [NFC] Add "human" diagnostic argument format (#115835)
This allows formatting large integers in a human friendly way. Example:
"5321584" -> "5.32M".
Use it where such human numbers are generated manually today.
2024-11-13 07:58:11 +01:00
Boaz Brickner
8431494094
[clang] Make source locations space usage diagnostics numbers easier to read (#114999)
Instead of writing "12345678B", write "12345678B (12.34MB)".
2024-11-06 09:45:16 +01:00
Abhina Sreeskantharajan
efdb3ae232 Revert "[SystemZ][z/OS] Propagate IsText parameter to open text files as text (#107906)"
This reverts commit edf3b277a5f2ebe144827ed47463c22743cac5f9.
2024-09-20 08:18:16 -04:00
Abhina Sree
edf3b277a5
[SystemZ][z/OS] Propagate IsText parameter to open text files as text (#107906)
This patch adds an IsText parameter to the following functions
openFileForRead, getBufferForFile, getBufferForFileImpl and determines
whether a file is text by querying the file tag on z/OS. The default is
set to OF_Text instead of OF_None, this change in value does not affect
any other platforms other than z/OS.
2024-09-19 14:30:10 -04:00
Vakhurin Sergei
eda72fac54
Fix OOM in FormatDiagnostic (2nd attempt) (#108866)
Resolves: #70930 (and probably latest comments from clangd/clangd#251)
by fixing racing for the shared DiagStorage value which caused messing with args inside the storage and then formatting the following message with getArgSInt(1) == 2:

def err_module_odr_violation_function : Error<
  "%q0 has different definitions in different modules; "
  "%select{definition in module '%2'|defined here}1 "
  "first difference is "

which causes HandleSelectModifier to go beyond the ArgumentLen so the recursive call to FormatDiagnostic was made with DiagStr > DiagEnd that leads to infinite while (DiagStr != DiagEnd).

The Main Idea:
Reuse the existing DiagStorageAllocator logic to make all DiagnosticBuilders having independent states.
Also, encapsulating the rest of state (e.g. ID and Loc) into DiagnosticBuilder.

The last attempt failed -
https://github.com/llvm/llvm-project/pull/108187#issuecomment-2353122096
so was reverted - #108838
2024-09-18 11:46:25 -04:00
Aaron Ballman
5cead0cb0b
Revert "Fix OOM in FormatDiagnostic" (#108838)
Reverting due to build failures found in #108187
2024-09-16 10:49:17 -04:00
Vakhurin Sergei
e5d255607d
Fix OOM in FormatDiagnostic (#108187)
Resolves: #70930 (and probably latest comments from
https://github.com/clangd/clangd/issues/251)
by fixing racing for the shared `DiagStorage` value which caused messing
with args inside the storage and then formatting the following message
with `getArgSInt(1)` == 2:
```
def err_module_odr_violation_function : Error<
  "%q0 has different definitions in different modules; "
  "%select{definition in module '%2'|defined here}1 "
  "first difference is "
```
which causes `HandleSelectModifier` to go beyond the `ArgumentLen` so
the recursive call to `FormatDiagnostic` was made with `DiagStr` >
`DiagEnd` that leads to infinite `while (DiagStr != DiagEnd)`.

**The Main Idea:**
Reuse the existing `DiagStorageAllocator` logic to make all
`DiagnosticBuilder`s having independent states.
Also, encapsulating the rest of state (e.g. ID and Loc) into
`DiagnosticBuilder`.

**TODO (if it will be requested by reviewer):**
- [x] add a test (I have no idea how to turn a whole bunch of my
proprietary code which leads `clangd` to OOM into a small public
example.. probably I must try using
[this](https://github.com/llvm/llvm-project/issues/70930#issuecomment-2209872975)
instead)
- [x] [`Diag.CurDiagID !=
diag::fatal_too_many_errors`](https://github.com/llvm/llvm-project/pull/108187#pullrequestreview-2296395489)
- [ ] ? get rid of `DiagStorageAllocator` at all and make
`DiagnosticBuilder` having they own `DiagnosticStorage` coz it seems
pretty small so should fit the stack for short-living
`DiagnosticBuilder` instances
2024-09-16 10:30:53 -04:00
kadir çetinkaya
a2a93f0293
[clang] Cleanup IncludeLocMap (#106241)
CompilerInstance can re-use same SourceManager across multiple
frontendactions. During this process it calls
`SourceManager::clearIDTables` to reset any caches based on FileIDs.

It didn't reset IncludeLocMap, resulting in wrong include locations for
workflows that triggered multiple frontend-actions through same
CompilerInstance.
2024-08-30 11:57:37 +02:00
Jannick Kremer
c5b611a419
[libclang/python] Expose clang_isBeforeInTranslationUnit for SourceRange.__contains__
Add libclang function `clang_isBeforeInTranslationUnit` to allow checking the order between two source locations.
Simplify the `SourceRange.__contains__` implementation using this new function.
Add tests for `SourceRange.__contains__` and the newly added functionality.

Fixes #22617 
Fixes #52827
2024-08-16 00:32:58 +02:00
Ilya Biryukov
dfbfb6c5c6
[SourceManager] Expose max usage of source location space as a Statistic (#96292)
We have been running into source location exhaustion recently and want
to use the statistics to monitor the usage in various files to be able
to anticipate where the next problem will happen.

I picked `Statistic` because it can be written into a structured JSON
file and is easier to consume by further automation.

This commit does not change any existing per-source-manager metrics
exposed via `SourceManager::PrintStats()`. This does create some
redundancy, but I also expect to be non-controversial because it aligns
with the intended use of `Statistic`.
2024-06-24 11:57:36 +02:00
Ziqing Luo
2e7b95e4c0
[Safe Buffers] Serialize unsafe_buffer_usage pragmas (#92031)
The commit adds serialization and de-serialization implementations for
the stored regions. Basically, the serialized representation of the
regions of a PP is a (ordered) sequence of source location encodings.
For de-serialization, regions from loaded files are stored by their ASTs.
When later one queries if a loaded location L is in an opt-out
region, PP looks up the regions of the loaded AST where L is at.

(Background if helps: a pair of `#pragma clang unsafe_buffer_usage begin/end` pragmas marks a
warning-opt-out region. The begin and end locations (opt-out regions)
are stored in preprocessor instances (PP) and will be queried by the
`-Wunsafe-buffer-usage` analyzer.)

The reported issue at upstream: https://github.com/llvm/llvm-project/issues/90501
rdar://124035402
2024-06-13 22:44:24 -07:00
Vlad Serebrennikov
f7b0b99c52 [clang][NFC] Further improvements to const-correctness 2024-05-18 12:10:39 +03:00
Vlad Serebrennikov
ee54c86ef7
[clang][NFC] Improve const-correctness in SourceManager (#92436)
This patch adds several const-qualified variants of existing member
functions to `SourceManager`.
I started with removing const qualification from
`setNumCreatedFIDsForFileID`, and removing `const_cast` in the body of
this function, as I think it doesn't make sense to const-qualify
setters.
2024-05-17 13:01:37 +04:00
Jan Svoboda
4f31d328aa [clang] Improve SourceManager::PrintStats()
This fixes a typo ("SLocEntry's" -> "SLocEntries"), fixes capitalization ("Sloc" -> "SLoc") and adds extra information (capacity in bytes of `LoadedSLocEntryTable`).
2023-11-06 14:45:04 -08:00
Ilya Biryukov
324d1bb35a
[Clang] Report an error and crash on source location exhaustion in macros (#69908)
`createExpansionLocImpl` has an assert that checks if we ran out of
source locations. We have observed this happening on real code and in
release builds the assertion does not fire and the compiler just keeps
running indefinitely without giving any indication that something went
wrong.

Diagnose this problem and reliably crash to make sure the problem is
easy to detect.

I have also tried:
- returning invalid source locations,
- reporting sloc address space usage on error.

Both caused the compiler to run indefinitely. It would be nice to dig
further why that happens, but until then crashing seems like a better
alternative.
2023-10-23 14:29:00 +02:00
Kazu Hirata
b8885926f8 Use llvm::endianness::{big,little,native} (NFC)
Note that llvm::support::endianness has been renamed to
llvm::endianness while becoming an enum class as opposed to an enum.
This patch replaces llvm::support::{big,little,native} with
llvm::endianness::{big,little,native}.
2023-10-10 22:54:51 -07:00
Jan Svoboda
537344fc50
[clang][modules] Move SLocEntry search into ASTReader (#66966)
In `SourceManager::getFileID()`, Clang performs binary search over its
buffer of `SLocEntries`. For modules, this binary search fully
deserializes the entire `SLocEntry` block for each visited entry. For
some entries, that includes decompressing the associated buffer (e.g.
the predefines buffer, macro expansion buffers, contents of volatile
files), which shows up in profiles of the dependency scanner.

This patch moves the binary search over loaded entries into `ASTReader`,
which can perform cheaper partial deserialization during the binary
search, reducing the wall time of dependency scans by ~3%. This also
reduces the number of retired instructions by ~1.4% on regular
(implicit) modules compilation.

Note that this patch drops the optimizations based on the last lookup ID
(pruning the search space and performing linear search before resorting
to the full binary search). Instead, it reduces the search space by
asking `ASTReader::GlobalSLocOffsetMap` for the containing `ModuleFile`
and only does binary search over entries of single module file.
2023-10-06 14:52:19 -07:00
Kazu Hirata
5009d249a5 [Basic] Fix a warning
This patch fixes:

  clang/lib/Basic/SourceManager.cpp:1979:64: error: 'greater' may not
  intend to support class template argument deduction
  [-Werror,-Wctad-maybe-unsupported]
2023-10-06 13:07:53 -07:00
Jan Svoboda
0dfb5dadc6
[clang][modules] Remove preloaded SLocEntries from PCM files (#66962)
This commit removes the list of SLocEntry offsets to preload eagerly
from PCM files. Commit introducing this functionality (258ae54a) doesn't
clarify why this would be more performant than the lazy approach used
regularly.

Currently, the only SLocEntry the reader is supposed to preload is the
predefines buffer, but in my experience, it's not actually referenced in
most modules, so the time spent deserializing its SLocEntry is wasted.
This is especially noticeable in the dependency scanner, where this
change brings 4.56% speedup on my benchmark.
2023-10-06 12:50:16 -07:00
Jan Svoboda
27254ae511
[clang] NFCI: Use FileEntryRef for FileID creation (#67838)
This patch removes the `SourceManager` APIs that create `FileID` from a
`const FileEntry *` in favor of APIs that take `FileEntryRef`. This also
removes a misleading documentation that claims `nullptr` file entry
represents stdin. I don't think that's right, since we just try to
dereference that pointer anyways.
2023-10-03 13:07:46 -07:00
Giulio Eulisse
4ae5157080
Introduce paged vector (#66430)
The goal of the class is to be an (almost) drop in replacement for
SmallVector and std::vector when those are presized and filled later, as
it happens in SourceManager and ASTReader.

By doing so, sparsely accessed PagedVector can profit from reduced 
memory footprint.
2023-09-30 08:26:19 +03:00
Jan Svoboda
3661a48a84 [clang] NFCI: Use FileEntryRef in SourceManager::getMemoryBufferForFileOr{None,Fake}() 2023-09-29 10:31:42 -07:00
Jan Svoboda
2da8f30c5e [clang] NFCI: Use FileEntryRef in SourceManager::overrideFileContents() 2023-09-29 09:30:21 -07:00
Jan Svoboda
8a2fb1391b
[clang] NFCI: Use FileEntryRef in SourceManager::FileInfos (#67742) 2023-09-29 08:04:34 -07:00
Jan Svoboda
3e9c36303c [clang] NFCI: Use FileEntryRef in SourceManager::setFileIsTransient() 2023-09-28 14:50:00 -07:00