llvm-project

shylie/llvm-project

Fork 0

Commit Graph

Author	SHA1	Message	Date
Jonas Devlieghere	51944e78bb	[lldb] Add two-level caching in the source manager We recently saw an uptick in internal reports complaining that LLDB is slow when sources on network file systems are inaccessible. I looked at the SourceManger and its cache and I think there’s some room for improvement in terms of reducing file system accesses: 1. We always resolve the path. 2. We always check the timestamp. 3. We always recheck the file system for negative cache hits. D153726 fixes (1) but (2) and (3) are necessary because of the cache’s current design. Source files are cached at the debugger level which means that the source file cache can span multiple targets and processes. It wouldn't be correct to not reload a modified or new file from disk. We can however significantly reduce the number of file system accesses by using a two level cache design: one cache at the debugger level and one at the process level: - The cache at the debugger level works the way it does today. There is no negative cache: if we can't find the file on disk, we'll try again next time the cache is queried. If a cached file's timestamp changes or if its path remapping changes, the cached file is evicted and we reload it from disk. - The cache at the process level is design to avoid accessing the file system. It doesn't check the file's modification time. It caches negative results, so if a file didn't exist, it doesn't try to reread it from disk. Checking if the path remapping changed is cheap (doesn't involve checking the file system) and is the only way for a file to get evicted from the process cache. The result of this patch is that LLDB will not show you new content if a file is modified or created while a process is running. I would argue that this is what most people would expect, but it is a change from how LLDB behaves today. For an average stop, we query the source cache 4 times. With the current implementation, that's 4 stats to get the modification time, If the file doesn't exist on disk, that's an additional 4 stats. Before D153726, if the path starts with a ~ there are another additional 4 calls to realpath. When debugging sources on a slow (network) file system, this quickly adds up. In addition to the two level caching, this patch also adds a source logging channel and synchronization to the source file cache. The logging was helpful during development and hopefully will help us triage issues in the future. The synchronization isn't a new requirement: as the cache is shared across targets, there is no guarantees that it can't be accessed concurrently. The patch also fixes a bug where we would only set the source remapping ID if the un-remapped file didn't exist, which led to the file getting evicted from the cache on every access. rdar://110787562 Differential revision: https://reviews.llvm.org/D153834	2023-07-03 14:12:39 -07:00
Jeffrey Tan	7b81192d46	Introduce new symbol on-demand for debug info This diff introduces a new symbol on-demand which skips loading a module's debug info unless explicitly asked on demand. This provides significant performance improvement for application with dynamic linking mode which has large number of modules. The feature can be turned on with: "settings set symbols.load-on-demand true" The feature works by creating a new SymbolFileOnDemand class for each module which wraps the actual SymbolFIle subclass as member variable. By default, most virtual methods on SymbolFileOnDemand are skipped so that it looks like there is no debug info for that module. But once the module's debug info is explicitly requested to be enabled (in the conditions mentioned below) SymbolFileOnDemand will allow all methods to pass through and forward to the actual SymbolFile which would hydrate module's debug info on-demand. In an internal benchmark, we are seeing more than 95% improvement for a 3000 modules application. Currently we are providing several ways to on demand hydrate a module's debug info: * Source line breakpoint: matching in supported files * Stack trace: resolving symbol context for an address * Symbolic breakpoint: symbol table match guided promotion * Global variable: symbol table match guided promotion In all above situations the module's debug info will be on-demand parsed and indexed. Some follow-ups for this feature: * Add a command that allows users to load debug info explicitly while using a new or existing command when this feature is enabled * Add settings for "never load any of these executables in Symbols On Demand" that takes a list of globs * Add settings for "always load the the debug info for executables in Symbols On Demand" that takes a list of globs * Add a new column in "image list" that shows up by default when Symbols On Demand is enable to show the status for each shlib like "not enabled for this", "debug info off" and "debug info on" (with a single character to short string, not the ones I just typed) Differential Revision: https://reviews.llvm.org/D121631	2022-04-26 10:42:06 -07:00
Pavel Labath	c34698a811	[lldb] Rename Logging.h to LLDBLog.h and clean up includes Most of our code was including Log.h even though that is not where the "lldb" log channel is defined (Log.h defines the generic logging infrastructure). This worked because Log.h included Logging.h, even though it should. After the recent refactor, it became impossible the two files include each other in this direction (the opposite inclusion is needed), so this patch removes the workaround that was put in place and cleans up all files to include the right thing. It also renames the file to LLDBLog to better reflect its purpose.	2022-02-03 14:47:01 +01:00

Author

SHA1

Message

Date

Jonas Devlieghere

51944e78bb

[lldb] Add two-level caching in the source manager

We recently saw an uptick in internal reports complaining that LLDB is
slow when sources on network file systems are inaccessible. I looked at
the SourceManger and its cache and I think there’s some room for
improvement in terms of reducing file system accesses:

 1. We always resolve the path.
 2. We always check the timestamp.
 3. We always recheck the file system for negative cache hits.

D153726 fixes (1) but (2) and (3) are necessary because of the cache’s
current design. Source files are cached at the debugger level which
means that the source file cache can span multiple targets and
processes. It wouldn't be correct to not reload a modified or new file
from disk.

We can however significantly reduce the number of file system accesses
by using a two level cache design: one cache at the debugger level and
one at the process level:

 - The cache at the debugger level works the way it does today. There is
   no negative cache: if we can't find the file on disk, we'll try again
   next time the cache is queried. If a cached file's timestamp changes
   or if its path remapping changes, the cached file is evicted and we
   reload it from disk.
 - The cache at the process level is design to avoid accessing the file
   system. It doesn't check the file's modification time. It caches
   negative results, so if a file didn't exist, it doesn't try to reread
   it from disk. Checking if the path remapping changed is cheap
   (doesn't involve checking the file system) and is the only way for a
   file to get evicted from the process cache.

The result of this patch is that LLDB will not show you new content if a
file is modified or created while a process is running. I would argue
that this is what most people would expect, but it is a change from how
LLDB behaves today.

For an average stop, we query the source cache 4 times. With the current
implementation, that's 4 stats to get the modification time, If the file
doesn't exist on disk, that's an additional 4 stats. Before D153726, if
the path starts with a ~ there are another additional 4 calls to
realpath. When debugging sources on a slow (network) file system, this
quickly adds up.

In addition to the two level caching, this patch also adds a source
logging channel and synchronization to the source file cache. The
logging was helpful during development and hopefully will help us triage
issues in the future. The synchronization isn't a new requirement: as
the cache is shared across targets, there is no guarantees that it can't
be accessed concurrently. The patch also fixes a bug where we would only
set the source remapping ID if the un-remapped file didn't exist, which
led to the file getting evicted from the cache on every access.

rdar://110787562

Differential revision: https://reviews.llvm.org/D153834

2023-07-03 14:12:39 -07:00

Jeffrey Tan

7b81192d46

Introduce new symbol on-demand for debug info

This diff introduces a new symbol on-demand which skips
loading a module's debug info unless explicitly asked on
demand. This provides significant performance improvement
for application with dynamic linking mode which has large
number of modules.
The feature can be turned on with:
"settings set symbols.load-on-demand true"

The feature works by creating a new SymbolFileOnDemand class for
each module which wraps the actual SymbolFIle subclass as member
variable. By default, most virtual methods on SymbolFileOnDemand are
skipped so that it looks like there is no debug info for that module.
But once the module's debug info is explicitly requested to
be enabled (in the conditions mentioned below) SymbolFileOnDemand
will allow all methods to pass through and forward to the actual SymbolFile
which would hydrate module's debug info on-demand.

In an internal benchmark, we are seeing more than 95% improvement
for a 3000 modules application.

Currently we are providing several ways to on demand hydrate
a module's debug info:
* Source line breakpoint: matching in supported files
* Stack trace: resolving symbol context for an address
* Symbolic breakpoint: symbol table match guided promotion
* Global variable: symbol table match guided promotion

In all above situations the module's debug info will be on-demand
parsed and indexed.

Some follow-ups for this feature:
* Add a command that allows users to load debug info explicitly while using a
  new or existing command when this feature is enabled
* Add settings for "never load any of these executables in Symbols On Demand"
  that takes a list of globs
* Add settings for "always load the the debug info for executables in Symbols
  On Demand" that takes a list of globs
* Add a new column in "image list" that shows up by default when Symbols On
  Demand is enable to show the status for each shlib like "not enabled for
  this", "debug info off" and "debug info on" (with a single character to
  short string, not the ones I just typed)

Differential Revision: https://reviews.llvm.org/D121631

2022-04-26 10:42:06 -07:00

Pavel Labath

c34698a811

[lldb] Rename Logging.h to LLDBLog.h and clean up includes

Most of our code was including Log.h even though that is not where the
"lldb" log channel is defined (Log.h defines the generic logging
infrastructure). This worked because Log.h included Logging.h, even
though it should.

After the recent refactor, it became impossible the two files include
each other in this direction (the opposite inclusion is needed), so this
patch removes the workaround that was put in place and cleans up all
files to include the right thing. It also renames the file to LLDBLog to
better reflect its purpose.

2022-02-03 14:47:01 +01:00

3 Commits