372 Commits

Author SHA1 Message Date
Greg Clayton
c8164d0276 Create synthetic symbol names on demand to improve memory consumption and startup times.
This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool.

Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index.

This patch fixes the issue by:
- not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed
- doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen
- removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool

Prior to this fix the startup times for a large application was:
12.5 seconds (cold file caches)
8.5 seconds (warm file caches)

After this fix:
9.7 seconds (cold file caches)
5.7 seconds (warm file caches)

The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name.

Differential Revision: https://reviews.llvm.org/D105160
2021-06-29 17:44:33 -07:00
Stella Stamenova
bb2cfca2f3 Revert D104488 and friends since it broke the windows bot
Reverts commits:
"Fix failing tests after https://reviews.llvm.org/D104488."
"Fix buildbot failure after https://reviews.llvm.org/D104488."
"Create synthetic symbol names on demand to improve memory consumption and startup times."

This series of commits broke the windows lldb bot and then failed to fix all of the failing tests.
2021-06-29 12:58:55 -07:00
Greg Clayton
d77ccfdc72 Create synthetic symbol names on demand to improve memory consumption and startup times.
This fix was created after profiling the target creation of a large C/C++/ObjC application that contained almost 4,000,000 redacted symbol names. The symbol table parsing code was creating names for each of these synthetic symbols and adding them to the name indexes. The code was also adding the object file basename to the end of the symbol name which doesn't allow symbols from different shared libraries to share the names in the constant string pool.

Prior to this fix this was creating 180MB of "___lldb_unnamed_symbol" symbol names and was taking a long time to generate each name, add them to the string pool and then add each of these names to the name index.

This patch fixes the issue by:
- not adding a name to synthetic symbols at creation time, and allows name to be dynamically generated when accessed
- doesn't add synthetic symbol names to the name indexes, but catches this special case as name lookup time. Users won't typically set breakpoints or lookup these synthetic names, but support was added to do the lookup in case it does happen
- removes the object file baseanme from the generated names to allow the names to be shared in the constant string pool

Prior to this fix the startup times for a large application was:
12.5 seconds (cold file caches)
8.5 seconds (warm file caches)

After this fix:
9.7 seconds (cold file caches)
5.7 seconds (warm file caches)

The names of the symbols are auto generated by appending the symbol's UserID to the end of the "___lldb_unnamed_symbol" string and is only done when the name is requested from a synthetic symbol if it has no name.

Differential Revision: https://reviews.llvm.org/D104488
2021-06-28 18:04:51 -07:00
Jason Molenda
af913881e3 Try to unbreak the windows CI
MSVC and clang seem to disagree with whether I can do this.
2021-06-20 13:13:46 -07:00
Jason Molenda
9ea6dd5cfa Add a corefile style option to process save-core; skinny corefiles
Add a new feature to process save-core on Darwin systems -- for
lldb to create a user process corefile with only the dirty (modified
memory) pages included.  All of the binaries that were used in the
corefile are assumed to still exist on the system for the duration
of the use of the corefile.  A new --style option to process save-core
is added, so a full corefile can be requested if portability across
systems, or across time, is needed for this corefile.

debugserver can now identify the dirty pages in a memory region
when queried with qMemoryRegionInfo, and the size of vm pages is
given in qHostInfo.

Create a new "all image infos" LC_NOTE for Mach-O which allows us
to describe all of the binaries that were loaded in the process --
load address, UUID, file path, segment load addresses, and optionally
whether code from the binary was executing on any thread.  The old
"read dyld_all_image_infos and then the in-memory Mach-O load
commands to get segment load addresses" no longer works when we
only have dirty memory.

rdar://69670807
Differential Revision: https://reviews.llvm.org/D88387
2021-06-20 12:26:54 -07:00
Adrian Prantl
035217ff51 Allow signposts to take advantage of deferred string substitution
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.

This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.

This reapplies the previously reverted patch with additional include
order fixes for non-modular builds of LLDB.

Differential Revision: https://reviews.llvm.org/D103575
2021-06-14 16:53:41 -07:00
Adrian Prantl
7a7c00761f Revert "Allow signposts to take advantage of deferred string substitution"
This reverts commit 03841edde7eee21d1d450041ab9a113a7e1be869.

Unfortunately this still breaks the LLDB standalone bot.
2021-06-14 16:09:04 -07:00
Adrian Prantl
03841edde7 Allow signposts to take advantage of deferred string substitution
One nice feature of the os_signpost API is that format string
substitutions happen in the consumer, not the logging
application. LLVM's current Signpost class doesn't take advantage of
this though and instead always uses a static "Begin/End %s" format
string.

This patch uses variadic macros to allow the API to be used as
intended. Unfortunately, the primary use-case I had in mind (the
LLDB_SCOPED_TIMER() macro) does not get much better from this, because
__PRETTY_FUNCTION__ is *not* a macro, but a static string, so
signposts created by LLDB_SCOPED_TIMER() still use a static "%s"
format string. At least LLDB_SCOPED_TIMERF() works as intended.

This reapplies the previsously reverted patch with additional MachO.h
macro #undefs.

Differential Revision: https://reviews.llvm.org/D103575
2021-06-14 14:19:41 -07:00
Adrian Prantl
635b72136e Disambiguate usage of struct mach_header and other MachO definitions.
Unfortunately the Darwin signpost header also pulls in the system
MachO header and so we need to make sure to use the LLVM versions of
those definitions.
2021-06-11 16:35:43 -07:00
Greg Clayton
b0572abf72 Improve performance when parsing symbol tables in mach-o files.
Some larger projects were loading quite slowly with the current LLDB on macOS and macOS simulator builds. I did some instrument traces and found 3 main culprits:
- a LLDB timer that was put into a function that was called too often
- a std::set that was keeping track of the address of symbols that were already added
- a unnamed function generator in ObjectFile that was going slow due to allocations

In order to see this in action I ran the latest LLDB on a large application with many frameworks using the following method:

(lldb) script import time; start_time = time.perf_counter()
(lldb) file Large.app
(lldb) script print(time.perf_counter() - start_time)

I first range "sudo purge" to clear the system file caches to simulate a cold startup of the debugger, followed by two iterations with warm file caches.

Prior to this fix I was seeing the following timings:

17.68 (cold)
14.56 (warm 1)
14.52 (warm 2)

After this fix I was seeing:

11.32 (cold)
8.43 (warm 1)
8.49 (warm 2)

Differential Revision: https://reviews.llvm.org/D103504
2021-06-02 10:31:37 -07:00
Bruce Mitchener
36597e4719 [lldb] Fix typos. NFC.
Differential Revision: https://reviews.llvm.org/D103381
2021-05-31 06:48:57 +07:00
Greg Clayton
e122877f10 Add a progress class that can track long running operations in LLDB.
LLDB can often appear deadlocked to users that use IDEs when it is indexing DWARF, or parsing symbol tables. These long running operations can make a debug session appear to be doing nothing even though a lot of work is going on inside LLDB. This patch adds a public API to allow clients to listen to debugger events that report progress and will allow UI to create an activity window or display that can show users what is going on and keep them informed of expensive operations that are going on inside LLDB.

Differential Revision: https://reviews.llvm.org/D97739
2021-03-24 12:58:13 -07:00
Jonas Devlieghere
732534ed64 [lldb] s/TARGET_OS_EMBEDDED/TARGET_OS_IPHONE/
TARGET_OS_EMBEDDED is deprecated, use TARGET_OS_IPHONE and/or
TARGET_OS_SIMULATOR instead.
2021-02-11 20:40:59 -08:00
Jonas Devlieghere
5c1c8443eb [lldb] Abstract scoped timer logic behind LLDB_SCOPED_TIMER (NFC)
This patch introduces a LLDB_SCOPED_TIMER macro to hide the needlessly
repetitive creation of scoped timers in LLDB. It's similar to the
LLDB_LOG(F) macro.

Differential revision: https://reviews.llvm.org/D93663
2020-12-22 09:10:27 -08:00
Raphael Isemann
ffb3fd8f18 Partially revert '[MachO] Update embedded part of ObjectFileMachO for Mangled API change'
Commit f3aa9e36d91b7b0f4f24f7a3b13cf80c11356e5e fixed the embedded OS
build by removing all passed args for `GetName`/`GetDemangledName`. The motivation
for this was that these arguments were apparently removed in
commit 22b044877d239c40c9a932d1ea47d489c507000f. However, only `GetName`'s language
argument was removed but the mangling preference argument was *not* removed
(and unfortunately had a default argument). So when that commit removed all
the args it didn't just fix the build but it also changed all the mangling
preferences to 'demangled' for all `GetName` calls.

Also some `GetName` calls were outside the TARGET_OS_EMBEDDED ifdef, so
this change ended up breaking the following tests on macOS:

  lldb-api :: lang/objc/objc-static-method-stripped/TestObjCStaticMethodStripped.py
  lldb-api :: lang/objc/objc-super/TestObjCSuper.py

From what I can see f3aa9e36d91b7 removed 12 ePreferMangled args and this patch
re-adds 12 args with roughly the same line numbers, so this *should* restore the
old behaviour and also keep the embedded build working. On the other hand,
ObjectFileMachO::ParseSymtab is a very successful attempt at writing
the longest possible function within LLVM, so this fix is partly based
on the engineering principle known as "hoping for the best".
2020-11-20 13:31:36 +01:00
Jonas Devlieghere
f3aa9e36d9 [MachO] Update embedded part of ObjectFileMachO for Mangled API change
Mangled::GetName and Mangled::GetDemangledName no longer take any
arguments.
2020-11-18 14:47:31 -08:00
Jason Molenda
1bec6eb3f5 Add support for firmware/standalone LC_NOTE "main bin spec" corefiles
When a Mach-O corefile has an LC_NOTE "main bin spec" for a
standalone binary / firmware, with only a UUID and no load
address, try to locate the binary and dSYM by UUID and if
found, load it at offset 0 for the user.

Add a test case that tests a firmware/standalone corefile
with both the "kern ver str" and "main bin spec" LC_NOTEs.

<rdar://problem/68193804>

Differential Revision: https://reviews.llvm.org/D88282
2020-09-25 15:19:22 -07:00
Greg Clayton
08748d15b8 Fix a check that was attempting to see if an object file was in memory.
Checking if an object file is in memory should use the ObjectFile::IsInMemory(), not test ObjectFile::BaseAddress(). ObjectFile::BaseAddress() is designed to be overridden by all classes and is for mach-o, ELF and COFF plug-ins. They find the header base adddress and return that as a section offset address. The default implementation of ObjectFile::BaseAddress() does try and make an Address() from the ObjectFile::m_memory_addr, but I switched it to a correct function call.

Differential Revision: https://reviews.llvm.org/D86122
2020-08-18 13:24:22 -07:00
Adrian Prantl
05df9cc703 Correctly detect legacy iOS simulator Mach-O objectfiles
The code in ObjectFileMachO didn't disambiguate between ios and
ios-simulator object files for Mach-O objects using the legacy
ambiguous LC_VERSION_MIN load commands. This used to not matter before
taught ArchSpec that ios and ios-simulator are no longer compatible.

rdar://problem/66545307

Differential Revision: https://reviews.llvm.org/D85358
2020-08-06 12:40:45 -07:00
Jim Ingham
1c1ffa6a30 GetPath() returns a std::string temporary. You can't reference just the c_str.
Found by the static analyzer.
2020-08-05 19:12:15 -07:00
Fred Riss
22c16360dd [lldb/ObjectFileMachO] Correctly account for resolver symbols
Summary:
The resolver addresses stored in the dyld trie are relative to the base
of the __TEXT segment. This is usually 0 in a dylib, so this was never
noticed, but it is not 0 for most dylibs integrated in the shared cache.
As we started using the shared cache images recently as symbol source,
this causes LLDB to fail to resolve symbols which go through a runtime
resolver.

Reviewers: jasonmolenda, jingham

Subscribers: lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D84083
2020-07-24 09:19:17 -07:00
Fred Riss
8113a8bb79 [lldb/ObjectFileMachO] Fetch shared cache images from our own shared cache
Summary:
On macOS 11, the libraries that have been integrated in the system
shared cache are not present on the filesystem anymore. LLDB was
using those files to get access to the symbols of those libraries.
LLDB can get the images from the target process memory though.

This has 2 consequences:
 - LLDB cannot load the images before the process starts, reporting
   an error if someone tries to break on a system symbol.
 - Loading the symbols by downloading the data from the inferior
   is super slow. It takes tens of seconds at the start of the
   debug session to populate the Module list.

To fix this, we can use the library images LLDB has in its own
mapping of the shared cache. Shared cache images are somewhat
special as their LINKEDIT segment is moved to the end of the cache
and thus the images are not contiguous in memory. All of this can
hidden in ObjectFileMachO.

This patch fixes a number of test failures on macOS 11 due to the
first problem described above and adds some specific unittesting
for the new SharedCache Host utilities.

Reviewers: jasonmolenda, labath

Subscribers: llvm-commits, lldb-commits

Tags: #lldb, #llvm

Differential Revision: https://reviews.llvm.org/D83023
2020-07-16 10:37:37 -07:00
Fred Riss
e529d774c4 [lldb] Use enum constant instead of raw value 2020-07-09 09:43:50 -07:00
Jonas Devlieghere
06412dae82 [lldb] Use std::make_unique<> (NFC)
Update the rest of lldb to use std::make_unique<>. I used clang-tidy to
automate this, which probably missed cases that are wrapped in ifdefs.
2020-06-24 17:48:40 -07:00
Davide Italiano
63d597093c [ObjectFileMachO] Check for TARGET_EMBEDDED instead of listing architectures.
Now that Apple Silicon is a thing, we need to generalize the check.
2020-06-23 12:37:45 -07:00
Pavel Labath
3a16829748 [lldb] Switch Section-dumping code to raw_ostream
Also, add a basic test for dumping sections.
2020-05-14 11:59:18 +02:00
Jason Molenda
836534f997 Add more detailed symbol type categorization, based on a swift patch by
Greg Clayton a few years ago.

My patch to augment the symbol table in Mach-O files with the
dyld trie exports data structure only categorized symbols as code
or data, but Greg Clayton had a patch to do something similar to
swift a few years ago that had a more extensive categorization of
symbols, as well as extracting some objc class/ivar names from the
entries. This patch is basically just Greg's, updated a bit and
with a test case added to it.

<rdar://problem/50791451>

Differential Revision: https://reviews.llvm.org/D77369
2020-04-06 14:05:33 -07:00
Benjamin Kramer
e8f13f4f62 Replace std::string::find == 0 with StringRef::startswith
This is both more readable and faster. Found by clang-tidy's
abseil-string-find-startswith.
2020-03-31 21:01:09 +02:00
Jason Molenda
f0a5af906b Merge in symbols from Mach-O dyld trie to the symbol table
In ObjectFileMachO we construct the symbol table from multiple
sources -- primarily the binary's nlist records, but when the nlist
symbols have been stripped, we would augment those with function
start address from the LC_FUNCTION_STARTS or eh_frame.  This patch
adds another source of symbols - the exported symbols that the
dynamic linker, dyld, uses at runtime from its trie structure.  This
provides us names and addresses for these functions/data.

This patch removes the code from ParseSymtab that would reject an
empty symbol table / nlist source.  It adds a new symbols_added
set which tracks the address of every symbol we've added to the
symtab.  We add symbols in most-information-ful order, and before
adding a symbol from less-informational-ful source (e.g.
LC_FUNCTION_STARTS with no function name), we check if that symbol
has already been added.

On targets with thumb code generation, instead of using the 0th bit
in these addresses in FunctionStarts (or now the trie entries), we
use the data field of FunctionStarts (formerly used to track if the
func_start should be added) and a flag for the trie entries to
encode this, and only store the actual addresses in the symbols_seen
and these vectors.

<rdar://problem/50791451>

Differential revision: https://reviews.llvm.org/D76758
2020-03-27 22:53:15 -07:00
Davide Italiano
34ee941f6d [ObjectFileMachO] Fix a build error on embedded. 2020-02-26 14:31:48 -08:00
Pavel Labath
7b59ff2fa0 [lldb] Add boilerplate to recognize the .debug_tu_index section
It's just like debug_cu_index, only for type units.
2020-02-20 13:44:21 +01:00
Jonas Devlieghere
bba9ba8d95 [lldb/Plugin] s/LLDB_PLUGIN/LLDB_PLUGIN_DEFINE/ (NFC)
Rename LLDB_PLUGIN to LLDB_PLUGIN_DEFINE as Pavel suggested in D73067 to
avoid name conflict.
2020-02-14 09:58:24 -08:00
Martin Storsjö
6115bd9ba2 [LLDB] Fix GCC warnings about extra semicolons. NFC. 2020-02-10 11:20:44 +02:00
Jonas Devlieghere
fbb4d1e43d [lldb/Plugins] Use external functions to (de)initialize plugins
This is a step towards making the initialize and terminate calls be
generated by CMake, which in turn is towards making it possible to
disable plugins at configuration time.

Differential revision: https://reviews.llvm.org/D74245
2020-02-07 15:28:27 -08:00
Alex Langford
22b044877d [lldb][NFCI] Remove unused LanguageType parameters
These parameters are unused in these methods, and some of them only had a
LanguageType parameter to pipe to other methods that don't use it
either.
2020-01-30 21:57:23 -08:00
Raphael Isemann
808142876c [lldb][NFC] Fix all formatting errors in .cpp file headers
Summary:
A *.cpp file header in LLDB (and in LLDB) should like this:
```
//===-- TestUtilities.cpp -------------------------------------------------===//
```
However in LLDB most of our source files have arbitrary changes to this format and
these changes are spreading through LLDB as folks usually just use the existing
source files as templates for their new files (most notably the unnecessary
editor language indicator `-*- C++ -*-` is spreading and in every review
someone is pointing out that this is wrong, resulting in people pointing out that this
is done in the same way in other files).

This patch removes most of these inconsistencies including the editor language indicators,
all the different missing/additional '-' characters, files that center the file name, missing
trailing `===//` (mostly caused by clang-format breaking the line).

Reviewers: aprantl, espindola, jfb, shafik, JDevlieghere

Reviewed By: JDevlieghere

Subscribers: dexonsmith, wuzish, emaste, sdardis, nemanjai, kbarton, MaskRay, atanasyan, arphaman, jfb, abidh, jsji, JDevlieghere, usaxena95, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D73258
2020-01-24 08:52:55 +01:00
Pavel Labath
4b5bc38802 [lldb/DWARF] Move location list sections into DWARFContext
These are the last sections not managed by the DWARFContext object. I
also introduce separate SectionType enums for dwo section variants, as
this is necessary for proper handling of single-file split dwarf.
2020-01-14 15:19:29 +01:00
Jason Molenda
7026b34702 make err msg in MachSymtabSectionInfo::GetSection clear about the file
This error message didn't specify which file was malformed, so
there's some hunting-around required if it comes up.  We have the
filename; include it in the error message.
2019-12-18 16:13:17 -08:00
Jonas Devlieghere
e194d89012 [lldb/MachO] "Fix" intentional out-of-bounds error (NFC)
Remove the hack that populates the cpsr register in the gpr struct by
writing past the end of the array. This was tripping up ASan.

Patch by: Reva Cuthbertson
2019-12-18 12:54:04 -08:00
Pavel Labath
4023bd05fc [lldb] Add boilerplate to recognize the .debug_rnglists.dwo section 2019-11-26 13:58:26 +01:00
Davide Italiano
96915495f9 [ObjectFileMachO] Fix the build for __arm64__.
Catch up with an API change.
2019-11-12 11:10:36 -08:00
Vedant Kumar
72105b9dcd Fix compilation error in ObjectFileMachO::ParseSymtab 2019-10-25 11:16:51 -07:00
Adrian Prantl
1ad655e255 Modernize the rest of the Find.* API (NFC)
This patch removes the size_t return value and the append parameter
from the remainder of the Find.* functions in LLDB's internal API. As
in the previous patches, this is motivated by the fact that these
parameters aren't really used, and in the case of the append parameter
were frequently implemented incorrectly.

Differential Revision: https://reviews.llvm.org/D69119

llvm-svn: 375160
2019-10-17 19:56:40 +00:00
Jason Molenda
7dd7a36075 Add arm64_32 support to lldb, an ILP32 codegen
that runs on arm64 ISA targets, specifically 
Apple watches.


Differential Revision: https://reviews.llvm.org/D68858

llvm-svn: 375032
2019-10-16 19:14:49 +00:00
Adrian Prantl
8db229e287 ObjectFileMachO: Replace std::map with llvm::DenseMap (NFC)
This makes parsing the symbol table of clang marginally faster. (Hashtable versus tree).

Differential Revision: https://reviews.llvm.org/D68605

llvm-svn: 374084
2019-10-08 16:59:24 +00:00
Adrian Prantl
917b8df0e5 Replace static const StringRef with StringRef (NFC)
Differential Revision: https://reviews.llvm.org/D68597

llvm-svn: 374081
2019-10-08 16:29:36 +00:00
Adrian Prantl
bde5a6a45a Remove constructor and unused method (NFC).
Differential Revision: https://reviews.llvm.org/D68595

llvm-svn: 374080
2019-10-08 16:29:33 +00:00
Jonas Devlieghere
4bddca306a [MachO] Fix symbol merging during symtab parsing.
The symtab parser in ObjectFileMachO has logic to coalesce debug (STAB)
and non-debug symbols, based on the address and the symbol name for
static (STSYM) and global symbols (GSYM) respectively. It makes the
assumption that the debug variant is always encountered first. Rather
than creating a second entry in the symbol table for the non-debug
symbol, the latter gets merged into the existing debug symbol.

This breaks when the linker emits the non-debug symbol first. We'd end
up with two entries in the symbol table, each containing part of the
information LLDB relies on. Indeed, commenting out the merging logic
breaks the test suite spectacularly.

This patch solves that problem by always parsing the debug symbols
first. This guarantees that the assumption for merging holds.

I'm not particularly happy with  adding a lambda, but after numerous
attempts this is the best solution I could come up with. The symtab
parsing logic is pretty complex in that it touches a lot of things. I've
experienced first hand that it's very easy to break things. I believe
this approach strikes a balance between fixing the issue while limiting
the risk of regressions.

Differential revision: https://reviews.llvm.org/D68536

llvm-svn: 373994
2019-10-08 00:13:59 +00:00
Jonas Devlieghere
369407fc52 [MachO] Shuffle some things around in ParseSymtab (NFC)
llvm-svn: 373954
2019-10-07 20:31:22 +00:00
Jonas Devlieghere
4e5d9e120b [MachO] Reduce indentation further in ParseSymtab (NFC)
llvm-svn: 373810
2019-10-04 23:09:55 +00:00