536 Commits

Author SHA1 Message Date
Greg Clayton
3475b3f97b
Enable LLDB to load large dSYM files. (#164471)
llvm-dsymutil can produce mach-o files where some sections in __DWARF
exceed the 4GB barrier and subsequent sections in the dSYM will be
inaccessible because the mach-o section_64 structure only has a 32 bit
file offset. This patch enables LLDB to load a large dSYM file by
figuring out when this happens and properly adjusting the file offset of
the LLDB sections.

I was unable to add a test as obj2yaml and yaml2obj are broken for
mach-o files and they can't convert a yaml file back into a valid mach-o
object file. Any suggestions for adding a test would be appreciated.
2025-10-30 14:08:44 -07:00
Michael Buch
07974fe2b1
Reland "[lldb][MachO][NFC] Extract ObjC metadata symbol parsing into helper function" (#161655)
This reverts `5a80fb9177e3c831c9c574400a13d77393397f2a`. The original
change got reverted because of failing tests on macOS.

The issue was that I changed the scope of setting `type =
eSymbolTypeData` during the cleanup. This patch relands the original
patch but doesn't change the `else` branch to an `else if` branch.

Tested that macOS test-suite passes.
2025-10-02 22:45:17 +01:00
Augusto Noronha
5a80fb9177 Revert "[lldb][MachO][NFC] Extract ObjC metadata symbol parsing into helper function (#161536)"
This reverts commit 23e081524fd9f64fb3430822e879b6dc36a1d3f1.
2025-10-01 10:50:59 -07:00
Michael Buch
23e081524f
[lldb][MachO][NFC] Extract ObjC metadata symbol parsing into helper function (#161536)
Just a simple de-duplication of the same code. We saw a bug here
recently (https://github.com/llvm/llvm-project/pull/161521). Might as
well isolate this all in one place.

rdar://158159242
2025-10-01 15:58:09 +00:00
Michael Buch
9f7e7f7d9c
[lldb][MachO] Fix inspection of global variables that start with 'O' (#161521)
On Darwin C-symbols are prefixed with a '_'. The LLDB Macho-O parses
handles Objective-C metadata symbols starting with '_OBJC' specially.
Previously global symbols starting with a '_O' prefix were lost because
of incorrectly scoped if-guards. This patch removes those checks.

There is more cleanup that can be done in this file because there's a
bunch of duplicated checks for these ObjC symbols. I decided to leave
that for an NFC follow-up.

Depends on https://github.com/llvm/llvm-project/pull/161520

rdar://158159242
2025-10-01 16:47:15 +01:00
Jason Molenda
3e57a0d01c
[lldb][MachO] Local structs for larger VA offsets (#159849)
The Mach-O file format has several load commands which specify the
location of data in the file in UInt32 offsets. lldb uses these same
structures to track the offsets of the binary in virtual address space
when it is running. Normally a binary is loaded in memory contiguously,
so this is fine, but on Darwin systems there is a "system shared cache"
where all system libraries are combined into one region of memory and
pre-linked. The shared cache has the TEXT segments for every binary
loaded contiguously, then the DATA segments, and finally a shared common
LINKEDIT segment for all binaries. The virtual address offset from the
TEXT segment for a libray to the LINKEDIT may exceed 4GB of virtual
address space depending on the structure of the shared cache, so this
use of a UInt32 offset will not work.

There was an initial instance of this issue that I fixed last November
in https://github.com/llvm/llvm-project/pull/117832 where I fixed this
issue for the LC_SYMTAB / `symtab_command` structure. But we have the
same issue now with three additional structures;
`linkedit_data_command`, `dyld_info_command`, and `dysymtab_command`.
For all of these we can see the pattern of `dyld_info.export_off +=
linkedit_slide` applied to the offset fields in ObjectFileMachO.

This defines local structures that mirror the Mach-O structures, except
that it uses UInt64 offset fields so we can reuse the same field for a
large virtual address offset at runtime. I defined ctor's from the
genuine structures, as well as operator= methods so the structures can
be read from the Mach-O binary into the standard object, then copied
into our local expanded versions of them. These structures are ABI in
Mach-O and cannot change their layout.

The alternative is to create local variables alongside these Mach-O load
command objects for the offsets that we care about, adjust those by the
correct VA offsets, and only use those local variables instead of the
fields in the objects. I took the approach of the local enhanced
structure in November and I think it is the cleaner approach.

rdar://160384968
2025-09-19 19:53:26 -07:00
Alex Langford
8b544f3639
[lldb] Do not use LC_FUNCTION_STARTS data to determine symbol size as symbols are created (#155282)
Note: This is a resubmission of #106791. I had to revert this a year ago
for a failing test that I could not understand. I have time now to try
and get this in again.

Summary:
This improves the performance of ObjectFileMacho::ParseSymtab by
removing eager and expensive work in favor of doing it later in a
less-expensive fashion.

Experiment:
My goal was to understand LLDB's startup time.
First, I produced a Debug build of LLDB (no dSYM) and a
Release+NoAsserts build of LLDB. The Release build debugged the Debug
build as it debugged a small C++ program. I found that
ObjectFileMachO::ParseSymtab accounted for somewhere between 1.2 and 1.3
seconds consistently. After applying this change, I consistently
measured a reduction of approximately 100ms, putting the time closer to
1.1s and 1.2s on average.

Background:
ObjectFileMachO::ParseSymtab will incrementally create symbols by
parsing nlist entries from the symtab section of a MachO binary. As it
does this, it eagerly tries to determine the size of symbols (e.g. how
long a function is) using LC_FUNCTION_STARTS data (or eh_frame if
LC_FUNCTION_STARTS is unavailable). Concretely, this is done by
performing a binary search on the function starts array and calculating
the distance to the next function or the end of the section (whichever
is smaller).

However, this work is unnecessary for 2 reasons:
1. If you have debug symbol entries (i.e. STABs), the size of a function
is usually stored right after the function's entry. Performing this work
right before parsing the next entry is unnecessary work.
2. Calculating symbol sizes for symbols of size 0 is already performed
in `Symtab::InitAddressIndexes` after all the symbols are added to the
Symtab. It also does this more efficiently by walking over a list of
symbols sorted by address, so the work to calculate the size per symbol
is constant instead of O(log n).
2025-08-26 13:21:28 -07:00
Jonas Devlieghere
5be2063e10
[lldb] Support parsing the Wasm symbol table (#153093)
This PR adds support for parsing the WebAssembly symbol table. The
symbol table is encoded in the "names" section and contains names and
indexes into other sections. For now we only support parsing function
(code) symbols. The result is that you can set breakpoints by symbol
name, while previously breakpoints by name required debug info (DWARF).

This is also necessary for Swift, which checks for the presence of
`swift_release` as a heuristic to determine if there's a static Swift
stdlib.
2025-08-12 15:12:30 -05:00
Jason Molenda
00e071d690 [lldb] remove do-nothing defaults in case statements,
unbreak gcc CI bots.
2025-07-02 11:16:04 -07:00
Kazu Hirata
1626867ccd [lldb] Fix warnings
This patch fixes:

  lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:415:7:
  error: label at end of compound statement is a C++23 extension
  [-Werror,-Wc++23-extensions]

  lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:536:7:
  error: label at end of compound statement is a C++23 extension
  [-Werror,-Wc++23-extensions]

  lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp:672:7:
  error: label at end of compound statement is a C++23 extension
  [-Werror,-Wc++23-extensions]
2025-07-02 10:32:47 -07:00
Jason Molenda
dfcef35ff1
[lldb][NFC][MachO] Clean up LC_THREAD reading code, remove i386 corefile (#146480)
While fixing bugs in the x86_64 LC_THREAD parser in ObjectFileMachO, I
noticed that the other LC_THREAD parsers are all less clear than they
should be.

To recap, a Mach-O LC_THREAD load command has a byte size for the entire
payload. Within the payload, there will be one or more register sets
provided. A register set starts with a UInt32 "flavor", the type of
register set defined in the system headers, and a UInt32 "count", the
number of UInt32 words of memory for this register set. After one
register set, there may be additional sets. A parser can skip an unknown
register set flavor by using the count field to get to the next register
set. When the total byte size of the LC_THREAD load command has been
parsed, it is completed.

This patch fixes the riscv/arm/arm64 LC_THREAD parsers to use the total
byte size as the exit condition, and to skip past unrecognized register
sets, instead of stopping parsing.

Instead of fixing the i386 corefile support, I removed it. The last
macOS that supported 32-bit Intel code was macOS 10.14 in 2018. I also
removed i386 KDP support, 32-bit intel kernel debugging hasn't been
supported for even longer than that.

It would be preferable to do these things separately, but I couldn't
bring myself to update the i386 LC_THREAD parser, and it required very
few changes to remove this support entirely.
2025-07-02 10:21:38 -07:00
Jason Molenda
e94c6091c9
[lldb][Mach-O] Fix several bugs in x86_64 Mach-O corefile (#146460)
reading, and one bug in the new RegisterContextUnifiedCore class.

The PR I landed a few days ago to allow Mach-O corefiles to augment
their registers with additional per-thread registers in metadata exposed
a few bugs in the x86_64 corefile reader when running under different CI
environments. It also showed a bug in my RegisterContextUnifiedCore
class where I wasn't properly handling lookups of unknown registers
(e.g. the LLDB_GENERIC_RA when debugging an intel target).

The Mach-O x86_64 corefile support would say that it had fpu & exc
registers available in every corefile, regardless of whether they were
actually present. It would only read the bytes for the first register
flavor in the LC_THREAD, the GPRs, but it read them incorrectly, so
sometimes you got more register context than you'd expect. The LC_THREAD
register context specifies a flavor and the number of uint32_t words;
the ObjectFileMachO method would read that number of uint64_t's,
exceeding the GPR register space, but it was followed by FPU and then
EXC register space so it didn't crash. If you had a corefile with GPR
and EXC register bytes, it would be written into the GPR and then FPU
register areas, with zeroes filling out the rest of the context.
2025-06-30 21:27:53 -07:00
Jason Molenda
a64db49371
[lldb][Mach-O] Allow "process metadata" LC_NOTE to supply registers (#144627)
The "process metadata" LC_NOTE allows for thread IDs to be specified in
a Mach-O corefile. This extends the JSON recognzied in that LC_NOTE to
allow for additional registers to be supplied on a per-thread basis.

The registers included in a Mach-O corefile LC_THREAD load command can
only be one of the register flavors that the kernel (xnu) defines in
<mach/arm/thread_status.h> for arm64 -- the general purpose registers,
floating point registers, exception registers.

JTAG style corefile producers may have access to many additional
registers beyond these that EL0 programs typically use, for instance
TCR_EL1 on AArch64, and people developing low level code need access to
these registers. This patch defines a format for including these
registers for any thread.

The JSON in "process metadata" is a dictionary that must have a
`threads` key. The value is an array of entries, one per LC_THREAD in
the Mach-O corefile. The number of entries must match the LC_THREADs so
they can be correctly associated.

Each thread's dictionary must have two keys, `sets`, and `registers`.
`sets` is an array of register set names. If a register set name matches
one from the LC_THREAD core registers, any registers that are defined
will be added to that register set. e.g. metadata can add a register to
the "General Purpose Registers" set that lldb shows users.

`registers` is an array of dictionaries, one per register. Each register
must have the keys `name`, `value`, `bitsize`, and `set`. It may provide
additional keys like `alt-name`, that
`DynamicRegisterInfo::SetRegisterInfo` recognizes.

This `sets` + `registers` formatting is the same that is used by the
`target.process.python-os-plugin-path` script interface uses, both are
parsed by `DynamicRegisterInfo`. The one addition is that in this
LC_NOTE metadata, each register must also have a `value` field, with the
value provided in big-endian base 10, as usual with JSON.

In RegisterContextUnifiedCore, I combine the register sets & registers
from the LC_THREAD for a specific thread, and the metadata sets &
registers for that thread from the LC_NOTE. Even if no LC_NOTE is
present, this class ingests the LC_THREAD register contexts and
reformats it to its internal stores before returning itself as the
RegisterContex, instead of shortcutting and returning the core's native
RegisterContext. I could have gone either way with that, but in the end
I decided if the code is correct, we should live on it always.

I added a test where we process save-core to create a userland corefile,
then use a utility "add-lcnote" to strip the existing "process metadata"
LC_NOTE that lldb put in it, and adds a new one from a JSON string.

rdar://74358787

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
2025-06-27 18:43:41 -07:00
Jason Molenda
f5733b0b85 [lldb][Mach-O] Fix DWARF5 debugging regression for Mach-O
A unification of the DWARF section names,
https://github.com/llvm/llvm-project/pull/141344
broke dwarf5 debugging with Mach-O files.  The str_offset and
str_offset.dwo names are different in Mach-O from other object
files.
2025-06-09 17:42:31 -07:00
nerix
c6670fa20d
[LLDB] Unify DWARF section name matching (#141344)
Different object file formats support DWARF sections (COFF, ELF, MachO,
PE/COFF, WASM). COFF and PE/COFF only matched a subset. This caused some
GCC executables produced on MinGW to have issue later on when debugging.
One example is that `.debug_rnglists` was not matched, which caused
range-extraction to fail when printing a backtrace.

This unifies the parsing of section names in
`ObjectFile::GetDWARFSectionTypeFromName`, so all file formats can use
the same naming convention. Since the prefixes are different,
`GetDWARFSectionTypeFromName` only matches the suffixes (i.e. `.debug_`
needs to be stripped before).

I added two tests to ensure the sections are correctly identified on
Windows executables.
2025-06-09 09:46:50 +01:00
Jason Molenda
f961d6a89a Revert "[lldb] Set default object format to MachO in ObjectFileMachO (#142704)"
This reverts commit d4d2f069dec4fb8b13447f52752d4ecd08d976d6.

Temporarily reverting until we can find a way to get the correct
ObjectFile set in Module's Triples without adding "-macho" to the
triple string for each Module.  This is breaking TestUniversal.py
on the x86_64 macOS CI bots.
2025-06-05 16:24:31 -07:00
royitaqi
d4d2f069de
[lldb] Set default object format to MachO in ObjectFileMachO (#142704)
# The Change

This patch sets the **default** object format of `ObjectFileMachO` to be
`MachO` (instead of what currently ends up to be `ELF`, see below). This
should be **the correct thing to do**, because the code before the line
of change has already verified the Mach-O header.

The existing logic:
* In `ObjectFileMachO`, the object format is unassigned by default. So
it's `UnknownObjectFormat` (see
[code](54d544b831/llvm/lib/TargetParser/Triple.cpp (L1024))).
* The code then looks at load commands like `LC_VERSION_MIN_*`
([code](54d544b831/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp (L5180-L5217)))
and `LC_BUILD_VERSION`
([code](54d544b831/lldb/source/Plugins/ObjectFile/Mach-O/ObjectFileMachO.cpp (L5231-L5252)))
and assign the Triple's OS and Environment if they exist.
* If the above sets the Triple's OS to macOS, then the object format
defaults to `MachO`; otherwise it is `ELF`
([code](54d544b831/llvm/lib/TargetParser/Triple.cpp (L936-L937)))

# Impact

For **production usage** where Mach-O files have the said load commands
(which is
[expected](https://www.google.com/search?q=Are+mach-o+files+expected+to+have+the+LC_BUILD_VERSION+load+command%3F)),
this patch won't change anything.
* **Important note**: It's not clear if there are legitimate production
use cases where the Mach-O files don't have said load commands. If there
is, the exiting code think they are `ELF`. This patch changes it to
`MachO`. This is considered a fix for such files.

For **unit tests**, this patch will simplify the yaml data by not
requiring the said load commands.

# Test

See PR.
2025-06-04 17:07:07 -07:00
Kazu Hirata
7ffa49111a
[lldb] Remove unused local variables (NFC) (#140989) 2025-05-21 20:33:35 -07:00
Jason Molenda
096ab51de0
[lldb][MachO] MachO corefile support for riscv32 binaries (#137092)
Add support for reading a macho corefile with CPU_TYPE_RISCV and the
riscv32 general purpose register file. I added code for the floating
point and exception registers too, but haven't exercised this. If we
start putting the full CSR register bank in a riscv corefile, it'll be
in separate 4k byte chunks, but I don't have a corefile to test against
that so I haven't written the code to read it.

The RegisterContextDarwin_riscv32 is copied & in the style of the other
RegisterContextDarwin classes; it's not the first choice I would make
for representing this, but it wasn't worth changing for this cputype.

rdar://145014653
2025-04-23 22:10:15 -07:00
Jason Molenda
1a31bb38a4 [lldb][Mach-O] Don't read symbol table of specially marked binary (#129967)
We have a binary image on Darwin that has no code, only metadata. It has
a large symbol table with many external symbol names that will not be
needed in the debugger. And it is possible to not have this binary on
the debugger system - so lldb must read all of the symbol names out of
memory, one at a time, which can be quite slow.

We're adding a section __TEXT,__lldb_no_nlist, to this binary to
indicate that lldb should not read the nlist symbols for it when we are
reading out of memory. If lldb is run with an on-disk version of the
binary, we will load the symbol table as we normally would, there's no
benefit to handling this binary differently.

I added a test where I create a dylib with this specially named section,
launch the process. The main binary deletes the dylib from the disk so
lldb is forced to read it out of memory. lldb attaches to the binary,
confirms that the dylib is present in the process and is a memory
Module. If the binary is not present, or lldb found the on-disk copy
because it hasn't been deleted yet, we delete the target, flush the
Debugger's module cache, sleep and retry, up to ten times. I create the
specially named section by compiling an assembly file that puts a byte
in the section which makes for a bit of a messy Makefile (the pre-canned
actions to build a dylib don't quite handle this case) but I don't think
it's much of a problem. This is a purely skipUnlessDarwin test case.

Relanding this change with a restructured Makefiles for the test case
that should pass on the CI bots.

rdar://146167816
2025-03-06 21:29:38 -08:00
Jason Molenda
82af9888db Revert "[lldb][Mach-O] Don't read symbol table of specially marked binary (#129967)"
This reverts commit 397696bb3d26c1298bf265e4907b0b6416ad59c9.

This breaks the macOS CI bots, I need to use $LDFLAGS in the $LD
invocation when building the dylib to get the dylibs to build on
the CI bots.  But I've added "-lno-nlists -lhas-nlists" to the LDFLAGS
for the main binary in the same directory, so using LDFLAGS will
result in a compile error for the dylibs.  I'll need to build the
dylibs in a subdir with a different Makefile, will reland with that
change in a bit.
2025-03-06 17:19:43 -08:00
Jason Molenda
397696bb3d
[lldb][Mach-O] Don't read symbol table of specially marked binary (#129967)
We have a binary image on Darwin that has no code, only metadata. It has
a large symbol table with many external symbol names that will not be
needed in the debugger. And it is possible to not have this binary on
the debugger system - so lldb must read all of the symbol names out of
memory, one at a time, which can be quite slow.

We're adding a section __TEXT,__lldb_no_nlist, to this binary to
indicate that lldb should not read the nlist symbols for it when we are
reading out of memory. If lldb is run with an on-disk version of the
binary, we will load the symbol table as we normally would, there's no
benefit to handling this binary differently.

I added a test where I create a dylib with this specially named section,
launch the process. The main binary deletes the dylib from the disk so
lldb is forced to read it out of memory. lldb attaches to the binary,
confirms that the dylib is present in the process and is a memory
Module. If the binary is not present, or lldb found the on-disk copy
because it hasn't been deleted yet, we delete the target, flush the
Debugger's module cache, sleep and retry, up to ten times. I create the
specially named section by compiling an assembly file that puts a byte
in the section which makes for a bit of a messy Makefile (the pre-canned
actions to build a dylib don't quite handle this case) but I don't think
it's much of a problem. This is a purely skipUnlessDarwin test case.

rdar://146167816
2025-03-06 16:34:13 -08:00
Jason Molenda
1f5edb17b2
[lldb][Mach-O] Read dyld_all_image_infos addr from main bin spec LC_NOTE (#127156)
Mach-O corefiles have LC_NOTE metadata, one LC_NOTE that lldb recognizes
is `main bin spec` which can specify that this is a kernel corefile,
userland corefile, or firmware/standalone corefile. With a userland
corefile, the LC_NOTE would specify the virtual address of the dyld
binary's Mach-O header. lldb would create a Module from that in-memory
binary, find the `dyld_all_image_infos` object in dyld's DATA segment,
and use that object to find all of the binaries present in the corefile.

ProcessMachCore takes the metadata from this LC_NOTE and passes the
address to the DynamicLoader plugin via its `GetImageInfoAddress()`
method, so the DynamicLoader can find all of the binaries and load them
in the Target at their correct virtual addresses.

We have a corefile creator who would prefer to specify the address of
`dyld_all_image_infos` directly, instead of specifying the address of
dyld and parsing that to find the object. DynamicLoaderMacOSX, the
DynamicLoader plugin being used here, will accept either a dyld virtual
address or a `dyld_all_image_infos` virtual address from
ProcessMachCore, and do the correct thing with either value.

lldb's process save-core mach-o corefile reader will continue to specify
the virtual address of the dyld binary.

rdar://144322688
2025-02-18 12:40:54 -08:00
Jason Molenda
fb623a3524 [lldb] Fix two old UUID method calls in ObjectFileMachO
A section of ObjectFileMachO is ifdef compiled only when
building to run on iOS etc natively, so this old method
call rename wasn't detected by normal on-mac building.
2025-02-10 15:08:03 -08:00
Jason Molenda
fec6d168bb
[lldb] Upstream a few remaining Triple::XROS patches (#126335)
Recognize the visionOS Triple::OSType::XROS os type. Some of these have
already been landed on main, but I reviewed the downstream sources and
there were a few that still needed to be landed upstream.
2025-02-08 15:50:52 -08:00
Greg Clayton
c4fb7180cb
[lldb][NFC] Make the target's SectionLoadList private. (#113278)
Lots of code around LLDB was directly accessing the target's section
load list. This NFC patch makes the section load list private so the
Target class can access it, but everyone else now uses accessor
functions. This allows us to control the resolving of addresses and will
allow for functionality in LLDB which can lazily resolve addresses in
JIT plug-ins with a future patch.
2025-01-14 20:12:46 -08:00
Adrian Prantl
87659a17d0 Reland: [lldb] Implement a formatter bytecode interpreter in C++
Compared to the python version, this also does type checking and error
handling, so it's slightly longer, however, it's still comfortably
under 500 lines.

Relanding with more explicit type conversions.
2024-12-10 16:37:53 -08:00
Sylvestre Ledru
a2fb70523a Revert "[lldb] Add cast to fix compile error on 32-bit platforms"
This reverts commit f6012a209dca6b1866d00e6b4f96279469884320.

Revert "[lldb] Add cast to fix compile error on 32-but platforms"

This reverts commit d300337e93da4ed96b044557e4b0a30001967cf0.

Revert "[lldb] Improve log message to include missing strings"

This reverts commit 0be33484853557bc0fd9dfb94e0b6c15dda136ce.

Revert "[lldb] Add comment"

This reverts commit e2bb47443d2e5c022c7851dd6029e3869fc8835c.

Revert "[lldb] Implement a formatter bytecode interpreter in C++"

This reverts commit 9a9c1d4a6155a96ce9be494cec7e25731d36b33e.
2024-12-11 00:00:44 +01:00
Adrian Prantl
9a9c1d4a61 [lldb] Implement a formatter bytecode interpreter in C++
Compared to the python version, this also does type checking and error
handling, so it's slightly longer, however, it's still comfortably
under 500 lines.
2024-12-10 09:36:38 -08:00
Dave Lee
1a650fde4a [lldb] Load embedded type summary section (#7859) (#8040)
Add support for type summaries embedded into the binary.

These embedded summaries will typically be generated by Swift macros,
but can also be generated by any other means.

rdar://115184658
2024-12-10 09:36:38 -08:00
Michael Buch
9a4c5a59d4 Revert "Re-apply [lldb] Do not use LC_FUNCTION_STARTS data to determine symbol size as symbols are created (#117079)"
This reverts commit ba668eb99c5dc37d3c5cf2775079562460fd7619.

Below test started failing again on x86_64 macOS CI. We're unsure
if this patch is the exact cause, but since this patch has broken
this test before, we speculatively revert it to see if it was indeed
the root cause.
```
FAIL: lldb-shell :: Unwind/trap_frame_sym_ctx.test (1692 of 2162)
******************** TEST 'lldb-shell :: Unwind/trap_frame_sym_ctx.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 7: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/clang --target=specify-a-target-or-use-a-_host-substitution --target=x86_64-apple-darwin22.6.0 -isysroot /Applications/Xcode-beta.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk -fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/call-asm.c /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/Inputs/trap_frame_sym_ctx.s -o /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp
clang: warning: argument unused during compilation: '-fmodules-cache-path=/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/lldb-test-build.noindex/module-cache-clang/lldb-shell' [-Wunused-command-line-argument]
RUN: at line 8: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit | /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/lldb --no-lldbinit -S /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/lit-lldb-init-quiet /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/tools/lldb/test/Shell/Unwind/Output/trap_frame_sym_ctx.test.tmp -s /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test -o exit
+ /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/lldb-build/bin/FileCheck /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test
/Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input
         ^
<stdin>:26:64: note: scanning from here
 frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp
                                                               ^
<stdin>:27:2: note: possible intended match here
 frame #2: 0x00007ff7bfeff6c0
 ^

Input file: <stdin>
Check file: /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test

-dump-input=help explains the following input dump.

Input was:
<<<<<<
            .
            .
            .
           21:  0x100003ed1 <+0>: pushq %rbp
           22:  0x100003ed2 <+1>: movq %rsp, %rbp
           23: (lldb) thread backtrace -u
           24: * thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
           25:  * frame #0: 0x0000000100003ecc trap_frame_sym_ctx.test.tmp`bar
           26:  frame #1: 0x0000000100003ee9 trap_frame_sym_ctx.test.tmp`tramp
check:21'0                                                                    X error: no match found
           27:  frame #2: 0x00007ff7bfeff6c0
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:21'1      ?                             possible intended match
           28:  frame #3: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           29:  frame #4: 0x0000000100003ec6 trap_frame_sym_ctx.test.tmp`main + 22
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           30:  frame #5: 0x00007ff8193cc41f dyld`start + 1903
check:21'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           31: (lldb) exit
check:21'0     ~~~~~~~~~~~~
>>>>>>
```
2024-12-03 11:04:04 +00:00
Jason Molenda
448ac7d341
[lldb][Mach-O] Handle shared cache binaries correctly (#117832)
The Mach-O load commands have an LC_SYMTAB / struct symtab_command which
represents the offset of the symbol table (nlist records) and string
table for this binary. In a mach-o binary on disk, these are file
offsets. If a mach-o binary is loaded in memory with all segments
consecutive, the `symoff` and `stroff` are the offsets from the TEXT
segment (aka the mach-o header) virtual address to the virtual address
of the start of these tables.

However, if a Mach-O binary is a part of the shared cache, then the
segments will be separated -- they will have different slide values. And
it is possible for the LINKEDIT segment to be greater than 4GB away from
the TEXT segment in the virtual address space, so these 32-bit offsets
cannot express the offset from TEXT segment to these tables.

Create separate uint64_t variables to track the offset to the symbol
table and string table, instead of reusing the 32-bit ones in the
symtab_command structure.

rdar://140432279
2024-11-28 10:31:57 -08:00
Jason Molenda
f2129ca94c [lldb][NFC] Whitespace fix for mis-indented block
This mis-indented block makes a FC change I'm about
to propose look larger than it is when clang-formatted.
2024-11-26 17:00:35 -08:00
Alex Langford
ba668eb99c
Re-apply [lldb] Do not use LC_FUNCTION_STARTS data to determine symbol size as symbols are created (#117079)
I backed this out due to a problem on one of the bots that myself and
others have problems reproducing locally. I'd like to try to land it
again, at least to gain more information.

Summary:
This improves the performance of ObjectFileMacho::ParseSymtab by
removing eager and expensive work in favor of doing it later in a
less-expensive fashion.

Experiment:
My goal was to understand LLDB's startup time.
First, I produced a Debug build of LLDB (no dSYM) and a
Release+NoAsserts build of LLDB. The Release build debugged the Debug
build as it debugged a small C++ program. I found that
ObjectFileMachO::ParseSymtab accounted for somewhere between 1.2 and 1.3
seconds consistently. After applying this change, I consistently
measured a reduction of approximately 100ms, putting the time closer to
1.1s and 1.2s on average.

Background:
ObjectFileMachO::ParseSymtab will incrementally create symbols by
parsing nlist entries from the symtab section of a MachO binary. As it
does this, it eagerly tries to determine the size of symbols (e.g. how
long a function is) using LC_FUNCTION_STARTS data (or eh_frame if
LC_FUNCTION_STARTS is unavailable). Concretely, this is done by
performing a binary search on the function starts array and calculating
the distance to the next function or the end of the section (whichever
is smaller).

However, this work is unnecessary for 2 reasons:
1. If you have debug symbol entries (i.e. STABs), the size of a function
is usually stored right after the function's entry. Performing this work
right before parsing the next entry is unnecessary work.
2. Calculating symbol sizes for symbols of size 0 is already performed
in `Symtab::InitAddressIndexes` after all the symbols are added to the
Symtab. It also does this more efficiently by walking over a list of
symbols sorted by address, so the work to calculate the size per symbol
is constant instead of O(log n).
2024-11-21 13:06:15 -08:00
Jason Molenda
3d0469516c [lldb] fix one-off error in vformat specifier
Results in an assert at runtime, when run on an improperly formed
corefile.

rdar://136659551
2024-09-25 21:36:51 -07:00
Jonas Devlieghere
fa478bd275
Revert "[lldb] Do not use LC_FUNCTION_STARTS data to determine symbol size as symbols are created" (#108715)
Reverts llvm/llvm-project#106791 because it breaks
`trap_frame_sym_ctx.test ` on x86_64.

https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/lldb-cmake/5745/
2024-09-14 10:50:44 -07:00
Alex Langford
0351dc522a
[lldb] Do not use LC_FUNCTION_STARTS data to determine symbol size as symbols are created (#106791)
Summary:
This improves the performance of ObjectFileMacho::ParseSymtab by
removing eager and expensive work in favor of doing it later in a
less-expensive fashion.

Experiment:
My goal was to understand LLDB's startup time.
First, I produced a Debug build of LLDB (no dSYM) and a
Release+NoAsserts build of LLDB. The Release build debugged the Debug
build as it debugged a small C++ program. I found that
ObjectFileMachO::ParseSymtab accounted for somewhere between 1.2 and 1.3
seconds consistently. After applying this change, I consistently
measured a reduction of approximately 100ms, putting the time closer to
1.1s and 1.2s on average.

Background:
ObjectFileMachO::ParseSymtab will incrementally create symbols by
parsing nlist entries from the symtab section of a MachO binary. As it
does this, it eagerly tries to determine the size of symbols (e.g. how
long a function is) using LC_FUNCTION_STARTS data (or eh_frame if
LC_FUNCTION_STARTS is unavailable). Concretely, this is done by
performing a binary search on the function starts array and calculating
the distance to the next function or the end of the section (whichever
is smaller).

However, this work is unnecessary for 2 reasons:
1. If you have debug symbol entries (i.e. STABs), the size of a function
is usually stored right after the function's entry. Performing this work
right before parsing the next entry is unnecessary work.
2. Calculating symbol sizes for symbols of size 0 is already performed
in `Symtab::InitAddressIndexes` after all the symbols are added to the
Symtab. It also does this more efficiently by walking over a list of
symbols sorted by address, so the work to calculate the size per symbol
is constant instead of O(log n).
2024-09-13 10:33:43 -07:00
Jacob Lalonde
96b7c64b8a
[LLDB] Reapply SBSaveCore Add Memory List (#107937)
Recently in #107731 this change was revereted due to excess memory size
in `TestSkinnyCore`. This was due to a bug where a range's end was being
passed as size. Creating massive memory ranges.

Additionally, and requiring additional review, I added more unit tests
and more verbose logic to the merging of save core memory regions.

@jasonmolenda as an FYI.
2024-09-11 10:33:19 -07:00
Jonas Devlieghere
bb343468ff
Revert "[LLDB] Reappply SBSaveCore AddMemoryList" (#107731)
Reverts llvm/llvm-project#107159 as this is still causing
`TestSkinnyCorefile.py` to time out.


https://ci.swift.org/view/all/job/llvm.org/view/LLDB/job/as-lldb-cmake/11099/

https://ci.swift.org/view/all/job/llvm.org/view/LLDB/job/lldb-cmake/5544/
2024-09-07 17:10:20 -07:00
Jacob Lalonde
d4d4e77918
[LLDB] Reappply SBSaveCore AddMemoryList (#107159)
Reapplies #106293, testing identified issue in the merging code. I used
this opportunity to strip CoreFileMemoryRanges to it's own file and then
add unit tests on it's behavior.
2024-09-06 09:04:33 -07:00
Adrian Prantl
a0dd90eb7d
[lldb] Make conversions from llvm::Error explicit with Status::FromEr… (#107163)
…ror() [NFC]
2024-09-05 12:19:31 -07:00
Jacob Lalonde
b959532484
Revert "[LLDB][SBSaveCore] Add selectable memory regions to SBSaveCor… (#106293)
Reverts #105442. Due to `TestSkinnyCoreFailing` and root causing of the
failure will likely take longer than EOD.
2024-08-27 14:23:00 -07:00
Adrian Prantl
0642cd768b
[lldb] Turn lldb_private::Status into a value type. (#106163)
This patch removes all of the Set.* methods from Status.

This cleanup is part of a series of patches that make it harder use the
anti-pattern of keeping a long-lives Status object around and updating
it while dropping any errors it contains on the floor.

This patch is largely NFC, the more interesting next steps this enables
is to:
1. remove Status.Clear()
2. assert that Status::operator=() never overwrites an error
3. remove Status::operator=()

Note that step (2) will bring 90% of the benefits for users, and step
(3) will dramatically clean up the error handling code in various
places. In the end my goal is to convert all APIs that are of the form

`    ResultTy DoFoo(Status& error)
`
to

`    llvm::Expected<ResultTy> DoFoo()
`
How to read this patch?

The interesting changes are in Status.h and Status.cpp, all other
changes are mostly

` perl -pi -e 's/\.SetErrorString/ = Status::FromErrorString/g' $(git
grep -l SetErrorString lldb/source)
`
plus the occasional manual cleanup.
2024-08-27 10:59:31 -07:00
Jacob Lalonde
d517b22411
[LLDB][SBSaveCore] Add selectable memory regions to SBSaveCore (#105442)
This patch adds the option to specify specific memory ranges to be
included in a given core file. The current implementation lets user
specified ranges either be in addition to a certain save style, or
independent of them via the newly added custom enum.

To achieve being inclusive of save style, I've moved from a std::vector
of ranges to a RangeDataVector, and to join overlapping ranges to
prevent duplication of memory ranges in the core file.

As a non function bonus, when SBSavecore was initially created, the
header was included in the lldb-private interfaces, and I've fixed that
and moved it the forward declare as an oversight. CC @bulbazord in case
we need to include that into swift.
2024-08-27 07:33:12 -07:00
Dhruv Srivastava
b804516dc5
[lldb][AIX] 1. Avoid namespace collision on other platforms (#104679)
This PR is in reference to porting LLDB on AIX.

Link to discussions on llvm discourse and github:
1.  https://discourse.llvm.org/t/port-lldb-to-ibm-aix/80640
2.  #101657 

The complete changes for porting are present in this draft PR:
https://github.com/llvm/llvm-project/pull/102601

The changes on this PR are intended to avoid namespace collision for
certain typedefs between lldb and other platforms:
1. tid_t --> lldb::tid_t
2. offset_t --> lldb::offset_t
2024-08-20 10:19:32 +01:00
Jacob Lalonde
572943e790
[LLDB] Reapply #100443 SBSaveCore Thread list (#104497)
Reapply #100443 and #101770. These were originally reverted due to a
test failure and an MSAN failure. I changed the test attribute to
restrict to x86 (following the other existing tests). I could not
reproduce the test or the MSAN failure and no repo steps were provided.
2024-08-15 16:29:59 -07:00
Jacob Lalonde
accf5c9bb3
Revert "[LLDB][SBSaveCore] Implement a selectable threadlist for Core… (#102018)
… Options.  (#100443)"

This reverts commit 3e4af616334eae532f308605b89ff158dd195180.

@adrian-prantl FYI

Reverts #100443
2024-08-05 10:17:25 -07:00
Haojian Wu
86f7374078 Revert "[LLDB][SBSaveCore] Fix bug where default values are not propagated. (#101770)"
This reverts commit 34766d0d488ba2fbefa80dcd0cc8720a0e753448 which
caused a msan failure, see comment https://github.com/llvm/llvm-project/pull/101770#issuecomment-2268373325 for details.
2024-08-05 09:37:36 +02:00
Jacob Lalonde
34766d0d48
[LLDB][SBSaveCore] Fix bug where default values are not propagated. (#101770)
In #100443, Mach-o and Minidump now only call process API's that take a
`SaveCoreOption` as the container for the style and information if a
thread should be included in the core or not. This introduced a bug
where in subsequent method calls we were not honoring the defaults of
both implementations.

~~To solve this I have made a copy of each SaveCoreOptions that is
mutable by the respective plugin. Originally I wanted to leave the
SaveCoreOptions as non const so these default value mutations could be
shown back to the user. Changing that behavior is outside of the scope
of this bugfix, but is context for why we are making a copy.~~

Removed const on the savecoreoptions so defaults can be inspected by the
user

CC: @Michael137
2024-08-02 18:38:05 -07:00
Jacob Lalonde
3e4af61633
[LLDB][SBSaveCore] Implement a selectable threadlist for Core Options. (#100443)
In #98403 I enabled the SBSaveCoreOptions object, which allows users via
the scripting API to define what they want saved into their core file.
As the first option I've added a threadlist, so users can scan and
identify which threads and corresponding stacks they want to save.

In order to support this, I had to add a new method to `Process.h` on
how we identify which threads are to be saved, and I had to change the
book keeping in minidump to ensure we don't double save the stacks.

Important to @jasonmolenda I also changed the MachO coredump to accept
these new APIs.
2024-08-02 13:35:05 -07:00