This patch adds support to the "Last Exception Backtrace" to the
`crashlog` command.
This metadata is homologous to the "Application Specific Backtrace",
however the format is closer to a regular stack frame.
Since the thread that "contains" the "Last Exception Backtrace" doesn't
really exist, this information is displayed when requesting an extended
backtrace of the crashed thread, similarly to the "Application Specific
Backtrace".
To achieve that, this patch includes some refactors and fixes to the
existing "Application Specific Backtrace" handling.
rdar://113046509
Differential Revision: https://reviews.llvm.org/D157851
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Before 27f27d15f, the `crashlog` command would always load images even
if `-a` or `-c` was not set by the user.
Since that change, images are loaded only when one of these 2 flags are
set, otherwise, we fallback to parsing the symbols from the report and
load them into a `SymbolFileJSON`.
Although that makes it way faster than pulling binaries and debug
symbols from build records, that cause a degraded experience since none
of our users are used to set these 2 flags. For instance, that would
symbolicate the backtraces, however the users wouldn't see sources.
To address that change of behavior, this patch changes the default value
for the `-c|--crash-only` flag to `true`. On the other hand, thanks to
the move to `argparse`, we introduced a new `--no-only-crashed` flag
that will let the user force skipping loading any images, relying only
on the `SymbolFileJSON`.
This gives the users a good compromise since they would be able to see
sources for the crashed thread if they're available, otherwise, they'll
only get a symbolicated backtrace.
Differential Revision: https://reviews.llvm.org/D157850
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This patch replace the deprecated `optparse` module used for the
`crashlog`& `save_crashlog` commands with the new `argparse` from the
python standard library. This provides many benefits such as showing the
default values for each option in the help description, but also greatly
improve the handling of position arguments.
Differential Revision: https://reviews.llvm.org/D157849
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Prior to this patch, when a user loaded multiple crash report in lldb,
they could get in a situation where all the targets would keep the same
architecture and executable path as the first one that we've created.
The reason behind this was that even if we created a new CrashLog
object, which is derived from a Symbolicator class that has a newly
constructoted image list as a default argument, because that default
argument is only created once when the function is defined, every CrashLog
object would share the same list.
That will cause use to append newly parsed images to the same
Symbolicator image list accross multiple CrashLog objects.
To address this, this patch changes the default argument value for the
image parameter to `None` and only initialize it as an empty list when
no argument was passed.
This also removes the image list stored in each CrashLog parsers since
they shouldn't have any state and should be re-usable. So now, the only
source of truth is stored in the CrashLog object.
rdar://84984949
Differential Revision: https://reviews.llvm.org/D157044
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This patch changes the parsing logic for the legacy crash report format
to avoid interrupting the parsing if there are new lines in the middle
of a section.
To do, the parser starts to skip all consecutive empty lines. If the
number of lines skipped is greater than 1, the parser considers that it
reached a new setion of the report and should reset the parsing mode to
back to normal.
Otherwise, it tries to parse the next line in the current parsing mode.
If it succeeds, the parser will also skip that line since it has already
been parsed and continue the parsing.
rdar://107022595
Differential Revision: https://reviews.llvm.org/D157043
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This patch changes the way we dump the registers from the legacy
crashlog command to make sure that the ordering matches the one from lldb.
rdar://109172073
Differential Revision: https://reviews.llvm.org/D156919
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
It can be tricky to troubleshoot why the crashlog script can't show
inline sources. The two most common causes are that we couldn't find the
dSYM or, if we find the dSYM, that the path remapping included in the
dSYMForUUID output isn't accessible. The former is already easy to
diagnose, but the latter is harder because you'd have to manually invoke
dsymForUUID on the UUID and check the remapped path. This patch
automates that process and prints a warning if the remapped path doesn't
exist or is not accessible.
Differential revision: https://reviews.llvm.org/D152886
Run crashlog inside lldb when invoked in interactive mode from the
command line. Currently, when passing -i to crashlog from the command
line, we symbolicate in LLDB and immediately exit right after. This
pretty much defeats the purpose of interactive mode. That said, we
wouldn't want to re-implement the driver from the crashlog script.
Re-invoking the crashlog command from inside LLDB solves the issue.
rdar://97801509
Differential revision: https://reviews.llvm.org/D152319
This fixes a regression introduced by 27f27d15f6c9 that results in a
NameError: (name 'self' is not defined) when using crashlog with the -c
option.
rdar://110007391
This patch should fix a crash in the opening a crash report that was
passed with a relative path.
This patch expands the crash report path before parsing it and raises a
`FileNotFoundError` exception if the file doesn't exist.
Differential Revision: https://reviews.llvm.org/D152012
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This patch should allow the user to set specific auto-completion type
for their custom commands.
To do so, we had to hoist the `CompletionType` enum so the user can
access it and add a new completion type flag to the CommandScriptAdd
Command Object.
So now, the user can specify which completion type will be used with
their custom command, when they register it.
This also makes the `crashlog` custom commands use disk-file completion
type, to browse through the user file system and load the report.
Differential Revision: https://reviews.llvm.org/D152011
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This patch changes the way we load a crash report into a scripted
process by creating a empty target.
To do so, it parses the architecture information from the report (for
both the legacy and json format) and uses that to create a target that
doesn't have any executable, like what we do when attaching to a process.
For the legacy format, we mostly rely on the `Code Type` line, since the
architure is an optional field on the `Binary Images` sections.
However for the json format, we first try to get the architecture while
parsing the image dictionary if we couldn't find it, we try to infer it
using the "flavor" key when parsing the frame's registers.
If the architecture is still not set after parsing the report, we raise
an exception.
rdar://107850263
Differential Revision: https://reviews.llvm.org/D151849
Differential
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This patch should address the crashes when parsing a the crash report
frame dictionary.
If the crash report is not symbolicated, the `symbolLocation` key will
be missing. In that case, we should just use the `imageOffset`.
rdar://109836386
Differential Revision: https://reviews.llvm.org/D151844
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This is an ongoing series of commits that are reformatting our Python
code. Reformatting is done with `black` (23.1.0).
If you end up having problems merging this commit because you have made
changes to a python file, the best way to handle that is to run `git
checkout --ours <yourfile>` and then reformat it with black.
RFC: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style
Differential revision: https://reviews.llvm.org/D151460
Following abba5de72466, some tests started failing on green-dragon:
https://green.lab.llvm.org/green/job/lldb-cmake/55460/console
Looking at the backtrace, there seems to be a racing issue when deleting
the temporary directory containing all the JSON object files:
```
Traceback (most recent call last):
File "/Users/buildslave/jenkins/workspace/lldb-cmake/lldb-build/lib/python3.10/site-packages/lldb/macosx/crashlog.py", line 1115, in __call__
SymbolicateCrashLogs(debugger, shlex.split(command), result)
File "/Users/buildslave/jenkins/workspace/lldb-cmake/lldb-build/lib/python3.10/site-packages/lldb/macosx/crashlog.py", line 1457, in SymbolicateCrashLogs
SymbolicateCrashLog(crash_log, options)
File "/Users/buildslave/jenkins/workspace/lldb-cmake/lldb-build/lib/python3.10/site-packages/lldb/macosx/crashlog.py", line 1158, in SymbolicateCrashLog
with tempfile.TemporaryDirectory() as obj_dir:
File "/usr/local/opt/python@3.10/Frameworks/Python.framework/Versions/3.10/lib/python3.10/tempfile.py", line 869, in __exit__
self.cleanup()
File "/usr/local/opt/python@3.10/Frameworks/Python.framework/Versions/3.10/lib/python3.10/tempfile.py", line 873, in cleanup
self._rmtree(self.name, ignore_errors=self._ignore_cleanup_errors)
File "/usr/local/opt/python@3.10/Frameworks/Python.framework/Versions/3.10/lib/python3.10/tempfile.py", line 855, in _rmtree
_shutil.rmtree(name, onerror=onerror)
File "/usr/local/opt/python@3.10/Frameworks/Python.framework/Versions/3.10/lib/python3.10/shutil.py", line 731, in rmtree
onerror(os.rmdir, path, sys.exc_info())
File "/usr/local/opt/python@3.10/Frameworks/Python.framework/Versions/3.10/lib/python3.10/shutil.py", line 729, in rmtree
os.rmdir(path)
OSError: [Errno 66] Directory not empty: '/var/folders/09/r4vw4v8n5kb67jl66zvlbljw0000gn/T/tmp6qfifxk7'
```
This patch should fix that issue since it won't delete the object file
directory until we're sure that the modules adding tasks completed.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
This patch changes the way we generate the ObjectFileJSON files
containing the inlined symbols from the crash report to remove the
tempfile prefix from the object file name.
To do so, instead of creating a new tempfile for each module, we create a
temporary directory that contains each module object file with the same
name as the module.
This makes the backtraces only contain the module name without the
temfile prefix which makes it look like a regular stackframe.
Differential Revision: https://reviews.llvm.org/D151045
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Sometimes, crash reports come with inlined symbols. These provide the
exact stacktrace from the user binary.
However, when investigating a crash, it's very likely that the images related
to the crashed thread are not available on the debugging user system or
that the versions don't match. This causes interactive crashlog to show
a degraded backtrace in lldb.
This patch aims to address that issue, by parsing the inlined symbols
from the crash report and load them into lldb's target.
This patch is a follow-up to 27f27d1, focusing on inlined symbols
loading from legacy (non-json) crash reports.
To do so, it updates the stack frame regular expression to make the
capture groups more granular, to be able to extract the symbol name, the
offset and the source location if available, while making it more
maintainable.
So now, when parsing the crash report, we build a data structure
containing all the symbol information for each stackframe. Then, after
launching the scripted process for interactive mode, we write a JSON
symbol file for each module, only containing the symbols that it contains.
Finally, we load the json symbol file into lldb, before showing the user
the process status and backtrace.
rdar://97345586
Differential Revision: https://reviews.llvm.org/D146765
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Create an artificial module using a JSON object file when we can't
locate the module and dSYM through dsymForUUID (or however
locate_module_and_debug_symbols is implemented). By parsing the symbols
from the crashlog and making them part of the JSON object file, LLDB can
symbolicate frames it otherwise wouldn't be able to, as there is no
module for it.
For non-interactive crashlogs, that never was a problem because we could
simply show the "pre-symbolicated" frame from the input. For interactive
crashlogs, we need a way to pass the symbol information to LLDB so that
it can symbolicate the frames, which is what motivated the JSON object
file format.
Differential revision: https://reviews.llvm.org/D148172
Now that we can pass Python objects to the scripted process instance, we
don't need to parse the crashlog twice anymore.
Differential revision: https://reviews.llvm.org/D148063
When using interactive crashlog from an IDE, it can happen that the user
already have the `command script import lldb.macosx.crashlog` command on
their `lldbinit` file.
That leads to showing some message:
```
error: cannot add command: user command exists and force replace not set
error: cannot add command: user command exists and force replace not set
```
This leads to confusion because the crashlog symbolication continues and
succeeds even after these errors.
To address that, the crashlog commands get overridden everytime the
script get re-imported.
rdar://103403943
Differential Revision: https://reviews.llvm.org/D140113
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch should fix an undefined behaviour that's happening when
parsing a crash report from an IDE. In the previous implementation, the
CrashLogParser base class would use the `__new__` static class method to
create the right parser instance depending on the crash report type.
For some reasons, the derived parser initializer wouldn't be called when
running the command from an IDE, so this patch refactors the
CrashLogParser code to replace the use of the `__new__` method with a
factory `create` static method.
rdar://100527640
Differential Revision: https://reviews.llvm.org/D139951
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch replaces the backing file path key to "file_path" to keep it
consistent.
rdar://101652618
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
It can happen that the originator of a crash report doesn't have access
to certain images. When that's the case, ReportCrash won't show the
source info in the crash report stack frames, but only the stack address
and image name.
This patch fixes a bug in the crashlog stackframe parser regular
expression to optionally match the source info group.
rdar://101934135
Differential Revision: https://reviews.llvm.org/D137466
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
For an exception crashlog, the thread backtraces aren't usually very helpful
and instead, developpers look at the "Application Specific Backtrace" that
was generated by `objc_exception_throw`.
LLDB could already parse and symbolicate these Application Specific Backtraces
for regular textual-based crashlog, so this patch adds support to parse them
in JSON crashlogs, and materialize them a HistoryThread extending the
crashed ScriptedThread.
This patch also includes the Application Specific Information messages
as part of the process extended crash information log. To do so, the
ScriptedProcess Python interface has a new GetMetadata method that
returns an arbitrary dictionary with data related to the process.
rdar://93207586
Differential Revision: https://reviews.llvm.org/D126260
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch updates the image_regex_uuid matcher to match null-UUID
images in the plain text crashlog parser.
It updates the regex to match one or more '?' characters or the image
full path.
rdar://100904019
Differential Revision: https://reviews.llvm.org/D135482
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch adds support for 32bit stack frame addresses in the `crashlog`
command.
For crash reports that are generated from a arm64_32 process, `PAGEZERO`
is loaded at 0x00004000 so no code address will be less than 0x4000.
This patch changes the crashlog frame address regex group to match
addresses as small as 4 hex characters.
rdar://100805026
Differential Revision: https://reviews.llvm.org/D135310
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
The python "open" function will use the default encoding for the
locale (the result of "locale.getpreferredencoding()"). Explicitly set
the locale to utf-8 when opening the crashlog for writing, as there may
be non-ascii symbols in there (for example, Swift uses "τ" to indicate
generic parameters).
rdar://101402755
Differential Revision: https://reviews.llvm.org/D136798
This patch parses CrashLog exception data from the raw
text format and adapts it to the new JSON format.
This is necessary for feature parity between the 2 formats.
Differential Revision: https://reviews.llvm.org/D131719
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
A spiritual follow up to D131032. I noticed some regex could be simplified.
This does some of the following:
1. Removes unused capture groups
2. Uses non-capturing `(?:...)` groups where grouping is needed but capturing isn't
3. Removes trailing `.*`
4. Uses `\d` over `[0-9]`
5. Uses raw strings
6. Uses `{N,}` to indicate N-or-more
Also improves the call site of a `re.findall`.
Differential Revision: https://reviews.llvm.org/D131305
This patch introduces a new option to the crashlog command to get the
the script version.
Since `crashlog.py` is not actually versioned, this returns lldb's
version instead.
rdar://98392669
Differential Revision: https://reviews.llvm.org/D131542
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch changes the CrashLogParser class to be both the base class
and a Factory for the JSONCrashLogParser & TextCrashLogParser.
That should help remove some code duplication and ensure both class
have a parse method.
Differential Revision: https://reviews.llvm.org/D131085
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch introduces a new option for the interactive crashlog mode,
that will prevent it from dumping the `process status` & `thread backtrace`
output to the debugger console.
This is necessary when lldb in running from an IDE, to prevent flooding
the console with information that should be already present in the UI.
rdar://96813296
Differential Revision: https://reviews.llvm.org/D131036
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
In can happen when creating stackshot crash report that that key is missing.
Moreover, we try to parse that key but don't use it, or need it, since we
fetch images and symbolicate the stackframes using the binaries UUIDs.
This is why this patch removes everything that is related to the
`process_path`/`procPath` parsing.
rdar://95054188
Differential Revision: https://reviews.llvm.org/D131033
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch updates the regular expression matching stackframes in
crashlog to allow addresses that are 7 characters long and more (vs. 8
characters previously).
It changes the `0x[0-9a-fA-F]{7}[0-9a-fA-F]+` by `0x[0-9a-fA-F]{7,}`.
rdar://97684839
Differential Revision: https://reviews.llvm.org/D131032
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch allows the crashlog script to surface its errors to lldb by
using the provided SBCommandReturnObject argument.
rdar://95048193
Differential Revision: https://reviews.llvm.org/D129614
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch introduces a new flag for the interactive crashlog mode, that
allow the user to specify, which target to use to create the scripted
process.
This can be very useful when lldb already have few targets created:
Instead of taking the first one (zeroth index), we will use that flag to
create a new target. If the user didn't provide a target path, we will rely
on the symbolicator to create a targer.If that fails and there are already
some targets loaded in lldb, we use the first one.
rdar://94682869
Differential Revision: https://reviews.llvm.org/D129611
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch changes the `crashlog` command behavior to print the help
message if no argument was provided with the command.
rdar://94576026
Differential Revision: https://reviews.llvm.org/D127362
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch subtracts 1 to the pc of any frame above frame 0 to get the
previous line entry and display the right line in the debugger.
This also rephrase some old comment from `48d157dd4`.
rdar://92686666
Differential Revision: https://reviews.llvm.org/D125928
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
Avoid a OverflowError (an underflow really) when the pc is zero. This
can happen for "unknown frames" where the crashlog generator reports a
zero pc. We could omit them altogether, but if they're part of the
crashlog it seems fair to display them in lldb as well.
rdar://92686666
Differential revision: https://reviews.llvm.org/D125716
crashlog.py catches every exception in order to format them. This
results in both the exception name as well as the backtrace getting
swallowed.
Here's an example of the current output:
error: python exception: in method 'SBTarget_ResolveLoadAddress', argument 2 of type 'lldb::addr_t'
Compare this to the output without the custom exception handling:
Traceback (most recent call last):
File "[...]/site-packages/lldb/macosx/crashlog.py", line 929, in __call__
SymbolicateCrashLogs(debugger, shlex.split(command))
File "[...]/site-packages/lldb/macosx/crashlog.py", line 1239, in SymbolicateCrashLogs
SymbolicateCrashLog(crash_log, options)
File "[...]/site-packages/lldb/macosx/crashlog.py", line 1006, in SymbolicateCrashLog
thread.dump_symbolicated(crash_log, options)
File "[...]/site-packages/lldb/macosx/crashlog.py", line 124, in dump_symbolicated
symbolicated_frame_addresses = crash_log.symbolicate(
File "[...]/site-packages/lldb/utils/symbolication.py", line 540, in symbolicate
if symbolicated_address.symbolicate(verbose):
File "[...]/site-packages/lldb/utils/symbolication.py", line 98, in symbolicate
sym_ctx = self.get_symbol_context()
File "[...]/site-packages/lldb/utils/symbolication.py", line 77, in get_symbol_context
sb_addr = self.resolve_addr()
File "[...]/site-packages/lldb/utils/symbolication.py", line 69, in resolve_addr
self.so_addr = self.target.ResolveLoadAddress(self.load_addr)
File "[...]/site-packages/lldb/__init__.py", line 10675, in ResolveLoadAddress
return _lldb.SBTarget_ResolveLoadAddress(self, vm_addr)
OverflowError: in method 'SBTarget_ResolveLoadAddress', argument 2 of type 'lldb::addr_t'
This patch removes the custom exception handling and lets LLDB or the
default exception handler deal with it instead.
Differential revision: https://reviews.llvm.org/D125589
When using dsymForUUID, the majority of time symbolication a crashlog
with crashlog.py is spent waiting for it to complete. Currently, we're
calling dsymForUUID sequentially when iterating over the modules. We can
drastically cut down this time by calling dsymForUUID in parallel. This
patch uses Python's ThreadPoolExecutor (introduced in Python 3.2) to
parallelize this IO-bound operation.
The performance improvement is hard to benchmark, because even with an
empty local cache, consecutive calls to dsymForUUID for the same UUID
complete faster. With warm caches, I'm seeing a ~30% performance
improvement (~90s -> ~60s). I suspect the gains will be much bigger for
a cold cache.
dsymForUUID supports batching up multiple UUIDs. I considered going that
route, but that would require more intrusive changes. It would require
hoisting the logic out of locate_module_and_debug_symbols which we
explicitly document [1] as a feature of Symbolication.py to locate
symbol files.
[1] https://lldb.llvm.org/use/symbolication.html
Differential reviison: https://reviews.llvm.org/D125107
On arm64 targets, when the crashing pc is 0, the caller
frame can be found by looking at $lr, but the crash
reports don't use that trick to show the actual crashing
frame. This patch adds that stack frame that lldb shows.
Also fix an issue where some register names were printed
as having a prefix of 'None'.
Differential Revision: https://reviews.llvm.org/D125042
rdar://92631787
Previously, the ScriptedThread used the thread index as the thread id.
This patch parses the crashlog json to extract the actual thread "id" value,
and passes this information to the Crashlog ScriptedProcess blueprint,
to create a higher fidelity ScriptedThreaad.
It also updates the blueprint to show the thread name and thread queue.
Finally, this patch updates the interactive crashlog test to reflect
these changes.
rdar://90327854
Differential Revision: https://reviews.llvm.org/D122422
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
This patch pipes down the `-a|--load-all` crashlog command option to the
Scripted Process initializer to load all the images used by crashed
process instead of only loading the images related to the crashed
thread.
This allows us to recreate artificial frames also for the non-crashed
scripted threads.
rdar://90396265
Differential Revision: https://reviews.llvm.org/D121826
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>