If llvm-symbolizer finds a malformed command, it echoes it to the
standard output. New versions of binutils (starting from 2.39) allow to
specify an address by a symbols. Implementation of this feature in
llvm-symbolizer makes the current reaction on invalid input
inappropriate. Almost any invalid command may be treated as a symbol
name, so the right reaction should be "symbol not found" in such case.
The exception are commands that are recognized but have incorrect
syntax, like "FILE:FILE:". The utility must produce descriptive
diagnostic for such input and route it to the stderr.
This change implements the new reaction on invalid input and is a
prerequisite for implementation of symbol lookup in llvm-symbolizer.
Differential Revision: https://reviews.llvm.org/D157210
This reverts commit 4e3b89483a6922d3f48670bb1c50a37f342918c6, with
fixes for places I'd missed updating in lld and lldb. I've also
renamed OptionVisibility::Default to "DefaultVis" to avoid ambiguity
since the undecorated name has to be available anywhere Options.inc is
included.
Original message follows:
This splits OptTable's "Flags" field into "Flags" and "Visibility",
updates the places where we instantiate Option tables, and adds
variants of the OptTable APIs that use Visibility mask instead of
Include/Exclude flags.
We need to do this to clean up a bunch of complexity in the clang
driver's option handling - there's a whole slew of flags like
CoreOption, NoDriverOption, and FlangOnlyOption there today to try to
handle all of the permutations of flags that the various drivers need,
but it really doesn't scale well, as can be seen by things like the
somewhat recently introduced CLDXCOption.
Instead, we'll provide an additive model for visibility that's
separate from the other flags. For things like "HelpHidden", which is
used as a "subtractive" modifier for option visibility, we leave that
in "Flags" and handle it as a special case.
Note that we don't actually update the users of the Include/Exclude
APIs here or change the flags that exist in clang at all - that will
come in a follow up that refactors clang's Options.td to use the
increased flexibility this change allows.
Differential Revision: https://reviews.llvm.org/D157149
This splits OptTable's "Flags" field into "Flags" and "Visibility",
updates the places where we instantiate Option tables, and adds
variants of the OptTable APIs that use Visibility mask instead of
Include/Exclude flags.
We need to do this to clean up a bunch of complexity in the clang
driver's option handling - there's a whole slew of flags like
CoreOption, NoDriverOption, and FlangOnlyOption there today to try to
handle all of the permutations of flags that the various drivers need,
but it really doesn't scale well, as can be seen by things like the
somewhat recently introduced CLDXCOption.
Instead, we'll provide an additive model for visibility that's
separate from the other flags. For things like "HelpHidden", which is
used as a "subtractive" modifier for option visibility, we leave that
in "Flags" and handle it as a special case.
Note that we don't actually update the users of the Include/Exclude
APIs here or change the flags that exist in clang at all - that will
come in a follow up that refactors clang's Options.td to use the
increased flexibility this change allows.
Differential Revision: https://reviews.llvm.org/D157149
This reverts commit a5fe6c7f5e2d1d265bd7c312ef55259fee7a68f9.
This change is causing problems with Windows build bots due to a hanging zombie llvm-symbolizer.exe process.
If llvm-symbolizer finds a malformed command, it echoes it to the
standard output. New versions of binutils (starting from 2.39) allow to
specify an address by a symbols. Implementation of this feature in
llvm-symbolizer makes the current reaction on invalid input
inappropriate. Almost any invalid command may be treated as a symbol
name, so the right reaction should be "symbol not found" in such case.
The exception are commands that are recognized but have incorrect
syntax, like "FILE:FILE:". The utility must produce descriptive
diagnostic for such input and route it to the stderr.
This change implements the new reaction on invalid input and is a
prerequisite for implementation of symbol lookup in llvm-symbolizer.
Differential Revision: https://reviews.llvm.org/D157210
Now llvm-symbolizer prints input string if parsing command failed,
whithout explanation that is wrong. As a first step in making the
interface more user-friendly, this change reorganize parsing so that
generation of messages becomes easier.
Differential Revision: https://reviews.llvm.org/D157203
The code that gets binary file name is moved to a separate function.
It makes the code of `parseCommand` cleaner and allows to reuse the
parsing code.
Differential Revision: https://reviews.llvm.org/D156978
All command-line tools using `llvm::opt` create an enum of option IDs and a table of `OptTable::Info` object. Most of the tools use the same ID (`OPT_##ID`), kind (`Option::KIND##Class`), group ID (`OPT_##GROUP`) and alias ID (`OPT_##ALIAS`). This patch extracts that common code into canonical macros. This results in fewer changes when tweaking the `OPTION` macros emitted by the TableGen backend.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D157028
If binary file specified as input with option --obj or -e is absent,
now llvm-addr2line exits immediately. This patch extends this behavior to
llvm-symbolizer. Previously llvm-symbolizer waited addresses from input
stream or command line in this case.
Differential Revision: https://reviews.llvm.org/D153219
GNU addr2line exits immediately if -e (default to a.out) specifies a file that
cannot be open or a directory. llvm-addr2line used to wait for input on if the
input file cannot be open and addresses are not specified in command line.
Replace the D147652 checkFileExists with getOrCreateModuleInfo to avoid
a separate `sys::fs::status` operation.
Reviewed By: sepavloff
Differential Revision: https://reviews.llvm.org/D153595
GNU addr2line exits immediately if it cannot open the file specified as
executable/relocatable. In contrast llvm-addr2line does not exit and, if
addresses are not specified in command line, waits for input on stdin. This
causes the test compiler-rt/test/asan/TestCases/Posix/asan-symbolize-bad-path.cc to block
forever on Gentoo (see https://reviews.llvm.org/rG27c4777f41d2ab204c1cf84ff1cccd5ba41354da#1190273).
To fix this issue the behavior llvm-addr2line now exits if
executable/relocatable file cannot be found.
It fixes https://github.com/llvm/llvm-project/issues/42099 (llvm-addr2line
does not exit when passed a non-existent file).
Differential Revision: https://reviews.llvm.org/D147652
This is a recommit of 75f1f158812d, reverted in 7a443b1c493d, because
it caused compilation error in
compiler-rt/lib/sanitizer_common/symbolizer/sanitizer_symbolize.cpp.
The error was fixed by Kasimir Georgiev in de4c038c7ba2, but this
commit was reverted in de088dd3a0aa, because the initial commit was
reverted.
This commit reverts both the reverting commits, 7a443b1c493d and
de088dd3a0aa.
Original commit message is below.
If llvm-symbolize did not find module, the error looked like:
LLVMSymbolizer: error reading file: No such file or directory
This message does not follow common practice: LLVMSymbolizer is not an
utility name. Also the message did not not contain the name of missed file.
With this change the error message looks differently:
llvm-symbolizer: error: 'abc': No such file or directory
This format is closer to messages produced by other utilities and allow
proper coloring.
Differential Revision: https://reviews.llvm.org/D148032
If llvm-symbolize did not find module, the error looked like:
LLVMSymbolizer: error reading file: No such file or directory
This message does not follow common practice: LLVMSymbolizer is not an
utility name. Also the message did not not contain the name of missed file.
With this change the error message looks differently:
llvm-symbolizer: error: 'abc': No such file or directory
This format is closer to messages produced by other utilities and allow
proper coloring.
Differential Revision: https://reviews.llvm.org/D148032
This makes parsing for build IDs in the markup filter slightly more
permissive, in line with fromHex.
It also removes the distinction between missing build ID and empty build
ID; empty build IDs aren't a useful concept, since their purpose is to
uniquely identify a binary. This removes a layer of indirection wherever
build IDs are obtained.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D147485
All control paths in executeCommand create Request object for use in
calls to 'print' function and do it identically. With this change the
Request object is created in a single point, which simplifies changing
implementation of Request class.
This is a prerequisite patch for implementation of symbol+offset lookup.
Differential Revision: https://reviews.llvm.org/D147115
This avoids recomputing string length that is already known at compile time.
It has a slight impact on preprocessing / compile time, see
https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u
This a recommit of e953ae5bbc313fd0cc980ce021d487e5b5199ea4 and the subsequent fixes caa713559bd38f337d7d35de35686775e8fb5175 and 06b90e2e9c991e211fecc97948e533320a825470.
The above patchset caused some version of GCC to take eons to compile clang/lib/Basic/Targets/AArch64.cpp, as spotted in aa171833ab0017d9732e82b8682c9848ab25ff9e.
The fix is to make BuiltinInfo tables a compilation unit static variable, instead of a private static variable.
Differential Revision: https://reviews.llvm.org/D139881
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4 (part 2)"
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4"
GCC build hangs on this bot https://lab.llvm.org/buildbot/#/builders/37/builds/19104
compiling CMakeFiles/obj.clangBasic.dir/Targets/AArch64.cpp.d
The bot uses GNU 11.3.0, but I can reproduce locally with gcc (Debian 12.2.0-3) 12.2.0.
This reverts commit caa713559bd38f337d7d35de35686775e8fb5175.
This reverts commit 06b90e2e9c991e211fecc97948e533320a825470.
This reverts commit e953ae5bbc313fd0cc980ce021d487e5b5199ea4.
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
This creates a library for fetching debug info by build ID, whether
locally or remotely via debuginfod. The functionality was refactored
out of existing code in the Symboliize library. Existing utilities
were refactored to use this library.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D132504
This connects the Symbolizer to the markup filter and enables the first
working end-to-end flow using the filter.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D130187
This change implements the contextual symbolizer markup elements: reset,
module, and mmap. These provide information about the runtime context of
the binary necessary to resolve addresses to symbolic values.
Summary information is printed to the output about this context.
Multiple mmap elements for the same module line are coalesced together.
The standard requires that such elements occur on their own lines to
allow for this; accordingly, anything after a contextual element on a
line is silently discarded.
Implementing this cleanly requires that the filter drive the parser;
this allows skipped sections to avoid being parsed. This also makes the
filter quite a bit easier to use, at the cost of some unused
flexibility.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D129519
This adds a --filter option to llvm-symbolizer. This takes log-bearing
symbolizer markup from stdin and writes a human-readable version to
stdout.
For now, this only implements the "symbol" markup tag; all others are
passed through unaltered. This is a proof-of-concept bit of
functionalty; implement the various tags is more-or-less just a matter
of hooking up various parts of the Symbolize library to the architecture
established here.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D126980
This adds a BUILD_ID prefix to the llvm-symbolizer stdin and argument
syntax. The prefix causes the given binary name to be interpreted as a
build ID instead of an object file path. The semantics are analagous to
the behavior of --obj and --build-id.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D119901
This change adds a simple LRU cache to the Symbolize class to put a cap
on llvm-symbolizer memory usage. Previously, the Symbolizer's virtual
memory footprint would grow without bound as additional binaries were
referenced.
I'm putting this out there early for an informal review, since there may be
a dramatically different/better way to go about this. I still need to
figure out a good default constant for the memory cap and benchmark the
implementation against a large symbolization workload. Right now I've
pegged max memory usage at zero for testing purposes, which evicts the whole
cache every time.
Unfortunately, it looks like StringRefs in the returned DI objects can
directly refer to the contents of binaries. Accordingly, the cache
pruning must be explicitly requested by the caller, as the caller must
guarantee that none of the returned objects will be used afterwards.
For llvm-symbolizer this a light burden; symbolization occurs
line-by-line, and the returned objects are discarded after each.
Implementation wise, there are a number of nested caches that depend
on one another. I've implemented a simple Evictor callback system to
allow derived caches to register eviction actions to occur when the
underlying binaries are evicted.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D119784
This change adds a pair of flags controlling whether llvm-symbolizer
attempts debuginfod lookups. Lookups are attempted if --debuginfod is
passed and disabled if --no-debuginfod is passed.
The default behavior is made more nuanced: debuginfod lookups are now
only attempted if an HTTP client is compiled in and at least one backing
debuginfod URL was configured via environment variable. Previously,
debuginfod lookups would always be attempted, even if there were no
chance that they could succeed.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D118665
This adds a --build-id=<hex build ID> flag to llvm-symbolizer. If --obj
is unspecified, this will attempt to look up the provided build ID using
whatever mechanisms are available to the Symbolizer (typically,
debuginfod). The semantics are then as if the found binary were given
using the --obj flag.
Reviewed By: jhenderson, phosek
Differential Revision: https://reviews.llvm.org/D118633
Debuginfod can pull in libcurl as a dependency, which isn't appropriate
for libLLVM. (See
https://gitlab.freedesktop.org/mesa/mesa/-/issues/5732).
This change breaks out debuginfod into a separate non-component library
that can be used directly in llvm-symbolizer. The tool can inject
debuginfod into the Symbolizer library via an abstract DebugInfoFetcher
interface, breaking the dependency of Symbolizer on debuinfod.
See https://github.com/llvm/llvm-project/issues/52731
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D118413
Adds a fallback to use the debuginfod client library (386655) in `findDebugBinary`.
Fixed a cast of Erorr::success() to Expected<> in debuginfod library.
Added Debuginfod to Symbolize deps in gn.
Updates compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh to include Debuginfod library to fix sanitizer-x86_64-linux breakage.
Reviewed By: jhenderson, vitalybuka
Differential Revision: https://reviews.llvm.org/D113717
Adds a fallback to use the debuginfod client library (386655) in `findDebugBinary`.
Fixed a cast of Erorr::success() to Expected<> in debuginfod library.
Added Debuginfod to Symbolize deps in gn.
Updates compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh to include Debuginfod library to fix sanitizer-x86_64-linux breakage.
Reviewed By: jhenderson, vitalybuka
Differential Revision: https://reviews.llvm.org/D113717