137 Commits

Author SHA1 Message Date
Serge Pavlov
75f1f15881 [symbolizer] Change error message if module not found
If llvm-symbolize did not find module, the error looked like:

    LLVMSymbolizer: error reading file: No such file or directory

This message does not follow common practice: LLVMSymbolizer is not an
utility name. Also the message did not not contain the name of missed file.

With this change the error message looks differently:

    llvm-symbolizer: error: 'abc': No such file or directory

This format is closer to messages produced by other utilities and allow
proper coloring.

Differential Revision: https://reviews.llvm.org/D148032
2023-04-14 13:03:28 +07:00
Daniel Thornburgh
9812948d22 [Object] Refactor build ID parsing into Object lib.
This makes parsing for build IDs in the markup filter slightly more
permissive, in line with fromHex.

It also removes the distinction between missing build ID and empty build
ID; empty build IDs aren't a useful concept, since their purpose is to
uniquely identify a binary. This removes a layer of indirection wherever
build IDs are obtained.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D147485
2023-04-05 11:25:26 -07:00
Serge Pavlov
22a3f974d3 [symbolizer] Build 'Request' object in single point. NFC
All control paths in executeCommand create Request object for use in
calls to 'print' function and do it identically. With this change the
Request object is created in a single point, which simplifies changing
implementation of Request class.

This is a prerequisite patch for implementation of symbol+offset lookup.

Differential Revision: https://reviews.llvm.org/D147115
2023-03-30 21:27:38 +07:00
Daniel Thornburgh
a3b0dde4ed Reland: [llvm-cov] Look up object files using debuginfod
Reviewed By: gulfem

Differential Revision: https://reviews.llvm.org/D136702
2023-01-26 12:59:52 -08:00
Douglas Yung
bce910242e Revert "[llvm-cov] Look up object files using debuginfod"
This reverts commit efbc8bb18eda63007216ad0cb5a8de04963eddd5.

This change is causing failures when detecting curl on several build bots:
 - https://lab.llvm.org/buildbot/#/builders/247/builds/884
 - https://lab.llvm.org/buildbot/#/builders/231/builds/7688
 - https://lab.llvm.org/buildbot/#/builders/121/builds/27389
 - https://lab.llvm.org/buildbot/#/builders/230/builds/8464
 - https://lab.llvm.org/buildbot/#/builders/57/builds/24209
 - https://lab.llvm.org/buildbot/#/builders/127/builds/42722
2023-01-25 19:11:08 -08:00
Daniel Thornburgh
efbc8bb18e [llvm-cov] Look up object files using debuginfod
Reviewed By: gulfem

Differential Revision: https://reviews.llvm.org/D136702
2023-01-25 14:00:34 -08:00
serge-sans-paille
07bb29d8ff
[OptTable] Precompute OptTable prefixes union table through tablegen
This avoid rediscovering this table when reading each options, providing
a sensible 2% speedup when processing and empty file, and a measurable
speedup on typical workloads, see:

This is optional, the legacy, on-the-fly, approach can still be used
through the GenericOptTable class, while the new one is used through
PrecomputedOptTable.

https://llvm-compile-time-tracker.com/compare.php?from=4da6cb3202817ee2897d6b690e4af950459caea4&to=19a492b704e8f5c1dea120b9c0d3859bd78796be&stat=instructions:u

Differential Revision: https://reviews.llvm.org/D140800
2023-01-12 12:08:06 +01:00
serge-sans-paille
d9ab3e82f3
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

This a recommit of e953ae5bbc313fd0cc980ce021d487e5b5199ea4 and the subsequent fixes caa713559bd38f337d7d35de35686775e8fb5175 and 06b90e2e9c991e211fecc97948e533320a825470.

The above patchset caused some version of GCC to take eons to compile clang/lib/Basic/Targets/AArch64.cpp, as spotted in aa171833ab0017d9732e82b8682c9848ab25ff9e.
The fix is to make BuiltinInfo tables a compilation unit static variable, instead of a private static variable.

Differential Revision: https://reviews.llvm.org/D139881
2022-12-27 09:55:19 +01:00
Vitaly Buka
aa171833ab Revert "[clang] Use a StringRef instead of a raw char pointer to store builtin and call information"
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4 (part 2)"
Revert "Fix lldb option handling since e953ae5bbc313fd0cc980ce021d487e5b5199ea4"

GCC build hangs on this bot https://lab.llvm.org/buildbot/#/builders/37/builds/19104
compiling CMakeFiles/obj.clangBasic.dir/Targets/AArch64.cpp.d

The bot uses GNU 11.3.0, but I can reproduce locally with gcc (Debian 12.2.0-3) 12.2.0.

This reverts commit caa713559bd38f337d7d35de35686775e8fb5175.
This reverts commit 06b90e2e9c991e211fecc97948e533320a825470.
This reverts commit e953ae5bbc313fd0cc980ce021d487e5b5199ea4.
2022-12-25 23:12:47 -08:00
serge-sans-paille
e953ae5bbc
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile
time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

This is a recommit of 719d98dfa841c522d8d452f0685e503538415a53 that into
account a GGC issue (probably
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92181) when dealing with
intiailizer_list and constant expressions.

Workaround this by avoiding initializer list, at the expense of a
temporary plain old array.

Differential Revision: https://reviews.llvm.org/D139881
2022-12-24 10:25:06 +01:00
serge-sans-paille
07d9ab9aa5
Revert "[clang] Use a StringRef instead of a raw char pointer to store builtin and call information"
There are still remaining issues with GCC 12, see for instance

https://lab.llvm.org/buildbot/#/builders/93/builds/12669

This reverts commit 5ce4e92264102de21760c94db9166afe8f71fcf6.
2022-12-23 13:29:21 +01:00
serge-sans-paille
5ce4e92264
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile
time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

This is a recommit of 719d98dfa841c522d8d452f0685e503538415a53 with a
change to llvm/utils/TableGen/OptParserEmitter.cpp to cope with GCC bug
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108158

Differential Revision: https://reviews.llvm.org/D139881
2022-12-23 12:48:17 +01:00
serge-sans-paille
b7065a31b5
Revert "[clang] Use a StringRef instead of a raw char pointer to store builtin and call information"
Failing builds: https://lab.llvm.org/buildbot#builders/9/builds/19030
This is GCC specific and has been reported upstream: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108158

This reverts commit 719d98dfa841c522d8d452f0685e503538415a53.
2022-12-23 11:36:56 +01:00
serge-sans-paille
719d98dfa8
[clang] Use a StringRef instead of a raw char pointer to store builtin and call information
This avoids recomputing string length that is already known at compile
time.

It has a slight impact on preprocessing / compile time, see

https://llvm-compile-time-tracker.com/compare.php?from=3f36d2d579d8b0e8824d9dd99bfa79f456858f88&to=e49640c507ddc6615b5e503144301c8e41f8f434&stat=instructions:u

Differential Revision: https://reviews.llvm.org/D139881
2022-12-23 10:31:47 +01:00
serge-sans-paille
6a35815c73
Store OptTable::Info::Name as a StringRef
This is a recommit of 8ae18303f97d5dcfaecc90b4d87effb2011ed82e,
with a few cleanups.

This avoids implicit conversion to StringRef at several points, which in
turns avoid redundant calls to strlen.

As a side effect, this greatly simplifies the implementation of
StrCmpOptionNameIgnoreCase.

It also eventually gives a consistent, humble speedup in compilation
time (timing updated since original commit).

https://llvm-compile-time-tracker.com/compare.php?from=de4b6a1bc64db33643f001ad45fae7b92b4a4688&to=c23a93d1292052b4be2fbe8c586fa31143d0c7ed&stat=instructions:u

Differential Revision: https://reviews.llvm.org/D139274
2022-12-08 10:28:56 +01:00
Fangrui Song
89fab98e88 [DebugInfo] llvm::Optional => std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-05 00:09:22 +00:00
Kazu Hirata
b4482f7ca0 [tools] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 21:11:40 -08:00
Daniel Thornburgh
e61d89efd7 [NFC] [Object] Create library to fetch debug info by build ID.
This creates a library for fetching debug info by build ID, whether
locally or remotely via debuginfod. The functionality was refactored
out of existing code in the Symboliize library. Existing utilities
were refactored to use this library.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D132504
2022-09-28 13:35:35 -07:00
serge-sans-paille
99c7f83b99 [iwyu] Move <iostream> out of llvm/DebugInfo/Symbolize/Markup.h header
It's only used in the implementation. No functional change intended.
2022-09-28 20:49:00 +02:00
Daniel Thornburgh
22df238d4a [Symbolizer] Implement data symbolizer markup element.
This connects the Symbolizer to the markup filter and enables the first
working end-to-end flow using the filter.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D130187
2022-08-04 10:20:29 -07:00
Benjamin Kramer
fc99f18a20 [Symbolizer] Fix use-after-free
MarkupFilter keeps a reference to the last filtered StringRef. Just keep
it alive a bit longer. Found by asan.
2022-07-22 10:29:04 +02:00
Daniel Thornburgh
17e4c217b6 [Symbolizer] Implement contextual symbolizer markup elements.
This change implements the contextual symbolizer markup elements: reset,
module, and mmap. These provide information about the runtime context of
the binary necessary to resolve addresses to symbolic values.

Summary information is printed to the output about this context.
Multiple mmap elements for the same module line are coalesced together.
The standard requires that such elements occur on their own lines to
allow for this; accordingly, anything after a contextual element on a
line is silently discarded.

Implementing this cleanly requires that the filter drive the parser;
this allows skipped sections to avoid being parsed. This also makes the
filter quite a bit easier to use, at the cost of some unused
flexibility.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D129519
2022-07-21 11:29:19 -07:00
Daniel Thornburgh
eb5af0acf0 [Symbolize] Add log markup --filter to llvm-symbolizer.
This adds a --filter option to llvm-symbolizer. This takes log-bearing
symbolizer markup from stdin and writes a human-readable version to
stdout.

For now, this only implements the "symbol" markup tag; all others are
passed through unaltered. This is a proof-of-concept bit of
functionalty; implement the various tags is more-or-less just a matter
of hooking up various parts of the Symbolize library to the architecture
established here.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D126980
2022-06-27 10:44:15 -07:00
Daniel Thornburgh
565add5a62 [Debuginfod] Add BUILD_ID syntax to llvm-symbolizer.
This adds a BUILD_ID prefix to the llvm-symbolizer stdin and argument
syntax. The prefix causes the given binary name to be interpreted as a
build ID instead of an object file path. The semantics are analagous to
the behavior of --obj and --build-id.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D119901
2022-02-25 00:39:13 +00:00
Daniel Thornburgh
02106ec15c [Symbolize] LRU cache binaries in llvm-symbolizer.
This change adds a simple LRU cache to the Symbolize class to put a cap
on llvm-symbolizer memory usage. Previously, the Symbolizer's virtual
memory footprint would grow without bound as additional binaries were
referenced.

I'm putting this out there early for an informal review, since there may be
a dramatically different/better way to go about this. I still need to
figure out a good default constant for the memory cap and benchmark the
implementation against a large symbolization workload. Right now I've
pegged max memory usage at zero for testing purposes, which evicts the whole
cache every time.

Unfortunately, it looks like StringRefs in the returned DI objects can
directly refer to the contents of binaries. Accordingly, the cache
pruning must be explicitly requested by the caller, as the caller must
guarantee that none of the returned objects will be used afterwards.

For llvm-symbolizer this a light burden; symbolization occurs
line-by-line, and the returned objects are discarded after each.

Implementation wise, there are a number of nested caches that depend
on one another. I've implemented a simple Evictor callback system to
allow derived caches to register eviction actions to occur when the
underlying binaries are evicted.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D119784
2022-02-25 00:31:48 +00:00
serge-sans-paille
db29f4374d Cleanup include: DebugInfo/Symbolize
Estimation of the impact on preprocessor output
after: 1067349756
before:1067487786

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120433
2022-02-24 13:25:11 +01:00
Daniel Thornburgh
694f384553 [Debuginfod] Flag-determine debuginfod lookups in llvm-symbolizer.
This change adds a pair of flags controlling whether llvm-symbolizer
attempts debuginfod lookups. Lookups are attempted if --debuginfod is
passed and disabled if --no-debuginfod is passed.

The default behavior is made more nuanced: debuginfod lookups are now
only attempted if an HTTP client is compiled in and at least one backing
debuginfod URL was configured via environment variable. Previously,
debuginfod lookups would always be attempted, even if there were no
chance that they could succeed.

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D118665
2022-02-09 22:20:54 +00:00
Daniel Thornburgh
dcd4950d42 [Symbolizer] Add Build ID flag to llvm-symbolizer.
This adds a --build-id=<hex build ID> flag to llvm-symbolizer. If --obj
is unspecified, this will attempt to look up the provided build ID using
whatever mechanisms are available to the Symbolizer (typically,
debuginfod). The semantics are then as if the found binary were given
using the --obj flag.

Reviewed By: jhenderson, phosek

Differential Revision: https://reviews.llvm.org/D118633
2022-02-08 23:08:18 +00:00
Daniel Thornburgh
4a6553f4c2 [Debuginfod] [Symbolizer] Break debuginfod out of libLLVM.
Debuginfod can pull in libcurl as a dependency, which isn't appropriate
for libLLVM. (See
https://gitlab.freedesktop.org/mesa/mesa/-/issues/5732).

This change breaks out debuginfod into a separate non-component library
that can be used directly in llvm-symbolizer. The tool can inject
debuginfod into the Symbolizer library via an abstract DebugInfoFetcher
interface, breaking the dependency of Symbolizer on debuinfod.

See https://github.com/llvm/llvm-project/issues/52731

Reviewed By: phosek

Differential Revision: https://reviews.llvm.org/D118413
2022-02-08 19:14:18 +00:00
Noah Shutty
34491ca729 [Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer.
Adds a fallback to use the debuginfod client library (386655) in `findDebugBinary`.
Fixed a cast of Erorr::success() to Expected<> in debuginfod library.
Added Debuginfod to Symbolize deps in gn.
Updates compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh to include Debuginfod library to fix sanitizer-x86_64-linux breakage.

Reviewed By: jhenderson, vitalybuka

Differential Revision: https://reviews.llvm.org/D113717
2021-12-13 23:00:32 +00:00
Nico Weber
30f221bba0 Revert "[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer."
This reverts commit 5bba0fe12b2971a9cbc859f48ee6e6c1356c88b8.
Makes lld depend on libcurl, see comments on https://reviews.llvm.org/D113717
2021-12-10 10:33:05 -05:00
Noah Shutty
5bba0fe12b [Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer.
Adds a fallback to use the debuginfod client library (386655) in `findDebugBinary`.
Fixed a cast of Erorr::success() to Expected<> in debuginfod library.
Added Debuginfod to Symbolize deps in gn.
Updates compiler-rt/lib/sanitizer_common/symbolizer/scripts/build_symbolizer.sh to include Debuginfod library to fix sanitizer-x86_64-linux breakage.

Reviewed By: jhenderson, vitalybuka

Differential Revision: https://reviews.llvm.org/D113717
2021-12-10 01:32:36 +00:00
Noah Shutty
afa3c14e2f Revert "[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer."
This reverts commit e2ad4f1756027cd27f6c82db620042e9877f900c because it
does not correctly fix the sanitizer buildbot breakage.
2021-12-10 00:59:13 +00:00
Noah Shutty
e2ad4f1756 [Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer.
Adds a fallback to use the debuginfod client library (386655) in `findDebugBinary`.
Fixed a cast of Erorr::success() to Expected<> in debuginfod library.
Added Debuginfod to Symbolize deps in gn.
Adds new symbolizer symbols to `global_symbols.txt`.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D113717
2021-12-10 00:23:00 +00:00
Noah Shutty
aaec63d2a7 Revert "[Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer."
This reverts commit 02cc8d698c4941f8f0120ea1a5d7205fb33a312d because it
caused buildbot failures. The issue appears to be simply that we need to
only enable debuginfod when the HTTPClient has been initialized by the
running tool, since InitLLVM does not do the initialization step anymore.
2021-12-08 18:49:12 +00:00
Noah Shutty
02cc8d698c [Symbolizer][Debuginfo] Add debuginfod client to llvm-symbolizer.
Adds a fallback to use the debuginfod client library (386655) in `findDebugBinary`.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D113717
2021-12-08 17:52:40 +00:00
Fangrui Song
8189c4eee7 [tools] Delete redundant 'static' from namespace scope 'static const'. NFC 2021-10-18 22:38:42 -07:00
Fangrui Song
5efffac71a [llvm-symbolizer] Move setGroupedShortOptions and don't ignore case
setGroupedShortOptions in the ctor seems more popular.
2021-07-01 19:43:49 -07:00
Fangrui Song
f1e2d5851b [OptTable] Rename PrintHelp to printHelp
To be consistent with other member functions and match the coding standard.
2021-06-24 14:47:03 -07:00
Alex Orlov
05d1ae4e18 * Add support for JSON output style to llvm-symbolizer
This patch adds JSON output style to llvm-symbolizer to better support CLI automation by providing a machine readable output.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D96883
2021-05-11 13:10:54 +04:00
Nico Weber
7a9cb801f3 [llvm-symbolizer] remove unused variable
This should've been removed in D83530.

Differential Revision: https://reviews.llvm.org/D100434
2021-04-14 09:24:45 -04:00
Simon Pilgrim
ccb361af6c [llvm-symbolizer] Don't use the same 'OutputStyle' name for the enum type and instance. NFCI.
This was causing some buildbot problems, e.g. http://lab.llvm.org:8011/#/builders/110/builds/2306
2021-04-06 15:21:48 +01:00
Alex Orlov
5f57793c4f * NFC. Refactored DIPrinter for better support of new print styles.
This patch introduces a DIPrinter interface to implement by different output style printer implementations. DIPrinterGNU and DIPrinterLLVM implement the GNU and LLVM output style printing respectively. No functional changes.

This refactoring clarifies and simplifies the code, and makes a new output style addition easier.

Reviewed By: jhenderson, dblaikie

Differential Revision: https://reviews.llvm.org/D98994
2021-04-05 15:40:41 +04:00
Georgii Rymar
d221406875 [llvm-symbolizer] - Fix the crash in GNU output style with --no-inlines and missing input file.
Fixes https://bugs.llvm.org/show_bug.cgi?id=48882.

If the input file does not exist (or has a reading error), the
following code will crash if there are two or more input addresses.

```
auto ResOrErr = Symbolizer.symbolizeInlinedCode(
  ModuleName, {Offset, object::SectionedAddress::UndefSection});
Printer << (error(ResOrErr) ? DILineInfo() : ResOrErr.get().getFrame(0));
```

For the first address, `symbolizeInlinedCode` returns an error.
For the second address, `symbolizeInlinedCode` returns an empty result
(not an error) and `.getFrame(0)` will crash.

Differential revision: https://reviews.llvm.org/D95609
2021-01-30 18:36:38 +03:00
Kazu Hirata
b676f2fee1 [llvm-cov, llvm-symbolizer] Use llvm::erase_if (NFC) 2020-12-26 12:06:27 -08:00
Amy Huang
aa7ae25613 [llvm-symbolizer] Add missing include for config.h
The cmake variable LLVM_ENABLE_DIA_SDK was being used here but
was undefined because config.h wasn't included.

Differential Revision: https://reviews.llvm.org/D93309
2020-12-15 09:20:31 -08:00
Amy Huang
efd1ec0dec Recommit "[llvm-symbolizer] Switch to using native symbolizer by default on Windows"
This reverts commit 1b63177a56e8cd6196778d2b90295f03e96b5800.
2020-11-30 17:36:12 -08:00
Amy Huang
1b63177a56 Revert "[llvm-symbolizer] Switch to using native symbolizer by default on Windows"
Breaks some asan tests on the buildbot.

This reverts commit c74b427cb2a90309ee0c29df21ad1ca26390263c.
2020-11-23 16:29:45 -08:00
Amy Huang
c74b427cb2 [llvm-symbolizer] Switch to using native symbolizer by default on Windows
llvm-symbolizer used to use the DIA SDK for symbolization on
Windows; this patch switches to using native symbolization, which was
implemented recently.

Users can still make the symbolizer use DIA by adding the `-dia` flag
in the LLVM_SYMBOLIZER_OPTS environment variable.

Differential Revision: https://reviews.llvm.org/D91814
2020-11-23 15:57:08 -08:00
David Blaikie
a67d164a82 Revert several changes related to llvm-symbolizer exiting non-zero on failure.
Seems users have enough different uses of the symbolizer where they
might have unknown binaries and offsets such that "best effort" behavior
is all that's expected of llvm-symbolizer - so even erroring on unknown
executables and out of bounds offsets might not be suitable.

This reverts commit 1de0199748ef2a20cd146c100ea1b8e6726c4767.
This reverts commit a7b209a6d40d77b43a38664b1fe64513587f24c6.
This reverts commit 338dd138ea4a70b52ab48e0c8aa38ec152b3569a.
2020-10-21 15:21:44 -07:00