185 Commits

Author SHA1 Message Date
Rahul Joshi
7d96b39c4f
[NFC][LLVM] Adopt ListSeparator/interleaved in more places (#172909)
Adopt `ListSeparator` and `interleaved` in various places instead of
manual code to print separators between loop iterations.
2026-01-12 12:18:01 -08:00
Rahul Joshi
7245e21e89
[NFC][Support] Add llvm::uninitialized_copy (#138174)
Add `llvm::uninitialized_copy` that accepts a range instead of start/end
iterator for the source of the copy.
2025-05-07 17:37:38 -07:00
Zequan Wu
78b21ddba7 Revert "Reland "Symbolize line zero as if no source info is available (#124846)" (#133798)"
This reverts commit 348374028970c956f2e49ab7553b495d7408ccd9 because #128619 doesn't handle the case when we have an empty frame from `getInliningInfoForAddress` because line num is 0 which makes it non-differentiable from missing debug info. So, we end up using the base filename from symtab again. Reverting for now until that issus is solved.
2025-04-09 18:09:31 -07:00
Zequan Wu
3483740289
Reland "Symbolize line zero as if no source info is available (#124846)" (#133798)
This land commits 23aca2f88dd5d2447e69496c89c3ed42a56f9c31 and
1b15a89a23c631a8e2d096dad4afe456970572c0.
https://github.com/llvm/llvm-project/pull/128619 makes symbolizer to
always use debug info when available so we can reland this chagnge.
2025-03-31 19:13:46 -04:00
Zequan Wu
23aca2f88d Revert "Symbolize line zero as if no source info is available (#124846)"
This commit creates an inconsistency on `__sanitizer_symbolize_pc` API. Before this change, this API always uses the filename from debug info when the line number is 0. After this change, the filename becomes invalid when line number is 0. The sanitizer might fall back to use base filename from symbol table. So, this API may return either `??:0` or `{filename from symbol table}:0` depending on if the symbol table has the filename for it. Make sure this inconsistency is resolved before relanding the commit.
2025-02-24 13:00:33 -08:00
David Blaikie
343bbda140 Use a stable sort to handle overlapping/duplicate line sequences
This can occur due to linker ICF and stable sort will ensure the results
are stable.

No explicit/new test coverage, because nondeterminism is non-testable.

It should already be covered by the DWARFDebugLineTest that was failing
some internal testing on an ARM machine which might've been what changed
the sort order. But `llvm::sort` also deliberately randomizes the
contents (under EXPENSIVE_CHECKS) so I'd have expected failures to show
up in any EXPENSIVE_CHECKS Build...
2025-02-07 23:50:41 +00:00
alx32
464f65adac
[DebugInfo][DWARF] Utilize DW_AT_LLVM_stmt_sequence attr in line table lookups (#123391)
**Summary**
Add support for filtering line table entries based on
`DW_AT_LLVM_stmt_sequence` attribute when looking up address ranges.
This ensures that line entries are correctly attributed to their
corresponding functions, even when multiple functions share the same
address range due to optimizations.

**Background**
In https://github.com/llvm/llvm-project/pull/110192 we added support to
clang to generate the `DW_AT_LLVM_stmt_sequence` attribute for
`DW_TAG_subprogram`'s. Corresponding RFC: [New DWARF Attribute for
Symbolication of Merged
Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434)

The `DW_AT_LLVM_stmt_sequence` attribute allows accurate attribution of
line number information to their corresponding functions, even in
scenarios where functions are merged or share the same address space due
to optimizations like Identical Code Folding (ICF) in the linker.

**Implementation Details**
The patch modifies `DWARFDebugLine::lookupAddressRange` to accept an
optional DWARFDie parameter. When provided, the function checks if the
`DIE` has a `DW_AT_LLVM_stmt_sequence` attribute. This attribute
contains an offset into the line table that marks where the line entries
for this DIE's function begin.

If the attribute is present, the function filters the results to only
include line entries from the sequence that starts at the specified
offset. This ensures that even when multiple functions share the same
address range, we return only the line entries that actually belong to
the function represented by the DIE.

The implementation:
- Adds an optional DWARFDie parameter to lookupAddressRange
- Extracts the `DW_AT_LLVM_stmt_sequence` offset if present
- Modifies the address range lookup logic to filter sequences based on
their offset
- Returns only line entries from the matching sequence
2025-02-06 15:41:29 -08:00
David Blaikie
c32cd5746b
Symbolize line zero as if no source info is available (#124846)
Since line zero means "no line information", when symbolizing a location
(an address or an inline frame associated with the address) that has a
line zero location, we shouldn't include other irrelevant data (like
filename) in the result.
2025-02-06 10:45:43 -08:00
Simon Pilgrim
4859c46761 Fix gcc Wparentheses warning. NFC. 2024-08-07 16:28:06 +01:00
Amit Kumar Pandey
0886440ef0
[Symbolizer] Support for Missing Line Numbers. (#82240)
LLVM Symbolizer attempt to symbolize addresses of optimized binaries
reports missing line numbers for some cases. It maybe due to compiler
which sometimes cannot map an instruction to line number due to
optimizations. Symbolizer should handle those cases gracefully.

Adding an option '--skip-line-zero' to symbolizer so as to report the
nearest non-zero line number.

---------

Co-authored-by: Amit Pandey <amit.pandey@amd.com>
2024-08-05 13:38:34 +05:30
Greg Clayton
a23d4ceb88
[lldb][llvm] Return an error instead of crashing when parsing a line table prologue. (#80769)
We recently ran into some bad DWARF where the `DW_AT_stmt_list` of many
compile units was randomly set to invalid values and was causing LLDB to
crash due to an assertion about address sizes not matching. Instead of
asserting, we should return an appropriate recoverable `llvm::Error`.
2024-02-22 10:25:05 -08:00
Kazu Hirata
b7a66d0fae [llvm] Use SmallString::operator std::string (NFC) 2024-01-19 18:54:11 -08:00
Adrian Prantl
87e22bdd2b Allow for mixing source/no-source DIFiles in one CU
The DWARF proposal that the DW_LNCT_LLVM_source extension is based on
(https://dwarfstd.org/issues/180201.1.html) allows to mix source and
non-source files in the same CU by storing an empty string as a
sentinel value.

This patch implements this feature.

Review in https://github.com/llvm/llvm-project/pull/73877
2023-11-30 15:09:24 -08:00
David Stenberg
fe6cddef20 [DWARF] Allow op-index in line number programs
This extends DWARFDebugLine to properly parse line number programs with
maximum_operations_per_instruction > 1 for VLIW targets.

No functions that use that parsed output to retrieve line information
have been extended to support multiple op-indexes. This means that when
retrieving information for an address with multiple op-indexes, e.g.
when using llvm-addr2line, the penultimate row for that address will be
used, which in most cases is the row for the second largest op-index.
This will be addressed in further changes, but this patch at least
allows us to correctly parse such line number programs, with a warning
saying that the line number information may be incorrect (incomplete).

Reviewed By: StephenTozer

Differential Revision: https://reviews.llvm.org/D152536
2023-07-12 13:46:29 +02:00
David Stenberg
6aa94c64a5 [DWARF] Add printout for op-index
This is a preparatory patch for extending DWARFDebugLine to properly
parse line number programs with maximum_operations_per_instruction > 1
for VLIW targets.

Add some scaffolding for handling op-index in line number programs, and
add printouts for that in the table. As this affects a lot of tests,
this is done in a separate commit to get a cleaner review for the actual
op-index implementation.

Verbose printouts are not present in many tests, and adding op-index to
those will require a bit more code changes, so that is done in the
actual implementation patch.

Reviewed By: StephenTozer

Differential Revision: https://reviews.llvm.org/D152535
2023-07-12 12:03:44 +02:00
Benjamin Maxwell
8eb464f543 [DebugInfo] Allow parsing line tables aligned to 4 or 8-byte boundaries
This allows the DWARFDebugLine::SectionParser to try parsing line tables
at 4 or 8-byte boundaries if the unaligned offset appears invalid. If
aligning the offset does not reduce errors the offset is used unchanged.

This is needed for llvm-dwarfdump to be able to extract the line tables
(with --debug-lines) from binaries produced by certain compilers that
like to align each line table in the .debug_line section. Note that this
alignment does not seem to be invalid since the units do point to the
correct line table offsets via the DW_AT_stmt_list attribute.

Differential Revision: https://reviews.llvm.org/D143513
2023-03-22 16:30:01 +00:00
Fangrui Song
2fa744e631 std::optional::value => operator*/operator->
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).

This commit fixes LLVMAnalysis and its dependencies.
2022-12-16 22:44:08 +00:00
Kazu Hirata
934942c033 [llvm] Don't include Optional.h (NFC)
These source files no longer use Optional<T>, so they do not need to
include Optional.h.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-06 22:34:50 -08:00
Kazu Hirata
595f1a6aaf [llvm] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 19:47:13 -08:00
Fangrui Song
89fab98e88 [DebugInfo] llvm::Optional => std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-05 00:09:22 +00:00
Kazu Hirata
110115993c [DebugInfo] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 21:11:39 -08:00
Carlos Alberto Enciso
4f06d46f46 [llvm-debuginfo-analyzer] (08/09) - ELF Reader
llvm-debuginfo-analyzer is a command line tool that processes debug
info contained in a binary file and produces a debug information
format agnostic “Logical View”, which is a high-level semantic
representation of the debug info, independent of the low-level
format.

The code has been divided into the following patches:

1) Interval tree
2) Driver and documentation
3) Logical elements
4) Locations and ranges
5) Select elements
6) Warning and internal options
7) Compare elements
8) ELF Reader
9) CodeView Reader

Full details:
https://discourse.llvm.org/t/llvm-dev-rfc-llvm-dva-debug-information-visual-analyzer/62570

This patch:

This is a high level summary of the changes in this patch.

ELF Reader
- Support for ELF/DWARF.
  LVBinaryReader, LVELFReader

Reviewed By: psamolysov, probinson

Differential Revision: https://reviews.llvm.org/D125783
2022-10-27 05:37:51 +01:00
Fangrui Song
3329cec2f7 [DebugInfo] Don't join DW_AT_comp_dir and directories[0] for DWARF v5 line tables
DWARF v5 6.2.4 The Line Number Program Header says:

> The first entry is the current directory of the compilation. Each additional
> path entry is either a full path name or is relative to the current directory of
> the compilation.

When forming a path, relative DW_AT_comp_dir and directories[0] are not supposed
to be joined together. Fix getFileNameByIndex to special case DWARF v5 DirIdx == 0.

Reviewed By: #debug-info, dblaikie

Differential Revision: https://reviews.llvm.org/D131804
2022-08-12 14:01:52 -07:00
Kazu Hirata
3112987d5c Remove unused forward declarations (NFC) 2022-07-17 15:37:48 -07:00
Kazu Hirata
611ffcf4e4 [llvm] Use value instead of getValue (NFC) 2022-07-13 23:11:56 -07:00
Kazu Hirata
3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
2022-06-25 11:56:50 -07:00
Kazu Hirata
aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Hyoun Kyu Cho
6c12ae8163 Exposes interface to free up caching data structure in DWARFDebugLine and DWARFUnit for memory management
This is minimum changes extracted from https://reviews.llvm.org/D78950. The old patch tried to add LRU eviction of caching data structure. Due to multiple layers of interfaces that users could be using, it was not clear where to put the functionality. While we work out on where to put that functionality, it'll be great to add this minimum interface change so that the user could implement their own memory management. More specifically:

* Add a clearLineTable method for DWARFDebugLine which erases the given offset from the LineTableMap.
* DWARFDebugContext adds the clearLineTableForUnit method that leverages clearLineTable to remove the object corresponding to a given compile unit, for memory management purposes. When it is referred to again, the line table object will be repopulated.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D90006
2022-05-24 03:23:24 +00:00
Argyrios Kyrtzidis
330268ba34 [Support/Hash functions] Change the final() and result() of the hashing functions to return an array of bytes
Returning `std::array<uint8_t, N>` is better ergonomics for the hashing functions usage, instead of a `StringRef`:

* When returning `StringRef`, client code is "jumping through hoops" to do string manipulations instead of dealing with fixed array of bytes directly, which is more natural
* Returning `std::array<uint8_t, N>` avoids the need for the hasher classes to keep a field just for the purpose of wrapping it and returning it as a `StringRef`

As part of this patch also:

* Introduce `TruncatedBLAKE3` which is useful for using BLAKE3 as the hasher type for `HashBuilder` with non-default hash sizes.
* Make `MD5Result` inherit from `std::array<uint8_t, 16>` which improves & simplifies its API.

Differential Revision: https://reviews.llvm.org/D123100
2022-04-05 21:38:06 -07:00
serge-sans-paille
290e482342 Cleanup LLVMDWARFDebugInfo
As usual with that header cleanup series, some implicit dependencies now need to
be explicit:

llvm/DebugInfo/DWARF/DWARFContext.h no longer includes:
- "llvm/DebugInfo/DWARF/DWARFAcceleratorTable.h"
- "llvm/DebugInfo/DWARF/DWARFCompileUnit.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h"
- "llvm/DebugInfo/DWARF/DWARFDebugAranges.h"
- "llvm/DebugInfo/DWARF/DWARFDebugFrame.h"
- "llvm/DebugInfo/DWARF/DWARFDebugLoc.h"
- "llvm/DebugInfo/DWARF/DWARFDebugMacro.h"
- "llvm/DebugInfo/DWARF/DWARFGdbIndex.h"
- "llvm/DebugInfo/DWARF/DWARFSection.h"
- "llvm/DebugInfo/DWARF/DWARFTypeUnit.h"
- "llvm/DebugInfo/DWARF/DWARFUnitIndex.h"

Plus llvm/Support/Errc.h not included by a bunch of llvm/DebugInfo/DWARF/DWARF*.h files

Preprocessed lines to build llvm on my setup:
after: 1065629059
before: 1066621848

Which is a great diff!

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119723
2022-02-15 09:16:03 +01:00
David Blaikie
71e5488a19 DebugInfo: Migrate callers from getAsCString to dwarf::toString
This makes a bunch of these call sites independent of a follow-up change
I'm making to have getAsCString return Expected<const char*> for more
descriptive error messages so that the failures there can be
communicated up to DWARFVerifier (or other callers who want to provide
more verbose diagnostics) so DWARFVerifier doesn't have to re-implement
the string lookup logic and error checking.
2021-12-14 14:50:43 -08:00
David Blaikie
628a319475 llvm-dwarfdump: Print addresses in debug_line to the parsed address size 2020-10-04 16:05:49 -07:00
David Blaikie
8036cf7f54 llvm-dwarfdump: Skip tombstoned address ranges
Make the dumper & API a bit more informative by using the new tombstone
addresses to filter out or otherwise render more explicitly dead code
ranges.
2020-10-04 13:43:29 -07:00
David Blaikie
51a505340d DebugInfo: Simplify line table parsing to take all the units together, rather than CUs and TUs separately 2020-09-18 11:18:23 -07:00
Petr Hosek
9c73e55510 Revert "[DebugInfo] Remove dots from getFilenameByIndex return value"
This is failing on Windows bots due to path separator normalization.

This reverts commit 042c23506869b4ae9a49d2c4bc5ea6e6baeabe78.
2020-09-15 10:06:47 -07:00
Petr Hosek
042c235068 [DebugInfo] Remove dots from getFilenameByIndex return value
When concatenating directory with filename in getFilenameByIndex, we
might end up with a path that contains extra dots. For example, if the
input is /path and ./example, we would return /path/./example. Run
sys::path::remove_dots on the output to eliminate unnecessary dots.

Differential Revision: https://reviews.llvm.org/D87657
2020-09-14 20:19:06 -07:00
Greg Clayton
e1de85f9f4 Add verification for DW_AT_decl_file and DW_AT_call_file.
LTO builds have been creating invalid DWARF and one of the errors was a file index that was out of bounds. "llvm-dwarfdump --verify" will check all file indexes for line tables already, but there are no checks for the validity of file indexes in attributes.

The verification will verify if there is a DW_AT_decl_file/DW_AT_call_file that:
- there is a line table for the compile unit
- the file index is valid
- the encoding is appropriate

Tests are added that test all of the above conditions.

Differential Revision: https://reviews.llvm.org/D84817
2020-08-05 15:30:13 -07:00
James Henderson
9e09a54c69 [DebugInfo] Use Cursor to detect errors in debug line prologue parser
Previously, the debug line parser would keep attempting to read data
even if it had run out of data to read. This meant errors in parsing
would often end up being reported as something else, such as an unknown
version or malformed directory/filename table. This patch fixes the
issues by using the Cursor API to capture errors.

Reviewed by: labath

Differential Revision: https://reviews.llvm.org/D83043
2020-07-03 11:52:06 +01:00
James Henderson
9782c922cb [DebugInfo] Print line table extended opcode bytes if parsing fails
Previously, if there was an error whilst parsing the operands of an
extended opcode, the operands would be treated as zero and printed. This
could potentially be slightly confusing. This patch changes the
behaviour to print the raw bytes instead.

Reviewed by: ikudrin

Differential Revision: https://reviews.llvm.org/D81570
2020-06-23 10:04:02 +01:00
James Henderson
b21794a91c [DebugInfo] Unify Cursor usage for all debug line opcodes
This is a natural extension of the previous changes to use the Cursor
class independently in the standard and extended opcode paths, and in
turn allows delaying error handling until the entire line has been
printed in verbose mode, removing interleaved output in some cases.

Reviewed by: MaskRay, JDevlieghere

Differential Revision: https://reviews.llvm.org/D81562
2020-06-17 09:19:24 +01:00
James Henderson
1a78904752 [DebugInfo] Report errors for truncated debug line standard opcode
Standard opcodes usually have ULEB128 arguments, so it is generally not
possible to recover from such errors. This patch causes the parser to
stop parsing the table in such situations.

Also don't emit the operands or add data to the table if there is an
error reading these opcodes.

Reviewed by: JDevlieghere

Differential Revision: https://reviews.llvm.org/D81470
2020-06-15 11:50:12 +01:00
Pavel Labath
9ed452f370 [llvm/DWARFDebugLine] Remove spurious full stop from warning messages
Other warnings messages don't have a trailing full stop.
2020-06-11 13:14:21 +02:00
Pavel Labath
fccaa89e23 [llvm/DWARFDebugLine] Fix a typo in one warning message 2020-06-11 13:04:52 +02:00
Pavel Labath
6f55b5a101 [DWARFDebugLine] Use truncating data extractors for prologue parsing
Summary:
This makes the code easier to reason about, as it will behave the same
way regardless of whether there is any more data coming after the
presumed end of the prologue.

Reviewers: jhenderson, dblaikie, probinson, ikudrin

Subscribers: hiraditya, MaskRay, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77557
2020-06-10 16:12:53 +02:00
Fangrui Song
81cca98768 [DebugInfo] Drop unneeded format() calls (fix -Wformat-security) after 3b7ec64d59748765990ed99716034ab8d5533673 2020-06-09 09:56:13 -07:00
James Henderson
3b7ec64d59 [DebugInfo] Fix printing of unrecognised standard opcodes
The verbose printing of unrecognised standard opcodes was broken in
multiple ways (additional blank lines, a closing parenthesis without
opening parenthesis and so on). This patch fixes it, and makes the
output more consistent with other opcodes.
2020-06-09 14:32:20 +01:00
James Henderson
e3547ade68 [DebugInfo] Improve new line printing in debug line verbose output
The new line printing for debug line verbose output was inconsistent.
For new rows in the matrix, a blank line followed, whilst the
DW_LNS_copy opcode actually resulted in two blank lines. There was also
potential inconsistency in the blank lines at the end of the table. This
patch mostly resolves these issues - no blank lines appear in the output
except for a single line after the prologue and at table end to separate
it from any subsquent table, plus some instances after error messages.

Also add a unit test for verbose output to test the fine details of new
line placement and other aspects of verbose output.

Reviewed by: dblaikie

Differential Revision: https://reviews.llvm.org/D81102
2020-06-09 14:27:16 +01:00
James Henderson
dbd26fe0b6 [DebugInfo] Print non-verbose output at some point as verbose output
Verbose and non-verbose parsing of .debug_line produced their output at
different points in the program. The most obvious impact of this was
that error messages were produced at different times, but it also
potentially reduced what clients could do by customising the stream or
warning/error handlers.

This change makes the two variants consistent by printing non-verbose
output inline, the same as verbose output.

Testing of the error messages has been modified to check the messages
always appear in the same location to illustrate the behaviour.

Reviewed by: JDevlieghere, dblaikie, MaskRay, labath

Differential Revision: https://reviews.llvm.org/D80989
2020-06-09 14:24:53 +01:00
James Henderson
5777570d24 [DebugInfo] Check for errors when reading data for extended opcode
Previously, if an extended opcode was truncated, it would manifest as an
"unexpected line op length error" which wasn't quite accurate. This
change checks for errors any time data is read whilst parsing an
extended opcode, and reports any errors detected.

Reviewed by: MaskRay, labath, aprantl

Differential Revision: https://reviews.llvm.org/D80797
2020-06-09 09:56:37 +01:00
Fangrui Song
9be3567df2 [llvm-dwarfdump] Add a table header for -debug-line -verbose output
Like non-verbose output, so that it is easy to recognize the `Line,Column,File,ISA,Discriminator` column values.

Reviewed By: JDevlieghere, jhenderson

Differential Revision: https://reviews.llvm.org/D80874
2020-06-04 08:56:17 -07:00