Reland #174731, resolve cyclic dependency issue.
The use of LLVM_Object in LLVM_Util would cause cyclic dependency.
Fix cyclic dependency by reimplement `getFeatureSetFromEFlag()`.
Original description:
---
This PR updates llvm-objdump to detect the specific AVR architecture
from the ELF header flags when no specific CPU is provided.
Fixes: https://github.com/llvm/llvm-project/issues/146451
Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
Reverts llvm/llvm-project#174731 due to introducing a cyclic dependency
when building LLVM with modules enabled: LLVM_Utils -> LLVM_Object ->
LLVM_Utils
This PR updates llvm-objdump to detect the specific AVR architecture
from the ELF header flags when no specific CPU is provided.
Fixes: #146451
---------
Signed-off-by: RuoyuQiu <cabbaken@outlook.com>
Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
Co-authored-by: qiuruoyu <qiuruoyu@hygon.cn>
This adds support for `--symbolize-operands`, so that local references
are turned back into labels by objdump, which makes it easier to tell
what is going on with a linked object.
When using `--symbolize-operands`, branch target addresses are not
printed, only the referenced symbol is printed, and the address is
elided:
```
# Without --symbolize-operands
0: 04a05263 blez a0, 0x44 <.text+0x44>
...
40: fd1ff06f j 0x10 <.text+0x10>
44: 00000613 li a2, 0x0
# With --symbolize-operands
0: 04a05263 blez a0, <L3>
...
40: fd1ff06f j <L0>
<L3>:
44: 00000613 li a2, 0x0
```
The --disassembler-color option was being ignored when using the --macho
flag because DisassembleMachO() never called setUseColor() on the
instruction printer.
Use getRISCVVendorRelocationTypeName to resolve RISCV vendor-specific
relocation names (R_RISCV_CUSTOM192-255) when preceded by
R_RISCV_VENDOR.
This improves the output of llvm-readobj and llvm-objdump to show
vendor-specific names like R_RISCV_QC_ABS20_U, R_RISCV_QC_E_BRANCH
(QUALCOMM) and R_RISCV_NDS_BRANCH_10 (ANDES) instead of generic
R_RISCV_CUSTOM* names.
Per RISC-V psABI, R_RISCV_VENDOR must be placed immediately before its
associated vendor-specific relocation, so the vendor symbol is consumed
after one use. Unknown vendors fall back to R_RISCV_CUSTOM*.
Reland #165661 with fix for memory leak.
The call to `DummyTarget->createMCSubtargetInfo` within `mcpuHelp()`
returns a pointer that is not subsequently freed, leading to a memory
leak. Use `std::unique_ptr` to ensure the memory is released
automatically.
Original description:
---
Currently --mcpu=help and --mattr=help only produce help out when
disassembling. This patch specialises these cases to always print the
requested help.
If --triple is specified, the help text will be derived from the
specified target. Otherwise, it will be derived from the target of the
first input file.
Fixes: https://github.com/llvm/llvm-project/issues/150567
Currently `--mcpu=help` and `--mattr=help` only produce help out when
disassembling. This patch specialises these cases to always print the
requested help.
If `--triple` is specified, the help text will be derived from the
specified target. Otherwise, it will be derived from the target of the
first input file.
Fixes: #150567
---------
Signed-off-by: Ruoyu Qiu <cabbaken@outlook.com>
Co-authored-by: James Henderson <James.Henderson@sony.com>
This patch significantly optimizes the LiveElementPrinter
by replacing a slow linear search with efficient hash map
lookups. It refactors the code to use a map-based system
for tracking live element addresses and managing column
assignments, leading to a major performance improvement
for large binaries.
Hexagon packets can visually straddle labels when data (e.g. jump
tables) in the text section does not carry end-of-packet bits. In such
cases the next instruction, even at a new symbol, appears to continue
the previous packet.
This patch resets packet state when encountering a new symbol so that
packets at symbol starts are guaranteed to start in their own packet.
This implements very basic support for RISC-V mapping symbols in
llvm-objdump, sharing the implementation with how Arm/AArch64/CSKY
implement this feature.
This only supports the `$x` (instruction) and `$d` (data) mapping
symbols for RISC-V, and not the version of `$x` which includes an
architecture string suffix.
This change adds boilerplate code to implement the object::ObjectFile
interface for the DXContainer object file and an empty implementation of
the objdump Dumper object.
Adding an ObjectFile implementation for DXContainer is a bit odd because
the DXContainer format doesn't have a symbol table, so there isn't a
reasonable implementation for the SymbolicFile interfaces. That said, it
does have sections, and it will be useful for objdump to be able to
inspect some of the structured data stored in some of the special named
sections.
At this point in the implementation it can't do much other than dump the
part names, offsets, and sizes. Dumping detailed structured section
contents to be extended in subsequent PRs.
Fixes#151433
This patch adds the support for displaying inlined functions into
llvm-objdump.
1) It extends the source variable display
support for inlined functions both for ascii and unicode formats.
2) It also introduces a new format called limits-only that only prints a
line for the start and end of an inlined function without line-drawing
characters.
Hexagon instructions are VLIW "bundles" of up to four instruction words
encoded as a single MCInst with operands for each sub-instruction.
Previously, the disassembler's getInstruction() returned the full
bundle, which made it difficult to work with llvm-objdump.
For example, since all instructions are bundles, and bundles do not
branch, branch targets could not be printed.
This patch modifies the Hexagon disassembler to return individual
sub-instructions instead of entire bundles, enabling correct printing of
branch targets and relocations. It also introduces
`MCDisassembler::getInstructionBundle` for cases where the full bundle
is still needed.
By default, llvm-objdump separates instructions with newlines. However,
this does not work well for Hexagon syntax:
{ inst1
inst2
inst3
inst4 <branch> } :endloop0
Instructions may be followed by a closing brace, a closing brace with
`:endloop`, or a newline. Branches must appear within the braces.
To address this, `PrettyPrinter::getInstructionSeparator()` is added and
overridden for Hexagon.
Similar to the existing implementations for X86 and PPC, support
symbolizing branch targets for AArch64. Do not omit the address for ADRP
as the target is typically not at an intended location.
Pull Request: https://github.com/llvm/llvm-project/pull/145009
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
Utilize the new extensions to the LLVM Offloading API to extend to
llvm-objdump to handle dumping fatbin offload bundles generated by HIP.
This extension to llvm-objdump adds the option --offload-fatbin.
Specifying this option will take the input object/executable and extract
all offload fatbin bundle entries into distinct code object files with
names reflecting the source file name combined with the Bundle Entry ID.
Users can also use the --arch-name option to filter offload fatbin
bundle entries by their target triple.
---------
Co-authored-by: dsalinas <dsalinas@MKM-L1-DSALINAS.amd.com>
llvm-objdump currently calls MCDisassembler::getInstruction with
unadjusted address and MCInstPrinter::printInst with adjusted address.
The decoded branch targets will be adjusted as expected for most targets
(as the getInstruction address is insignificant) but not for SystemZ
(where the getInstruction address is displayed).
Specify an adjust address to fix SystemZInstPrinter output.
The added test utilizes llvm/utils/update_test_body.py to make updates
easier and additionally checks that we don't adjust SHN_ABS symbol
addresses.
Pull Request: https://github.com/llvm/llvm-project/pull/140471
Utilize the new extensions to the LLVM Offloading API to extend to
llvm-objdump to handle dumping fatbin offload bundles generated by HIP.
This extension to llvm-objdump adds the option --offload-fatbin.
Specifying this option will take the input object/executable and extract
all offload fatbin bundle entries into distinct code object files with
names reflecting the source file name combined with the Bundle Entry ID.
Users can also use the --arch-name option to filter offload fatbin
bundle entries by their target triple.
---------
Co-authored-by: dsalinas <dsalinas@MKM-L1-DSALINAS.amd.com>
--adjust-vma adjusts the current section address. The address printed
for inline relocs is relative to the current section address instead of
the section that the referenced symbol resides in.
Fix https://github.com/llvm/llvm-project/issues/75444
Utilize the new extensions to the LLVM Offloading API to extend to
llvm-objdump to handle dumping fatbin offload bundles generated by HIP.
This extension to llvm-objdump adds the option --offload-fatbin.
Specifying this option will take the input object/executable and extract
all offload fatbin bundle entries into distinct code object files with
names reflecting the source file name combined with the Bundle Entry ID.
Users can also use the --arch-name option to filter offload fatbin
bundle entries by their target triple.
---------
Co-authored-by: dsalinas <dsalinas@MKM-L1-DSALINAS.amd.com>
This implements arm, armeb, thumb, thumbeb PLT entries parsing support
in ELF for llvm-objdump.
Implementation is similar to AArch64MCInstrAnalysis::findPltEntries. PLT
entry signatures are based on LLD code for PLT generation
(ARM::writePlt).
llvm-objdump tests are produced from lld/test/ELF/arm-plt-reloc.s,
lld/test/ELF/armv8-thumb-plt-reloc.s.
DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently
gained C++23-style insert_range. This patch replaces:
Dest.insert(Src.begin(), Src.end());
with:
Dest.insert_range(Src);
This patch does not touch custom begin like succ_begin for now.
and fix crash when vd_aux is invalid (#86611).
vd_version, vd_flags, vd_ndx, and vd_cnt in Elf{32,64}_Verdef are
16-bit. Change VerDef to use uint16_t instead.
vda_name specifies a NUL-terminated string. Update getVersionDefinitions
to remove some `.c_str()`.
Pull Request: https://github.com/llvm/llvm-project/pull/128434
Now that we have a dedicated abstraction for string tables, switch the
option parser library's string table over to it rather than using a raw
`const char*`. Also try to use the `StringTable::Offset` type rather
than a raw `unsigned` where we can to avoid accidental increments or
other issues.
This is based on review feedback for the initial switch of options to a
string table. Happy to tweak or adjust if desired here.
Apologies for the large change, I looked for ways to break this up and
all of the ones I saw added real complexity. This change focuses on the
option's prefixed names and the array of prefixes. These are present in
every option and the dominant source of dynamic relocations for PIE or
PIC users of LLVM and Clang tooling. In some cases, 100s or 1000s of
them for the Clang driver which has a huge number of options.
This PR addresses this by building a string table and a prefixes table
that can be referenced with indices rather than pointers that require
dynamic relocations. This removes almost 7k dynmaic relocations from the
`clang` binary, roughly 8% of the remaining dynmaic relocations outside
of vtables. For busy-boxing use cases where many different option tables
are linked into the same binary, the savings add up a bit more.
The string table is a straightforward mechanism, but the prefixes
required some subtlety. They are encoded in a Pascal-string fashion with
a size followed by a sequence of offsets. This works relatively well for
the small realistic prefixes arrays in use.
Lots of code has to change in order to land this though: both all the
option library code has to be updated to use the string table and
prefixes table, and all the users of the options library have to be
updated to correctly instantiate the objects.
Some follow-up patches in the works to provide an abstraction for this
style of code, and to start using the same technique for some of the
other strings here now that the infrastructure is in place.
Swap `!DisassembleZeroes` and `if (DumpARMELFData)` conditions so that
in the false DisassembleZeroes case (default), `...` will be printed for
long consecutive zeroes, even when a data mapping symbol is active.
This is especially useful for certain lld tests that insert a huge
padding within a code section. Without `...` the output will be huge.
Pull Request: https://github.com/llvm/llvm-project/pull/109553
The section field has been repurposed for some STAB symbol types, and if
we blindly look it up we'll produce an error and terminate. Logic
already existed
Existing stabs test had a section that was in range. Unfortunately I
don't know of an easy way to produce stabs entries in LLVM (I thought
they died in the 90s until this came up) so I just binary-edited it to
cause a failure on existing llvm-objdump.
Enable local labels computation for BPF disassembly when
`--symbolize-operands` option is specified.
This relies on `MCInstrAnalysis::evaluateBranch()` method, which is
already defined in `BPFMCInstrAnalysis::evaluateBranch`.
After this change the assembly code below:
if r1 > 42 goto +1
r1 -= 10
...
Would be printed as:
if r1 > 42 goto +1 <L0>
r1 -= 10
<L0>:
...
(when `--symbolize-operands` option is set).
See https://reviews.llvm.org/D84191 for the main part of the
`--symbolize-operands` implementation logic.
Extract the llvm-readelf decoder to `decodeCrel` (#91280) and reuse it
for llvm-objdump.
Because the section representation of LLVMObject (`SectionRef`) is
64-bit, insufficient to hold all decoder states, `section_rel_begin` is
modified to decode CREL eagerly and hold the decoded relocations inside
ELFObjectFile<ELFT>.
The test is adapted from llvm/test/tools/llvm-readobj/ELF/crel.test.
Pull Request: https://github.com/llvm/llvm-project/pull/97382
These mostly are checking for various reserved bits being set. The diagnostics
for gpu-dependent reserved bits have a bit more context since they seem like the
most likely ones to be observed in practice.
This commit also improves the error handling mechanism for
MCDisassembler::onSymbolStart(). Previously it had a comment stream parameter
that was just being ignored by llvm-objdump, now it returns errors using
Expected<T>.
The color methods in formatted_raw_ostream were forwarding directly to
the underlying stream without considering existing buffered output. This
would cause incorrect colored output for buffered uses of
formatted_raw_ostream.
Fix this issue by applying the color to the formatted_raw_ostream itself
and temporarily disabling scanning of any color related output so as not
to affect the position tracking.
This fix means that workarounds that forced formatted_raw_ostream
buffering to be disabled can be removed. In the case of llvm-objdump,
this can improve disassembly performance when redirecting to a file by
more than an order of magnitude on both Windows and Linux. This
improvement restores the disassembly performance when redirecting to a
file to a level similar to before color support was added.
Primary change is to add a flag `--pretty-pgo-analysis-map` to
llvm-readobj and llvm-objdump that prints block frequencies and branch
probabilities in the same manner as BFI and BPI respectively. This can
be helpful if you are manually inspecting the outputs from the tools.
In order to print, I moved the `printBlockFreqImpl` function from
Analysis to Support and renamed it to `printRelativeBlockFreq`.