174 Commits

Author SHA1 Message Date
yonghong-song
03958680b2
[BPF] Make llvm-objdump disasm default cpu v4 (#102166)
Currently, with the following example,
  $ cat t.c
  void foo(int a, _Atomic int *b)
  {
   *b &= a;
  }
  $ clang --target=bpf -O2 -c -mcpu=v3 t.c
  $ llvm-objdump -d t.o
  t.o:    file format elf64-bpf

  Disassembly of section .text:

  0000000000000000 <foo>:
       0:       c3 12 00 00 51 00 00 00 <unknown>
       1:       95 00 00 00 00 00 00 00 exit

Basically, the default cpu for llvm-objdump is v1 and it won't be able
to decode insn properly.

If we add --mcpu=v3 to llvm-objdump command line, we will have
  $ llvm-objdump -d --mcpu=v3 t.o

  t.o:    file format elf64-bpf

  Disassembly of section .text:

  0000000000000000 <foo>:
0: c3 12 00 00 51 00 00 00 w1 = atomic_fetch_and((u32 *)(r2 + 0x0), w1)
       1:       95 00 00 00 00 00 00 00 exit

The atomic_fetch_and insn can be decoded properly. Using latest cpu
version --mcpu=v4 can also decode properly like the above --mcpu=v3.

To avoid the above '<unknown>' decoding with common 'llvm-objdump -d
t.o', this patch marked the default cpu for llvm-objdump with the
current highest cpu number v4 in ELFObjectFileBase::tryGetCPUName(). The
cpu number in ELFObjectFileBase::tryGetCPUName() will be adjusted in the
future if cpu number is increased e.g. v5 etc. Such an approach also
aligns with gcc-bpf as discussed in [1].

Six bpf unit tests are affected with this change. I changed test output
for three unit tests and added --mcpu=v1 for the other three unit tests,
to demonstrate the default (cpu v4) behavior and explicit --mcpu=v1
behavior.

[1]
https://lore.kernel.org/bpf/6f32c0a1-9de2-4145-92ea-be025362182f@linux.dev/T/#m0f7e63c390bc8f5a5523e7f2f0537becd4205200

Co-authored-by: Yonghong Song <yonghong.song@linux.dev>
2024-08-06 18:23:46 -07:00
Fangrui Song
2f37a22f10
[llvm-objdump] -r: support CREL
Extract the llvm-readelf decoder to `decodeCrel` (#91280) and reuse it
for llvm-objdump.

Because the section representation of LLVMObject (`SectionRef`) is
64-bit, insufficient to hold all decoder states, `section_rel_begin` is
modified to decode CREL eagerly and hold the decoded relocations inside
ELFObjectFile<ELFT>.

The test is adapted from llvm/test/tools/llvm-readobj/ELF/crel.test.

Pull Request: https://github.com/llvm/llvm-project/pull/97382
2024-07-08 09:14:34 -07:00
Shilei Tian
1ca0055f45
[AMDGPU] Add a new target gfx1152 (#94534) 2024-06-06 12:16:11 -04:00
Konstantin Zhuravlyov
775f1cd34d
AMDGPU: Add gfx12-generic target (#93875) 2024-05-31 12:46:44 -04:00
Craig Topper
733a87783c
[RISCV] Split code that tablegen needs out of RISCVISAInfo. (#89684)
This introduces a new file, RISCVISAUtils.cpp and moves the rest of
RISCVISAInfo to the TargetParser library.

This will allow us to generate part of RISCVISAInfo.cpp using tablegen.
2024-04-23 15:12:36 -07:00
quic-areg
31f4b329c8
[Hexagon] ELF attributes for Hexagon (#85359)
Defines a subset of attributes and emits them to a section called
.hexagon.attributes.

The current attributes recorded are the attributes needed by
llvm-objdump to automatically determine target features and eliminate
the need to manually pass features.
2024-03-19 16:22:30 -05:00
Pierre van Houtryve
43c7eb5d7b
[AMDGPU] Replace '.' with '-' in generic target names (#81718)
The dot is too confusing for tools. Output temporaries would have
'10.3-generic' so tools could parse it as an extension, device libs &
the associated clang driver logic are also confused by the dot.

After discussions, we decided it's better to just remove the '.' from
the target name than fix each issue one by one.
2024-02-14 15:19:04 +01:00
Pierre van Houtryve
f93aa5157a
[AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (#76955)
These generic targets include multiple GPUs and will, in the future,
provide a way to build once and run on multiple GPU, at the cost of less
optimization opportunities.

Note that this is just doing the compiler side of things, device libs an
runtimes/loader/etc. don't know about these targets yet, so none of them
actually work in practice right now. This is just the initial commit to
make LLVM aware of them.

This contains the documentation changes for both this change and #76954
as well.
2024-02-12 10:18:20 +01:00
Craig Topper
8c37e3e64b
[RISCV] Only set Zca flag for EF_RISCV_RVC in ELFObjectFileBase::getRISCVFeatures(). (#80928)
This code appears to be a hack to set the features to include compressed
instructions if the ELF EFLAGS flags bit is present, but the ELF
attribute for the ISA string is no present or not accurate.

We can't remove the hack because llvm-mc doesn't create ELF attributes
by default so a lot of tests fail to disassembler properly. Using clang
as the assembler does set the attributes.

This patch changes the hack to only set Zca since that is the minimum
implied by the flag. Setting anything else potentially conflicts with
the ISA string containing Zcmp or Zcmt.

JITLink also needs to be updated to recognize Zca in addition to C.
2024-02-07 08:23:57 -08:00
Alexandre Ganea
43ab40a5ba [llvm] Silence warning when building with Clang ToT
This fixes:
```
[1343/7452] Building CXX object lib\Object\CMakeFiles\LLVMObject.dir\ELFObjectFile.cpp.obj
C:\git\llvm-project\llvm\lib\Object\ELFObjectFile.cpp(808,27): warning: comparison of integers of different signs: 'unsigned int' and '_Iter_diff_t<const Elf_Shdr_Impl<ELFType<llvm::endianness::little, false>> *>' (aka 'int') [-Wsign-compare]
  808 |     if (*TextSectionIndex != std::distance(Sections.begin(), *TextSecOrErr))
      |         ~~~~~~~~~~~~~~~~~ ^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\git\llvm-project\llvm\lib\Object\ELFObjectFile.cpp(913,12): note: in instantiation of function template specialization 'readBBAddrMapImpl<llvm::object::ELFType<llvm::endianness::little, false>>' requested here
  913 |     return readBBAddrMapImpl(Obj->getELFFile(), TextSectionIndex, PGOAnalyses);
      |            ^
```
2024-01-25 09:34:19 -05:00
Aiden Grossman
c067524852
[SHT_LLVM_BB_ADDR_MAP] Add assertion and clarify docstring (#77374)
This patch adds an assertion to readBBAddrMapImpl to confirm that
PGOAnalyses and BBAddrMaps are of the same size when PGO information is
requested (part of the API contract). This patch also updates the
docstring for readBBAddrMap to better clarify what is guaranteed.
2024-01-19 11:34:00 -08:00
Luke Lau
db78c30ba7
[RISCV] Deduplicate RISCVISAInfo::toFeatures/toFeatureVector. NFC (#76942)
toFeatures and toFeatureVector both output a list of target feature
flags, just with a slightly different interface. toFeatures keeps any
unsupported extensions, and also provides a way to append negative
extensions (AddAllExtensions=true).

This patch combines them into one function, so that a later patch will
be be able to get a std::vector of features that includes all the
negative extensions, which was previously only possible through the
StrAlloc interface.
2024-01-09 15:33:51 +07:00
Joseph Huber
deab58d127
[ELF] Add CPU name detection for CUDA architectures (#75964)
Summary:
Recently we added support for detecting the CUDA processor with the ELF
flags. This allows us to get a string representation of it in other
code. This will be used by the offloading runtime.
2023-12-19 20:01:15 -06:00
Micah Weston
105adf2cd9
[SHT_LLVM_BB_ADDR_MAP] Implements PGOAnalysisMap in Object and ObjectYAML with tests.
Reviewed in PR (#71750). A part of [RFC - PGO Accuracy Metrics: Emitting and Evaluating Branch
and Block
Analysis](https://discourse.llvm.org/t/rfc-pgo-accuracy-metrics-emitting-and-evaluating-branch-and-block-analysis/73902).

This PR adds the PGOAnalysisMap data structure and implements encoding and
decoding through Object and ObjectYAML along with associated tests. When
emitted into the bb-addr-map section, each function is followed by the associated
pgo-analysis-map for that function. The emitting of each analysis in the map is
controlled by a bit in the bb-addr-map feature byte. All existing bb-addr-map
code can ignore the pgo-analysis-map if the caller does not request the data.
2023-12-12 10:23:16 -05:00
Jay Foad
cf1e0c0b07
[AMDGPU] Define new targets gfx1200 and gfx1201 (#73133)
Define target names and ELF numbers for new GFX12 targets gfx1200 and
gfx1201. For now they behave identically to GFX11.
2023-11-23 16:44:05 +00:00
Jay Foad
92542f2a40 [AMDGPU] Add targets gfx1150 and gfx1151
This is the target definition only. Currently they are treated the same
as GFX 11.0.x.

Differential Revision: https://reviews.llvm.org/D155429
2023-07-17 13:06:12 +01:00
Fangrui Song
a4d1259e61 [llvm-objdump] Default to --mcpu=future for PPC32
Extend D127824 to the 32-bit Power architecture.
AFAICT GNU objdump -d dumps all instructions for 32-bit as well.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D155114
2023-07-12 18:24:18 -07:00
Job Noorman
8de9f2b558 Move SubtargetFeature.h from MC to TargetParser
SubtargetFeature.h is currently part of MC while it doesn't depend on
anything in MC. Since some LLVM components might have the need to work
with target features without necessarily needing MC, it might be
worthwhile to move SubtargetFeature.h to a different location. This will
reduce the dependencies of said components.

Note that I choose TargetParser as the destination because that's where
Triple lives and SubtargetFeatures feels related to that.

This issues came up during a JITLink review (D149522). JITLink would
like to avoid a dependency on MC while still needing to store target
features.

Reviewed By: MaskRay, arsenm

Differential Revision: https://reviews.llvm.org/D150549
2023-06-26 11:20:08 +02:00
Fangrui Song
9e37a7bd1f [llvm-objdump][X86] Add @plt symbols for .plt.got
If a symbol needs both JUMP_SLOT and GLOB_DAT relocations, there is a
minor linker optimization to keep just GLOB_DAT. This optimization
is only implemented by GNU ld's x86 port and mold.
https://maskray.me/blog/2021-08-29-all-about-global-offset-table#combining-.got-and-.got.plt

With the optimizing, the PLT entry is placed in .plt.got and the
associated GOTPLT entry is placed in .got (ld.bfd -z now) or .got.plt (ld.bfd -z lazy).
The relocation is in .rel[a].dyn.

This patch synthesizes `symbol@plt` labels for these .plt.got entries.

Example:
```
cat > a.s <<e
.globl _start; _start:
mov combined0@gotpcrel(%rip), %rax; mov combined1@gotpcrel(%rip), %rax
call combined0@plt; call combined1@plt
call foo0@plt; call foo1@plt
e
cat > b.s <<e
.globl foo0, foo1, combined0, combined1
foo0: foo1: combined0: combined1:
e
gcc -fuse-ld=bfd -shared b.s -o b.so
gcc -fuse-ld=bfd -pie -nostdlib a.s b.so -o a
```

```
Disassembly of section .plt:

0000000000001000 <.plt>:
    1000: ff 35 ea 1f 00 00             pushq   0x1fea(%rip)            # 0x2ff0 <_GLOBAL_OFFSET_TABLE_+0x8>
    1006: ff 25 ec 1f 00 00             jmpq    *0x1fec(%rip)           # 0x2ff8 <_GLOBAL_OFFSET_TABLE_+0x10>
    100c: 0f 1f 40 00                   nopl    (%rax)

0000000000001010 <foo1@plt>:
    1010: ff 25 ea 1f 00 00             jmpq    *0x1fea(%rip)           # 0x3000 <_GLOBAL_OFFSET_TABLE_+0x18>
    1016: 68 00 00 00 00                pushq   $0x0
    101b: e9 e0 ff ff ff                jmp     0x1000 <.plt>

0000000000001020 <foo0@plt>:
    1020: ff 25 e2 1f 00 00             jmpq    *0x1fe2(%rip)           # 0x3008 <_GLOBAL_OFFSET_TABLE_+0x20>
    1026: 68 01 00 00 00                pushq   $0x1
    102b: e9 d0 ff ff ff                jmp     0x1000 <.plt>

Disassembly of section .plt.got:

0000000000001030 <combined0@plt>:
    1030: ff 25 a2 1f 00 00             jmpq    *0x1fa2(%rip)           # 0x2fd8 <foo1+0x2fd8>
    1036: 66 90                         nop

0000000000001038 <combined1@plt>:
    1038: ff 25 a2 1f 00 00             jmpq    *0x1fa2(%rip)           # 0x2fe0 <foo1+0x2fe0>
    103e: 66 90                         nop
```

For x86-32, with -z now, if we remove `foo0` and `foo1`, the absence of regular
PLT will cause GNU ld to omit .got.plt, and our code cannot synthesize @plt
labels. This is an extreme corner case that almost never happens in practice (to
trigger the case, ensure every PLT symbol has been taken address). To fix it, we
can get the `_GLOBAL_OFFSET_TABLE_` symbol value, but the complexity is not
worth it.

Close https://github.com/llvm/llvm-project/issues/62537

Reviewed By: bd1976llvm

Differential Revision: https://reviews.llvm.org/D149817
2023-05-16 09:22:21 -07:00
Konstantin Zhuravlyov
9d05727972 AMDGPU: Add basic gfx942 target
Differential Revision: https://reviews.llvm.org/D149983
2023-05-10 11:51:06 -04:00
Konstantin Zhuravlyov
1fc70210a6 AMDGPU: Add basic gfx941 target
Differential Revision: https://reviews.llvm.org/D149982
2023-05-10 11:51:06 -04:00
Fangrui Song
b05cd680ea MCInstrAnalysis: make GotPltSectionVA x86-32 specific
GotPltSectionVA is specific to x86-32 PIC PLT entries.
Let's remove the argument from the generic interface.

As a side effect of not requiring .got.plt, this simplification
addresses a subset of https://github.com/llvm/llvm-project/issues/62537
by enabling .plt dumping for some ld.bfd -z now linked x86-32/x86-64 images
without .got.plt
2023-05-03 19:21:01 -07:00
Alex Bradbury
91c6174ce3 [RISCV] Allow llvm-objdump to disassemble objects with unrecognised versions of known extensions
This Moves ELFObjectFile to using
RISCVISAInfo::parseNormalizedArchString which is not an NFC, as the test
changes show. D144353 transitioned LLD to using this function, which is
specialised to parsing arch strings in the normalised format specified
in the psABI rather than user-authored strings accepted in `-march`,
which has greater flexibility.

parseNormalizedArchString does not ignore or produce an error for ISA
extensions with a version that isn't recognised/supported by LLVM. As
current GCC is marking its objects with a higher version of the A, F,
and D extensions than LLVM (see [extension versioning
discussion](https://discourse.llvm.org/t/rfc-resolving-issues-related-to-extension-versioning-in-risc-v/68472)
this massively improves the usability of llvm-objdump with such
binaries.

Differential Revision: https://reviews.llvm.org/D146114
2023-03-27 04:38:16 +01:00
Aiden Grossman
175aa049c7 [Propeller] Make decoding BBAddrMaps trace through relocations
Currently when using the LLVM tools (eg llvm-readobj, llvm-objdump) to
find information about basic block locations using the propeller tooling
in relocatable object files function addresses are not mapped properly
which causes problems. In llvm-readobj this means that incorrect
function names will be pulled. In llvm-objdum this means that most BBs
won't show up in the output if --symbolize-operands is used. This patch
changes the behavior of decodeBBAddrMap to trace through relocations
to get correct function addresses if it is going through a relocatable
object file. This fixes the behavior in both tools and also other
consumers of decodeBBAddrMap. Some helper functions have been added
in/refactoring done to aid in grabbing BB address map sections now that
in some cases both relocation and BB address map sections need to be
obtained at the same time.

Regression tests moved around/added.

Differential Revision: https://reviews.llvm.org/D143841
2023-03-13 21:29:48 +00:00
Gregory Alfonso
956c7dca29 [Object][NFC] Remove unneeded llvm_unreachable
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D139452
2023-02-16 13:20:41 -08:00
Archibald Elliott
62c7f035b4 [NFC][TargetParser] Remove llvm/ADT/Triple.h
I also ran `git clang-format` to get the headers in the right order for
the new location, which has changed the order of other headers in two
files.
2023-02-07 12:39:46 +00:00
Kazu Hirata
55e2cd1609 Use llvm::count{lr}_{zero,one} (NFC) 2023-01-28 12:41:20 -08:00
Mehdi Amini
5ab0894fd5 Explicitly more Error when returning it (NFC)
This is an attempt to fix a build failure:

llvm/lib/Object/ELFObjectFile.cpp:300:12: error: call to deleted constructor of 'llvm::Error'
    return E;
2023-01-16 15:07:46 +00:00
Elena Lepilkina
537cdf92c4 [llvm-objdump][RISCV] Use new common method to parse ARCH RISCV attribute
Differential Revision: https://reviews.llvm.org/D139553
2023-01-16 16:57:55 +03:00
Fangrui Song
2fa744e631 std::optional::value => operator*/operator->
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).

This commit fixes LLVMAnalysis and its dependencies.
2022-12-16 22:44:08 +00:00
Fangrui Song
c302fb5cc3 [Object] llvm::Optional => std::optional 2022-12-04 09:11:11 +00:00
Kazu Hirata
aadaaface2 [llvm] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 21:11:44 -08:00
Fangrui Song
6aebb5d177 AttributeParser: Convert Optional to std::optional 2022-12-02 07:43:18 +00:00
WANG Xuerui
28b4838a33 [Object] Add some more LoongArch support
Add ELFObjectFileBase::getLoongArchFeatures, and return the proper ELF
relative reloc type for LoongArch.

Reviewed By: MaskRay, SixWeining

Differential Revision: https://reviews.llvm.org/D138016
2022-12-01 19:16:51 +08:00
Kazu Hirata
96ebf9bc80 [Object] Use std::optional in ELFObjectFile.cpp (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-11-25 12:44:51 -08:00
Fangrui Song
12ebfca621 [Object] Internalize readBBAddrMapImpl 2022-11-23 17:47:00 -08:00
Kazu Hirata
258531b7ac Remove redundant initialization of Optional (NFC) 2022-08-20 21:18:28 -07:00
Fangrui Song
de9d80c1c5 [llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.
2022-08-08 11:24:15 -07:00
Kazu Hirata
611ffcf4e4 [llvm] Use value instead of getValue (NFC) 2022-07-13 23:11:56 -07:00
Fangrui Song
2601b90d83 [llvm-objdump] Default to --mcpu=future for PPC64
GNU objdump disassembles all unknown instructions by default. Match this user
friendly behavior with the cpu value `future`.

Differential Revision: https://reviews.llvm.org/D127824
2022-06-30 11:30:35 -07:00
Rahman Lavaee
0aa6df6575 [Propeller] Encode address offsets of basic blocks relative to the end of the previous basic blocks.
This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of `LLVM_BB_ADDR_MAP` will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original  section type value to `SHT_LLVM_BB_ADDR_MAP_V0` and assign a new value to the `SHT_LLVM_BB_ADDR_MAP` section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function.  This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format.

Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address.

This encoding uses smaller values compared to the existing one (offsets relative to function symbol).
Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary).
The extra two bytes (version and feature fields) incur a small 3% size overhead to the `LLVM_BB_ADDR_MAP` section size.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D121346
2022-06-28 07:42:54 -07:00
Kazu Hirata
a7938c74f1 [llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 21:42:52 -07:00
Kazu Hirata
3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
2022-06-25 11:56:50 -07:00
Kazu Hirata
aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Kazu Hirata
7a47ee51a1 [llvm] Don't use Optional::getValue (NFC) 2022-06-20 22:45:45 -07:00
Kazu Hirata
e0e687a615 [llvm] Don't use Optional::hasValue (NFC) 2022-06-20 10:38:12 -07:00
Rahman Lavaee
5f7ef65245 [llvm-objdump] Let --symbolize-operands symbolize basic block addresses based on the SHT_LLVM_BB_ADDR_MAP section.
`--symbolize-operands` already symbolizes branch targets based on the disassembly. When the object file is created with `-fbasic-block-sections=labels` (ELF-only) it will include a SHT_LLVM_BB_ADDR_MAP section which maps basic blocks to their addresses. In such case `llvm-objdump` can annotate the disassembly based on labels inferred on this section.

In contrast to the current labels, SHT_LLVM_BB_ADDR_MAP-based labels are created for every machine basic block including empty blocks and those which are not branched into (fallthrough blocks).

The old logic is still executed even when the SHT_LLVM_BB_ADDR_MAP section is present to handle functions which have not been received an entry in this section.

Reviewed By: jhenderson, MaskRay

Differential Revision: https://reviews.llvm.org/D124560
2022-05-16 10:11:11 -07:00
Joe Nash
813e521e55 [AMDGPU] Add gfx11 subtarget ELF definition
This is the first patch of a series to upstream support for the new
subtarget.

Contributors:
Jay Foad <jay.foad@amd.com>
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

Patch 1/N for upstreaming AMDGPU gfx11 architectures.

Reviewed By: foad, kzhuravl, #amdgpu

Differential Revision: https://reviews.llvm.org/D124536
2022-04-29 12:27:17 -04:00
Ties Stuij
051deb2d9d [ARM] add Armv9 build attribute
The build attribute number can be found in the Arm ABI addenda32 document:
https://github.com/ARM-software/abi-aa/blob/main/addenda32/addenda32.rst#335target-related-attributes

Reviewed By: tmatheson

Differential Revision: https://reviews.llvm.org/D124090
2022-04-28 10:48:26 +01:00
Aakanksha
840695814a [AMDGPU] Add gfx1036 target
Differential Revision: https://reviews.llvm.org/D120846
2022-03-02 23:26:38 +00:00