190 Commits

Author SHA1 Message Date
Artem Belevich
4e596fc285
[ELF] handle new NVIDIA GPU variants. (#151604) 2025-07-31 17:21:40 -07:00
Joseph Huber
b53be5f4b2
[LLVM] Update CUDA ELF flags for their new ABI (#149534)
Summary:
We rely on these flags to do things in the runtime and print the
contents of binaries correctly. CUDA updated their ABI encoding recently
and we didn't handle that. it's a new ABI entirely so we just select on
it when it shows up.

Fixes: https://github.com/llvm/llvm-project/issues/148703
2025-07-21 14:38:03 -05:00
Ming-Yi Lai
9b3064aec8
[llvm-objdump][RISCV] Display `@plt' symbols when disassembling .plt section (#147933)
This patch adds dummy symbols for PLT entries for RISC-V 32-bit and
64-bit targets so llvm-objdump can show the function symbol that
corresponds to each PLT entry.
2025-07-16 11:41:17 +08:00
Rahman Lavaee
6b623a6622
[SHT_LLVM_BB_ADDR_MAP] Remove support for versions 1 and 0 (SHT_LLVM_BB_ADDR_MAP_V0). (#146186)
Version 2 was added more than two years ago
(6015a045d7).
So it should be safe to deprecate older versions.
2025-07-02 10:31:52 -07:00
Stanislav Mekhanoshin
69974658f0
[AMDGPU] Initial support for gfx1250 target. (#144965)
This is just a stub for now.
2025-06-19 22:52:51 -07:00
Kazu Hirata
03f616eb3a
[llvm] Compare std::optional<T> to values directly (NFC) (#143340)
This patch transforms:

  X && *X == Y

to:

  X == Y

where X is of std::optional<T>, and Y is of T or similar.
2025-06-08 22:37:59 -07:00
Vladislav Dzhidzhoev
bcad050106
[llvm-objdump][ARM] Find ELF file PLT entries for arm, thumb (#130764)
This implements arm, armeb, thumb, thumbeb PLT entries parsing support
in ELF for llvm-objdump.

Implementation is similar to AArch64MCInstrAnalysis::findPltEntries. PLT
entry signatures are based on LLD code for PLT generation
(ARM::writePlt).

llvm-objdump tests are produced from lld/test/ELF/arm-plt-reloc.s,
lld/test/ELF/armv8-thumb-plt-reloc.s.
2025-03-26 20:18:23 +01:00
Vladislav Dzhidzhoev
84e44ae6b7
[llvm-objdump] Pass MCSubtargetInfo to findPltEntries (NFC) (#131773)
It allows access to subtarget features, collected in llvm-objdump.cpp,
from findPltEntries, which will be used in
https://github.com/llvm/llvm-project/pull/130764.
2025-03-18 14:00:34 +01:00
Nikita Popov
979c275097
[IR] Store Triple in Module (NFC) (#129868)
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.

For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.

The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
2025-03-06 10:27:47 +01:00
Fabian Ritter
8615f9aaff
[AMDGPU] Replace gfx940 and gfx941 with gfx942 in llvm (#126763)
gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

This PR removes all non-documentation occurrences of gfx940/gfx941 from
the llvm directory, and the remaining occurrences in clang.

Documentation changes will follow.

For SWDEV-512631
2025-02-19 10:20:48 +01:00
quic-areg
61ea63baaf
[Hexagon] Add support for decoding PLT symbols (#123425)
Describes PLT entries for hexagon.
2025-01-29 15:37:23 -06:00
Ikhlas Ajbar
8b37c1c71b
[Hexagon] Add V75 support to compiler and assembler (#120773)
This patch introduces support for the Hexagon V75 architecture. It
includes instruction formats, definitions, encodings, scheduling
classes, and builtins/intrinsics.
2024-12-20 14:01:58 -06:00
Kazu Hirata
e9c8106a90
[Object] Remove unused includes (NFC) (#116750)
Identified with misc-include-cleaner.
2024-11-19 19:42:09 -08:00
Matt Arsenault
a6fc489bb7
AMDGPU: Add gfx950 subtarget definitions (#116307)
Mostly a stub, but adds some baseline tests and
tests for removed instructions.
2024-11-18 10:41:14 -08:00
Shilei Tian
de0fd64bed
[AMDGPU] Introduce a new generic target gfx9-4-generic (#115190)
This patch introduces a new generic target, `gfx9-4-generic`. Since it doesn’t support FP8 and XF32-related instructions, the patch includes several code reorganizations to accommodate these changes.
2024-11-12 23:11:05 -05:00
Carl Ritson
076aac59ac
[AMDGPU] Add a new target for gfx1153 (#113138) 2024-10-23 12:56:58 +09:00
yonghong-song
03958680b2
[BPF] Make llvm-objdump disasm default cpu v4 (#102166)
Currently, with the following example,
  $ cat t.c
  void foo(int a, _Atomic int *b)
  {
   *b &= a;
  }
  $ clang --target=bpf -O2 -c -mcpu=v3 t.c
  $ llvm-objdump -d t.o
  t.o:    file format elf64-bpf

  Disassembly of section .text:

  0000000000000000 <foo>:
       0:       c3 12 00 00 51 00 00 00 <unknown>
       1:       95 00 00 00 00 00 00 00 exit

Basically, the default cpu for llvm-objdump is v1 and it won't be able
to decode insn properly.

If we add --mcpu=v3 to llvm-objdump command line, we will have
  $ llvm-objdump -d --mcpu=v3 t.o

  t.o:    file format elf64-bpf

  Disassembly of section .text:

  0000000000000000 <foo>:
0: c3 12 00 00 51 00 00 00 w1 = atomic_fetch_and((u32 *)(r2 + 0x0), w1)
       1:       95 00 00 00 00 00 00 00 exit

The atomic_fetch_and insn can be decoded properly. Using latest cpu
version --mcpu=v4 can also decode properly like the above --mcpu=v3.

To avoid the above '<unknown>' decoding with common 'llvm-objdump -d
t.o', this patch marked the default cpu for llvm-objdump with the
current highest cpu number v4 in ELFObjectFileBase::tryGetCPUName(). The
cpu number in ELFObjectFileBase::tryGetCPUName() will be adjusted in the
future if cpu number is increased e.g. v5 etc. Such an approach also
aligns with gcc-bpf as discussed in [1].

Six bpf unit tests are affected with this change. I changed test output
for three unit tests and added --mcpu=v1 for the other three unit tests,
to demonstrate the default (cpu v4) behavior and explicit --mcpu=v1
behavior.

[1]
https://lore.kernel.org/bpf/6f32c0a1-9de2-4145-92ea-be025362182f@linux.dev/T/#m0f7e63c390bc8f5a5523e7f2f0537becd4205200

Co-authored-by: Yonghong Song <yonghong.song@linux.dev>
2024-08-06 18:23:46 -07:00
Fangrui Song
2f37a22f10
[llvm-objdump] -r: support CREL
Extract the llvm-readelf decoder to `decodeCrel` (#91280) and reuse it
for llvm-objdump.

Because the section representation of LLVMObject (`SectionRef`) is
64-bit, insufficient to hold all decoder states, `section_rel_begin` is
modified to decode CREL eagerly and hold the decoded relocations inside
ELFObjectFile<ELFT>.

The test is adapted from llvm/test/tools/llvm-readobj/ELF/crel.test.

Pull Request: https://github.com/llvm/llvm-project/pull/97382
2024-07-08 09:14:34 -07:00
Shilei Tian
1ca0055f45
[AMDGPU] Add a new target gfx1152 (#94534) 2024-06-06 12:16:11 -04:00
Konstantin Zhuravlyov
775f1cd34d
AMDGPU: Add gfx12-generic target (#93875) 2024-05-31 12:46:44 -04:00
Craig Topper
733a87783c
[RISCV] Split code that tablegen needs out of RISCVISAInfo. (#89684)
This introduces a new file, RISCVISAUtils.cpp and moves the rest of
RISCVISAInfo to the TargetParser library.

This will allow us to generate part of RISCVISAInfo.cpp using tablegen.
2024-04-23 15:12:36 -07:00
quic-areg
31f4b329c8
[Hexagon] ELF attributes for Hexagon (#85359)
Defines a subset of attributes and emits them to a section called
.hexagon.attributes.

The current attributes recorded are the attributes needed by
llvm-objdump to automatically determine target features and eliminate
the need to manually pass features.
2024-03-19 16:22:30 -05:00
Pierre van Houtryve
43c7eb5d7b
[AMDGPU] Replace '.' with '-' in generic target names (#81718)
The dot is too confusing for tools. Output temporaries would have
'10.3-generic' so tools could parse it as an extension, device libs &
the associated clang driver logic are also confused by the dot.

After discussions, we decided it's better to just remove the '.' from
the target name than fix each issue one by one.
2024-02-14 15:19:04 +01:00
Pierre van Houtryve
f93aa5157a
[AMDGPU] Introduce GFX9/10.1/10.3/11 Generic Targets (#76955)
These generic targets include multiple GPUs and will, in the future,
provide a way to build once and run on multiple GPU, at the cost of less
optimization opportunities.

Note that this is just doing the compiler side of things, device libs an
runtimes/loader/etc. don't know about these targets yet, so none of them
actually work in practice right now. This is just the initial commit to
make LLVM aware of them.

This contains the documentation changes for both this change and #76954
as well.
2024-02-12 10:18:20 +01:00
Craig Topper
8c37e3e64b
[RISCV] Only set Zca flag for EF_RISCV_RVC in ELFObjectFileBase::getRISCVFeatures(). (#80928)
This code appears to be a hack to set the features to include compressed
instructions if the ELF EFLAGS flags bit is present, but the ELF
attribute for the ISA string is no present or not accurate.

We can't remove the hack because llvm-mc doesn't create ELF attributes
by default so a lot of tests fail to disassembler properly. Using clang
as the assembler does set the attributes.

This patch changes the hack to only set Zca since that is the minimum
implied by the flag. Setting anything else potentially conflicts with
the ISA string containing Zcmp or Zcmt.

JITLink also needs to be updated to recognize Zca in addition to C.
2024-02-07 08:23:57 -08:00
Alexandre Ganea
43ab40a5ba [llvm] Silence warning when building with Clang ToT
This fixes:
```
[1343/7452] Building CXX object lib\Object\CMakeFiles\LLVMObject.dir\ELFObjectFile.cpp.obj
C:\git\llvm-project\llvm\lib\Object\ELFObjectFile.cpp(808,27): warning: comparison of integers of different signs: 'unsigned int' and '_Iter_diff_t<const Elf_Shdr_Impl<ELFType<llvm::endianness::little, false>> *>' (aka 'int') [-Wsign-compare]
  808 |     if (*TextSectionIndex != std::distance(Sections.begin(), *TextSecOrErr))
      |         ~~~~~~~~~~~~~~~~~ ^  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
C:\git\llvm-project\llvm\lib\Object\ELFObjectFile.cpp(913,12): note: in instantiation of function template specialization 'readBBAddrMapImpl<llvm::object::ELFType<llvm::endianness::little, false>>' requested here
  913 |     return readBBAddrMapImpl(Obj->getELFFile(), TextSectionIndex, PGOAnalyses);
      |            ^
```
2024-01-25 09:34:19 -05:00
Aiden Grossman
c067524852
[SHT_LLVM_BB_ADDR_MAP] Add assertion and clarify docstring (#77374)
This patch adds an assertion to readBBAddrMapImpl to confirm that
PGOAnalyses and BBAddrMaps are of the same size when PGO information is
requested (part of the API contract). This patch also updates the
docstring for readBBAddrMap to better clarify what is guaranteed.
2024-01-19 11:34:00 -08:00
Luke Lau
db78c30ba7
[RISCV] Deduplicate RISCVISAInfo::toFeatures/toFeatureVector. NFC (#76942)
toFeatures and toFeatureVector both output a list of target feature
flags, just with a slightly different interface. toFeatures keeps any
unsupported extensions, and also provides a way to append negative
extensions (AddAllExtensions=true).

This patch combines them into one function, so that a later patch will
be be able to get a std::vector of features that includes all the
negative extensions, which was previously only possible through the
StrAlloc interface.
2024-01-09 15:33:51 +07:00
Joseph Huber
deab58d127
[ELF] Add CPU name detection for CUDA architectures (#75964)
Summary:
Recently we added support for detecting the CUDA processor with the ELF
flags. This allows us to get a string representation of it in other
code. This will be used by the offloading runtime.
2023-12-19 20:01:15 -06:00
Micah Weston
105adf2cd9
[SHT_LLVM_BB_ADDR_MAP] Implements PGOAnalysisMap in Object and ObjectYAML with tests.
Reviewed in PR (#71750). A part of [RFC - PGO Accuracy Metrics: Emitting and Evaluating Branch
and Block
Analysis](https://discourse.llvm.org/t/rfc-pgo-accuracy-metrics-emitting-and-evaluating-branch-and-block-analysis/73902).

This PR adds the PGOAnalysisMap data structure and implements encoding and
decoding through Object and ObjectYAML along with associated tests. When
emitted into the bb-addr-map section, each function is followed by the associated
pgo-analysis-map for that function. The emitting of each analysis in the map is
controlled by a bit in the bb-addr-map feature byte. All existing bb-addr-map
code can ignore the pgo-analysis-map if the caller does not request the data.
2023-12-12 10:23:16 -05:00
Jay Foad
cf1e0c0b07
[AMDGPU] Define new targets gfx1200 and gfx1201 (#73133)
Define target names and ELF numbers for new GFX12 targets gfx1200 and
gfx1201. For now they behave identically to GFX11.
2023-11-23 16:44:05 +00:00
Jay Foad
92542f2a40 [AMDGPU] Add targets gfx1150 and gfx1151
This is the target definition only. Currently they are treated the same
as GFX 11.0.x.

Differential Revision: https://reviews.llvm.org/D155429
2023-07-17 13:06:12 +01:00
Fangrui Song
a4d1259e61 [llvm-objdump] Default to --mcpu=future for PPC32
Extend D127824 to the 32-bit Power architecture.
AFAICT GNU objdump -d dumps all instructions for 32-bit as well.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D155114
2023-07-12 18:24:18 -07:00
Job Noorman
8de9f2b558 Move SubtargetFeature.h from MC to TargetParser
SubtargetFeature.h is currently part of MC while it doesn't depend on
anything in MC. Since some LLVM components might have the need to work
with target features without necessarily needing MC, it might be
worthwhile to move SubtargetFeature.h to a different location. This will
reduce the dependencies of said components.

Note that I choose TargetParser as the destination because that's where
Triple lives and SubtargetFeatures feels related to that.

This issues came up during a JITLink review (D149522). JITLink would
like to avoid a dependency on MC while still needing to store target
features.

Reviewed By: MaskRay, arsenm

Differential Revision: https://reviews.llvm.org/D150549
2023-06-26 11:20:08 +02:00
Fangrui Song
9e37a7bd1f [llvm-objdump][X86] Add @plt symbols for .plt.got
If a symbol needs both JUMP_SLOT and GLOB_DAT relocations, there is a
minor linker optimization to keep just GLOB_DAT. This optimization
is only implemented by GNU ld's x86 port and mold.
https://maskray.me/blog/2021-08-29-all-about-global-offset-table#combining-.got-and-.got.plt

With the optimizing, the PLT entry is placed in .plt.got and the
associated GOTPLT entry is placed in .got (ld.bfd -z now) or .got.plt (ld.bfd -z lazy).
The relocation is in .rel[a].dyn.

This patch synthesizes `symbol@plt` labels for these .plt.got entries.

Example:
```
cat > a.s <<e
.globl _start; _start:
mov combined0@gotpcrel(%rip), %rax; mov combined1@gotpcrel(%rip), %rax
call combined0@plt; call combined1@plt
call foo0@plt; call foo1@plt
e
cat > b.s <<e
.globl foo0, foo1, combined0, combined1
foo0: foo1: combined0: combined1:
e
gcc -fuse-ld=bfd -shared b.s -o b.so
gcc -fuse-ld=bfd -pie -nostdlib a.s b.so -o a
```

```
Disassembly of section .plt:

0000000000001000 <.plt>:
    1000: ff 35 ea 1f 00 00             pushq   0x1fea(%rip)            # 0x2ff0 <_GLOBAL_OFFSET_TABLE_+0x8>
    1006: ff 25 ec 1f 00 00             jmpq    *0x1fec(%rip)           # 0x2ff8 <_GLOBAL_OFFSET_TABLE_+0x10>
    100c: 0f 1f 40 00                   nopl    (%rax)

0000000000001010 <foo1@plt>:
    1010: ff 25 ea 1f 00 00             jmpq    *0x1fea(%rip)           # 0x3000 <_GLOBAL_OFFSET_TABLE_+0x18>
    1016: 68 00 00 00 00                pushq   $0x0
    101b: e9 e0 ff ff ff                jmp     0x1000 <.plt>

0000000000001020 <foo0@plt>:
    1020: ff 25 e2 1f 00 00             jmpq    *0x1fe2(%rip)           # 0x3008 <_GLOBAL_OFFSET_TABLE_+0x20>
    1026: 68 01 00 00 00                pushq   $0x1
    102b: e9 d0 ff ff ff                jmp     0x1000 <.plt>

Disassembly of section .plt.got:

0000000000001030 <combined0@plt>:
    1030: ff 25 a2 1f 00 00             jmpq    *0x1fa2(%rip)           # 0x2fd8 <foo1+0x2fd8>
    1036: 66 90                         nop

0000000000001038 <combined1@plt>:
    1038: ff 25 a2 1f 00 00             jmpq    *0x1fa2(%rip)           # 0x2fe0 <foo1+0x2fe0>
    103e: 66 90                         nop
```

For x86-32, with -z now, if we remove `foo0` and `foo1`, the absence of regular
PLT will cause GNU ld to omit .got.plt, and our code cannot synthesize @plt
labels. This is an extreme corner case that almost never happens in practice (to
trigger the case, ensure every PLT symbol has been taken address). To fix it, we
can get the `_GLOBAL_OFFSET_TABLE_` symbol value, but the complexity is not
worth it.

Close https://github.com/llvm/llvm-project/issues/62537

Reviewed By: bd1976llvm

Differential Revision: https://reviews.llvm.org/D149817
2023-05-16 09:22:21 -07:00
Konstantin Zhuravlyov
9d05727972 AMDGPU: Add basic gfx942 target
Differential Revision: https://reviews.llvm.org/D149983
2023-05-10 11:51:06 -04:00
Konstantin Zhuravlyov
1fc70210a6 AMDGPU: Add basic gfx941 target
Differential Revision: https://reviews.llvm.org/D149982
2023-05-10 11:51:06 -04:00
Fangrui Song
b05cd680ea MCInstrAnalysis: make GotPltSectionVA x86-32 specific
GotPltSectionVA is specific to x86-32 PIC PLT entries.
Let's remove the argument from the generic interface.

As a side effect of not requiring .got.plt, this simplification
addresses a subset of https://github.com/llvm/llvm-project/issues/62537
by enabling .plt dumping for some ld.bfd -z now linked x86-32/x86-64 images
without .got.plt
2023-05-03 19:21:01 -07:00
Alex Bradbury
91c6174ce3 [RISCV] Allow llvm-objdump to disassemble objects with unrecognised versions of known extensions
This Moves ELFObjectFile to using
RISCVISAInfo::parseNormalizedArchString which is not an NFC, as the test
changes show. D144353 transitioned LLD to using this function, which is
specialised to parsing arch strings in the normalised format specified
in the psABI rather than user-authored strings accepted in `-march`,
which has greater flexibility.

parseNormalizedArchString does not ignore or produce an error for ISA
extensions with a version that isn't recognised/supported by LLVM. As
current GCC is marking its objects with a higher version of the A, F,
and D extensions than LLVM (see [extension versioning
discussion](https://discourse.llvm.org/t/rfc-resolving-issues-related-to-extension-versioning-in-risc-v/68472)
this massively improves the usability of llvm-objdump with such
binaries.

Differential Revision: https://reviews.llvm.org/D146114
2023-03-27 04:38:16 +01:00
Aiden Grossman
175aa049c7 [Propeller] Make decoding BBAddrMaps trace through relocations
Currently when using the LLVM tools (eg llvm-readobj, llvm-objdump) to
find information about basic block locations using the propeller tooling
in relocatable object files function addresses are not mapped properly
which causes problems. In llvm-readobj this means that incorrect
function names will be pulled. In llvm-objdum this means that most BBs
won't show up in the output if --symbolize-operands is used. This patch
changes the behavior of decodeBBAddrMap to trace through relocations
to get correct function addresses if it is going through a relocatable
object file. This fixes the behavior in both tools and also other
consumers of decodeBBAddrMap. Some helper functions have been added
in/refactoring done to aid in grabbing BB address map sections now that
in some cases both relocation and BB address map sections need to be
obtained at the same time.

Regression tests moved around/added.

Differential Revision: https://reviews.llvm.org/D143841
2023-03-13 21:29:48 +00:00
Gregory Alfonso
956c7dca29 [Object][NFC] Remove unneeded llvm_unreachable
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D139452
2023-02-16 13:20:41 -08:00
Archibald Elliott
62c7f035b4 [NFC][TargetParser] Remove llvm/ADT/Triple.h
I also ran `git clang-format` to get the headers in the right order for
the new location, which has changed the order of other headers in two
files.
2023-02-07 12:39:46 +00:00
Kazu Hirata
55e2cd1609 Use llvm::count{lr}_{zero,one} (NFC) 2023-01-28 12:41:20 -08:00
Mehdi Amini
5ab0894fd5 Explicitly more Error when returning it (NFC)
This is an attempt to fix a build failure:

llvm/lib/Object/ELFObjectFile.cpp:300:12: error: call to deleted constructor of 'llvm::Error'
    return E;
2023-01-16 15:07:46 +00:00
Elena Lepilkina
537cdf92c4 [llvm-objdump][RISCV] Use new common method to parse ARCH RISCV attribute
Differential Revision: https://reviews.llvm.org/D139553
2023-01-16 16:57:55 +03:00
Fangrui Song
2fa744e631 std::optional::value => operator*/operator->
value() has undesired exception checking semantics and calls
__throw_bad_optional_access in libc++. Moreover, the API is unavailable without
_LIBCPP_NO_EXCEPTIONS on older Mach-O platforms (see
_LIBCPP_AVAILABILITY_BAD_OPTIONAL_ACCESS).

This commit fixes LLVMAnalysis and its dependencies.
2022-12-16 22:44:08 +00:00
Fangrui Song
c302fb5cc3 [Object] llvm::Optional => std::optional 2022-12-04 09:11:11 +00:00
Kazu Hirata
aadaaface2 [llvm] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 21:11:44 -08:00
Fangrui Song
6aebb5d177 AttributeParser: Convert Optional to std::optional 2022-12-02 07:43:18 +00:00
WANG Xuerui
28b4838a33 [Object] Add some more LoongArch support
Add ELFObjectFileBase::getLoongArchFeatures, and return the proper ELF
relative reloc type for LoongArch.

Reviewed By: MaskRay, SixWeining

Differential Revision: https://reviews.llvm.org/D138016
2022-12-01 19:16:51 +08:00