5170 Commits

Author SHA1 Message Date
Tim Neumann
f792f14b01
[WebAssembly] Allocate MCSymbolWasm data on MCContext (#85866)
Fixes #85578, a use-after-free caused by some `MCSymbolWasm` data being
freed too early.

Previously, `WebAssemblyAsmParser` owned the data that is moved to
`MCContext` by this PR, which caused problems when handling module ASM,
because the ASM parser was destroyed after parsing the module ASM, but
the symbols persisted.

The added test passes locally with an LLVM build with AddressSanitizer
enabled.

Implementation notes:

* I've called the added method
<code>allocate<b><i>Generic</i></b>String</code> and added the second
paragraph of its documentation to maybe guide people a bit on when to
use this method (based on my (limited) understanding of the `MCContext`
class). We could also just call it `allocateString` and remove that
second paragraph.
* The added `createWasmSignature` method does not support taking the
return and parameter types as arguments: Specifying them afterwards is
barely any longer and prevents them from being accidentally specified in
the wrong order.
* This removes a _"TODO: Do the uniquing of Signatures here instead of
ObjectFileWriter?"_ since the field it's attached to is also removed.
Let me know if you think that TODO should be preserved somewhere.
2024-04-02 10:59:29 -07:00
Zaara Syeda
6582509daa
[AIX] Handle toc-data offset overflowing 16-bits (#80092)
When the toc-data offset overflows the 16-bits, we can truncate the
value to the 16-bit value as the linker will handle overflow through
fixup code.
2024-03-28 13:55:13 -04:00
Simon Tatham
88b10f3e3a
[MC][AArch64] Segregate constant pool caches by size. (#86832)
If you write a 32- and a 64-bit LDR instruction that both refer to the
same constant or symbol using the = syntax:

```
  ldr w0, =something
  ldr x1, =something
```

then the first call to `ConstantPool::addEntry` will insert the constant
into its cache of existing entries, and the second one will find the
cache entry and reuse it. This results in a 64-bit load from a 32-bit
constant, reading nonsense into the other half of the target register.

In this patch I've done the simplest fix: include the size of the
constant pool entry as part of the key used to index the cache. So now
32- and 64-bit constant loads will never share a constant pool entry.

There's scope for doing this better, in principle: you could imagine
merging the two slots with appropriate overlap, so that the 32-bit load
loads the LSW of the 64-bit value. But that's much more complicated: you
have to take endianness into account, and maybe also adjust the size of
an existing entry. This is the simplest fix that restores correctness.
2024-03-28 08:57:27 +00:00
Fangrui Song
a41bfea5c0 [MC] Simplify ELFObjectWriter. NFC
And fix `if (hasRelocationAddend())` to `usesRela` to properly treat
SHT_LLVM_CALL_GRAPH_PROFILE as SHT_REL. The incorrect does not cause a
problem because the synthesized SHT_LLVM_CALL_GRAPH_PROFILE has zero
addends.
2024-03-27 22:10:11 -07:00
Simon Pilgrim
78f0871bee Revert rG58de1e2c5eee548a9b365e3b1554d87317072ad9 "Fix stack layout for frames larger than 2gb (#84114)"
This is failing on some EXPENSIVE_CHECKS buildbots
2024-03-27 16:16:15 +00:00
Wesley Wiser
58de1e2c5e
Fix stack layout for frames larger than 2gb (#84114)
For very large stack frames, the offset from the stack pointer to a local can be more than 2^31 which overflows various `int` offsets in the frame lowering code.

This patch updates the frame lowering code to calculate the offsets as 64-bit values and resolves the overflows, resulting in the correct codegen for very large frames.

Fixes #48911
2024-03-27 15:05:58 +00:00
Cooper Partin
c62c74639a
Add support for PSV EntryFunctionName (#86296)
This change introduces a version 3 of the PSV data that includes support
for the name of the entry function as an offset into StringTable data to
a null-terminated utf-8 string.

Additional tests were added to ensure that the new value was properly
serialized/deserialized from object data.

Fixes #80175

---------

Co-authored-by: Cooper Partin <coopp@ntdev.microsoft.com>
2024-03-25 10:18:53 -07:00
Sergei Barannikov
5e5b656102
[MC] Make MCParsedAsmOperand::getReg() return MCRegister (#86444) 2024-03-25 05:13:48 +03:00
Fangrui Song
3a63f737e2 [MC] Refactor writeRelocations. NFC
MIPS is different and should better off use separate code.
2024-03-23 10:15:47 -07:00
Fangrui Song
87c7f4a12b [MC] Remove unnecessary reversal of relocations. NFC
Commit f44db24e1fd948c75c87aea017646f16553d3361 (2015) enabled this
simplication.
2024-03-23 10:03:09 -07:00
Yingwei Zheng
6c1932ffd8
[LLVM] Pass APInt by const reference. NFC. (#86278)
This patch adjusts argument passing for `APInt` to improve the
compile-time.
Compile-time improvement:
https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=32d6611af69bf4e76373f9bc7d9649650f760e48&stat=instructions:u
2024-03-23 14:57:35 +08:00
Craig Topper
fb329f1844
[Target] Move SubRegIdxRanges from MCSubtargetInfo to TargetInfo. (#86245)
I'm planning to add HwMode support to SubRegIdxRanges for RISC-V GPR
pairs. The MC layer is currently unaware of the HwMode for registers and
I'd like to keep it that way.

This information is not used by the MC layer so I think it is safe to
move it.
2024-03-22 11:15:45 -07:00
Billy Laws
46b853a82c
[MC][COFF][AArch64] Treat ARM64EC/X as ARM64 for relocations (#86019)
Since ARM64EC/X objects use regular ARM64 relocations, any special
handling must be done for them too.
2024-03-22 15:17:06 +01:00
Cooper Partin
1538b82fd3
Revert "Add support for PSV EntryFunctionName (#84409)" (#86211)
This reverts commit cde54df39cab3a1d60a3e1862ab341609bee3cc3.

Co-authored-by: Cooper Partin <coopp@ntdev.microsoft.com>
2024-03-21 15:40:29 -07:00
Cooper Partin
cde54df39c
Add support for PSV EntryFunctionName (#84409)
This change introduces a version 3 of the PSV data that includes support
for the name of the entry function as an offset into StringTable data to
a null-terminated utf-8 string.

Additional tests were added to ensure that the new value was properly
serialized/deserialized from object data.

Fixes #80175

---------

Co-authored-by: Cooper Partin <coopp@ntdev.microsoft.com>
2024-03-21 14:43:15 -07:00
timoh-ba
7650a01927
[DWARF5][COFF] Emit section-relative .debug_line_str relocations (#83773)
Dwarf 5 allows separating filenames from .debug_line into a separate
.debug_line_str section. The strings are referenced relative to the
start of the .debug_line_str section. Previously, on COFF, the
relocation information instead caused offsets to be relocated to the
base address of the COFF-File. This lead  to wrong offsets in linked
COFF (PE) files which caused the debugger to be unable to find the
correct source files.

This patch fixes this problem by making the offsets relative to the
start of the .debug_line_str section instead. There should be no
changes for ELF-Files as everything seems to be working there.

A test is also added to ensure that the correct relocation entries are
emitted.
2024-03-21 17:30:10 +02:00
Neumann Hon
5fb2797f23
[GOFF][z/OS] Change PrivateGlobalPrefix and PrivateLabelPrefix to be L# (#85730)
The current values for PrivateGlobalPrefix and PrivateLabelPrefix (@@
and @ respectively) are, in hindsight, poor choices for multiple
reasons:

First, there exist externally visible routines from the language
environment that begin with @@. These functions are certainly not
local/private by any means and they should not share a prefix with
private globals.

Secondly, both private globals and private labels should be handled the
same way by GOFF, so it doesn't make much sense for them to have
separate prefixes. GOFF remains the only file format where these are
different and there is no reason for that to be the case
2024-03-20 10:30:30 -04:00
Vyacheslav Levytskyy
59f34e8c2b
[SPIRV] Add Lifetime intrinsics/instructions (#85391)
This PR:
* adds Lifetime intrinsics/instructions
* fixes how the binary header is emitted (correct version and better
approximation of Bound)
* add validation into more test cases
2024-03-18 11:42:44 +01:00
Zaara Syeda
37b5eb0a0a
[AIX][TOC] Add -mtocdata/-mno-tocdata options on AIX (#67999)
This patch enables support that the XL compiler had for AIX under
-qdatalocal/-qdataimported.
2024-03-13 10:26:31 -04:00
Fangrui Song
a331937197 [MC] Move CompressDebugSections/RelaxELFRelocations from TargetOptions/MCAsmInfo to MCTargetOptions
The convention is for such MC-specific options to reside in
MCTargetOptions. However, CompressDebugSections/RelaxELFRelocations do
not follow the convention: `CompressDebugSections` is defined in both
TargetOptions and MCAsmInfo and there is forwarding complexity.

Move the option to MCTargetOptions and hereby simplify the code. Rename
the misleading RelaxELFRelocations to X86RelaxRelocations. llvm-mc
-relax-relocations and llc -x86-relax-relocations can now be unified.
2024-03-06 23:19:59 -08:00
Neumann Hon
eccc71783c
[SystemZ] [z/OS] Emit offset to PPA2 in separate MCSection (#84043)
The ppa2list section isn't really part of the ppa2 section. The ppa2list
section contains the offset to the ppa2, and must be created with a
special section name (specifically, C_@@QPPA2). The binder searches for
a section with this name, then uses this value to locate the ppa2.

In GOFF terms, these are entirely separate sections; the PPA2 section
isn't even really a section but rather belongs to the code section. On
the other hand, the ppa2list section is a section in its own right and
resides in a separate TXT record.
2024-03-05 15:29:07 -05:00
Felix (Ting Wang)
5b05870953
[PowerPC] Support local-dynamic TLS relocation on AIX (#66316)
Supports TLS local-dynamic on AIX, generates below sequence of code:

```
.tc foo[TC],foo[TL]@ld # Variable offset, ld relocation specifier
.tc mh[TC],mh[TC]@ml # Module handle for the caller
lwz 3,mh[TC]\(2\) $$ For 64-bit: ld 3,mh[TC]\(2\)
bla .__tls_get_mod # Modifies r0,r3,r4,r5,r11,lr,cr0
#r3 = &TLS for module
lwz 4,foo[TC]\(2\) $$ For 64-bit: ld 4,foo[TC]\(2\)
add 5,3,4 # Compute &foo
.rename mh[TC], "\_$TLSML" # Symbol for the module handle must have the name "_$TLSML"
```

---------

Co-authored-by: tingwang <tingwang@tingwangs-MBP.lan>
Co-authored-by: tingwang <tingwang@tingwangs-MacBook-Pro.local>
2024-03-01 08:09:40 +08:00
Jon Roelofs
5b91647e3f
Allow .alt_entry symbols to pass the .cfi nesting check (#82268)
A symbol with an `N_ALT_ENTRY` attribute may be defined in the middle of
a subsection, so it is reasonable to opt them out of the
`.cfi_{start,end}proc` nesting check.

Fixes: https://github.com/llvm/llvm-project/issues/82261
2024-02-28 13:03:35 -08:00
Fangrui Song
2167881f51 [ARM,MC] Support FDPIC relocations
Linux kernel fs/binfmt_elf_fdpic.c supports FDPIC for MMU-less systems.
GCC/binutils/qemu support FDPIC ABI for ARM
(https://github.com/mickael-guene/fdpic_doc).
_ARM FDPIC Toolchain and ABI_ provides a summary.

This patch implements FDPIC relocations to the integrated assembler.
There are 6 static relocations and 2 dynamic relocations, with
R_ARM_FUNCDESC as both static and dynamic.

gas requires `--fdpic` to assemble data relocations like `.word f(FUNCDESC)`.
This patch adds `MCTargetOptions::FDPIC` and reports an error if FDPIC
is not set.

Pull Request: https://github.com/llvm/llvm-project/pull/82187
2024-02-21 10:13:26 -08:00
Yuta Saito
ba3c1f9ce3
[WebAssembly] Add segment RETAIN flag to support private retained data (#81539)
In WebAssembly, we have `WASM_SYMBOL_NO_STRIP` symbol flag to mark the
referenced content as retained. However, the flag is not enough to
express retained data that is not referenced by any symbol. This patch
adds a new segment flag`WASM_SEG_FLAG_RETAIN` to support "private"
linkage data that is retained by llvm.used.

This kind of data that is not referenced but must be retained is usually
used with encapsulation symbols (__start/__stop). Swift runtime uses
this technique and depends on the fact "all metadata sections in live
objects are retained", which was not guaranteed with `--gc-sections`
before this patch.

This is a revised version of https://reviews.llvm.org/D126950 (has been
reverted) based on @MaskRay's comments
2024-02-21 03:35:36 +09:00
Vinayak Dev
9ea34be3e4
[MC]: Fix typo in MCObjectStreamer.cpp (#80856)
Fixes a typo in llvm/lib/MC/MCObjectStreamer.cpp introduced in #80162
2024-02-06 16:54:17 +01:00
stephenpeckham
b1acb7a315
[XCOFF] Add compiler version to an auxiliary symbol table entry (#80162)
C_FILE symbols. To match the behavior of the assembler and the legacy
compiler, this includes using the generic ".file" name for the C_FILE
symbol and generating the actual file name in an auxiliary entry.
2024-02-06 09:08:18 -06:00
stephenpeckham
032a70ee11
[NFC] Fix typo (#80703) 2024-02-05 12:42:43 -06:00
Kazu Hirata
39fa304866 [llvm] Use StringRef::starts_with (NFC) 2024-01-31 23:54:07 -08:00
Zaara Syeda
a03a6e9964
[AIX] [XCOFF] Add support for common and local common symbols in the TOC (#79530)
This patch adds support for common and local symbols in the TOC for AIX.
Note that we need to update isVirtualSection so as a common symbol in
TOC will have the symbol type XTY_CM and will be initialized when placed
in the TOC so sections with this type are no longer virtual.

---------

Co-authored-by: Zaara Syeda <syzaara@ca.ibm.com>
2024-01-31 16:34:21 -05:00
Kazu Hirata
54149217b5 [MC] Use StringRef::consume_back (NFC) 2024-01-27 09:32:17 -08:00
Derek Schuff
7f409cd82b
[Object][Wasm] Allow parsing of GC types in type and table sections (#79235)
This change allows a WasmObjectFile to be created from a wasm file even 
if it uses typed funcrefs and GC types. It does not significantly change how 
lib/Object models its various internal types (e.g. WasmSignature,
WasmElemSegment), so LLVM does not really "support" or understand such
files, but it is sufficient to parse the type, global and element sections, discarding
types that are not understood. This is useful for low-level binary tools such as
nm and objcopy, which use only limited aspects of the binary (such as function
definitions) or deal with sections as opaque blobs.

This is done by allowing `WasmValType` to have a value of `OTHERREF`
(representing any unmodeled reference type), and adding a field to
`WasmSignature` indicating it's a placeholder for an unmodeled reference 
type (since there is a 1:1 correspondence between WasmSignature objects
and types in the type section).
Then the object file parsers for the type and element sections are expanded
to parse encoded reference types and discard any unmodeled fields.
2024-01-25 09:48:38 -08:00
Emma Pilkington
bc82cfb38d
[AMDGPU] Add an asm directive to track code_object_version (#76267)
Named '.amdhsa_code_object_version'. This directive sets the
e_ident[ABIVERSION] in the ELF header, and should be used as the assumed
COV for the rest of the asm file.

This commit also weakens the --amdhsa-code-object-version CL flag.
Previously, the CL flag took precedence over the IR flag. Now the IR
flag/asm directive take precedence over the CL flag. This is implemented
by merging a few COV-checking functions in AMDGPUBaseInfo.h.
2024-01-21 11:54:47 -05:00
Amir Ayupov
9fec33aadc Revert "[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)"
This reverts commit 82bc33ea3f1a539be50ed46919dc53fc6b685da9.

Accidentally pushed unrelated changes.
2024-01-18 19:59:09 -08:00
Amir Ayupov
82bc33ea3f
[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)
Fix the bug where merge-fdata unconditionally outputs boltedcollection 
line, regardless of whether input files have it set.

Test Plan:
Added bolt/test/X86/merge-fdata-nobat-mode.test which fails without this
fix.
2024-01-18 19:44:16 -08:00
Derek Schuff
103fa3250c
[WebAssembly] Use ValType instead of integer types to model wasm tables (#78012)
LLVM models some features found in the binary format with raw integers
and others with nested or enumerated types. This PR switches modeling of
tables and segments to use wasm::ValType rather than uint32_t. This NFC
change is in preparation for modeling more reference types, but IMO is
also cleaner and closer to the spec.
2024-01-17 11:29:19 -08:00
Cyndy Ishida
735adbf1a8
[llvm] Teach MachO about XROS (#78373)
Add support for XROS to encode in Mach-O file formats.
2024-01-17 10:35:20 -08:00
Cyndy Ishida
25c7c23114 [llvm][MC] silence xros platform warnings, NFC 2024-01-16 16:32:34 -08:00
David Green
7850c94b86 [NFC] sentinal -> sentinel 2024-01-16 17:22:06 +00:00
Fangrui Song
f972e4d343 [MC,ELF] .section: unconditionally print section flag 'G' after 'o'
* Placing 'G' before 'M' (SHF_MERGE) can be misleading as the sh_entsize
  argument goes before the section group name, if a reader doesn't know
  that the order of extra arguments is not affected by the order of flags.
* 'a', 'w', and 'x' indicate basic permission-related flags. Separating
  them with 'G' is kinda ugly.

Simplify code and move 'G' after 'o'. The new output is more similar to
GCC.
2024-01-09 10:48:23 -08:00
Fangrui Song
7620f03ef7
[MC] Parse SHF_LINK_ORDER argument before section group name (#77407)
When both SHF_LINK_ORDER | SHF_GROUP flags are set, GNU assembler from
2.35 onwards (https://sourceware.org/PR25381
https://sourceware.org/binutils/docs/as/Section.html) parses the
SHF_LINK_ORDER argument before section group name, different from us.

This is unfortunate, but does not matter because the `.section` flag `o`
is a niche feature only used by compiler instrumentations, not adopted
by hand-written assembly, and using both flags is extremely rare. Let's
just match GNU assembler. There is another benefit: we now support
zero-flag section group with the SHF_LINK_ORDER flag, while previously
there isn't a syntax.

While here, print 'G' after 'o' to be clear that the 'G' argument is
parsed after the 'o' argument. To make the diff smaller, we don't print
'G' after 'w' in the absence of 'o' for now.
2024-01-09 10:42:34 -08:00
Jinyang He
7b45c54967
[MC][RISCV] Check hasEmitNops before call shouldInsertExtraNopBytesForCodeAlign (#77236)
The shouldInsertExtraNopBytesForCodeAlign() need STI to check whether
relax is enabled or not. It is initialized when call setEmitNops. The
setEmitNops may not be called in a section which has instructions but is
not executable. In this case uninitialized STI will cause problems.
Thus, check hasEmitNops before call it.

Fixes:
https://github.com/llvm/llvm-project/pull/76552#issuecomment-1878952480
2024-01-09 15:21:41 +08:00
Jinyang He
b57159cb19
[LoongArch] Support R_LARCH_{ADD,SUB}_ULEB128 for .uleb128 and force relocs when sym is not in section (#76433)
1, Follow RISCV 1df5ea29 to support generates relocs for .uleb128 which
can not be folded. Unlike RISCV, the located content of LoongArch should
be zero. LoongArch fixup uleb128 value by in-place addition and
subtraction reloc types named R_LARCH_{ADD,SUB}_ULEB128. The located
content can affect the result and R_LARCH_ADD_ULEB128 has enough info to
represent the first symbol value, so it needs to be set to zero.
2, Force relocs if sym is not in section so that it can emit relocs for
external symbol.

Fixes:
https://github.com/llvm/llvm-project/pull/72960#issuecomment-1866844679
2024-01-09 15:14:54 +08:00
Jinyang He
0731567a31
[MC][RISCV][LoongArch] Add AlignFragment size if layout is available and not need insert nops (#76552)
Due to delayed decision for ADD/SUB relocations, RISCV and LoongArch may
go slow fragment walk path with available layout. When RISCV (or
LoongArch in the future) don't need insert nops, that means relax is
disabled. With available layout and not needing insert nops, the size of
AlignFragment should be a constant. So we can add it to Displacement for
folding A-B.
2024-01-03 09:28:25 +08:00
Kazu Hirata
f5f2c313ae [llvm] Use StringRef::consume_front (NFC) 2023-12-25 12:33:00 -08:00
Lucas Duarte Prates
b652674dd0
[AsmWriter] Ensure getMnemonic doesn't return invalid pointers (#75783)
For instructions that don't map to a mnemonic string, the implementation
of MCInstPrinter::getMnemonic would return an invalid pointer due to the
result of the calculation of the instruction's position in the `AsmStrs`
table. This patch fixes the issue by ensuring those cases return a
`nullptr` value instead.

Fixes #74177.
2023-12-20 10:09:29 +00:00
Chenyang Gao
f72b654991
[MC][x86] Allow non-MCTargetExpr RHS when the LHS of a MCBinaryExpr is MCTargetExpr (#75693)
This fixes #73109.
In instruction `addl %eax %rax`, because there is a missing comma in the
middle of two registers, the asm parser will treat it as a binary
expression.
```
%rax  %  rax --> register mod identifier
```
However, In `MCExpr::evaluateAsRelocatableImpl`, it only checks the left
side of the expression. This patch ensures the right side will also be
checked.
2023-12-20 16:43:18 +08:00
Jinyang He
a8081ed8ff
[LoongArch] Allow delayed decision for ADD/SUB relocations (#72960)
Refer to RISCV [1], LoongArch also need delayed decision for ADD/SUB
relocations. In handleAddSubRelocations, just return directly if SecA !=
SecB, handleFixup usually will finish the the rest of creating PCRel
relocations works. Otherwise we emit relocs depends on whether
relaxation is enabled. If not, we return true and avoid record ADD/SUB
relocations.
Now the two symbols separated by alignment directive will return without
folding symbol offset in AttemptToFoldSymbolOffsetDifference, which has
the same effect when relaxation is enabled.

[1] https://reviews.llvm.org/D155357
2023-12-20 10:54:51 +08:00
Yusra Syeda
0768253c20
[SystemZ][z/OS] Add exception handling for XPLINK (#74638)
Adds emitting the exception table and the EH registers for XPLINK.

---------

Co-authored-by: Yusra Syeda <yusra.syeda@ibm.com>
2023-12-19 13:58:33 -05:00
Kazu Hirata
586ecdf205
[llvm] Use StringRef::{starts,ends}_with (NFC) (#74956)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-11 21:01:36 -08:00