Follow-up to 10c894cffd0f4bef21b54a43b5780240532e44cf.
MCAsmLayout, introduced by ac8a95498a99eb16dff9d3d0186616645d200b6e
(2010), provides APIs to compute fragment/symbol/section offsets.
The separate class is cumbersome and passing it around has overhead.
Let's remove it as the underlying implementation is tightly coupled with
MCAsmLayout anyway.
Some forwarders are added to ease migration.
This was added by 507efbcce03d8c2c5dbea3028bc39f02c88fea79
([MC] Fold A-B when A is a pending label or A/B are separated by a
MCFillFragment) to account for pending labels and is now unneeded after
the removal of pending labels (75006466296ed4b0f845cbbec4bf77c21de43b40).
Fragments are allocated with `operator new` and stored in an ilist with
Prev/Next/Parent pointers. A more efficient representation would be an
array of fragments without the overhead of Prev/Next pointers.
As the first step, replace ilist with singly-linked lists.
* `getPrevNode` uses have been eliminated by previous changes.
* The last use of the `Prev` pointer remains: for each subsection, there is an insertion point and
the current insertion point is stored at `CurInsertionPoint`.
* `HexagonAsmBackend::finishLayout` needs a backward iterator. Save all
fragments within `Frags`. Hexagon programs are usually small, and the
performance does not matter that much.
To eliminate `Prev`, change the subsection representation to
singly-linked lists for subsections and a pointer to the active
singly-linked list. The fragments from all subsections will be chained
together at layout time.
Since fragment lists are disconnected before layout time, we can remove
`MCFragment::SubsectionNumber` (https://reviews.llvm.org/D69411). The
current implementation of `AttemptToFoldSymbolOffsetDifference` requires
future improvement for robustness.
Pull Request: https://github.com/llvm/llvm-project/pull/95077
Lazy relaxation caused hash table lookups (`getFragmentOffset`) and
complex use/compute interdependencies. Some expressions involding
forward declared symbols (e.g. `subsection-if.s`) cannot be computed.
Recursion detection requires complex `IsBeingLaidOut`
(https://reviews.llvm.org/D79570).
D76114's `invalidateFragmentsFrom` makes lazy relaxation even less
useful.
Switch to eager relaxation to greatly simplify code and resolve these
issues. This change also removes a `getPrevNode` use, which makes it
more feasible to replace the fragment representation, which might yield
a large peak RSS win.
Minor downsides: The number of section relaxations may increase (offset
by avoiding the hash table lookup). For relax-recompute-align.s, the
computed layout is not optimal.
The `FA < FB` check added by https://reviews.llvm.org/D153096 is slow.
Compute an informal layout order to speed up computation when
`AttemptToFoldSymbolOffsetDifference` is repeatedly called for the same
section.
Commit 9500a5d02e23f9b43294e5f662ac099f8989c0e4 ("[MC] Make UseAssemblerInfoForParsing mostly true")
exposed this performance pitfall, which was mitigated by
`setUseAssemblerInfoForParsing(false)` workarounds (e.g. commit
245491a9f384e4c53421196533c2a2b693efaf8d). The workaround can be removed
now.
Linux kernel fs/binfmt_elf_fdpic.c supports FDPIC for MMU-less systems.
GCC/binutils/qemu support FDPIC ABI for ARM
(https://github.com/mickael-guene/fdpic_doc).
_ARM FDPIC Toolchain and ABI_ provides a summary.
This patch implements FDPIC relocations to the integrated assembler.
There are 6 static relocations and 2 dynamic relocations, with
R_ARM_FUNCDESC as both static and dynamic.
gas requires `--fdpic` to assemble data relocations like `.word f(FUNCDESC)`.
This patch adds `MCTargetOptions::FDPIC` and reports an error if FDPIC
is not set.
Pull Request: https://github.com/llvm/llvm-project/pull/82187
The shouldInsertExtraNopBytesForCodeAlign() need STI to check whether
relax is enabled or not. It is initialized when call setEmitNops. The
setEmitNops may not be called in a section which has instructions but is
not executable. In this case uninitialized STI will cause problems.
Thus, check hasEmitNops before call it.
Fixes:
https://github.com/llvm/llvm-project/pull/76552#issuecomment-1878952480
Due to delayed decision for ADD/SUB relocations, RISCV and LoongArch may
go slow fragment walk path with available layout. When RISCV (or
LoongArch in the future) don't need insert nops, that means relax is
disabled. With available layout and not needing insert nops, the size of
AlignFragment should be a constant. So we can add it to Displacement for
folding A-B.
This fixes#73109.
In instruction `addl %eax %rax`, because there is a missing comma in the
middle of two registers, the asm parser will treat it as a binary
expression.
```
%rax % rax --> register mod identifier
```
However, In `MCExpr::evaluateAsRelocatableImpl`, it only checks the left
side of the expression. This patch ensures the right side will also be
checked.
Refer to RISCV [1], LoongArch also need delayed decision for ADD/SUB
relocations. In handleAddSubRelocations, just return directly if SecA !=
SecB, handleFixup usually will finish the the rest of creating PCRel
relocations works. Otherwise we emit relocs depends on whether
relaxation is enabled. If not, we return true and avoid record ADD/SUB
relocations.
Now the two symbols separated by alignment directive will return without
folding symbol offset in AttemptToFoldSymbolOffsetDifference, which has
the same effect when relaxation is enabled.
[1] https://reviews.llvm.org/D155357
The patch adds parser, MCExpr, and emitter support for the authenticated
pointer auth relocation.
In assembly, this is expressed using:
.quad <symbol>@AUTH(<key>, <discriminator> [, addr])
For example:
.quad _g3@AUTH(ib, 1234, addr)
The optional 'addr' specifier represents whether the generated pointer
authentication code will also include address diversity (by blending the
address of the storage location of the relocated pointer with the
user-specified constant discriminator).
The @AUTH expression lowers to R_AARCH64_AUTH_ABS64 ELF relocation.
The signing schema is encoded in the place of relocation to be applied
as follows:
```
| 63 | 62 | 61:60 | 59:48 | 47:32 | 31:0 |
| ----------------- | -- | ----- | ----- | ------------- | ------ |
| address diversity | 0 | key | 0 | discriminator | addend |
```
See the following for details:
https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#static-relocations
Differential Revision: https://reviews.llvm.org/D156505
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
Co-authored-by: Peter Collingbourne <peter@pcc.me.uk>
D52985/D57677 added a .gcc_except_table workaround, but the new behavior
doesn't match GNU assembler.
```
void foo();
int bar() {
foo();
try { throw 1; }
catch (int) { return 1; }
return 0;
}
clang --target=mipsel-linux-gnu -mmicromips -S a.cc
mipsel-linux-gnu-gcc -mmicromips -c a.s -o gnu.o
.uleb128 ($cst_end0)-($cst_begin0) // bit 0 is not forced to 1
.uleb128 ($func_begin0)-($func_begin0) // bit 0 is not forced to 1
```
I have inspected `.gcc_except_table` output by `mipsel-linux-gnu-gcc -mmicromips -c a.cc`.
The `.uleb128` values are not forced to set the least significant bit.
In addition, D57677's adjustment (even->odd) to CodeGen/Mips/micromips-b-range.ll is wrong.
PC-relative `.long func - .` values will differ from GNU assembler as well.
The original intention of D52985 seems unclear to me. I think whatever
goal it wants to achieve should be moved to an upper layer.
This isMicroMips special case has caused problems to fix MCAssembler::relaxLEB to use evaluateAsAbsolute instead of evaluateKnownAbsolute,
which is needed to proper support R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128.
Differential Revision: https://reviews.llvm.org/D157655
This reverts commit 8ee6c0ea0bf30f1f1da6b49ee720b933f9676a30.
The untested special case is used as a workaround that we don't force emitting
R_ARM_REL32 relocations like GNU assembler's arm port: `TC_FORCE_RELOCATION_SUB_SAME`.
We shall investigate how to emit R_ARM_REL32.
The special case from 9746286beca2539438e0a6b783e106bc359036ca (2011)
seems irrelevant nowadays. It is actually incorrect because GNU
assembler does not set the least significant bit for `.long thumb - .`.
For a label difference `A-B` in assembly, if A and B are separated by a
linker-relaxable instruction, we should emit a pair of ADD/SUB
relocations (e.g. R_RISCV_ADD32/R_RISCV_SUB32,
R_RISCV_ADD64/R_RISCV_SUB64).
However, the decision is made upfront at parsing time with inadequate
heuristics (`requiresFixup`). As a result, LLVM integrated assembler
incorrectly suppresses R_RISCV_ADD32/R_RISCV_SUB32 for the following
code:
```
// Simplified from a workaround https://android-review.googlesource.com/c/platform/art/+/2619609
// Both end and begin are not defined yet. We decide ADD/SUB relocations upfront and don't know they will be needed.
.4byte end-begin
begin:
call foo
end:
```
To fix the bug, make two primary changes:
* Delete `requiresFixups` and the overridden emitValueImpl (from D103539).
This deletion requires accurate evaluateAsAbolute (D153097).
* In MCAssembler::evaluateFixup, call handleAddSubRelocations to emit
ADD/SUB relocations.
However, there is a remaining issue in
MCExpr.cpp:AttemptToFoldSymbolOffsetDifference. With MCAsmLayout, we may
incorrectly fold A-B even when A and B are separated by a
linker-relaxable instruction. This deficiency is acknowledged (see
D153097), but was previously bypassed by eagerly emitting ADD/SUB using
`requiresFixups`. To address this, we partially reintroduce `canFold` (from
D61584, removed by D103539).
Some expressions (e.g. .size and .fill) need to take the `MCAsmLayout`
code path in AttemptToFoldSymbolOffsetDifference, avoiding relocations
(weird, but matching GNU assembler and needed to match user
expectation). Switch to evaluateKnownAbsolute to leverage the `InSet`
condition.
As a bonus, this change allows for the removal of some relocations for
the FDE `address_range` field in the .eh_frame section.
riscv64-64b-pcrel.s contains the main test.
Add a linker relaxable instruction to dwarf-riscv-relocs.ll to test what
it intends to test.
Merge fixups-relax-diff.ll into fixups-diff.ll.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D155357
Annotation attributes may be attached to a function to mark it with
custom data that will be contained in the final Wasm file. The
annotation causes a custom section named
"func_attr.annotate.<name>.<arg0>.<arg1>..." to be created that will
contain each function's index value that was marked with the annotation.
A new patchable relocation type for function indexes had to be created so
the custom section could be updated during linking.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D150803
`MCExpr::evaluateAsAbsolute` has a longstanding bug. When the MCAssembler is
non-null and the MCAsmLayout is null, it may incorrectly fold A-B even if A and
B are separated by a linker-relaxable instruction. This behavior can suppress
some ADD/SUB relocations and lead to wrong results if the linker performs
relaxation.
To fix the bug, ensure that linker-relaxable instructions only appear at the end
of an MCDataFragment, thereby making them terminate the fragment. When computing
A-B, suppress folding if A and B are separated by a linker-relaxable
instruction.
* `.subsection` now correctly give errors for non-foldable expressions.
* gen-dwarf.s will pass even if we add back the .debug_line or .eh_frame/.debug_frame code from D150004
* This will fix suppressed relocation when we add R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128.
In the future, we should investigate the desired behavior for
`MCExpr::evaluateAsAbsolute` when both MCAssembler and MCAsmLayout are non-null.
(Note: MCRelaxableFragment is only for assembler-relaxation. If we ever need
linker-relaxable MCRelaxableFragment, we would need to adjust RISCVMCExpr.cpp
(D58943/D73211).)
Depends on D153096
Differential Revision: https://reviews.llvm.org/D153097
When the MCAssembler is non-null and the MCAsmLayout is null, we can fold A-B
when
* A and B are in the same fragment, or
* A's fragment suceeds B's fragment, and they are not separated by non-data fragments (D69411)
This patch allows folding when A's fragment precedes B's fragment so
that `9997b - . == 0` below can be evaluated as true:
```
nop
.arch_extension sec
9997:nop
// old behavior: error: expected absolute expression
.if 9997b - . == 0
.endif
```
Add a case to llvm/test/MC/ARM/directive-if-subtraction.s.
Note: for MCAsmStreamer, we cannot evaluate `.if . - 9997b == 0` at parse
time due to MCAsmStreamer::getAssemblerPtr returning nullptr (D45164).
Some Darwin tests check that this folding does not work. Add `.p2align 2` to
block some label difference folding or adjust the tests.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D153096
This patch adds support for the TLS local-exec access model on AIX to allow
for the ability to generate the 64-bit (specifically, non-optimized) code sequence.
For this patch in particular, the sequence that is generated involves a load of the
variable offset, followed by an add of the loaded variable offset to r13 (which is
thread pointer, respectively). This code sequence looks like the following:
```
ld reg1,var[TC](2)
add reg2, reg1, r13 // r13 contains the thread pointer
```
The TOC (.tc pseudo-op) entries generated in the assembly files are also
changed where we add the @le relocation for the variable offset.
Differential Revision: https://reviews.llvm.org/D149722
When the MCAssembler is non-null and the MCAsmLayout is null, we can fold A-B
in these additional cases:
* when A is a pending label (will be reassigned to a real fragment in flushPendingLabels())
* A and B are separated by a MCFillFragment with a constant size
If FA == FB, we can use SA.getOffset() - SB.getOffset() even if FA is
not a MCDataFragment, as the only case this can be problematic
(different offsets for a variable-size fragment) is invalid/unreachable.
If FA != FB, the `if (FI->getKind() != MCFragment::FT_Data)` check below
can bail out correctly.
This change will help Mach-O fold more expressions. For ELF this is NFC,
unless evaluateFixup has a bug that would evaluate an expression
differently.
This is mostly useful for ARM64EC, which uses such symbols extensively.
One interesting quirk of ARM64EC is that we need to be able to emit weak
symbols that point at each other (so if either symbol is defined
elsewhere, both symbols point at the definition). This handling is
currently restricted to weak_anti_dep symbols, because we depend on the
current behavior of resolving weak symbols in some cases.
Differential Revision: https://reviews.llvm.org/D145208
This reverts commit 10c17c97ebaf81ac26f6830e51a7a57ddcf63cd2. It causes undefined symbol error on chromium windows build. A small repro was uploaded to the code review.
This is mostly useful for ARM64EC, which uses such symbols extensively.
One interesting quirk of ARM64EC is that we need to be able to emit weak
symbols that point at each other (so if either symbol is defined
elsewhere, both symbols point at the definition). This required a few
changes to the way we handle weak symbols on Windows.
Differential Revision: https://reviews.llvm.org/D145208
This is mostly useful for ARM64EC, which uses such symbols extensively.
One interesting quirk of ARM64EC is that we need to be able to emit weak
symbols that point at each other (so if either symbol is defined
elsewhere, both symbols point at the definition). This required a few
changes to the way we handle weak symbols on Windows.
Differential Revision: https://reviews.llvm.org/D145208
ptxas fails to parse such syntax:
mov.u64 %rd1, ($str);
fatal : Parsing error near '$str': syntax error
A new MCAsmInfo option was added because InParens parameter of
MCExpr::print is not sufficient to disable parens
completely. MCExpr::print resets it to false for a recursive call in
case of unary or binary expressions.
Targets that require parens around identifiers that start with '$'
should always pass MCAsmInfo to MCExpr::print.
Therefore 'operator<<(raw_ostream &, MCExpr&)' should be avoided
because it calls MCExpr::print with nullptr MAI.
Differential Revision: https://reviews.llvm.org/D123702
ptxas fails to parse such syntax:
mov.u64 %rd1, ($str);
fatal : Parsing error near '$str': syntax error
A new MCAsmInfo option was added because InParens parameter of
MCExpr::print is not sufficient to disable parens
completely. MCExpr::print resets it to false for a recursive call in
case of unary or binary expressions.
Differential Revision: https://reviews.llvm.org/D123702
There's a few relevant forward declarations in there that may require downstream
adding explicit includes:
llvm/MC/MCContext.h no longer includes llvm/BinaryFormat/ELF.h, llvm/MC/MCSubtargetInfo.h, llvm/MC/MCTargetOptions.h
llvm/MC/MCObjectStreamer.h no longer include llvm/MC/MCAssembler.h
llvm/MC/MCAssembler.h no longer includes llvm/MC/MCFixup.h, llvm/MC/MCFragment.h
Counting preprocessed lines required to rebuild llvm-project on my setup:
before: 1052436830
after: 1049293745
Which is significant and backs up the change in addition to the usual benefits of
decreasing coupling between headers and compilation units.
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D119244
For tagged-globals, we only need to disable relaxation for globals that
we actually tag. With this patch function pointer relocations, which
we do not instrument, can be relaxed.
This patch also makes tagged-globals work properly with LTO, as
-Wa,-mrelax-relocations=no doesn't work with LTO.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D113220
We previously had a limitation that TLS variables could not
be exported (and therefore could also not be imported). This
change removed that limitation.
Differential Revision: https://reviews.llvm.org/D108877
This re-architects the RISCV relocation handling to bring the
implementation closer in line with the implementation in binutils. We
would previously aggressively resolve the relocation. With this
restructuring, we always will emit a paired relocation for any symbolic
difference of the type of S±T[±C] where S and T are labels and C is a
constant.
GAS has a special target hook controlled by `RELOC_EXPANSION_POSSIBLE`
which indicates that a fixup may be expanded into multiple relocations.
This is used by the RISCV backend to always emit a paired relocation -
either ADD[WIDTH] + SUB[WIDTH] for text relocations or SET[WIDTH] +
SUB[WIDTH] for a debug info relocation. Irrespective of whether linker
relaxation support is enabled, symbolic difference is always emitted as
a paired relocation.
This change also sinks the target specific behaviour down into the
target specific area rather than exposing it to the shared relocation
handling. In the process, we also sink the "special" handling for debug
information down into the RISCV target. Although this improves the path
for the other targets, this is not necessarily entirely ideal either.
The changes in the debug info emission could be done through another
type of hook as this functionality would be required by any other target
which wishes to do linker relaxation. However, as there are no other
targets in LLVM which currently do this, this is a reasonable thing to
do until such time as the code needs to be shared.
Improve the handling of the relocation (and add a reduced test case from
the Linux kernel) to ensure that we handle complex expressions for
symbolic difference. This ensures that we correct relocate symbols with
the adddends normalized and associated with the addition portion of the
paired relocation.
This change also addresses some review comments from Alex Bradbury about
the relocations meant for use in the DWARF CFA being named incorrectly
(using ADD6 instead of SET6) in the original change which introduced the
relocation type.
This resolves the issues with the symbolic difference emission
sufficiently to enable building the Linux kernel with clang+IAS+lld
(without linker relaxation).
Resolves PR50153, PR50156!
Fixes: ClangBuiltLinux/linux#1023, ClangBuiltLinux/linux#1143
Reviewed By: nickdesaulniers, maskray
Differential Revision: https://reviews.llvm.org/D103539
- Add new variantKinds for the symbol's variable offset and region handle
- Print the proper relocation specifier @gd in the asm streamer when emitting
the TC Entry for the variable offset for the symbol
- Fix the switch section failure between the TC Entry of variable offset and
region handle
- Put .__tls_get_addr symbol in the ProgramCodeSects with XTY_ER property
Reviewed by: sfertile
Differential Revision: https://reviews.llvm.org/D100956
References to functions are in program memory and need a `pm()` fixup. This should fix trait objects for Rust on AVR.
Differential Revision: https://reviews.llvm.org/D87631
Patch by Alex Mikhalev.