Span-dependent instructions on RISC-V interact in a complex manner with
linker relaxation. The span-dependent assembler algorithm implemented in
LLVM has to start with the smallest version of an instruction and then
only make it larger, so we compress instructions before emitting them to
the streamer.
When the instruction is streamed, the information that the instruction
(or rather, the fixup on the instruction) is linker relaxable must be
accurate, even though the assembler relaxation process may transform a
not-linker-relaxable instruction/fixup into one that that is linker
relaxable, for instance `c.jal` becoming `qc.e.jal`, or `bne` getting
turned into `beq; jal` (the `jal` is linker relaxable).
In order for this to work, the following things have to happen:
- Any instruction/fixup which might be relaxed to a linker-relaxable
instruction/fixup, gets marked as `RelaxCandidate = true` in
RISCVMCCodeEmitter.
- In RISCVAsmBackend, when emitting the `R_RISCV_RELAX` relocation, we
have to check that the relocation/fixup kind is one that may need a
relax relocation, as well as that it is marked as linker relaxable (the
latter will not be set if relaxation is disabled).
- Linker Relaxable instructions streamed to a Relaxable fragment need to
mark the fragment and its section as linker relaxable.
I also added more debug output for Sections/Fixups which are marked
Linker Relaxable.
This results in more relocations, when these PC-relative fixups cross an
instruction with a fixup that is resolved as not linker-relaxable but
caused the fragment to be marked linker relaxable at streaming time
(i.e. `c.j`).
(cherry picked from commit 9e8f7acd2b3a71dad473565a6a6f3ba51a3e6bca)
Fixes: #150071
To prepare for moving content and fixup member variables from
MCEncodedFragment to MCFragment and removing
MCDataFragment/MCRelaxableFragment classes, replace dyn_cast with
getKind() tests.
Printing an expression is error-prone without a MCAsmInfo argument.
Remove the operator<< overload and replace callers with
MCAsmInfo::printExpr. Some callers are changed to MCExpr::print, with
the goal of eventually making it private.
This dump feature does not pass MCAsmInfo to the printer function.
When we remove MCSpecifierExpr subclasses (and the printImpl overrides),
we will not be able to print target-specific specifier strings.
Just print a textual representation.
Expressions with specifier can only be folded during relocation
generatin. At parse time the `MCAssembler *` argument might be null, and
targets should not rely on the evaluateAsRelocatable result.
Therefore, we can move evaluateAsRelocatableImpl from MCSpecifierExpr to
MCAsmInfo, so that targets do not need to inherit from MCSpecifierExpr.
This function was a workaround used to detect cyclic dependency
(properly resolved by 343428c666f9293ae260bbcf79130562b830b268).
We do not want backends to use it. However, #112251 exposed it to MCExpr
to be reused by AMDGPU. Keep the workaround within AMDGPU to prevent
other backends from accidentally relying on it.
Many targets define MCTargetExpr subclasses just to encode an expression
with a relocation specifier. Create a generic MCSpecifierExpr to be
inherited instead. Migrate M68k and SPARC as examples.
Sym.isInSection() calls findAssociatedFragment, which traverses the
expression tree. This can be replaced by calling evaluateAsRelocatableImpl
first and then inspecting the MCValue result.
GNU Assembler supports symbol reassignment via .set, .equ, or =.
However, LLVM's integrated assembler only allows reassignment for
MCConstantExpr cases, as it struggles with scenarios like:
```
.data
.set x, 0
.long x // reference the first instance
x = .-.data
.long x // reference the second instance
.set x,.-.data
.long x // reference the third instance
```
Between two assignments binds, we cannot ensure that a reference binds
to the earlier assignment. We use MCSymbol::IsUsed and other conditions
to reject potentially unsafe reassignments, but certain MCConstantExpr
uses could be unsafe as well.
This patch enables reassignment by cloning the symbol upon reassignment
and updating the symbol table. Existing references to the original
symbol remain unchanged, and the original symbol is excluded from the
emitted symbol table.
We report cyclic dependency errors for variable symbols and rely on
isSymbolUsedInExpression in parseAssignmentExpression at parse time,
which does not catch all setVariableValue cases (e.g. cyclic .weakref).
Instead, add a bit to MCSymbol and check it when walking the variable
value MCExpr. When a cycle is detected when we have a final layout,
report an error and set the variable to a constant to avoid duplicate
errors.
isSymbolUsedInExpression is considered deprecated, but it is still used
by AMDGPU (#112251).
Currently, the code path is likely only reachable with super edge-case scenario,
but will be more reachable with the upcoming parseAssignmentExpression improvement
to address a pile of hacks.
Use a variable symbol without any specifier instead of VK_WEAKREF.
Add code in ELFObjectWriter::executePostLayoutBinding to check
whether the target should be made an undefined weak symbol.
This change fixes several issues:
* Unreferenced `.weakref alias, target` no longer creates an undefined `target`.
* When `alias` is already defined, report an error instead of crashing.
.weakref is specific to ELF. llvm-ml has reused the VK_WEAKREF name for
a different concept. wasm incorrectly copied the ELF implementation.
Remove it.
For the label difference A-B where A and B are in the same section,
if the section contains no linker-relaxable instruction, we can disable
the framnent walk code path (https://reviews.llvm.org/D155357), removing
redundant relocations. This optimization is available since we now track
per-section linker-relaxable instructions (#140692).
lld/test/ELF/loongarch-reloc-leb128.s , introduced in #81133, has been
updated in 9662a6039c0320eb4473d87b47f0ed891a0f111c to prevent coverage
loss.
These are identified by misc-include-cleaner. I've filtered out those
that break builds. Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
Follow-up to #140494
`shouldForceRelocation` is conservative and produces redundant
relocations.
For example, RISCVAsmBackend::ForceRelocs (introduced to support mixed
relax/norelax code) leads to redundant relocations in the following
example adapted from #77436
```
.option norelax
j label
// For assembly input, RISCVAsmParser::ParseInstruction sets ForceRelocs (https://reviews.llvm.org/D46423).
// For direct object emission, RISCVELFStreamer sets ForceRelocs (#77436)
.option relax
call foo // linker-relaxable
.option norelax
j label // redundant relocation due to ForceRelocs
.option relax
label:
```
Root problem: The `isSymbolRefDifferenceFullyResolvedImpl` condition in
MCAssembler::evaluateFixup does not check whether two locations are
separated by a fragment whose size can be indeterminate due to linker
instruction (e.g. MCDataFragment with relaxation, or MCAlignFragment
due to indeterminate start offst).
This patch
* Updates the fragment walk code in
`attemptToFoldSymbolOffsetDifference` to treat MCRelaxableFragment
(for --riscv-asm-relax-branches) as fixed size after finishLayout.
* Adds a condition in `addReloc` to complement
`isSymbolRefDifferenceFullyResolvedImpl`.
* Removes the no longer needed `shouldForceRelocation`.
This fragment walk code path handles nicely handles
mixed relax/norelax case from
https://discourse.llvm.org/t/possible-problem-related-to-subtarget-usage/75283
and allows us to remove `MCSubtargetInfo` argument (#73721) as a follow-up.
This fragment walk code should be avoided in the absence of
linker-relaxable fragments within the current section.
Adjust two bolt/test/RISCV tests (#141310)
Pull Request: https://github.com/llvm/llvm-project/pull/140692
Mach-O's ARM and X86 writers use MCExpr's `SectionAddrMap *Addrs`
argument to compute label differences, which was a bit of a hack.
The hack has been cleaned up by commit
1b7759de8e6979dda2d949b1ba1c742922e5c366.
Commit 0999cbd0b9ed8aa893cce10d681dec6d54b200ad (2014) introduced
`MCValue::RefKind` for AArch64 ELF as a clean approach to encode the
relocation specifier.
Following numerous migration commits, direct references to getSymA and
getSymB have been eliminated. This allows us to seamlessly update SymA
and SymB, replacing MCSymbolRefExpr with MCSymbol.
Removeing reliance on a MCAssembler::evaluateFixup hack
(`if (Target.SymSpecifier || SA.isUndefined()) {` (previosuly
`if (A->getKind() != MCSymbolRefExpr::VK_None || SA.isUndefined()) {`))
requires 38c3ad36be1facbe6db2dede7e93c0f12fb4e1dc and 4182d2dcb5ecbfc34d41a6cd11810cd36844eddb
Revert the temporary RISCV/LoongArch workaround
(7e62715e0cd433ed97749549c6582c4e1aa689a3) during migration.
MCAssembler::evaluateFixup needs an extra `!Add->isAbsolute()` case
to support `movq abs@GOTPCREL(%rip), %rax; abs = 42` in llvm/test/MC/ELF/relocation-alias.s
(ELFObjectWriter::isSymbolRefDifferenceFullyResolvedImpl asserts if
called on an absolute symbol).
The relocation specifier should be accessed via MCValue::Specifier.
However, some targets encode the relocation specifier within SymA using
MCSymbolRefExpr::SubclassData and access it via getAccessVariant(), though
this method is now deprecated.
This change stores the SymA specifier at Specifier as well, unifying the
two code paths.
* CSKY: GOT- and PLT- relocations now suppress the STT_SECTION
conversion.
* AArch64: https://reviews.llvm.org/D156505 added `getRefkind` check to
prevent folding. This is a hack and is now removed.
MCValue: Unify relocation specifier storage by storing SymA specifier at Specifier
The relocation specifier is accessed via MCValue::Specifier, but some
targets encoded it within SymA using MCSymbolRefExpr::SubclassData and
retrieved it through the now-deprecated getAccessVariant() method. This
commit unifies the two approaches by storing the SymA specifier at
`Specifier` as well.
Additional changes:
- CSKY: GOT- and PLT- relocations now suppress STT_SECTION conversion.
- AArch64: Removed the `getRefkind` check hack (introduced in https://reviews.llvm.org/D156505) that prevented folding.
Removed the assertion from `getRelocType`.
- RISCV: Removed the assertion from `getRelocType`.
Future plans:
- Replace MCSymbolRefExpr members with MCSymbol within MCValue.
- Remove `getSymSpecifier` (added for migration).
Reworked evaluateSymbolicAdd and isSymbolRefDifferenceFullyResolved to
remove their reliance on MCValue::SymB, which previously used the
MCSymbolRefExpr member when folding two symbolic expressions. This
dependency prevented replacing MCValue::SymB with a MCSymbol. By
refactoring, we enable this replacement, which is a more significant
improvement.
Note that this change eliminates the rare "unsupported subtraction of
qualified symbol" diagnostic, resulting in a minor loss of information.
However, the benefit of enabling MCValue::SymB replacement with MCSymbol
outweighs this slight regression.
Some ELF targets don't use @ for relocation specifiers.
We should not report `error: invalid variant` when @ is used.
Attempt to make expr@specifier parsing less hacky.
The existing pretty printer generates excessive parentheses for
MCBinaryExpr expressions. This update removes unnecessary parentheses
of MCBinaryExpr with +/- operators and MCUnaryExpr.
Since relocatable expressions only use + and -, this change improves
readability in most cases.
Examples:
- (SymA - SymB) + C now prints as SymA - SymB + C.
This updates the output of -fexperimental-relative-c++-abi-vtables for
AArch64 and x86 to `.long _ZN1B3fooEv@PLT-_ZTV1B-8`
- expr + (MCTargetExpr) now prints as expr + MCTargetExpr, with this
change primarily affecting AMDGPUMCExpr.
This MIPS behavior from edb9d84dcc4824865e86f963e52d67eb50dde7f5 (2010)
is obsoleted and misleading. This caused confusion in
https://reviews.llvm.org/D123702 ([NVPTX] Disable parens for identifiers
starting with '$')
Note: $tmp was rejected by AsmParser before
https://reviews.llvm.org/D75111 (2020)
In `.equ a, 3; .if a@plt`, a@plt does not evaluate to an absolute value
(MCExpr::evaluateAsRelocatableImpl disables evaluation when the Kind !=
0 at parse time). Similarly, when using MCTargetValue,
evaluateAsAbsolute should return false when MCValue::RefKind==0.
This allows us to remove `if (!Asm)` check from MipsMCExpr.cpp
(%hi(0xdeadbeef) is not evaluated to a constant without RefKind) and
make targets less error-prone.
Commit 752b91bd821ad8a23e004b6cd631ae4f6984ae8b added the argument for
PowerPC to evaluate @l/@ha as constant in 2014. However, this is not
needed and has been cleaned up by
commit 8560da28c69de481f3ad147722577e87b902facb.
Mips also had an inappropriate use, which has been fixed by
79d84a878e83990c235da8710273a98bf835c915
The StringRef overload is often error-prone as users might forget to
register the MCSymbol.
Add comments to MCTargetExpr and MCSymbolRefExpr::VariantKind.
In the distant future the VariantKind parameter might be removed.
They are error-prone as MCParser may parse a variant kind,
which cannot be handled by the target.
The replacement in MCAsmInfo should be used instead.
Follow-up to f244b8eed37a12539fb11b76e19ec7a7eb41dccc
They are error-prone as MCParser may parse a variant kind,
which cannot be handled by the target.
The replacement in MCAsmInfo should be used instead.
Follow-up to f244b8eed37a12539fb11b76e19ec7a7eb41dccc
Follow-up to 14951a5a3120e50084b3c5fb217e2d47992a24d1
* Unify getVariantKindName and getVariantKindForName
* Allow each target to specify the preferred case (albeit ignored in MCParser)
Note: targets that use variant kinds should call MCExpr::print with a
non-null MAI to print variant kinds. operator<< passes a nullptr to
`MCExpr::print`, which should be avoided (e.g. Hexagon; fixed in
commit cf00ac81ac049cddb80aec1d6d88b8fab4f209e8).
All VariantKinds except VK_None/VK_Invalid are target-specific (e.g. a
target may not support "@plt" even if it is widely available).
Move the parsers to lib/Target to ensure that VariantKind from unrelated
targets will not be parsed.