Span-dependent instructions on RISC-V interact in a complex manner with
linker relaxation. The span-dependent assembler algorithm implemented in
LLVM has to start with the smallest version of an instruction and then
only make it larger, so we compress instructions before emitting them to
the streamer.
When the instruction is streamed, the information that the instruction
(or rather, the fixup on the instruction) is linker relaxable must be
accurate, even though the assembler relaxation process may transform a
not-linker-relaxable instruction/fixup into one that that is linker
relaxable, for instance `c.jal` becoming `qc.e.jal`, or `bne` getting
turned into `beq; jal` (the `jal` is linker relaxable).
In order for this to work, the following things have to happen:
- Any instruction/fixup which might be relaxed to a linker-relaxable
instruction/fixup, gets marked as `RelaxCandidate = true` in
RISCVMCCodeEmitter.
- In RISCVAsmBackend, when emitting the `R_RISCV_RELAX` relocation, we
have to check that the relocation/fixup kind is one that may need a
relax relocation, as well as that it is marked as linker relaxable (the
latter will not be set if relaxation is disabled).
- Linker Relaxable instructions streamed to a Relaxable fragment need to
mark the fragment and its section as linker relaxable.
I also added more debug output for Sections/Fixups which are marked
Linker Relaxable.
This results in more relocations, when these PC-relative fixups cross an
instruction with a fixup that is resolved as not linker-relaxable but
caused the fragment to be marked linker relaxable at streaming time
(i.e. `c.j`).
(cherry picked from commit 9e8f7acd2b3a71dad473565a6a6f3ba51a3e6bca)
Fixes: #150071
* Rename the vague `Value` to `Fill`.
* FillLen is at most 8. Making the field smaller to facilitate encoding
MCAlignFragment as a MCFragment union member.
* Replace an unreachable report_fatal_error with assert.
To prepare for moving content and fixup member variables from
MCEncodedFragment to MCFragment and removing
MCDataFragment/MCRelaxableFragment classes, replace dyn_cast with
getKind() tests.
Follow-up to #146307
Moved MCInst storage to MCSection, enabling trivial ~MCRelaxableFragment
and eliminating the need for a fragment walk in ~MCSection.
Updated MCRelaxableFragment::getInst to construct an MCInst on demand.
Modified MCAssembler::relaxInstruction's mayNeedRelaxation to accept
opcode and operands instead of an MCInst, avoiding redundant MCInst
creation. Note that MCObjectStreamer::emitInstructionImpl calls
mayNeedRelaxation before determining the target fragment for the MCInst.
Unfortunately, we also have to encode `MCInst::Flags` to support
the EVEX prefix, e.g. `{evex} xorw $foo, %ax`
There is a small decrease in max-rss (stage1-ReleaseLTO-g (link only))
with negligible instructions:u change.
https://llvm-compile-time-tracker.com/compare.php?from=0b533f2d9f0551aaffb13dcac8e0fd0a952185b5&to=f26b57f33bc7ccae749a57dfc841de7ce2acc2ef&stat=max-rss&linkStats=on
Next: Enable MCFragment to store fixed-size data (was MCDataFragment's job)
and optional Opcode/Operands data (was MCRelaxableFragment's job),
and delete MCDataFragment/MCRelaxableFragment.
This will allow re-encoding of Data+Relax+Data+Relax sequences as
Frag+Frag. The saving should outweigh the downside of larger
MCFragment.
Pull Request: https://github.com/llvm/llvm-project/pull/147229
* Fix the crash for `.equiv b, undef; b:` (.equiv equates a symbol to an expression and reports an error if the symbol was already defined).
* Remove redundant isVariable check from emitFunctionEntryLabel
Pull Request: https://github.com/llvm/llvm-project/pull/145460
Commit bb03cdcb441fd68da9d1ebb7d5f39f73667cd39c caused a Linux kernel
regression https://github.com/ClangBuiltLinux/linux/issues/2091
When a section contains linker-relaxable MCAlignmentFragment but no
linker-relaxable instructions, the RISCVAsmBackend::isPCRelFixupResolved
code path should be taken as well. The #76552 condition in the fragment
walk code will make the fixup unresolvable, leading to a relocation.
Use a variable symbol without any specifier instead of VK_WEAKREF.
Add code in ELFObjectWriter::executePostLayoutBinding to check
whether the target should be made an undefined weak symbol.
This change fixes several issues:
* Unreferenced `.weakref alias, target` no longer creates an undefined `target`.
* When `alias` is already defined, report an error instead of crashing.
.weakref is specific to ELF. llvm-ml has reused the VK_WEAKREF name for
a different concept. wasm incorrectly copied the ELF implementation.
Remove it.
Remove FK_PCRel_* kinds from the generic fixup list, as they are not
generic like FK_Data_*. In getRelocType, FK_PCRel_* can be replaced with
FK_Data_* by leveraging the IsPCRel argument. Their inclusion in the
generic kind list caused confusion for PowerPC, RISCV, and VE targets.
The X86/M68k uses can be implemented as target-specific fixups.
DWARF linetable entries are usually emitted as a sequence of
MCDwarfLineAddrFragment fragments containing the line-number difference
and an MCExpr describing the instruction-range the linetable entry
covers. These then get relaxed during assembly emission.
However, a large number of these instruction-range expressions are
ranges within a fixed MCDataFragment, i.e. a range over fixed-size
instructions that are not subject to relaxation at a later stage. Thus,
we can compute the address-delta immediately, and not spend time and
memory describing that computation so it can be deferred.
The function is called to test the fast path - when Lo/Hi are within the
same fragment. This is unsafe - Lo/Hi at the begin and end of a
relaxable fragment should not evaluate to a constant. However, we don't
have tests that exercise the code path.
Nevertheless, make the check safer and remove the now unnecessary
isRISCV check (from https://reviews.llvm.org/D103539).
Reland https://github.com/llvm/llvm-project/pull/106230
The original PR was reverted due to compilation time regression.
This PR fixed that by adding a condition OutStreamer->isVerboseAsm() to
the generation of extra inlined-at debug info, so that it does not
affect normal compilation time.
Currently MC print source location of instructions in comments in
assembly when debug info is available, however, it does not include
inlined-at locations when a function is inlined.
For example, function foo is defined in header file a.h and is called
multiple times in b.cpp. If foo is inlined, current assembly will only
show its instructions with their line numbers in a.h. With inlined-at
locations, the assembly will also show where foo is called in b.cpp.
This patch adds inlined-at locations to the comments by using
DebugLoc::print. It makes the printed source location info consistent
with those printed by machine passes.
Currently MC print source location of instructions in comments in
assembly when debug info is available, however, it does not include
inlined-at locations when a function is inlined.
For example, function foo is defined in header file a.h and is called
multiple times in b.cpp. If foo is inlined, current assembly will only
show its instructions with their line numbers in a.h. With inlined-at
locations, the assembly will also show where foo is called in b.cpp.
This patch adds inlined-at locations to the comments by using
DebugLoc::print. It makes the printed source location info consistent
with those printed by machine passes.
https://reviews.llvm.org/D23669 inappropriately added MIPS-specific
dtprel/tprel directives to MCStreamer. In addition,
llvm-mc -filetype=null parsing these directives will crash.
This patch moves these functions to MipsTargetStreamer and fixes
-filetype=null.
gprel32 and gprel64, called by AsmPrinter, are moved to
MCTargetStreamer.
Avoid needless copying of instructions and fixups and directly emit into
the fragment small vectors.
This (optionally, second commit) also removes the single use of the
MCCompactEncodedInstFragment to simplify code.
https://reviews.llvm.org/D70157 (for Intel Jump Conditional Code
Erratum) introduced two virtual function calls in the generic
MCObjectStreamer::emitInstruction, which added some overhead.
This patch removes the virtual function overhead:
* Define `llvm::X86_MC::emitInstruction` that calls `emitInstruction{Begin,End}`.
* Define {X86ELFStreamer,X86WinCOFFStreamer}::emitInstruction to call `llvm::X86_MC::emitInstruction`
Pull Request: https://github.com/llvm/llvm-project/pull/96835
Follow-up to 05ba5c0648ae5e80d5afce270495bf3b1eef9af4. uint32_t is
preferred over const MCExpr * in the section stack uses because it
should only be evaluated once. Change the paramter type to match.
This commit removes the complexity introduced by pending labels in
https://reviews.llvm.org/D5915 by using a simpler approach. D5915 aimed
to ensure padding placement before `.Ltmp0` for the following code, but
at the cost of expensive per-instruction `flushPendingLabels`.
```
// similar to llvm/test/MC/X86/AlignedBundling/labeloffset.s
.bundle_lock align_to_end
calll .L0$pb
.bundle_unlock
.L0$pb:
popl %eax
.Ltmp0: //// padding should be inserted before this label instead of after
addl $_GLOBAL_OFFSET_TABLE_+(.Ltmp0-.L0$pb), %eax
```
(D5915 was adjusted by https://reviews.llvm.org/D8072 and
https://reviews.llvm.org/D71368)
This patch achieves the same goal by setting the offset of the empty
MCDataFragment (`Prev`) in `layoutBundle`. This eliminates the need for
pending labels and simplifies the code.
llvm/test/MC/MachO/pending-labels.s (D71368): relocation symbols are
changed, but the result is still supported by linkers.
When both aligned bundling and RelaxAll are enabled, bundle padding is
directly written into fragments (https://reviews.llvm.org/D8072).
(The original motivation was memory usage, which has been achieved from
different angles with recent assembler improvement).
The code presents challenges with the work to replace fragment
representation (e.g. #94950#95077). This patch removes the special
handling. RelaxAll still works but the behavior seems slightly different
as revealed by 2 changed tests. However, most `-mc-relax-all` tests are
unchanged.
RelaxAll used to be the default for clang -O0. This mode has significant
code size drawbacks and newer Clang doesn't use it (#90013).
---
flushPendingLabels: The FOffset parameter can be removed: pending labels
will be assigned to the incoming fragment at offset 0.
Pull Request: https://github.com/llvm/llvm-project/pull/95188
`allocFragment` might be changed to a placement new when the allocation
strategy changes.
`allocInitialFragment` is to deduplicate the following pattern
```
auto *F = new MCDataFragment();
Result->addFragment(*F);
F->setParent(Result);
```
Pull Request: https://github.com/llvm/llvm-project/pull/95197
For bolt/test/runtime/X86/exceptions-pic.test, llvm-bolt seems to call
emitLabel twice and the assert will fail. Work around it after
2cc4bc132cbcc76c5552cbc128830943ea596b3e
After 9d0754ada5dbbc0c009bcc2f7824488419cc5530 ("[MC] Relax fragments
eagerly") removes the assert of Offset, it is no longer useful to
initialize the member to -1.
Now the symbol value estimate is more precise, which leads to slight
behavior change to layout-interdependency.s.
Fragments are allocated with `operator new` and stored in an ilist with
Prev/Next/Parent pointers. A more efficient representation would be an
array of fragments without the overhead of Prev/Next pointers.
As the first step, replace ilist with singly-linked lists.
* `getPrevNode` uses have been eliminated by previous changes.
* The last use of the `Prev` pointer remains: for each subsection, there is an insertion point and
the current insertion point is stored at `CurInsertionPoint`.
* `HexagonAsmBackend::finishLayout` needs a backward iterator. Save all
fragments within `Frags`. Hexagon programs are usually small, and the
performance does not matter that much.
To eliminate `Prev`, change the subsection representation to
singly-linked lists for subsections and a pointer to the active
singly-linked list. The fragments from all subsections will be chained
together at layout time.
Since fragment lists are disconnected before layout time, we can remove
`MCFragment::SubsectionNumber` (https://reviews.llvm.org/D69411). The
current implementation of `AttemptToFoldSymbolOffsetDifference` requires
future improvement for robustness.
Pull Request: https://github.com/llvm/llvm-project/pull/95077