That is, on all targets except ARM and AArch64.
This field used to be required due to a bug, it was fixed long ago
by 23423c0ea8d414e56081cb6a13bd8b2cc91513a9.
According to the instructions manual, when `vr0` is changed, high 128
bit of `xr0` is undefined.
Use `vinsgr2vr.b/h` to insert an `i8/i16` to low 128bit of a 256 vector
may cause undefined behavior when high 128bit is used in later
instructions.
In some cases, such as using `lto` or `llc`, relax feature is not
available from this `SubtargetInfo` (`LoongArchAsmBackend` is
instantiated too early), causing loss of relocations.
This commit modifiy the condition to check whether the section which
contains the two symbols is relaxable. If not relaxable, no need to
record relocations.
This commit changes all relocations to be relocated with symbols.
Without this commit, errors may occur in some cases, such as when using
`llc/lto+relax`, or combining relaxed and norelaxed object files using
`ld -r`.
Some tests updated.
According to the suggestions in
https://github.com/llvm/llvm-project/pull/150816, this commit refine the
conditions for emitting R_LARCH_ALIGN relocations.
Some existing tests are updated to avoid being affected by this
optimization. New tests are added to verify: removal of redundant ALIGN
relocations, ALIGN emitted after the first linker-relaxable instruction,
and conservatively emitted ALIGN in lower-numbered subsections.
It is common to have ABI requirements for illegal types: For example,
two i64 argument parts that originally came from an fp128 argument may
have a different call ABI than ones that came from a i128 argument.
The current calling convention lowering does not provide access to this
information, so backends come up with various hacks to support it (like
additional pre-analysis cached in CCState, or bypassing the default
logic entirely).
This PR adds the original IR type to InputArg/OutputArg and passes it
down to CCAssignFn. It is not actually used anywhere yet, this just does
the mechanical changes to thread through the new argument.
Each section now tracks the index of the first linker-relaxable
fragment, enabling two changes:
* Delete redundant ALIGN relocations before the first linker-relaxable
instruction in a section. The primary example is the offset 0
R_RISCV_ALIGN relocation for a text section aligned by 4.
* For alignments larger than the NOP size after the first
linker-relaxable instruction, ALIGN relocations are now generated, even in
norelax regions. This fixes the issue #150159.
The new test llvm/test/MC/RISCV/Relocations/align-after-relax.s
verifies the required ALIGN in a norelax region following
linker-relaxable instructions.
By using a fragment index within the subsection (which is less than or
equal to the section's index), the implementation may generate redundant
ALIGN relocations in lower-numbered subsections before the first
linker-relaxable instruction.
align-option-relax.s demonstrates the ALIGN optimization.
Add an initial `call` to a few tests to prevent the ALIGN optimization.
---
When the alignment exceeds 2, we insert $alignment-2 bytes of NOPs, even
in non-RVC code. This enables non-RVC code following RVC code to handle
a 2-byte adjustment without requiring an additional state in MCSection
or AsmParser.
```
.globl _start
_start:
// GNU ld can relax this to 6505 lui a0, 0x1
// LLD hasn't implemented this transformation.
lui a0, %hi(foo)
.option push
.option norelax
.option norvc
// Now we generate R_RISCV_ALIGN with addend 2, even if this is a norvc region.
.balign 4
b0:
.word 0x3a393837
.option pop
foo:
```
Pull Request: https://github.com/llvm/llvm-project/pull/150816
The information whether a specific argument is vararg or fixed is
currently stored separately from all the other argument information in
ArgFlags. This means that it is not accessible from CCAssign, and
backends have developed all kinds of workarounds for how they can access
it after all.
Move this information to ArgFlags to make it directly available in all
relevant places.
I've opted to invert this and store it as IsVarArg, as I think that both
makes the meaning more obvious and provides for a better default (which
is IsVarArg=false).
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
Instead of creating two separate fixups, create a single one. Leverage
LoongArchAsmBackend::addReloc to generate ADD/SUB relocation pairs.
In a future change MCFragment::setVarFixup may be restricted to a single
fixup.
Similar to c7500a2ec3baae1f0d7de0de94407d4bdb2e5d4d
`Data` now references the first byte of the fixup offset within the current fragment.
MCAssembler::layout asserts that the fixup offset is within either the
fixed-size content or the optional variable-size tail, as this is the
most the generic code can validate without knowing the target-specific
fixup size.
Many backends applyFixup assert
```
assert(Offset + Size <= F.getSize() && "Invalid fixup offset!");
```
This refactoring allows a subsequent change to move the fixed-size
content outside of MCSection::ContentStorage, fixing the
-fsanitize=pointer-overflow issue of #150846
Pull Request: https://github.com/llvm/llvm-project/pull/151724
to facilitate replacing `MutableArrayRef<char> Data` (fragment content)
with the relocated location. This is necessary to fix the
pointer-overflow sanitizer issue and reland #150846
"Encode FT_Align in fragment's variable-size tail" or a neighbor change
caused a regression that was similar to the root cause of
ab0931b6389838cb5d7d11914063a1ddd84102f0
(See the new test (.text3 with a call at the start"))
For a FT_Align fragment, the offset between location A (with offset <=
FixedSize) and B (offset == FixedSize+VarSize) cannot be resolved.
In addition, delete unneeded condition `F->isLinkerRelaxable()`.
LoongArch linker relaxation is largely under-tested, but update it as well.
If the environment is considered to be the triple component as a whole,
so, including the object format, if any, and if that is the intended
behaviour, then the loongarch64 function `computeTargetABI()` should be
changed to not rely on `hasEnvironment()`, but, rather, to check if
there is a non-unknown environment set.
Without this change, using a (ideally valid) target of
loongarch64-unknown-none-elf, with a manually specified ABI of lp64s,
will result in a completely superfluous warning:
```
warning: triple-implied ABI conflicts with provided target-abi 'lp64s', using target-abi
```
Previously, two MCAsmBackend hooks were used, with
shouldInsertFixupForCodeAlign calling getWriter().recordRelocation
directly, bypassing generic code.
This patch:
* Introduces MCAsmBackend::relaxAlign to replace the two hooks.
* Tracks padding size using VarContentEnd (content is ignored).
* Move setLinkerRelaxable from MCObjectStreamer::emitCodeAlignment to the backends.
Pull Request: https://github.com/llvm/llvm-project/pull/149465
This patch adds an emergency spill slot when ran out of registers.
PR #139201 introduces `vstelm` instructions with only 8-bit imm offset,
it causes no spill slot to store the spill registers.
Refactor the fragment representation of `push rax; jmp foo; nop; jmp foo`,
previously encoded as
`MCDataFragment(nop); MCRelaxableFragment(jmp foo); MCDataFragment(nop); MCRelaxableFragment(jmp foo)`,
to
```
MCFragment(fixed: push rax, variable: jmp foo)
MCFragment(fixed: nop, variable: jmp foo)
```
Changes:
* Eliminate MCEncodedFragment, moving content and fixup storage to MCFragment.
* The new MCFragment contains a fixed-size content (similar to previous
MCDataFragment) and an optional variable-size tail.
* The variable-size tail supports FT_Relaxable, FT_LEB, FT_Dwarf, and
FT_DwarfFrame, with plans to extend to other fragment types.
dyn_cast/isa should be avoided for the converted fragment subclasses.
* In `setVarFixups`, source fixup offsets are relative to the variable part's start.
Stored fixup (in `FixupStorage`) offsets are relative to the fixed part's start.
A lot of code does `getFragmentOffset(Frag) + Fixup.getOffset()`,
expecting the fixup offset to be relative to the fixed part's start.
* HexagonAsmBackend::fixupNeedsRelaxationAdvanced needs to know the
associated instruction for a fixup. We have to add a `const MCFragment &` parameter.
* In MCObjectStreamer, extend `absoluteSymbolDiff` to apply to
FT_Relaxable as otherwise there would be many more FT_DwarfFrame
fragments in -g compilations.
https://llvm-compile-time-tracker.com/compare.php?from=28e1473e8e523150914e8c7ea50b44fb0d2a8d65&to=778d68ad1d48e7f111ea853dd249912c601bee89&stat=instructions:u
```
stage2-O0-g instructins:u geomeon (-0.07%)
stage1-ReleaseLTO-g (link only) max-rss geomean (-0.39%)
```
```
% /t/clang-old -g -c sqlite3.i -w -mllvm -debug-only=mc-dump &| awk '/^[0-9]+/{s[$2]++;tot++} END{print "Total",tot; n=asorti(s, si); for(i=1;i<=n;i++) print si[i],s[si[i]]}'
Total 59675
Align 2215
Data 29700
Dwarf 12044
DwarfCallFrame 4216
Fill 92
LEB 12
Relaxable 11396
% /t/clang-new -g -c sqlite3.i -w -mllvm -debug-only=mc-dump &| awk '/^[0-9]+/{s[$2]++;tot++} END{print "Total",tot; n=asorti(s, si); for(i=1;i<=n;i++) print si[i],s[si[i]]}'
Total 32287
Align 2215
Data 2312
Dwarf 12044
DwarfCallFrame 4216
Fill 92
LEB 12
Relaxable 11396
```
Pull Request: https://github.com/llvm/llvm-project/pull/148544
This patch replaces stack-based accesses with register moves when
converting between 128-bit and 256-bit vectors. A 128-bit subvector
extract from, or insert to, the lower half of a 256-bit vector is now
treated as a subregister copy that needs no instruction.
Fixes#147769
This PR takes the work previously done by @pawan-nirpal-031 on X86 in
#106370, and makes it available in common code. This should enable all
targets to use `__builtin_canonicalize` for all `f(16|32|64|128)` data
types.
Canonicalization is implemented here as multiplication by `1.0`, as
suggested in [the
docs](https://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic).
Avoid reliance on the MCAssembler::evaluateFixup workaround that checks
MCFixupKindInfo::FKF_IsPCRel. Additionally, standardize how fixups are
appended. This helper will facilitate future fixup data structure
optimizations.
MCFixup::Loc has been removed in favor of MCExpr::Loc through
`const MCExpr *Value` (commit 777391a2164b89d2030ca013562151ca3c3676d1).
While here, change Kind to uint16_t from MCFixupKind. Most fixup kinds
are target-specific.
Follow-up to #141333. Relocation generation called both addReloc and
applyFixup, with the default addReloc invoking shouldForceRelocation,
resulting in three virtual calls. This approach was also inflexible, as
targets needing additional data required extending
`shouldForceRelocation` (see #73721, resolved by #141311).
This change integrates relocation handling into applyFixup, eliminating
two virtual calls. The prior default addReloc is renamed to
maybeAddReloc. Targets overriding addReloc now call their customized
addReloc implementation.
Printing an expression is error-prone without a MCAsmInfo argument.
Remove the operator<< overload and replace callers with
MCAsmInfo::printExpr. Some callers are changed to MCExpr::print, with
the goal of eventually making it private.
so that subclasses can provide the appropriate MCAsmInfo to print
MCExpr objects.
At present, llvm/utils/TableGen/AsmMatcherEmitter.cpp constucts a
generic MCAsmInfo.