Follow-up to 14951a5a3120e50084b3c5fb217e2d47992a24d1
* Unify getVariantKindName and getVariantKindForName
* Allow each target to specify the preferred case (albeit ignored in MCParser)
Note: targets that use variant kinds should call MCExpr::print with a
non-null MAI to print variant kinds. operator<< passes a nullptr to
`MCExpr::print`, which should be avoided (e.g. Hexagon; fixed in
commit cf00ac81ac049cddb80aec1d6d88b8fab4f209e8).
All VariantKinds except VK_None/VK_Invalid are target-specific (e.g. a
target may not support "@plt" even if it is widely available).
Move the parsers to lib/Target to ensure that VariantKind from unrelated
targets will not be parsed.
I was wrong last patch. I viewed the `Visited` set purely as a possible
recursion deterrent where functions calling a callee multiple times are
handled elsewhere. This doesn't consider cases where a function is
called multiple times by different callers still part of the same call
graph. New test shows the aforementioned case.
Reapplies #111004, fixes#115562.
This restores 63ec52f867ada8d841dd872acf3d0cb62e2a99e8 and
46f7929879a59ec72dc75679b4201e2d314efba9, NFC changes that were
unnecessarily reverted.
This completes the work that merges MCAsmLayout into MCAssembler.
Pull Request: https://github.com/llvm/llvm-project/pull/97449
and replace MCAssembler::Layout with a bool.
This mostly completes "[MC] Start merging MCAsmLayout into MCAssembler".
Note: BOLT used a dummy `MCAsmLayout` to call `getSymbolOffset`, which
is technically not supported. There is some discussion in
https://reviews.llvm.org/D154604 .
The revert f80a4072ced41b52363c63df28fea9a649f7f89e is incorrect and
actually broke bots.
Follow-up to 10c894cffd0f4bef21b54a43b5780240532e44cf.
MCAsmLayout, introduced by ac8a95498a99eb16dff9d3d0186616645d200b6e
(2010), provides APIs to compute fragment/symbol/section offsets.
The separate class is cumbersome and passing it around has overhead.
Let's remove it as the underlying implementation is tightly coupled with
MCAsmLayout anyway.
Some forwarders are added to ease migration.
This was added by 507efbcce03d8c2c5dbea3028bc39f02c88fea79
([MC] Fold A-B when A is a pending label or A/B are separated by a
MCFillFragment) to account for pending labels and is now unneeded after
the removal of pending labels (75006466296ed4b0f845cbbec4bf77c21de43b40).
Fragments are allocated with `operator new` and stored in an ilist with
Prev/Next/Parent pointers. A more efficient representation would be an
array of fragments without the overhead of Prev/Next pointers.
As the first step, replace ilist with singly-linked lists.
* `getPrevNode` uses have been eliminated by previous changes.
* The last use of the `Prev` pointer remains: for each subsection, there is an insertion point and
the current insertion point is stored at `CurInsertionPoint`.
* `HexagonAsmBackend::finishLayout` needs a backward iterator. Save all
fragments within `Frags`. Hexagon programs are usually small, and the
performance does not matter that much.
To eliminate `Prev`, change the subsection representation to
singly-linked lists for subsections and a pointer to the active
singly-linked list. The fragments from all subsections will be chained
together at layout time.
Since fragment lists are disconnected before layout time, we can remove
`MCFragment::SubsectionNumber` (https://reviews.llvm.org/D69411). The
current implementation of `AttemptToFoldSymbolOffsetDifference` requires
future improvement for robustness.
Pull Request: https://github.com/llvm/llvm-project/pull/95077
Lazy relaxation caused hash table lookups (`getFragmentOffset`) and
complex use/compute interdependencies. Some expressions involding
forward declared symbols (e.g. `subsection-if.s`) cannot be computed.
Recursion detection requires complex `IsBeingLaidOut`
(https://reviews.llvm.org/D79570).
D76114's `invalidateFragmentsFrom` makes lazy relaxation even less
useful.
Switch to eager relaxation to greatly simplify code and resolve these
issues. This change also removes a `getPrevNode` use, which makes it
more feasible to replace the fragment representation, which might yield
a large peak RSS win.
Minor downsides: The number of section relaxations may increase (offset
by avoiding the hash table lookup). For relax-recompute-align.s, the
computed layout is not optimal.
The `FA < FB` check added by https://reviews.llvm.org/D153096 is slow.
Compute an informal layout order to speed up computation when
`AttemptToFoldSymbolOffsetDifference` is repeatedly called for the same
section.
Commit 9500a5d02e23f9b43294e5f662ac099f8989c0e4 ("[MC] Make UseAssemblerInfoForParsing mostly true")
exposed this performance pitfall, which was mitigated by
`setUseAssemblerInfoForParsing(false)` workarounds (e.g. commit
245491a9f384e4c53421196533c2a2b693efaf8d). The workaround can be removed
now.
Linux kernel fs/binfmt_elf_fdpic.c supports FDPIC for MMU-less systems.
GCC/binutils/qemu support FDPIC ABI for ARM
(https://github.com/mickael-guene/fdpic_doc).
_ARM FDPIC Toolchain and ABI_ provides a summary.
This patch implements FDPIC relocations to the integrated assembler.
There are 6 static relocations and 2 dynamic relocations, with
R_ARM_FUNCDESC as both static and dynamic.
gas requires `--fdpic` to assemble data relocations like `.word f(FUNCDESC)`.
This patch adds `MCTargetOptions::FDPIC` and reports an error if FDPIC
is not set.
Pull Request: https://github.com/llvm/llvm-project/pull/82187
The shouldInsertExtraNopBytesForCodeAlign() need STI to check whether
relax is enabled or not. It is initialized when call setEmitNops. The
setEmitNops may not be called in a section which has instructions but is
not executable. In this case uninitialized STI will cause problems.
Thus, check hasEmitNops before call it.
Fixes:
https://github.com/llvm/llvm-project/pull/76552#issuecomment-1878952480
Due to delayed decision for ADD/SUB relocations, RISCV and LoongArch may
go slow fragment walk path with available layout. When RISCV (or
LoongArch in the future) don't need insert nops, that means relax is
disabled. With available layout and not needing insert nops, the size of
AlignFragment should be a constant. So we can add it to Displacement for
folding A-B.
This fixes#73109.
In instruction `addl %eax %rax`, because there is a missing comma in the
middle of two registers, the asm parser will treat it as a binary
expression.
```
%rax % rax --> register mod identifier
```
However, In `MCExpr::evaluateAsRelocatableImpl`, it only checks the left
side of the expression. This patch ensures the right side will also be
checked.
Refer to RISCV [1], LoongArch also need delayed decision for ADD/SUB
relocations. In handleAddSubRelocations, just return directly if SecA !=
SecB, handleFixup usually will finish the the rest of creating PCRel
relocations works. Otherwise we emit relocs depends on whether
relaxation is enabled. If not, we return true and avoid record ADD/SUB
relocations.
Now the two symbols separated by alignment directive will return without
folding symbol offset in AttemptToFoldSymbolOffsetDifference, which has
the same effect when relaxation is enabled.
[1] https://reviews.llvm.org/D155357
The patch adds parser, MCExpr, and emitter support for the authenticated
pointer auth relocation.
In assembly, this is expressed using:
.quad <symbol>@AUTH(<key>, <discriminator> [, addr])
For example:
.quad _g3@AUTH(ib, 1234, addr)
The optional 'addr' specifier represents whether the generated pointer
authentication code will also include address diversity (by blending the
address of the storage location of the relocated pointer with the
user-specified constant discriminator).
The @AUTH expression lowers to R_AARCH64_AUTH_ABS64 ELF relocation.
The signing schema is encoded in the place of relocation to be applied
as follows:
```
| 63 | 62 | 61:60 | 59:48 | 47:32 | 31:0 |
| ----------------- | -- | ----- | ----- | ------------- | ------ |
| address diversity | 0 | key | 0 | discriminator | addend |
```
See the following for details:
https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#static-relocations
Differential Revision: https://reviews.llvm.org/D156505
Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
Co-authored-by: Peter Collingbourne <peter@pcc.me.uk>
D52985/D57677 added a .gcc_except_table workaround, but the new behavior
doesn't match GNU assembler.
```
void foo();
int bar() {
foo();
try { throw 1; }
catch (int) { return 1; }
return 0;
}
clang --target=mipsel-linux-gnu -mmicromips -S a.cc
mipsel-linux-gnu-gcc -mmicromips -c a.s -o gnu.o
.uleb128 ($cst_end0)-($cst_begin0) // bit 0 is not forced to 1
.uleb128 ($func_begin0)-($func_begin0) // bit 0 is not forced to 1
```
I have inspected `.gcc_except_table` output by `mipsel-linux-gnu-gcc -mmicromips -c a.cc`.
The `.uleb128` values are not forced to set the least significant bit.
In addition, D57677's adjustment (even->odd) to CodeGen/Mips/micromips-b-range.ll is wrong.
PC-relative `.long func - .` values will differ from GNU assembler as well.
The original intention of D52985 seems unclear to me. I think whatever
goal it wants to achieve should be moved to an upper layer.
This isMicroMips special case has caused problems to fix MCAssembler::relaxLEB to use evaluateAsAbsolute instead of evaluateKnownAbsolute,
which is needed to proper support R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128.
Differential Revision: https://reviews.llvm.org/D157655
This reverts commit 8ee6c0ea0bf30f1f1da6b49ee720b933f9676a30.
The untested special case is used as a workaround that we don't force emitting
R_ARM_REL32 relocations like GNU assembler's arm port: `TC_FORCE_RELOCATION_SUB_SAME`.
We shall investigate how to emit R_ARM_REL32.
The special case from 9746286beca2539438e0a6b783e106bc359036ca (2011)
seems irrelevant nowadays. It is actually incorrect because GNU
assembler does not set the least significant bit for `.long thumb - .`.
For a label difference `A-B` in assembly, if A and B are separated by a
linker-relaxable instruction, we should emit a pair of ADD/SUB
relocations (e.g. R_RISCV_ADD32/R_RISCV_SUB32,
R_RISCV_ADD64/R_RISCV_SUB64).
However, the decision is made upfront at parsing time with inadequate
heuristics (`requiresFixup`). As a result, LLVM integrated assembler
incorrectly suppresses R_RISCV_ADD32/R_RISCV_SUB32 for the following
code:
```
// Simplified from a workaround https://android-review.googlesource.com/c/platform/art/+/2619609
// Both end and begin are not defined yet. We decide ADD/SUB relocations upfront and don't know they will be needed.
.4byte end-begin
begin:
call foo
end:
```
To fix the bug, make two primary changes:
* Delete `requiresFixups` and the overridden emitValueImpl (from D103539).
This deletion requires accurate evaluateAsAbolute (D153097).
* In MCAssembler::evaluateFixup, call handleAddSubRelocations to emit
ADD/SUB relocations.
However, there is a remaining issue in
MCExpr.cpp:AttemptToFoldSymbolOffsetDifference. With MCAsmLayout, we may
incorrectly fold A-B even when A and B are separated by a
linker-relaxable instruction. This deficiency is acknowledged (see
D153097), but was previously bypassed by eagerly emitting ADD/SUB using
`requiresFixups`. To address this, we partially reintroduce `canFold` (from
D61584, removed by D103539).
Some expressions (e.g. .size and .fill) need to take the `MCAsmLayout`
code path in AttemptToFoldSymbolOffsetDifference, avoiding relocations
(weird, but matching GNU assembler and needed to match user
expectation). Switch to evaluateKnownAbsolute to leverage the `InSet`
condition.
As a bonus, this change allows for the removal of some relocations for
the FDE `address_range` field in the .eh_frame section.
riscv64-64b-pcrel.s contains the main test.
Add a linker relaxable instruction to dwarf-riscv-relocs.ll to test what
it intends to test.
Merge fixups-relax-diff.ll into fixups-diff.ll.
Reviewed By: kito-cheng
Differential Revision: https://reviews.llvm.org/D155357
Annotation attributes may be attached to a function to mark it with
custom data that will be contained in the final Wasm file. The
annotation causes a custom section named
"func_attr.annotate.<name>.<arg0>.<arg1>..." to be created that will
contain each function's index value that was marked with the annotation.
A new patchable relocation type for function indexes had to be created so
the custom section could be updated during linking.
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D150803
`MCExpr::evaluateAsAbsolute` has a longstanding bug. When the MCAssembler is
non-null and the MCAsmLayout is null, it may incorrectly fold A-B even if A and
B are separated by a linker-relaxable instruction. This behavior can suppress
some ADD/SUB relocations and lead to wrong results if the linker performs
relaxation.
To fix the bug, ensure that linker-relaxable instructions only appear at the end
of an MCDataFragment, thereby making them terminate the fragment. When computing
A-B, suppress folding if A and B are separated by a linker-relaxable
instruction.
* `.subsection` now correctly give errors for non-foldable expressions.
* gen-dwarf.s will pass even if we add back the .debug_line or .eh_frame/.debug_frame code from D150004
* This will fix suppressed relocation when we add R_RISCV_SET_ULEB128/R_RISCV_SUB_ULEB128.
In the future, we should investigate the desired behavior for
`MCExpr::evaluateAsAbsolute` when both MCAssembler and MCAsmLayout are non-null.
(Note: MCRelaxableFragment is only for assembler-relaxation. If we ever need
linker-relaxable MCRelaxableFragment, we would need to adjust RISCVMCExpr.cpp
(D58943/D73211).)
Depends on D153096
Differential Revision: https://reviews.llvm.org/D153097
When the MCAssembler is non-null and the MCAsmLayout is null, we can fold A-B
when
* A and B are in the same fragment, or
* A's fragment suceeds B's fragment, and they are not separated by non-data fragments (D69411)
This patch allows folding when A's fragment precedes B's fragment so
that `9997b - . == 0` below can be evaluated as true:
```
nop
.arch_extension sec
9997:nop
// old behavior: error: expected absolute expression
.if 9997b - . == 0
.endif
```
Add a case to llvm/test/MC/ARM/directive-if-subtraction.s.
Note: for MCAsmStreamer, we cannot evaluate `.if . - 9997b == 0` at parse
time due to MCAsmStreamer::getAssemblerPtr returning nullptr (D45164).
Some Darwin tests check that this folding does not work. Add `.p2align 2` to
block some label difference folding or adjust the tests.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D153096
This patch adds support for the TLS local-exec access model on AIX to allow
for the ability to generate the 64-bit (specifically, non-optimized) code sequence.
For this patch in particular, the sequence that is generated involves a load of the
variable offset, followed by an add of the loaded variable offset to r13 (which is
thread pointer, respectively). This code sequence looks like the following:
```
ld reg1,var[TC](2)
add reg2, reg1, r13 // r13 contains the thread pointer
```
The TOC (.tc pseudo-op) entries generated in the assembly files are also
changed where we add the @le relocation for the variable offset.
Differential Revision: https://reviews.llvm.org/D149722
When the MCAssembler is non-null and the MCAsmLayout is null, we can fold A-B
in these additional cases:
* when A is a pending label (will be reassigned to a real fragment in flushPendingLabels())
* A and B are separated by a MCFillFragment with a constant size
If FA == FB, we can use SA.getOffset() - SB.getOffset() even if FA is
not a MCDataFragment, as the only case this can be problematic
(different offsets for a variable-size fragment) is invalid/unreachable.
If FA != FB, the `if (FI->getKind() != MCFragment::FT_Data)` check below
can bail out correctly.
This change will help Mach-O fold more expressions. For ELF this is NFC,
unless evaluateFixup has a bug that would evaluate an expression
differently.
This is mostly useful for ARM64EC, which uses such symbols extensively.
One interesting quirk of ARM64EC is that we need to be able to emit weak
symbols that point at each other (so if either symbol is defined
elsewhere, both symbols point at the definition). This handling is
currently restricted to weak_anti_dep symbols, because we depend on the
current behavior of resolving weak symbols in some cases.
Differential Revision: https://reviews.llvm.org/D145208
This reverts commit 10c17c97ebaf81ac26f6830e51a7a57ddcf63cd2. It causes undefined symbol error on chromium windows build. A small repro was uploaded to the code review.
This is mostly useful for ARM64EC, which uses such symbols extensively.
One interesting quirk of ARM64EC is that we need to be able to emit weak
symbols that point at each other (so if either symbol is defined
elsewhere, both symbols point at the definition). This required a few
changes to the way we handle weak symbols on Windows.
Differential Revision: https://reviews.llvm.org/D145208
This is mostly useful for ARM64EC, which uses such symbols extensively.
One interesting quirk of ARM64EC is that we need to be able to emit weak
symbols that point at each other (so if either symbol is defined
elsewhere, both symbols point at the definition). This required a few
changes to the way we handle weak symbols on Windows.
Differential Revision: https://reviews.llvm.org/D145208