695 Commits

Author SHA1 Message Date
Sergei Barannikov
d6679d5a5f
[Target] Remove SoftFail field on targets that don't use it (NFC) (#154659)
That is, on all targets except ARM and AArch64.
This field used to be required due to a bug, it was fixed long ago
by 23423c0ea8d414e56081cb6a13bd8b2cc91513a9.
2025-08-21 05:21:42 +03:00
tangaac
ccbcebcfd3
[LoongArch] Fix implicit PesudoXVINSGR2VR error (#152432)
According to the instructions manual, when `vr0` is changed, high 128
bit of `xr0` is undefined.
Use `vinsgr2vr.b/h` to insert an `i8/i16` to low 128bit of a 256 vector
may cause undefined behavior when high 128bit is used in later
instructions.
2025-08-19 17:22:00 +08:00
ZhaoQi
be3fd6ae25
[LoongArch] Use section-relaxable check instead of relax feature from STI (#153792)
In some cases, such as using `lto` or `llc`, relax feature is not
available from this `SubtargetInfo` (`LoongArchAsmBackend` is
instantiated too early), causing loss of relocations.

This commit modifiy the condition to check whether the section which
contains the two symbols is relaxable. If not relaxable, no need to
record relocations.
2025-08-19 09:48:51 +08:00
ZhaoQi
8f671a675f
[LoongArch] Always emit symbol-based relocations regardless of relaxation (#153943)
This commit changes all relocations to be relocated with symbols.

Without this commit, errors may occur in some cases, such as when using
`llc/lto+relax`, or combining relaxed and norelaxed object files using
`ld -r`.

Some tests updated.
2025-08-18 20:15:49 +08:00
ZhaoQi
76fb1619f0
[LoongArch] Reduce number of reserved relocations when relax enabled (#153769) 2025-08-18 17:42:43 +08:00
ZhaoQi
6957e44d8e
[LoongArch][MC] Refine conditions for emitting ALIGN relocations (#153365)
According to the suggestions in
https://github.com/llvm/llvm-project/pull/150816, this commit refine the
conditions for emitting R_LARCH_ALIGN relocations.

Some existing tests are updated to avoid being affected by this
optimization. New tests are added to verify: removal of redundant ALIGN
relocations, ALIGN emitted after the first linker-relaxable instruction,
and conservatively emitted ALIGN in lower-numbered subsections.
2025-08-18 14:54:27 +08:00
Nikita Popov
01bc742185
[CodeGen] Give ArgListEntry a proper constructor (NFC) (#153817)
This ensures that the required fields are set, and also makes the
construction more convenient.
2025-08-15 18:06:07 +02:00
tangaac
9315d701eb
[LoongArch] Optimize inserting extracted element for v4i64/v8i32 (#152629) 2025-08-14 17:06:50 +08:00
Nikita Popov
d1952baa5d [CodeGen] Remove unnecessary setTypeListBeforeSoften() parameter (NFC)
It does not make sense to set the softening type list without
setting IsSoften=true.
2025-08-14 10:04:56 +02:00
Nikita Popov
e92b7e9641
[CodeGen] Provide original IR type to CC lowering (NFC) (#152709)
It is common to have ABI requirements for illegal types: For example,
two i64 argument parts that originally came from an fp128 argument may
have a different call ABI than ones that came from a i128 argument.

The current calling convention lowering does not provide access to this
information, so backends come up with various hacks to support it (like
additional pre-analysis cached in CCState, or bypassing the default
logic entirely).

This PR adds the original IR type to InputArg/OutputArg and passes it
down to CCAssignFn. It is not actually used anywhere yet, this just does
the mechanical changes to thread through the new argument.
2025-08-11 08:57:53 +02:00
Fangrui Song
3769ce013b
MC: Refine ALIGN relocation conditions
Each section now tracks the index of the first linker-relaxable
fragment, enabling two changes:

* Delete redundant ALIGN relocations before the first linker-relaxable
  instruction in a section. The primary example is the offset 0
  R_RISCV_ALIGN relocation for a text section aligned by 4.
* For alignments larger than the NOP size after the first
  linker-relaxable instruction, ALIGN relocations are now generated, even in
  norelax regions. This fixes the issue #150159.

The new test llvm/test/MC/RISCV/Relocations/align-after-relax.s
verifies the required ALIGN in a norelax region following
linker-relaxable instructions.
By using a fragment index within the subsection (which is less than or
equal to the section's index), the implementation may generate redundant
ALIGN relocations in lower-numbered subsections before the first
linker-relaxable instruction.

align-option-relax.s demonstrates the ALIGN optimization.
Add an initial `call` to a few tests to prevent the ALIGN optimization.

---

When the alignment exceeds 2, we insert $alignment-2 bytes of NOPs, even
in non-RVC code. This enables non-RVC code following RVC code to handle
a 2-byte adjustment without requiring an additional state in MCSection
or AsmParser.

```
.globl _start
_start:
// GNU ld can relax this to  6505          lui     a0, 0x1
// LLD hasn't implemented this transformation.
  lui a0, %hi(foo)

.option push
.option norelax
.option norvc
// Now we generate R_RISCV_ALIGN with addend 2, even if this is a norvc region.
.balign 4
b0:
  .word 0x3a393837
.option pop
foo:
```

Pull Request: https://github.com/llvm/llvm-project/pull/150816
2025-08-07 19:16:58 -07:00
Nikita Popov
406d9b1dd6
[CodeGen] Move IsFixed into ArgFlags (NFCI) (#152319)
The information whether a specific argument is vararg or fixed is
currently stored separately from all the other argument information in
ArgFlags. This means that it is not accessible from CCAssign, and
backends have developed all kinds of workarounds for how they can access
it after all.

Move this information to ArgFlags to make it directly available in all
relevant places.

I've opted to invert this and store it as IsVarArg, as I think that both
makes the meaning more obvious and provides for a better default (which
is IsVarArg=false).
2025-08-07 09:12:40 +02:00
tangaac
b05e26be8a
[LoongArch] Optimize extracting f32/f64 from 256-bit vector by using XVPICKVE. (#151914) 2025-08-06 09:11:34 +08:00
Fangrui Song
5570ce5cef MCSymbolELF: Migrate away from classof
The object file format specific derived classes are used in context
where the type is statically known. We don't use isa/dyn_cast and we
want to eliminate MCSymbol::Kind in the base class.
2025-08-03 15:17:13 -07:00
Fangrui Song
c330585bc7 LoongArchAsmBackend: Simplify relaxDwarfLineAddr and remove getRelocPairForSize
Instead of creating two separate fixups, create a single one. Leverage
LoongArchAsmBackend::addReloc to generate ADD/SUB relocation pairs.
In a future change MCFragment::setVarFixup may be restricted to a single
fixup.

Similar to c7500a2ec3baae1f0d7de0de94407d4bdb2e5d4d
2025-08-02 12:24:46 -07:00
Fangrui Song
d3589edafc MCAsmBackend::applyFixup: Change Data to indicate the relocated location
`Data` now references the first byte of the fixup offset within the current fragment.

MCAssembler::layout asserts that the fixup offset is within either the
fixed-size content or the optional variable-size tail, as this is the
most the generic code can validate without knowing the target-specific
fixup size.

Many backends applyFixup assert
```
assert(Offset + Size <= F.getSize() && "Invalid fixup offset!");
```

This refactoring allows a subsequent change to move the fixed-size
content outside of MCSection::ContentStorage, fixing the
-fsanitize=pointer-overflow issue of #150846

Pull Request: https://github.com/llvm/llvm-project/pull/151724
2025-08-02 09:27:06 -07:00
Fangrui Song
491c7bdd58 MCAsmBackend::applyFixup: Replace Data.getSize() with F.getSize()
to facilitate replacing `MutableArrayRef<char> Data` (fragment content)
with the relocated location. This is necessary to fix the
pointer-overflow sanitizer issue and reland #150846
2025-08-01 00:31:51 -07:00
ZhaoQi
ece7a72aa2
[LoongArch] Optimize insertelement containing variable index using compare+select (#151131) 2025-07-30 18:06:41 +08:00
ZhaoQi
80e0d41677
[LoongArch] Custom legalizing build_vector with same constant elements (#150584) 2025-07-28 09:50:35 +08:00
Fangrui Song
69d0078f16 MC: Generate relocation for a branch crossing linker-relaxable FT_Align fragment
"Encode FT_Align in fragment's variable-size tail" or a neighbor change
caused a regression that was similar to the root cause of
ab0931b6389838cb5d7d11914063a1ddd84102f0
(See the new test (.text3 with a call at the start"))

For a FT_Align fragment, the offset between location A (with offset <=
FixedSize) and B (offset == FixedSize+VarSize) cannot be resolved.

In addition, delete unneeded condition `F->isLinkerRelaxable()`.

LoongArch linker relaxation is largely under-tested, but update it as well.
2025-07-26 21:19:48 -07:00
ZhaoQi
f2a4cc1dd0
[LoongArch] Avoid expanding build_vector containing insertion of undef elements (#150377) 2025-07-26 14:24:39 +08:00
ZhaoQi
ddf34b4c97
[LoongArch] Optimize general fp build_vector lowering (#149486) 2025-07-22 16:16:27 +08:00
ZhaoQi
cae7650558
[LoongArch] Optimize inserting fp element to vector (#149302)
Co-authored-by: tangaac <tangyan01@loongson.cn>
2025-07-22 13:38:46 +08:00
mintsuki
9ed8816dc6
LoongArch: Improve detection of valid TripleABI (#147952)
If the environment is considered to be the triple component as a whole,
so, including the object format, if any, and if that is the intended
behaviour, then the loongarch64 function `computeTargetABI()` should be
changed to not rely on `hasEnvironment()`, but, rather, to check if
there is a non-unknown environment set.

Without this change, using a (ideally valid) target of
loongarch64-unknown-none-elf, with a manually specified ABI of lp64s,
will result in a completely superfluous warning:

```
warning: triple-implied ABI conflicts with provided target-abi 'lp64s', using target-abi
```
2025-07-22 12:13:37 +08:00
hev
8a307ae619
[LoongArch] Fix failure to widen operand for [X]VMSK{LT,GE,NE}Z (#149442)
Reported-by: tangyan <tangyan01@loongson.cn>
2025-07-21 16:36:49 +08:00
Fangrui Song
fd6d6a7c8d
MC: Refactor FT_Align fragments when linker relaxation is enabled
Previously, two MCAsmBackend hooks were used, with
shouldInsertFixupForCodeAlign calling getWriter().recordRelocation
directly, bypassing generic code.

This patch:

* Introduces MCAsmBackend::relaxAlign to replace the two hooks.
* Tracks padding size using VarContentEnd (content is ignored).
* Move setLinkerRelaxable from MCObjectStreamer::emitCodeAlignment to the backends.

Pull Request: https://github.com/llvm/llvm-project/pull/149465
2025-07-20 00:55:54 -07:00
Fangrui Song
2ba5e0ad17
MC: Encode FT_Align in fragment's variable-size tail
Follow-up to #148544

Pull Request: https://github.com/llvm/llvm-project/pull/149030
2025-07-20 00:46:51 -07:00
tangaac
64a0478e08
[LoongArch] Strengthen stack size estimation for LSX/LASX extension (#146455)
This patch adds an emergency spill slot when ran out of registers.
PR #139201 introduces `vstelm` instructions with only 8-bit imm offset, 
it causes no spill slot to store the spill registers.
2025-07-18 16:12:11 +08:00
ZhaoQi
e74082703e
[LoongArch] Optimize inserting bitcasted integer element or bitcasting extracted fp element (#147043) 2025-07-17 19:21:24 +08:00
ZhaoQi
efa5063ba7
[LoongArch] Optimize inserting element to high part of 256bits vector (#146816) 2025-07-17 17:52:12 +08:00
ZhaoQi
d218011159
[LoongArch] Optimize inserting extracted elements (#146018) 2025-07-17 15:44:49 +08:00
Fangrui Song
dc3a4c0fcf
MC: Restructure MCFragment as a fixed part and a variable tail
Refactor the fragment representation of `push rax; jmp foo; nop; jmp foo`,
previously encoded as
`MCDataFragment(nop); MCRelaxableFragment(jmp foo); MCDataFragment(nop); MCRelaxableFragment(jmp foo)`,

to

```
MCFragment(fixed: push rax, variable: jmp foo)
MCFragment(fixed: nop, variable: jmp foo)
```

Changes:

* Eliminate MCEncodedFragment, moving content and fixup storage to MCFragment.
* The new MCFragment contains a fixed-size content (similar to previous
  MCDataFragment) and an optional variable-size tail.
* The variable-size tail supports FT_Relaxable, FT_LEB, FT_Dwarf, and
  FT_DwarfFrame, with plans to extend to other fragment types.
  dyn_cast/isa should be avoided for the converted fragment subclasses.
* In `setVarFixups`, source fixup offsets are relative to the variable part's start.
  Stored fixup (in `FixupStorage`) offsets are relative to the fixed part's start.
  A lot of code does `getFragmentOffset(Frag) + Fixup.getOffset()`,
  expecting the fixup offset to be relative to the fixed part's start.
* HexagonAsmBackend::fixupNeedsRelaxationAdvanced needs to know the
  associated instruction for a fixup. We have to add a `const MCFragment &` parameter.
* In MCObjectStreamer, extend `absoluteSymbolDiff` to apply to
  FT_Relaxable as otherwise there would be many more FT_DwarfFrame
  fragments in -g compilations.

https://llvm-compile-time-tracker.com/compare.php?from=28e1473e8e523150914e8c7ea50b44fb0d2a8d65&to=778d68ad1d48e7f111ea853dd249912c601bee89&stat=instructions:u

```
stage2-O0-g instructins:u geomeon (-0.07%)
stage1-ReleaseLTO-g (link only) max-rss geomean (-0.39%)
```

```
% /t/clang-old -g -c sqlite3.i -w -mllvm -debug-only=mc-dump &| awk '/^[0-9]+/{s[$2]++;tot++} END{print "Total",tot; n=asorti(s, si); for(i=1;i<=n;i++) print si[i],s[si[i]]}'
Total 59675
Align 2215
Data 29700
Dwarf 12044
DwarfCallFrame 4216
Fill 92
LEB 12
Relaxable 11396
% /t/clang-new -g -c sqlite3.i -w -mllvm -debug-only=mc-dump &| awk '/^[0-9]+/{s[$2]++;tot++} END{print "Total",tot; n=asorti(s, si); for(i=1;i<=n;i++) print si[i],s[si[i]]}'
Total 32287
Align 2215
Data 2312
Dwarf 12044
DwarfCallFrame 4216
Fill 92
LEB 12
Relaxable 11396
```

Pull Request: https://github.com/llvm/llvm-project/pull/148544
2025-07-15 21:56:55 -07:00
Fangrui Song
5ba458c559 MCFixup: Replace getTargetKind with getKind 2025-07-15 00:21:07 -07:00
Fangrui Song
0b674f4c52 MCFixup: Replace getTargetKind with getKind
MCFixupKind is now a type alias (fixup kinds are inherently
target-specific). getTargetKind is no longer necessary.
2025-07-15 00:08:45 -07:00
hev
eb0d61af6e
[LoongArch] Optimize 128-to-256-bit vector insertion and 256-to-128-bit subvector extraction (#146300)
This patch replaces stack-based accesses with register moves when
converting between 128-bit and 256-bit vectors. A 128-bit subvector
extract from, or insert to, the lower half of a 256-bit vector is now
treated as a subregister copy that needs no instruction.

Fixes #147769
2025-07-11 14:32:14 +08:00
Fangrui Song
4aa23ccd14 MCAsmInfo: Explicitly set AllowDollarAtStartOfIdentifier to false for some targets
The default AllowDollarAtStartOfIdentifier will be changed to true to
align better with GNU Assembler where $ is a valid initial identifier
char.
2025-07-08 23:02:51 -07:00
Dominik Steenken
acdf1c7526
[DAG] Add generic expansion for ISD::FCANONICALIZE nodes (#142105)
This PR takes the work previously done by @pawan-nirpal-031 on X86 in
#106370, and makes it available in common code. This should enable all
targets to use `__builtin_canonicalize` for all `f(16|32|64|128)` data
types.

Canonicalization is implemented here as multiplication by `1.0`, as
suggested in [the
docs](https://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic).
2025-07-08 16:12:17 +01:00
Matt Arsenault
d8ef156379
DAG: Remove verifyReturnAddressArgumentIsConstant (#147240)
The intrinsic argument is already marked with immarg so non-constant
values are rejected by the IR verifier.
2025-07-07 16:28:47 +09:00
Fangrui Song
eeba57a860 LoongArch: Remove unused relaxInstruction 2025-07-06 16:56:00 -07:00
Fangrui Song
aec88832df MC: Remove unneeded MCFixupKind casts 2025-07-05 14:43:34 -07:00
Fangrui Song
5a40023497 MCAsmBackend: Reduce FK_NONE uses 2025-07-05 13:22:07 -07:00
Fangrui Song
244e053b6c MC: Remove llvm/MC/MCFixupKindInfo.h
The file used to define `MCFixupKindInfo`, a simple structure,
which is now in MCAsmBackend.h.
2025-07-05 11:24:11 -07:00
Fangrui Song
43397e5fe3 LoongArchMCCodeEmitter: Set PCRel at fixup creation
Avoid reliance on the MCAssembler::evaluateFixup workaround that checks
MCFixupKindInfo::FKF_IsPCRel. Additionally, standardize how fixups are
appended. This helper will facilitate future fixup data structure
optimizations.
2025-07-04 16:18:39 -07:00
Fangrui Song
372752c2dd MCFixup: Remove unused Loc argument
MCFixup::Loc has been removed in favor of MCExpr::Loc through
`const MCExpr *Value` (commit 777391a2164b89d2030ca013562151ca3c3676d1).

While here, change Kind to uint16_t from MCFixupKind. Most fixup kinds
are target-specific.
2025-07-04 12:51:39 -07:00
Fangrui Song
20b3ab5683 MCFixup: Remove unused Loc argument
MCFixup::Loc has been removed in favor of MCExpr::Loc through
`const MCExpr *Value` (commit 777391a2164b89d2030ca013562151ca3c3676d1).
2025-07-04 12:23:04 -07:00
Fangrui Song
dd2891535d
MCAsmBackend: Merge addReloc into applyFixup (#146820)
Follow-up to #141333. Relocation generation called both addReloc and
applyFixup, with the default addReloc invoking shouldForceRelocation,
resulting in three virtual calls. This approach was also inflexible, as
targets needing additional data required extending
`shouldForceRelocation` (see #73721, resolved by #141311).

This change integrates relocation handling into applyFixup, eliminating
two virtual calls. The prior default addReloc is renamed to
maybeAddReloc. Targets overriding addReloc now call their customized
addReloc implementation.
2025-07-02 23:14:11 -07:00
Fangrui Song
9beb467d92
MC: Store fragment content and fixups out-of-line
Moved `Contents` and `Fixups` SmallVector storage to MCSection, enabling
trivial destructors for most fragment subclasses and eliminating the need
for MCFragment::destroy in ~MCSection.

For appending content to the current section, use
getContentsForAppending. During assembler relaxation, prefer
setContents/setFixups, which may involve copying and reduce the benefits
of https://reviews.llvm.org/D145791.

Moving only Contents out-of-line caused a slight performance regression
(Alexis Engelke's 2024 prototype). By also moving Fragments out-of-line,
fragment destructors become trivial, resulting in
neglgible instructions:u increase for "stage2-O0-g" and [large max-rss decrease](https://llvm-compile-time-tracker.com/compare.php?from=84e82746c3ff63ec23a8b85e9efd4f7fccf92590&to=555a28c0b2f8250a9cf86fd267a04b0460283e15&stat=max-rss&linkStats=on)
for the "stage1-ReleaseLTO-g (link only)" benchmark.
(
An older version using fewer inline functions: https://llvm-compile-time-tracker.com/compare.php?from=bb982e733cfcda7e4cfb0583544f68af65211ed1&to=f12d55f97c47717d438951ecddecf8ebd28c296b&linkStats=on
)

Now using plain SmallVector in MCSection for storage, with potential for
future allocator optimizations, such as allocating `Contents` as the
trailing object of MCDataFragment. (GNU Assembler uses gnulib's obstack
for fragment management.)

Co-authored-by: Alexis Engelke <engelke@in.tum.de>

Pull Request: https://github.com/llvm/llvm-project/pull/146307
2025-07-01 00:21:12 -07:00
Fangrui Song
2de51345fb MCFragment: Add addFixup to replace getFixups().push_back()
to not expose SmallVector to the callers. We will make fixup storage out
of line.
2025-06-29 16:26:00 -07:00
Fangrui Song
e6b25288eb MCExpr: Migrate away from operator<<
Printing an expression is error-prone without a MCAsmInfo argument.
Remove the operator<< overload and replace callers with
MCAsmInfo::printExpr. Some callers are changed to MCExpr::print, with
the goal of eventually making it private.
2025-06-28 14:41:58 -07:00
Fangrui Song
e878b7e349 MCParsedAsmOperand::print: Add MCAsmInfo parameter
so that subclasses can provide the appropriate MCAsmInfo to print
MCExpr objects.

At present, llvm/utils/TableGen/AsmMatcherEmitter.cpp constucts a
generic MCAsmInfo.
2025-06-28 12:05:33 -07:00