After commit 18c5f3c3 ("[RegisterScavenger][RISCV] Don't search for
FrameSetup instrs if we were searching from Non-FrameSetup instrs"), we
can revert the `sp` adjustment 4e2364a2 ("[LoongArch] Add emergency
spill slot for GPR for large frames") to generate better code, as the
issue with `RegScavenger` has been resolved.
Fixes#88109
The `MSB` must not be greater than `GRLen`. Without this patch, newly
added test cases will crash with LoongArch32, resulting in a 'cannot
select' error.
The SelectionDAG scheduling preference now becomes source order
scheduling (machine scheduler generates better code -- even without
there being a machine model defined for LoongArch yet).
Most of the test changes are trivial instruction reorderings and
differing register allocations, without any obvious performance impact.
This is similar to commit: 3d0fbafd0bce43bb9106230a45d1130f7a40e5ec
This patch aims to solve Firefox issue:
https://bugzilla.mozilla.org/show_bug.cgi?id=1882301
Similar to 616289ed2922. Currently LoongArch uses an ll.[wd]/sc.[wd]
loop for ATOMIC_CMP_XCHG. Because the comparison in the loop is
full-width (i.e. the `bne` instruction), we must sign extend the input
comparsion argument.
Note that LoongArch ISA manual V1.1 has introduced compare-and-swap
instructions. We would change the implementation (return `ANY_EXTEND`)
when we support them.
For instruction xvpermi.q, only [1:0] and [5:4] bits of operands[3] are
used. The unused bits in operands[3] need to be set to 0 to avoid
causing undefined behavior.
This commit updates the pattern matching logic for the `AddLike`
predicate in `LoongArchInstrInfo.td` to use the
`isBaseWithConstantOffset` function provided by `CurDAG`. This
optimization aims to improve the efficiency of pattern matching by
identifying cases where the operation can be represented as a base
address plus a constant offset, which can lead to more efficient code
generation.
When we do not enable vector features, we should return the default
value (`TargetTransformInfoImplBase::getRegisterBitWidth`) instead of
zero.
This should fix the LoongArch [buildbot
breakage](https://lab.llvm.org/staging/#/builders/5/builds/486) from
#78943.
Refer to RISCV, we will fix up the alignment if linker relaxation
changes code size and breaks alignment. Insert enough Nops and emit
R_LARCH_ALIGN relocation type so that linker could satisfy the alignment
by removing Nops.
It does so only in sections with the SHF_EXECINSTR flag.
In LoongArch psABI v2.30, R_LARCH_ALIGN requires symbol index. The
lowest 8 bits of addend represent alignment and the other bits of addend
represent the maximum number of bytes to emit.
With enough codegen complete, we can now correctly report the size of
vector registers for LSX/LASX, allowing auto vectorization (The
`auto-vec` feature needs to be enabled simultaneously).
As described, the `auto-vec` feature is an experimental one. To ensure
that automatic vectorization is not enabled by default, because the
information provided by the current `TTI` cannot yield additional
benefits for automatic vectorization.
LoongArch V1.1 instrucions include floating-point approximate reciprocal
instructions and atomic instrucions. And add testcases for these
instrucions meanwhile.
When linker-relaxation is enabled, part of the label diff in dwarfinfo
cannot be computed before static link. Refer to RISCV, we add the
relaxDwarfLineAddr and relaxDwarfCFA to add relocations for these label
diffs. Calculate whether the label diff is mutable. For immutable label
diff, return false and do the other works by its parent function.
This patch fixes the crash issue in the test:
CodeGen/LoongArch/can-not-realign-stack.ll
Register allocator may spill virtual registers to the stack, which
introduces stack alignment requirements (when the size of spilled
registers exceeds the default alignment size of the stack). If a
function does not have stack alignment requirements before register
allocation, registers used for stack alignment will not be preserved.
Therefore, we should implement `canRealignStack()` to inform the
register allocator whether it is allowed to perform stack realignment
operations.
This follows on from #76708, allowing
`cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just
`N->getAsZextVal();`
Introduced via `git grep -l "cast<ConstantSDNode>\(.*\).*getZExtValue" |
xargs sed -E -i
's/cast<ConstantSDNode>\((.*)\)->getZExtValue/\1->getAsZExtVal/'` and
then using `git clang-format` on the result.
1, Follow RISCV 1df5ea29 to support generates relocs for .uleb128 which
can not be folded. Unlike RISCV, the located content of LoongArch should
be zero. LoongArch fixup uleb128 value by in-place addition and
subtraction reloc types named R_LARCH_{ADD,SUB}_ULEB128. The located
content can affect the result and R_LARCH_ADD_ULEB128 has enough info to
represent the first symbol value, so it needs to be set to zero.
2, Force relocs if sym is not in section so that it can emit relocs for
external symbol.
Fixes:
https://github.com/llvm/llvm-project/pull/72960#issuecomment-1866844679
This patch gets the code model from global variable attribute if it has,
otherwise the target's will be used.
---------
Signed-off-by: WANG Rui <wangrui@loongson.cn>
According to the description of the psABI v2.30:
https://github.com/loongson/la-abi-specs/releases/tag/v2.30, moved the
expansion of relevant pseudo-instructions from
`LoongArchPreRAExpandPseudo` pass to `LoongArchExpandPseudo` pass, to
ensure that the code sequences of `PseudoLA*_LARGE` instructions and
Medium code model's function call are not scheduled.
According to the description of the psABI v2.20:
https://github.com/loongson/la-abi-specs/releases/tag/v2.20, adjustments
are made to the function call instructions under the medium code model.
At the same time, AsmParser has already supported parsing the call36 and
tail36 macro instructions.
This helper function shortens examples like
`cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();` to
`Node->getConstantOperandVal(1);`.
Implemented with:
`git grep -l
"cast<ConstantSDNode>\(.*->getOperand\(.*\)\)->getZExtValue\(\)" | xargs
sed -E -i
's/cast<ConstantSDNode>\((.*)->getOperand\((.*)\)\)->getZExtValue\(\)/\1->getConstantOperandVal(\2)/`
and `git grep -l
"cast<ConstantSDNode>\(.*\.getOperand\(.*\)\)->getZExtValue\(\)" | xargs
sed -E -i
's/cast<ConstantSDNode>\((.*)\.getOperand\((.*)\)\)->getZExtValue\(\)/\1.getConstantOperandVal(\2)/'`.
With a couple of simple manual fixes needed. Result then processed by
`git clang-format`.
Emit relax relocs when expand non-large la.pcrel and non-large la.got on
llvm-mc stage, which like what does on GAS.
1, la.pcrel -> PCALA_HI20 + RELAX + PCALA_LO12 + RELAX
2, la.got -> GOT_PC_HI20 + RELAX + GOT_PC_LO12 + RELAX
Refer to RISCV [1], LoongArch also need delayed decision for ADD/SUB
relocations. In handleAddSubRelocations, just return directly if SecA !=
SecB, handleFixup usually will finish the the rest of creating PCRel
relocations works. Otherwise we emit relocs depends on whether
relaxation is enabled. If not, we return true and avoid record ADD/SUB
relocations.
Now the two symbols separated by alignment directive will return without
folding symbol offset in AttemptToFoldSymbolOffsetDifference, which has
the same effect when relaxation is enabled.
[1] https://reviews.llvm.org/D155357
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
By default, `isShuffleMaskLegal` always returns true, which can result
in the expansion of `BUILD_VECTOR` into a `VECTOR_SHUFFLE` node in
certain situations. Subsequently, the `VECTOR_SHUFFLE` node is expanded
again into a `BUILD_VECTOR`, leading to an infinite loop.
To address this, we always return false, allowing the expansion of
`BUILD_VECTOR` through the stack.