143 Commits

Author SHA1 Message Date
gonglingqin
2c174a53d5 [LoongArch] Move illegal ImmArg tests to llvm/test/Verifier
This patch also fixes incorrect function declarations in test cases
and remove -disable-verify from the test case.

Fix https://github.com/llvm/llvm-project/issues/59839
2023-01-07 12:10:13 +08:00
Xiaodong Liu
9e06d18c80 [LoongArch] Add intrinsics for CACOP instruction
The CACOP instruction is mainly used for cache initialization
and cache-consistency maintenance.

Depends on D140872

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D140527
2023-01-06 11:41:35 +08:00
Xiaodong Liu
8d798eab16 [LoongArch] Add "32bit" target feature
There are a few intrinsics or instructions on LoongArch
that are only appropriate for loongarch32 target. So the
feature "32bit" is added to implement it.

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D140872
2023-01-06 09:28:55 +08:00
Xiaodong Liu
63d46869ea [LoongArch] Add intrinsics for MOVFCSR2GR and MOVGR2FCSR instructions
Instruction formats:
`movgr2fcsr fcsr, rj`
`movfcsr2gr rd, fcsr`
MOVGR2FCSR modifies the value of the software writable field
corresponding to the FCSR (floating-point control and status
register) `fcsr` according to the value of the lower 32 bits of
the GR (general purpose register) `rj`.
MOVFCSR2GR sign extends the 32-bit value of the FCSR `fcsr`
and writes it into the GR `rd`.

Add "i32 @llvm.loongarch.movfcsr2gr(i32)" intrinsic for MOVFCSR2GR
instruction. The argument is FCSR register number. The return value
is the value in the FCSR.
Add "void @llvm.loongarch.movgr2fcsr(i32, i32)" intrinsic for MOVGR2FCSR
instruction. The first argument is the FCSR number, the second argument
is the value in GR.

Reviewed By: SixWeining, xen0n

Differential Revision: https://reviews.llvm.org/D140685
2023-01-04 14:11:30 +08:00
WANG Xuerui
77fad4c31e [LoongArch][test] Regenerate checks for the ghc-cc.ll test case
Seems the codegen was stale (the extra `ret` after the tail call should
not be there, and indeed it is not emitted by the current code), plus
the whitespaces are different from the update_llc_test_checks.py style.
Simply regenerate it to fix the test failure.

Differential Revision: https://reviews.llvm.org/D140670
2022-12-26 20:15:56 +08:00
Lin Runze
389079c87c [LoongArch] Add GHC Calling Convention
This is modeled after [[ https://reviews.llvm.org/D89788 | the RISCV GHC calling convention]]
and matches [[ https://gitlab.haskell.org/ghc/ghc/-/merge_requests/9292 | the corresponding GHC change ]].

Reviewed By: xen0n, wangleiat

Differential Revision: https://reviews.llvm.org/D137495
2022-12-26 19:36:41 +08:00
gonglingqin
aeb8f911b1 [Clang][LoongArch] Add intrinsic for asrtle, asrtgt, lddir, ldpte and cpucfg
`__cpucfg` is required by Linux [1].

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/loongarch/include/asm/loongarch.h#n59

Differential Revision: https://reviews.llvm.org/D139915
2022-12-22 18:56:34 +08:00
gonglingqin
9aa5de9746 [LoongArch] Break MUL into SLLI and SUB or ADD
Further, after MUL is decomposed, use ALSL instead of SLLI and ADD

Differential Revision: https://reviews.llvm.org/D140282
2022-12-20 19:39:07 +08:00
Weining Lu
ba9ed24b03 [LoongArch] Add tests showing the optimization pipeline
Other targets like ARM, AArch64, RISCV and X86 have similar tests.

`O1`, `O2` and `O3` appear to be the same for now. But in future, some
passes may be disabled at lower levels (e.g. `O1`). Hoping we can use
FileCheck prefixes for differences to avoid repeating the contents 3
times.

Reviewed By: xen0n, MaskRay

Differential Revision: https://reviews.llvm.org/D139499
2022-12-16 18:09:50 +08:00
wanglei
899226adac [LoongArch] Add custom parser for atomic instructions' memory operand
In order to be compatible with the form of the atomic instruction in
GAS that accepts the fourth operand as 0 (i.e. `am* $rd, $rk, $rj, 0`),
we need to treat `$rj, 0` as one operand, but only print `$rj`.

For this, the number of result operands of inline assembly memory
operand `ZB` constraint is modified to 2 (reg + 0).

Restrictions on register usage in `am*` instructions have also been
adjusted. When `$rd` is equal to `$r0`, the instruction must be
considered legal, because of some special usage like `PseudoUNIMP`.

Reviewed By: SixWeining, xen0n

Differential Revision: https://reviews.llvm.org/D139303
2022-12-13 11:46:53 +08:00
gonglingqin
048612050a [Clang][LoongArch] Add intrinsic for iocsrrd and iocsrwr
These intrinsics are required by Linux [1].

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/loongarch/include/asm/loongarch.h#n240

Differential Revision: https://reviews.llvm.org/D139612
2022-12-10 14:05:19 +08:00
gonglingqin
685bbe65f5 [Clang][LoongArch] Add intrinsic for csrrd, csrwr and csrxchg
These intrinsics are required by Linux [1].

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/loongarch/include/asm/loongarch.h?h=v6.0&id=4fe89d07dcc2804c8b562f6c7896a45643d34b2f#n232

Differential Revision: https://reviews.llvm.org/D139288
2022-12-08 14:11:50 +08:00
Fangrui Song
858266d308 [LoongArch] Fix tests after comment change 2022-12-01 22:08:21 +00:00
gonglingqin
624401612c [LoongArch] Add remaining intrinsics for CRC check instructions
After D137316 implements the intrinsics of the first crc check instruction
and related diagnosis, this patch implements the intrinsics of all remaining
crc check instructions.

Differential Revision: https://reviews.llvm.org/D138418
2022-12-01 09:40:50 +08:00
gonglingqin
5f9b4d8bad [LoongArch] Add codegen support for atomicrmw min/max operation on LA64
This patch is required by OpenMP. After applying this patch, OpenMP regression
test passed. To reduce review difficulty caused by too large patches,
atomicrmw min/max operations on LA32 will be added later.

Differential Revision: https://reviews.llvm.org/D138177
2022-11-30 17:45:18 +08:00
gonglingqin
a2d10bda18 [LoongArch] Add atomic ordering information for binary atomic operations
This patch also implements not emit fence in atomic binary operation
when AtomicOrdering is monotonic and fixes the issue of loading from
non ptr parameters.

The processing of other levels of AtomicOrdering will be added later.

Differential Revision: https://reviews.llvm.org/D138481
2022-11-28 19:51:20 +08:00
gonglingqin
eca62f9204 [LoongArch] Diagnose the behavior of reading and writing registers that do not conform to the hardware register size
When reading or writing a register that does not conform to the size of a
hardware register, an error message is generated instead of a compiler crash.

Differential Revision: https://reviews.llvm.org/D138008
2022-11-24 10:38:44 +08:00
gonglingqin
801c77bbfa [LoongArch] Support when the depth of __builtin_frame_address is greater than zero
As discussed in D137541, it supports processing when the depth of
__builtin_frame_address is greater than 0 instead of reporting an error.
Unsafe calls rely on the '-Wframe-address' option for diagnosis.

Differential Revision: https://reviews.llvm.org/D138084
2022-11-22 15:29:29 +08:00
gonglingqin
f6d411f557 [LoongArch] Make function name in error message consistent with the user input. NFC 2022-11-21 10:54:12 +08:00
gonglingqin
c2ec455f18 [LoongArch] Add intrinsics for ibar, break and syscall
Diagnostics for intrinsic input parameters have also been added.

Differential Revision: https://reviews.llvm.org/D138094
2022-11-21 09:31:26 +08:00
wanglei
6971c1b370 [LoongArch] Add support for tail call optimization
This patch adds tail call support to the LoongArch backend.  When
appropriate, use the `b` or `jr` instruction for tail calls (the
`pcalau12i+jirl` instruction pair when use medium codemodel).

This patch also modifies the inappropriate operand name:
simm26_bl -> simm26_symbol

This has been modeled after RISCV's tail call opt.

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D137889
2022-11-19 17:36:06 +08:00
wanglei
0e4378c55e [LoongArch] Add emergency spill slot for CFR spill/reload
When all registers have been allocated and CFR needs to be saved on the
stack, an emergency spill slot is required. Because CFR's spill and
reload require a general purpose register to transfer.

The attached test case was bugpoint-reduced down from
`MultiSource/Benchmarks/mafft/Lalignmm.c` in the test-suite.
Without this patch, llc will crash and report the following errors:

```
LLVM ERROR: Error while trying to spill R4 from class GPR: Cannot scavenge register without an emergency spill slot!
```

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D138007
2022-11-19 14:35:31 +08:00
gonglingqin
825547247a [LoongArch] Eliminate extra un-accounted-for successors
Specifically:
```
*** Bad machine code: MBB has unexpected successors which are not branch targets, fallthrough, EHPads, or inlineasm_br targets. ***
- function:    atomicrmw_umax_i8_acquire
- basic block: %bb.3  (0x1b90bd8)

*** Bad machine code: Non-terminator instruction after the first terminator ***
- function:    atomicrmw_umax_i8_acquire
- basic block: %bb.3  (0x1b90bd8)
- instruction: DBAR 1792
```

Differential Revision: https://reviews.llvm.org/D137884
2022-11-17 09:44:59 +08:00
wanglei
7da2d69da6 [LoongArch] Transfer MI flags when expand PseudoCALL
When expanding a PseudoCALL, the corresponding flags (e.g. nomerge)
need to be passed to the new instruction.

This patch also adds test for the nomerge attribute.

The `nomerge` attribute was added during `LowerCall`, but was lost
during expand PseudoCALL. Now add it back.

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D137888
2022-11-17 09:25:10 +08:00
gonglingqin
ddbb21bdb5 [LoongArch] Add immediate operand validity check for __builtin_loongarch_dbar
Differential Revision: https://reviews.llvm.org/D137809
2022-11-16 14:47:45 +08:00
gonglingqin
b97911c37b [LoongArch] Implement the TargetLowering::getRegisterByName hook
Only reserved registers can be read.

Differential Revision: https://reviews.llvm.org/D137532
2022-11-15 16:10:32 +08:00
Xiaodong Liu
03d07e181d [LoongArch] Handle register spill in BranchRelaxation pass
When the range of the unconditional branch is overflow, the indirect
branch way is used. The case when there is no scavenged register for
indirect branch needs to spill register to stack.

Reviewed By: SixWeining, wangleiat

Differential Revision: https://reviews.llvm.org/D137821
2022-11-15 09:55:40 +08:00
gonglingqin
19ae5391e3 [LoongArch] Expand atomicrmw fadd/fsub/fmin/fmax with CmpXChg
Differential Revision: https://reviews.llvm.org/D137311
2022-11-14 10:11:37 +08:00
wanglei
882ddab4c0 [LoongArch] Generate PCALAU12I + JIRL instruction pair for medium codemodel
In LoongArch, when `CodeModel=Medium`, it just increases the jumping
ability of function calls relative to PC, from 2^28 to 2^32.

Depends on D137393

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D137394
2022-11-11 18:26:51 +08:00
wanglei
4edcf8ab9f [LoongArch] Moved expansion of PseudoCALL to LoongArchPreRAExpandPseudo pass
This patch moves the expansion of the `PseudoCALL` insturction to
`LoongArchPreRAExpandPseudo` pass. This helps to expand into different
instruction sequences according to different CodeModels.

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D137393
2022-11-11 18:04:56 +08:00
gonglingqin
da34aff90d [Clang][LoongArch] Implement __builtin_loongarch_crc_w_d_w builtin and add diagnostics
This patch adds support to prevent __builtin_loongarch_crc_w_d_w from compiling
on loongarch32 in the front end and adds diagnostics accordingly.

Reference: https://github.com/gcc-mirror/gcc/blob/master/gcc/config/loongarch/larchintrin.h#L175-L184

Depends on D136906

Differential Revision: https://reviews.llvm.org/D137316
2022-11-11 09:16:57 +08:00
wanglei
d20c54cbdb [LoongArch] Override TargetFrameLowering::spillCalleeSavedRegisters
When using `llvm.returnaddress` intrinsic, special handling is required
for the spill of the `RA` register. Otherwise it will cause the verifier
fail in some cases (e.g. pr17377.c of the GCC C Torture Suite).

Specifically:
```
*** Bad machine code: Using an undefined physical register ***
- function:    f
- basic block: %bb.0 entry (0xd94d18)
- instruction: ST_D killed $r1, $r22, -40 :: (store (s64) into %stack.2)
- operand 0:   killed $r1
```

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D137387
2022-11-10 21:14:27 +08:00
wanglei
0436cf5f52 [LoongArch] Support parsing target specific flags for MIR
These hooks ensure that the LoongArch backend can serialize and parse
MIR correctly.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D137482
2022-11-10 20:53:20 +08:00
wanglei
7d5c8cb023 [LoongArch] Added spill/reload/copy support for CFRs
1, spill/reload
When a function call is made immediately after a floating point
comparison, the result of the comparison needs to be spilled before
function call and reloaded after the function returns.

2, copy
Support `GPR` to `CFR` and `CFR` to `GRP` copys. Therefore, the correct
register class can be used in the pattern template, and the hard-coding
of mutual coping of `CFR` and `GRP` is eliminated, reducing redundant
comparison instructions.

Note: Since the `COPY` instruction between CFRs is not provided in
LoongArch, we only use `$fcc0` in the register allocation.

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D137004
2022-11-10 20:12:18 +08:00
gonglingqin
85f08c4197 [Clang][LoongArch] Implement __builtin_loongarch_dbar builtin
Differential Revision: https://reviews.llvm.org/D136906
2022-11-10 17:27:44 +08:00
gonglingqin
71ec751886 [LoongArch] Fix atomic store pointer operand sequence error
Differential Revision: https://reviews.llvm.org/D137687
2022-11-10 17:03:06 +08:00
Xiaodong Liu
57ad3f1dc6 [LoongArch] Add support for the BranchRelaxation pass
When the branch target is out of the range represented by the current
branch instruction's immediate, branch relaxation is required. There
are three types of immediate for branch instructions on LoongArch,
including simm16, simm21 and simm26. And the real branch target
address is PC + sext(simmXX << 2). In addition, the indirect branch
way is implemented to support larger branch target.

BranchRelaxation pass calls `RenumberBlocks` to renumber all of the
machine basic blocks in the function. So the machine basic blocks
number changed in some test cases.

Differential Revision: https://reviews.llvm.org/D137233
2022-11-08 19:26:16 +08:00
wanglei
5adb090795 [LoongArch] Fix codegen for [su]itofp instructions
This patch fixes codegen for `[su]itofp` instructions.

In LoongArch, a legal int-to-float conversion is done in two steps:

1. Move the data from `GPR` to `FPR`. (FRLen >= GRLen)
2. Conversion in `FPR`. (the data in `FPR` is treated as a signed value)

Based on the above features, when the type's BitWidth meets the
requirements, all `SINT_TO_FP` are legal, all `UINT_TO_FP` are expand
and lowered to libcall when appropriate.
The only special case is, LoongArch64 with `+f,-d` features. At this
point, custom processing is required for `[SU]INT_TO_FP`. Of course, we
can also ignore it and use libcall directly.

Differential Revision: https://reviews.llvm.org/D136916
2022-11-03 11:40:50 +08:00
Weining Lu
e415cb1d61 [LoongArch] Support inline asm operand modifier 'z'
Print $zero register if operand is zero, otherwise print it normally.

Clang is highly compatible [1] with GCC inline assembly extensions,
allowing the same set of constraints, modifiers and operands as GCC
inline assembly. This patch tries to make it compatible regarding
LoongArch specific operand modifiers.

GCC supports many modifiers [2], but it seems that only x86 and msp430
are documented [3][4]. I don't know if any other modifiers are being
used except the 'z' in Linux [5].

[1]: https://clang.llvm.org/compatibility.html#inline-asm
[2]: https://github.com/gcc-mirror/gcc/blob/master/gcc/config/loongarch/loongarch.cc#L4884-L4911
[3]: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#x86Operandmodifiers
[4]: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#msp430Operandmodifiers
[5]: https://github.com/torvalds/linux/blob/master/arch/loongarch/include/asm/cmpxchg.h#L17

Differential Revision: https://reviews.llvm.org/D136841
2022-10-31 09:56:41 +08:00
gonglingqin
be8a2b98da [LoongArch] Replace assertion by error message while lowering RETURNADDR and FRAMEADDR
If `__builtin_frame_address` or `__builtin_return_address` is invoked with
non-zero argument, show an error message instead of a crash.

Reference: https://reviews.llvm.org/rG83b88441ad951fe99c30402930ef3cd661f2fd2b

Differential Revision: https://reviews.llvm.org/D136917
2022-10-31 09:19:00 +08:00
Weining Lu
cd0174aacb [Clang][LoongArch] Support inline asm constraint 'J'
'J' is defined in GCC [1] but not documented [2] while Linux [3] has
already used it in LoongArch port.

[1]: https://github.com/gcc-mirror/gcc/blob/master/gcc/config/loongarch/constraints.md#L61
[2]: https://gcc.gnu.org/onlinedocs/gccint/Machine-Constraints.html
[3]: https://github.com/torvalds/linux/blob/master/arch/loongarch/include/asm/cmpxchg.h#L19

Differential Revision: https://reviews.llvm.org/D136835
2022-10-31 09:13:52 +08:00
wanglei
4e2364a285 [LoongArch] Add emergency spill slot for GPR for large frames
An emergency spill slot is created when the stack size cannot be
represented by an 11-bit signed number.

This patch also modifies how the `sp` is adjusted in the prologue.

`RegScavenger` will place the spill instruction before the prologue
if a VReg is created in the prologue. This will pollute the caller's
stack data. Therefore, until there is better way, we just use the
`addi.w/d` instruction for stack adjustment to ensure that VReg will
not be created. (RISCV has the same issue #58286)

Due to the addition of emergency spill slot, some test cases that use
exact stacksize need to be updated.

Differential Revision: https://reviews.llvm.org/D135757
2022-10-28 17:51:53 +08:00
wanglei
f589e5067f [LoongArch] Split SP adjustment
This patch split the SP adjustment to reduce the instructions in
prologue and epilogue. In this way, the offset of the callee saved
register could fit in a single store.

Similar to D68011(RISCV).

Differential Revision: https://reviews.llvm.org/D136222
2022-10-28 16:39:00 +08:00
gonglingqin
bba97c3c03 [LoongArch] Add codegen support for cmpxchg on LA64
Differential Revision: https://reviews.llvm.org/D135948
2022-10-27 21:18:17 +08:00
gonglingqin
3be059377e [LoongArch] Add support for ISD::FRAMEADDR and ISD::RETURNADDR
For now, only support lowering frame/return address for current frame.

Differential Revision: https://reviews.llvm.org/D136215
2022-10-24 15:12:28 +08:00
wanglei
daf067da04 [LoongArch] Stack realignment support
This patch adds support for stack realignment while adding support for
variable sized objects.

Differential Revision: https://reviews.llvm.org/D136074
2022-10-21 17:30:29 +08:00
wanglei
6790292062 [LoongArch] Modify ParserMethod for the simm26_b operand type
Modify the ParserMethod of `simm26_b` operand type to `parseImmediate`.

Before that, for the `simm26_b` operand type, the same ParserMethod
was used as `simm26_bl`. When using the internal assembler to process
the blockaddress with `asm` instruction, the wrong blockaddress symbol
would be generated due to the call to the `getOrCreateSymbol()`
interface.

Differential Revision: https://reviews.llvm.org/D136073
2022-10-21 17:01:46 +08:00
gonglingqin
0f4dc562bc [LoongArch] Fix 32-bit and 64-bit atomicrmw nand operand order errors
Differential Revision: https://reviews.llvm.org/D136220
2022-10-20 17:26:33 +08:00
Weining Lu
771aee91c8 Reland "[LoongArch] Fix codegen of atomicrmw nand"
Fix invalid RISCV-like MI being emitted for performing the `not`
operation: the LoongArch `xori` zero-extends the immediate, hence is
not equivalent to RISCV `xori`. The LoongArch `not` is a `nor` with
zero.

Patch by lrzlin (Lin Runze).

Differential Revision: https://reviews.llvm.org/D136021
2022-10-19 10:05:35 +08:00
Weining Lu
0f374ca5cd Revert "[LoongArch] Fix codegen of atomicrmw nand"
This reverts commit 9572406bbcb497f8c23c28daa762b55ee3219f41.

The author name is wrong.
2022-10-19 07:56:23 +08:00