262 Commits

Author SHA1 Message Date
wanglei
af999c4be9
[LoongArch] Add codegen support for [X]VF{MSUB/NMADD/NMSUB}.{S/D} instructions (#74819)
This is similar to single and double-precision floating-point
instructions.
2023-12-11 10:37:22 +08:00
wanglei
cdc3732566 [LoongArch] Mark ISD::FNEG as legal 2023-12-08 15:07:58 +08:00
wanglei
9f70e708a7
[LoongArch] Make ISD::FSQRT a legal operation with lsx/lasx feature (#74795)
And add some patterns:
1. (fdiv 1.0, vector)
2. (fdiv 1.0, (fsqrt vector))
2023-12-08 14:16:26 +08:00
wanglei
9ff7d0ebeb
[LoongArch] Add codegen support for icmp/fcmp with lsx/lasx fetaures (#74700)
Mark ISD::SETCC node as legal, and add handling for the vector types
condition codes.
2023-12-07 20:11:43 +08:00
wanglei
de21308f78 [LoongArch] Make ISD::VSELECT a legal operation with lsx/lasx 2023-12-06 16:43:38 +08:00
wanglei
e9cd197d15 [LoongArch] Support MULHS/MULHU with lsx/lasx
Mark MULHS/MULHU nodes as legal and adds the necessary patterns.
2023-12-04 10:58:05 +08:00
wanglei
a60a5421b6 Reland "[LoongArch] Support CTLZ with lsx/lasx"
This patch simultaneously adds tests for `CTPOP`.

This relands 07cec73dcd095035257eec1f213d273b10988130 with fix tests.
2023-12-02 17:22:40 +08:00
wanglei
63e6bba0c3 Revert "[LoongArch] Support CTLZ with lsx/lasx"
This reverts commit 07cec73dcd095035257eec1f213d273b10988130.
2023-12-02 17:17:48 +08:00
wanglei
07cec73dcd [LoongArch] Support CTLZ with lsx/lasx
This patch simultaneously adds tests for `CTPOP`.
2023-12-02 17:13:36 +08:00
wanglei
66a3e4fafb [LoongArch] Override TargetLowering::isShuffleMaskLegal
By default, `isShuffleMaskLegal` always returns true, which can result
 in the expansion of `BUILD_VECTOR` into a `VECTOR_SHUFFLE` node in
 certain situations. Subsequently, the `VECTOR_SHUFFLE` node is expanded
 again into a `BUILD_VECTOR`, leading to an infinite loop.
 To address this, we always return false, allowing the expansion of
 `BUILD_VECTOR` through the stack.
2023-12-02 14:25:17 +08:00
leecheechen
dbbc7c31c8
[LoongArch] Add some binary IR instructions testcases for LASX (#74031)
The IR instructions include:
- Binary Operations: add fadd sub fsub mul fmul udiv sdiv fdiv
- Bitwise Binary Operations: shl lshr ashr
2023-12-01 13:14:11 +08:00
wanglei
ca66df3b02 [LoongArch] Add more and/or/xor patterns for vector types 2023-12-01 10:28:41 +08:00
wanglei
add224c0a0 [LoongArch] Custom lowering ISD::BUILD_VECTOR 2023-12-01 09:13:39 +08:00
wanglei
f2cbd1fdf7 [LoongArch] Add codegen support for insertelement 2023-12-01 09:13:39 +08:00
leecheechen
29a0f3ec2b
[LoongArch] Add some binary IR instructions testcases for LSX (#73929)
The IR instructions include:
- Binary Operations: add fadd sub fsub mul fmul udiv sdiv fdiv
- Bitwise Binary Operations: shl lshr ashr
2023-11-30 21:41:18 +08:00
wanglei
b72456120f
[LoongArch] Add codegen support for extractelement (#73759)
Add codegen support for extractelement when enable `lsx` or `lasx`
feature.
2023-11-30 17:29:18 +08:00
wanglei
5e7e0d6032
[LoongArch] Fix pattern for FNMSUB_{S/D} instructions (#73742)
```
when a=c=-0.0, b=0.0:
-(a * b + (-c)) = -0.0
-a * b + c = 0.0
(fneg (fma a, b (-c))) != (fma (fneg a), b ,c)
```

See https://reviews.llvm.org/D90901 for a similar discussion on X86.
2023-11-29 15:21:21 +08:00
hev
0d9f557b6c
[LoongArch] Disable mulodi4 and muloti4 libcalls (#73199)
This library function only exists in compiler-rt not libgcc. So this
would fail to link unless we were linking with compiler-rt.

Fixes https://github.com/ClangBuiltLinux/linux/issues/1958
2023-11-23 19:34:50 +08:00
hev
7414c0db96
[LoongArch] Precommit a test for smul with overflow (NFC) (#73212) 2023-11-23 15:15:26 +08:00
ZhaoQi
775d2f3201
[LoongArch][MC] Support to get the FixupKind for BL (#72938)
Previously, bolt could not get FixupKind for BL correctly, because bolt
cannot get target-flags for BL. Here just add support in MCCodeEmitter.

Fixes https://github.com/llvm/llvm-project/pull/72826.
2023-11-21 19:00:29 +08:00
ZhaoQi
2ca028ce7c
[LoongArch][MC] Pre-commit tests for instr bl fixupkind testing (#72826)
This patch is used to test whether fixupkind for bl can be returned
correctly. When BL has target-flags(loongarch-call), there is no error.
But without this flag, an assertion error will appear. So the test is
just tagged as "Expectedly Failed" now until the following patch fix it.
2023-11-21 08:34:52 +08:00
Lu Weining
78abc45c44
[LoongArch] Improve codegen for atomic cmpxchg ops (#69339)
PR #67391 improved atomic codegen by handling memory ordering specified
by the `cmpxchg` instruction. An acquire barrier needs to be generated
when memory ordering includes an acquire operation. This PR improves the
codegen further by only handling the failure ordering.
2023-10-19 09:21:51 +08:00
wanglei
271087e3a0
[LoongArch] Implement COPY instruction between CFRs (#69300)
With this patch, all CFRs can be used for register allocation.
2023-10-19 09:20:27 +08:00
Weining Lu
b2773d170c [LoongArch] Precommit a test for atomic cmpxchg optmization 2023-10-17 22:29:51 +08:00
WANG Xuerui
956482de13 [LoongArch] Support finer-grained DBAR hints for LA664+ (#68787)
These are treated as DBAR 0 on older uarchs, so we can start to
unconditionally emit the new hints right away.

Co-authored-by: WANG Rui <wangrui@loongson.cn>
2023-10-12 15:04:51 +08:00
hev
37b93f07cd
[LoongArch] Add some atomic tests (#68766) 2023-10-11 18:28:04 +08:00
hev
203ba238e3
[LoongArch] Improve codegen for atomic ops (#67391)
This PR improves memory barriers generated by atomic operations.

Memory barrier semantics of LL/SC:
```
LL: <memory-barrier> + <load-exclusive>
SC: <store-conditional> + <memory-barrier>
```

Changes:
* Remove unnecessary memory barriers before LL and between LL/SC.
* Fix acquire semantics. (If the SC instruction is not executed, then
the guarantee of acquiring semantics cannot be ensured. Therefore, an
acquire barrier needs to be generated when memory ordering includes an
acquire operation.)
2023-10-11 10:24:18 +08:00
WANG Rui
6417ce4336 [LoongArch] Improve codegen for i8/i16 'atomicrmw xchg a, {0,-1}'
Similar to D156801 for RISCV.

Link: https://github.com/rust-lang/rust/pull/114034
Link: https://github.com/llvm/llvm-project/issues/64090

Reviewed By: SixWeining, xen0n

Differential Revision: https://reviews.llvm.org/D159252
2023-09-26 11:46:07 +08:00
WANG Rui
555e2397aa [LoongArch] Add test cases for atomicrmw xchg {0,-1} {i8,i16}
Add test cases for atomicrmw xchg {0,-1} {i8,i16}.

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D159251
2023-09-26 11:46:06 +08:00
Weining Lu
0a692b6b96 [LoongArch] Fix incorrect instruction 'and' in pattern
It should be `andi`, but not `and`.

Address buildbot failure:
https://lab.llvm.org/buildbot/#/builders/42/builds/11634
2023-09-15 16:16:06 +08:00
Weining Lu
419f90e93a [LoongArch] Support llvm.is.fpclass for f32 and f64
is_fpclass (fj, mask)
->
sltu (r0, and (movfr2gr.[sd] (fclass.[sd] fj), (to_fclass_mask mask)))

[1]: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html#_fclass_sd

Reviewed By: wangleiat

Differential Revision: https://reviews.llvm.org/D159183
2023-09-14 15:43:58 +08:00
Weining Lu
26021577d1 [LoongArch] Optimize (and (a & ~((2^^X - 1) << Y)) to (bstrins a, zero, X+Y-1, Y)
Inspired by D158384.

Differential Revision: https://reviews.llvm.org/D158832
2023-08-28 08:36:54 +08:00
Weining Lu
73a2eecb21 [LoongArch] Pre-commit test for bstrins optimization
Differential Revision: https://reviews.llvm.org/D158831
2023-08-28 08:36:54 +08:00
wanglei
1bb7766489 [LoongArch] Optimize stack realignment using BSTRINS instruction
Prior to this change, stack realignment was achieved using the SRLI/SLLI
instructions in two steps. With this patch, stack realignment is
optimized using a single `BSTRINS` instruction.

Reviewed By: SixWeining, xen0n

Differential Revision: https://reviews.llvm.org/D158384
2023-08-23 09:21:42 +08:00
chenli
0c76f46ca6 [LoongArch] Add testcases of LSX intrinsics with immediates
The testcases mainly cover three situations:
- the arguments which should be immediates are non immediates.
- the immediate is out of upper limit of the argument type.
- the immediate is out of lower limit of the argument type.

Depends on D155829

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D157570
2023-08-21 11:04:19 +08:00
chenli
82bbf7003c [LoongArch] Add testcases of LASX intrinsics with immediates
The testcases mainly cover three situations:
- the arguments which should be immediates are non immediates.
- the immediate is out of upper limit of the argument type.
- the immediate is out of lower limit of the argument type.

Depends on D155830

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D157571
2023-08-19 17:14:16 +08:00
chenli
83311b2b5d [LoongArch] Add LASX intrinsic testcases
Depends on D155830

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D155835
2023-08-19 17:12:31 +08:00
chenli
f3aa441631 [LoongArch] Add LSX intrinsic testcases
Depends on D155829

Reviewed By: SixWeining

Differential Revision: https://reviews.llvm.org/D155834
2023-08-19 17:10:46 +08:00
Weining Lu
f62c9252fc [LoongArch] Support -march=native and -mtune=
As described in [1][2], `-mtune=` is used to select the type of target
microarchitecture, defaults to the value of `-march`. The set of
possible values should be a superset of `-march` values. Currently
possible values of `-march=` and `-mtune=` are `native`, `loongarch64`
and `la464`.

D136146 has supported `-march={loongarch64,la464}` and this patch adds
support for `-march=native` and `-mtune=`.

A new ProcessorModel called `loongarch64` is defined in LoongArch.td
to support `-mtune=loongarch64`.

`llvm::sys::getHostCPUName()` returns `generic` on unknown or future
LoongArch CPUs, e.g. the not yet added `la664`, leading to
`llvm::LoongArch::isValidArchName()` failing to parse the arch name.
In this case, use `loongarch64` as the default arch name for 64-bit
CPUs.

Two preprocessor macros are defined based on user-provided `-march=`
and `-mtune=` options and the defaults.
- __loongarch_arch
- __loongarch_tune
Note that, to work with `-fno-integrated-cc1` we leverage cc1 options
`-target-cpu` and `-tune-cpu` to pass driver options `-march=` and
`-mtune=` respectively because cc1 needs these information to define
macros in `LoongArchTargetInfo::getTargetDefines`.

[1]: https://github.com/loongson/LoongArch-Documentation/blob/2023.04.20/docs/LoongArch-toolchain-conventions-EN.adoc
[2]: https://github.com/loongson/la-softdev-convention/blob/v0.1/la-softdev-convention.adoc

Reviewed By: xen0n, wangleiat, steven_wu, MaskRay

Differential Revision: https://reviews.llvm.org/D155824
2023-08-09 10:29:50 +08:00
Steven Wu
42c9354a92 Revert "Reland "[LoongArch] Support -march=native and -mtune=""
This reverts commit c56514f21b2cf08eaa7ac3a57ba4ce403a9c8956. This
commit adds global state that is shared between clang driver and clang
cc1, which is not correct when clang is used with `-fno-integrated-cc1`
option (no integrated cc1). The -march and -mtune option needs to be
properly passed through cc1 command-line and stored in TargetInfo.
2023-07-31 16:57:06 -07:00
Weining Lu
c56514f21b Reland "[LoongArch] Support -march=native and -mtune="
As described in [1][2], `-mtune=` is used to select the type of target
microarchitecture, defaults to the value of `-march`. The set of
possible values should be a superset of `-march` values. Currently
possible values of `-march=` and `-mtune=` are `native`, `loongarch64`
and `la464`.

D136146 has supported `-march={loongarch64,la464}` and this patch adds
support for `-march=native` and `-mtune=`.

A new ProcessorModel called `loongarch64` is defined in LoongArch.td
to support `-mtune=loongarch64`.

`llvm::sys::getHostCPUName()` returns `generic` on unknown or future
LoongArch CPUs, e.g. the not yet added `la664`, leading to
`llvm::LoongArch::isValidArchName()` failing to parse the arch name.
In this case, use `loongarch64` as the default arch name for 64-bit
CPUs.

And these two preprocessor macros are defined:
- __loongarch_arch
- __loongarch_tune

[1]: https://github.com/loongson/LoongArch-Documentation/blob/2023.04.20/docs/LoongArch-toolchain-conventions-EN.adoc
[2]: https://github.com/loongson/la-softdev-convention/blob/v0.1/la-softdev-convention.adoc

Reviewed By: xen0n, wangleiat

Differential Revision: https://reviews.llvm.org/D155824
2023-07-26 10:26:38 +08:00
Weining Lu
212d6aa0da Revert "[LoongArch] Support -march=native and -mtune="
This reverts commit 92c06114b2ea9900a3364fb395988dfb065758f7.
2023-07-25 23:32:15 +08:00
Weining Lu
92c06114b2 [LoongArch] Support -march=native and -mtune=
As described in [1][2], `-mtune=` is used to select the type of target
microarchitecture, defaults to the value of `-march`. The set of
possible values should be a superset of `-march` values. Currently
possible values of `-march=` and `-mtune=` are `native`, `loongarch64`
and `la464`.

D136146 has supported `-march={loongarch64,la464}` and this patch adds
support for `-march=native` and `-mtune=`.

A new ProcessorModel called `loongarch64` is defined in LoongArch.td
to support `-mtune=loongarch64`.

`llvm::sys::getHostCPUName()` returns `generic` on unknown or future
LoongArch CPUs, e.g. the not yet added `la664`, leading to
`llvm::LoongArch::isValidArchName()` failing to parse the arch name.
In this case, use `loongarch64` as the default arch name for 64-bit
CPUs.

And these two preprocessor macros are defined:
- __loongarch_arch
- __loongarch_tune

[1]: https://github.com/loongson/LoongArch-Documentation/blob/2023.04.20/docs/LoongArch-toolchain-conventions-EN.adoc
[2]: https://github.com/loongson/la-softdev-convention/blob/v0.1/la-softdev-convention.adoc

Differential Revision: https://reviews.llvm.org/D155824
2023-07-25 21:01:51 +08:00
WANG Rui
e7c9a99dfe [LoongArch] Implement isSExtCheaperThanZExt
Implement isSExtCheaperThanZExt.

Signed-off-by: WANG Rui <wangrui@loongson.cn>

Differential Revision: https://reviews.llvm.org/D154919
2023-07-25 09:41:32 +08:00
WANG Rui
1a3da0bc1e [LoongArch] Add test case showing suboptimal codegen when zero extending
Add test case showing suboptimal codegen when zero extending.

Signed-off-by: WANG Rui <wangrui@loongson.cn>

Reviewed By: xen0n

Differential Revision: https://reviews.llvm.org/D154918
2023-07-25 09:31:33 +08:00
chenli
d25c79dc70 [LoongArch] Support InlineAsm for LSX and LASX
The author of the following files is licongtian <licongtian@loongson.cn>:
- clang/lib/Basic/Targets/LoongArch.cpp
- llvm/lib/Target/LoongArch/LoongArchAsmPrinter.cpp
- llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp

The files mentioned above implement InlineAsm for LSX and LASX as follows:
- Enable clang parsing LSX/LASX register name, such as $vr0.
- Support the case which operand type is 128bit or 256bit when the
  constraints is 'f'.
- Support the way of specifying LSX/LASX register by using constraint,
  such as "={$xr0}".
- Support the operand modifiers 'u' and 'w'.
- Support and legalize the data types and register classes involved in
  LSX/LASX in the lowering process.

Reviewed By: xen0n, SixWeining

Differential Revision: https://reviews.llvm.org/D154931
2023-07-25 09:02:29 +08:00
WANG Rui
9c21f95541 [LoongArch] Implement isZextFree
This returns true for 8-bit and 16-bit loads, allowing ld.bu/ld.hu to be selected and avoiding unnecessary masks.

Signed-off-by: WANG Rui <wangrui@loongson.cn>

Reviewed By: SixWeining, xen0n

Differential Revision: https://reviews.llvm.org/D154819
2023-07-24 17:49:25 +08:00
WANG Rui
90e08c2600 [LoongArch] Add test case showing suboptimal codegen when loading unsigned char/short
Implementing isZextFree will allow ld.bu or ld.hu to be selected rather than ld.b+mask and ld.h+mask.

Signed-off-by: WANG Rui <wangrui@loongson.cn>

Reviewed By: SixWeining, xen0n

Differential Revision: https://reviews.llvm.org/D154818
2023-07-24 17:48:16 +08:00
WANG Rui
899aaffcbc [LoongArch] Implement isLegalICmpImmediate
This causes a trivial improvement in the legalicmpimm.ll test case.

Signed-off-by: WANG Rui <wangrui@loongson.cn>

Reviewed By: SixWeining, xen0n

Differential Revision: https://reviews.llvm.org/D154811
2023-07-24 17:42:11 +08:00
WANG Rui
0cceea90bf [LoongArch][NFC] Add tests for (X & -256) == 256 -> (X >> 8) == 1
Add tests for (X & -256) == 256 -> (X >> 8) == 1.

Signed-off-by: WANG Rui <wangrui@loongson.cn>

Reviewed By: xen0n

Differential Revision: https://reviews.llvm.org/D154810
2023-07-24 17:42:10 +08:00