llvm-project

Author	SHA1	Message	Date
hev	0d17e1f0e5	[LoongArch] Revert `sp` adjustment in prologue (#88110 ) After commit 18c5f3c3 ("[RegisterScavenger][RISCV] Don't search for FrameSetup instrs if we were searching from Non-FrameSetup instrs"), we can revert the `sp` adjustment 4e2364a2 ("[LoongArch] Add emergency spill slot for GPR for large frames") to generate better code, as the issue with `RegScavenger` has been resolved. Fixes #88109	2024-04-10 17:13:25 +08:00
Craig Topper	acab142751	[LegalizeDAG] Freeze index when converting insert_elt/insert_subvector to load/store on stack. We try clamp the index to be within the bounds of the stack object we create, but if we don't freeze it, poison can propagate into the clamp code. This can cause the access to leave the bounds of the stack object. We have other instances of this issue in type legalization and extract_elt/subvector, but posting this patch first for direction check. Fixes #86717	2024-03-27 13:01:23 -07:00
Lu Weining	e4edbae0aa	Revert "[llvm][LoongArch] Improve loongarch_lasx_xvpermi_q instrinsic" (#84708 ) Reverts llvm/llvm-project#82984 See the discussion in https://github.com/llvm/llvm-project/pull/83540.	2024-03-13 11:51:47 +08:00
wanglei	edd4c6c6dc	[LoongArch] Make sure that the LoongArchISD::BSTRINS node uses the correct `MSB` value (#84454 ) The `MSB` must not be greater than `GRLen`. Without this patch, newly added test cases will crash with LoongArch32, resulting in a 'cannot select' error.	2024-03-11 08:59:17 +08:00
wanglei	a5c90e48b6	[LoongArch] Switch to the Machine Scheduler (#83759 ) The SelectionDAG scheduling preference now becomes source order scheduling (machine scheduler generates better code -- even without there being a machine model defined for LoongArch yet). Most of the test changes are trivial instruction reorderings and differing register allocations, without any obvious performance impact. This is similar to commit: 3d0fbafd0bce43bb9106230a45d1130f7a40e5ec	2024-03-05 09:15:44 +08:00
Lu Weining	5f058aa211	[LoongArch] Override LoongArchTargetLowering::getExtendForAtomicCmpSwapArg (#83656 ) This patch aims to solve Firefox issue: https://bugzilla.mozilla.org/show_bug.cgi?id=1882301 Similar to 616289ed2922. Currently LoongArch uses an ll.[wd]/sc.[wd] loop for ATOMIC_CMP_XCHG. Because the comparison in the loop is full-width (i.e. the `bne` instruction), we must sign extend the input comparsion argument. Note that LoongArch ISA manual V1.1 has introduced compare-and-swap instructions. We would change the implementation (return `ANY_EXTEND`) when we support them.	2024-03-04 08:38:52 +08:00
leecheechen	d7c80bba69	[llvm][LoongArch] Improve loongarch_lasx_xvpermi_q instrinsic (#82984 ) For instruction xvpermi.q, only [1:0] and [5:4] bits of operands[3] are used. The unused bits in operands[3] need to be set to 0 to avoid causing undefined behavior.	2024-02-27 15:38:11 +08:00
Jack Styles	28233408a2	[CodeGen] [ARM] Make RISC-V Init Undef Pass Target Independent and add support for the ARM Architecture. (#77770 ) When using Greedy Register Allocation, there are times where early-clobber values are ignored, and assigned the same register. This is illeagal behaviour for these intructions. To get around this, using Pseudo instructions for early-clobber registers gives them a definition and allows Greedy to assign them to a different register. This then meets the ARM Architecture Reference Manual and matches the defined behaviour. This patch takes the existing RISC-V patch and makes it target independent, then adds support for the ARM Architecture. Doing this will ensure early-clobber restraints are followed when using the ARM Architecture. Making the pass target independent will also open up possibility that support other architectures can be added in the future.	2024-02-26 12:12:31 +00:00
hev	8be39b3901	[LoongArch] Improve pattern matching for AddLike predicate (#82767 ) This commit updates the pattern matching logic for the `AddLike` predicate in `LoongArchInstrInfo.td` to use the `isBaseWithConstantOffset` function provided by `CurDAG`. This optimization aims to improve the efficiency of pattern matching by identifying cases where the operation can be represented as a base address plus a constant offset, which can lead to more efficient code generation.	2024-02-26 11:13:21 +08:00
hev	c747b24262	[NFC] Precommit a memcpy test for isOrEquivalentToAdd (#82758 )	2024-02-23 21:43:53 +08:00
hev	dd3e0a4643	[LoongArch] Assume no-op addrspacecasts by default (#82332 ) This PR indicates that `addrspacecasts` are always no-ops on LoongArch. Fixes #82330	2024-02-21 21:15:17 +08:00
DianQK	ccb46e8365	Reapply "[RegisterCoalescer] Clear instructions not recorded in `ErasedInstrs` but erased (#79820 )" This reverts commit 8316bf34ac21117f35bc8e6fafa2b3e7da75e1d5.	2024-02-09 15:58:48 +08:00
DianQK	8316bf34ac	Revert "[RegisterCoalescer] Clear instructions not recorded in `ErasedInstrs` but erased (#79820 )" This reverts commit 95b14da678f4670283240ef4cf60f3a39bed97b4.	2024-02-09 15:54:54 +08:00
Quentin Dian	95b14da678	[RegisterCoalescer] Clear instructions not recorded in `ErasedInstrs` but erased (#79820 ) Fixes #79718. Fixes #71178. The same instructions may exist in an iteration. We cannot immediately delete instructions in `ErasedInstrs`.	2024-02-09 15:29:05 +08:00
Nikita Popov	ff9af4c43a	[CodeGen] Convert tests to opaque pointers (NFC)	2024-02-05 14:07:09 +01:00
yjijd	44ba6ebc99	[CodeGen][LoongArch] Set FP_TO_SINT/FP_TO_UINT to legal for vector types (#79107 ) Support the following conversions: v4f32->v4i32, v2f64->v2i64(LSX) v8f32->v8i32, v4f64->v4i64(LASX) v4f32->v4i64, v4f64->v4i32(LASX)	2024-01-23 15:57:06 +08:00
yjijd	f799f93692	[CodeGen][LoongArch] Set SINT_TO_FP/UINT_TO_FP to legal for vector types (#78924 ) Support the following conversions: v4i32->v4f32, v2i64->v2f64(LSX) v8i32->v8f32, v4i64->v4f64(LASX) v4i32->v4f64, v4i64->v4f32(LASX)	2024-01-23 15:16:23 +08:00
Ami-zhang	fcb8342a21	[LoongArch] Add definitions and feature 'frecipe' for FP approximation intrinsics/builtins (#78962 ) This PR adds definitions and 'frecipe' feature for FP approximation intrinsics/builtins. In additions, this adds and complements relative testcases.	2024-01-23 14:24:58 +08:00
Fangrui Song	7620f03ef7	[MC] Parse SHF_LINK_ORDER argument before section group name (#77407 ) When both SHF_LINK_ORDER \| SHF_GROUP flags are set, GNU assembler from 2.35 onwards (https://sourceware.org/PR25381 https://sourceware.org/binutils/docs/as/Section.html) parses the SHF_LINK_ORDER argument before section group name, different from us. This is unfortunate, but does not matter because the `.section` flag `o` is a niche feature only used by compiler instrumentations, not adopted by hand-written assembly, and using both flags is extremely rare. Let's just match GNU assembler. There is another benefit: we now support zero-flag section group with the SHF_LINK_ORDER flag, while previously there isn't a syntax. While here, print 'G' after 'o' to be clear that the 'G' argument is parsed after the 'o' argument. To make the diff smaller, we don't print 'G' after 'w' in the absence of 'o' for now.	2024-01-09 10:42:34 -08:00
wanglei	98c6aa7229	[LoongArch] Implement LoongArchRegisterInfo::canRealignStack() (#76913 ) This patch fixes the crash issue in the test: CodeGen/LoongArch/can-not-realign-stack.ll Register allocator may spill virtual registers to the stack, which introduces stack alignment requirements (when the size of spilled registers exceeds the default alignment size of the stack). If a function does not have stack alignment requirements before register allocation, registers used for stack alignment will not be preserved. Therefore, we should implement `canRealignStack()` to inform the register allocator whether it is allowed to perform stack realignment operations.	2024-01-09 20:35:49 +08:00
wanglei	f499472de3	[LoongArch] Pre-commit test for #76913 . NFC This test will crash with expensive check. Crash message: ``` * Bad machine code: Using an undefined physical register * - function: main - basic block: %bb.0 entry (0x20fee70) - instruction: $r3 = frame-destroy ADDI_D $r22, -288 - operand 1: $r22 ```	2024-01-09 20:32:20 +08:00
hev	16094cb629	[llvm][LoongArch] Support per-global code model attribute for LoongArch (#72079 ) This patch gets the code model from global variable attribute if it has, otherwise the target's will be used. --------- Signed-off-by: WANG Rui <wangrui@loongson.cn>	2024-01-06 13:36:09 +08:00
wanglei	c56a5e895a	[LoongArch] Reimplement the expansion of PseudoLA_LARGE instructions (#76555 ) According to the description of the psABI v2.30: https://github.com/loongson/la-abi-specs/releases/tag/v2.30, moved the expansion of relevant pseudo-instructions from `LoongArchPreRAExpandPseudo` pass to `LoongArchExpandPseudo` pass, to ensure that the code sequences of `PseudoLA_LARGE` instructions and Medium code model's function call are not scheduled.	2024-01-05 10:57:53 +08:00
wanglei	3d6fc35b90	[LoongArch] Pre-commit test for #76555 . NFC	2024-01-05 10:57:40 +08:00
wanglei	2cf420d5b8	[LoongArch] Emit function call code sequence as `PCADDU18I+JIRL` in medium code model According to the description of the psABI v2.20: https://github.com/loongson/la-abi-specs/releases/tag/v2.20, adjustments are made to the function call instructions under the medium code model. At the same time, AsmParser has already supported parsing the call36 and tail36 macro instructions.	2024-01-05 10:56:47 +08:00
wanglei	da5378e87e	[LoongArch] Fix incorrect pattern [X]VBITSELI_B instructions Adjusted the operand order of [X]VBITSELI_B to correctly match vselect.	2023-12-29 14:44:29 +08:00
wanglei	c7367f985e	[LoongArch] Fix incorrect pattern XVREPL128VEI_{W/D} instructions Remove the incorrect patterns for `XVREPL128VEI_{W/D}` instructions, and add correct patterns for XVREPLVE0_{W/D} instructions	2023-12-29 14:03:53 +08:00
wanglei	47c88bcd5d	[LoongArch] Fix LASX vector_extract codegen Custom lowering `ISD::EXTRACT_VECTOR_ELT` with lasx.	2023-12-29 13:48:53 +08:00
wanglei	af999c4be9	[LoongArch] Add codegen support for [X]VF{MSUB/NMADD/NMSUB}.{S/D} instructions (#74819 ) This is similar to single and double-precision floating-point instructions.	2023-12-11 10:37:22 +08:00
wanglei	cdc3732566	[LoongArch] Mark ISD::FNEG as legal	2023-12-08 15:07:58 +08:00
wanglei	9f70e708a7	[LoongArch] Make ISD::FSQRT a legal operation with lsx/lasx feature (#74795 ) And add some patterns: 1. (fdiv 1.0, vector) 2. (fdiv 1.0, (fsqrt vector))	2023-12-08 14:16:26 +08:00
wanglei	9ff7d0ebeb	[LoongArch] Add codegen support for icmp/fcmp with lsx/lasx fetaures (#74700 ) Mark ISD::SETCC node as legal, and add handling for the vector types condition codes.	2023-12-07 20:11:43 +08:00
wanglei	de21308f78	[LoongArch] Make ISD::VSELECT a legal operation with lsx/lasx	2023-12-06 16:43:38 +08:00
wanglei	e9cd197d15	[LoongArch] Support MULHS/MULHU with lsx/lasx Mark MULHS/MULHU nodes as legal and adds the necessary patterns.	2023-12-04 10:58:05 +08:00
wanglei	a60a5421b6	Reland "[LoongArch] Support CTLZ with lsx/lasx" This patch simultaneously adds tests for `CTPOP`. This relands 07cec73dcd095035257eec1f213d273b10988130 with fix tests.	2023-12-02 17:22:40 +08:00
wanglei	63e6bba0c3	Revert "[LoongArch] Support CTLZ with lsx/lasx" This reverts commit 07cec73dcd095035257eec1f213d273b10988130.	2023-12-02 17:17:48 +08:00
wanglei	07cec73dcd	[LoongArch] Support CTLZ with lsx/lasx This patch simultaneously adds tests for `CTPOP`.	2023-12-02 17:13:36 +08:00
wanglei	66a3e4fafb	[LoongArch] Override TargetLowering::isShuffleMaskLegal By default, `isShuffleMaskLegal` always returns true, which can result in the expansion of `BUILD_VECTOR` into a `VECTOR_SHUFFLE` node in certain situations. Subsequently, the `VECTOR_SHUFFLE` node is expanded again into a `BUILD_VECTOR`, leading to an infinite loop. To address this, we always return false, allowing the expansion of `BUILD_VECTOR` through the stack.	2023-12-02 14:25:17 +08:00
leecheechen	dbbc7c31c8	[LoongArch] Add some binary IR instructions testcases for LASX (#74031 ) The IR instructions include: - Binary Operations: add fadd sub fsub mul fmul udiv sdiv fdiv - Bitwise Binary Operations: shl lshr ashr	2023-12-01 13:14:11 +08:00
wanglei	ca66df3b02	[LoongArch] Add more and/or/xor patterns for vector types	2023-12-01 10:28:41 +08:00
wanglei	add224c0a0	[LoongArch] Custom lowering `ISD::BUILD_VECTOR`	2023-12-01 09:13:39 +08:00
wanglei	f2cbd1fdf7	[LoongArch] Add codegen support for insertelement	2023-12-01 09:13:39 +08:00
leecheechen	29a0f3ec2b	[LoongArch] Add some binary IR instructions testcases for LSX (#73929 ) The IR instructions include: - Binary Operations: add fadd sub fsub mul fmul udiv sdiv fdiv - Bitwise Binary Operations: shl lshr ashr	2023-11-30 21:41:18 +08:00
wanglei	b72456120f	[LoongArch] Add codegen support for extractelement (#73759 ) Add codegen support for extractelement when enable `lsx` or `lasx` feature.	2023-11-30 17:29:18 +08:00
wanglei	5e7e0d6032	[LoongArch] Fix pattern for FNMSUB_{S/D} instructions (#73742 ) ``` when a=c=-0.0, b=0.0: -(a * b + (-c)) = -0.0 -a * b + c = 0.0 (fneg (fma a, b (-c))) != (fma (fneg a), b ,c) ``` See https://reviews.llvm.org/D90901 for a similar discussion on X86.	2023-11-29 15:21:21 +08:00
hev	0d9f557b6c	[LoongArch] Disable mulodi4 and muloti4 libcalls (#73199 ) This library function only exists in compiler-rt not libgcc. So this would fail to link unless we were linking with compiler-rt. Fixes https://github.com/ClangBuiltLinux/linux/issues/1958	2023-11-23 19:34:50 +08:00
hev	7414c0db96	[LoongArch] Precommit a test for smul with overflow (NFC) (#73212 )	2023-11-23 15:15:26 +08:00
ZhaoQi	775d2f3201	[LoongArch][MC] Support to get the FixupKind for BL (#72938 ) Previously, bolt could not get FixupKind for BL correctly, because bolt cannot get target-flags for BL. Here just add support in MCCodeEmitter. Fixes https://github.com/llvm/llvm-project/pull/72826.	2023-11-21 19:00:29 +08:00
ZhaoQi	2ca028ce7c	[LoongArch][MC] Pre-commit tests for instr bl fixupkind testing (#72826 ) This patch is used to test whether fixupkind for bl can be returned correctly. When BL has target-flags(loongarch-call), there is no error. But without this flag, an assertion error will appear. So the test is just tagged as "Expectedly Failed" now until the following patch fix it.	2023-11-21 08:34:52 +08:00
Lu Weining	78abc45c44	[LoongArch] Improve codegen for atomic cmpxchg ops (#69339 ) PR #67391 improved atomic codegen by handling memory ordering specified by the `cmpxchg` instruction. An acquire barrier needs to be generated when memory ordering includes an acquire operation. This PR improves the codegen further by only handling the failure ordering.	2023-10-19 09:21:51 +08:00

1 2 3 4 5 ...

290 Commits