llvm-project

Author	SHA1	Message	Date
WÁNG Xuěruì	f246b5f547	[LoongArch] Support bswap for LSX/LASX VTs (#114171 ) On top of #114170	2024-11-01 00:38:13 +08:00
hev	f7a96dc664	[LoongArch] Ensure pcaddu18i and jirl adjacency in tail calls for correct relocation (#113932 ) Prior to this patch, both `pcaddu18i` and `jirl` were marked as scheduling boundaries to prevent instruction reordering that would disrupt their adjacency. However, in certain cases, epilogues were still being inserted between these two instructions, breaking the required proximity. This patch ensures that `pcaddu18i` and `jirl` remain adjacent even in the presence of epilogues, maintaining correct relocation behavior for tail calls on LoongArch.	2024-11-01 00:08:15 +08:00
WÁNG Xuěruì	5581e43a2b	[LoongArch][NFC] Pre-commit tests for LSX/LASX bswap codegen (#114170 )	2024-10-31 21:10:26 +08:00
WANG Rui	862074fa57	[LoongArch][NFC] Pre-commit tests for the adjacency of expanded pseudo-insns	2024-10-31 16:59:41 +08:00
Ami-zhang	1897bf61f0	[LoongArch] Enable FeatureExtLSX for generic-la64 processor (#113421 ) This commit makes the `generic` target to support FP and LSX, as discussed in #110211. Thereby, it allows 128-bit vector to be enabled by default in the loongarch64 backend.	2024-10-31 15:58:15 +08:00
hev	b225b15a3d	[LoongArch] Merge base and offset for large offsets (#113277 ) This PR merges large offsets into the base address loading.	2024-10-23 19:43:23 +08:00
tangaac	5b9c76b6e7	[LoongArch] Support LoongArch-specific amswap[_db].{b/h} and amadd[_db].{b/h} instructions (#113255 ) Two options for clang: -mlam-bh & -mno-lam-bh. Enable or disable amswap[__db].{b/h} and amadd[__db].{b/h} instructions. The default is -mno-lam-bh. Only works on LoongArch64.	2024-10-23 16:03:15 +08:00
WANG Rui	4614b80c49	[LoongArch] Pre-commit tests for merge base with large offset. NFC	2024-10-22 15:44:40 +08:00
tangaac	ba5676cf91	[LoongArch] Minor refinement to monotonic atomic semantics. (#112681 ) Don't use "_db" version AM instructions for LoongArch atomic memory operations with monotonic semantics.	2024-10-21 15:58:35 +08:00
Alex Rønne Petersen	ad4a582fd9	[llvm] Consistently respect `naked` fn attribute in `TargetFrameLowering::hasFP()` (#106014 ) Some targets (e.g. PPC and Hexagon) already did this. I think it's best to do this consistently so that frontend authors don't run into inconsistent results when they emit `naked` functions. For example, in Zig, we had to change our emit code to also set `frame-pointer=none` to get reliable results across targets. Note: I don't have commit access.	2024-10-18 09:35:42 +04:00
tangaac	e9eec14bb3	[LoongArch] [CodeGen] Add options for Clang to generate LoongArch-specific frecipe & frsqrte instructions (#109917 ) Two options: `-mfrecipe` & `-mno-frecipe`. Enable or Disable frecipe.{s/d} and frsqrte.{s/d} instructions. The default is `-mno-frecipe`.	2024-10-18 09:06:29 +08:00
wanglei	4c2c177567	[LoongArch] Add options for annotate tablejump This aligns with GCC. LoongArch kernel developers requested that this option generate some corresponding relations in a section, including the addresses of the jump instruction(jr) and the `MachineJumpTableEntry`. Reviewed By: heiher Pull Request: https://github.com/llvm/llvm-project/pull/102411	2024-10-16 11:58:00 +08:00
WANG Rui	8e3cde04cb	[LoongArch][test] Add float-point atomic load/store tests. NFC	2024-09-25 15:39:22 +08:00
Robert Dazi	8837898b8d	[DAGCombine] Count leading ones: refine post DAG/Type Legalisation if promotion (#102877 ) This PR is related to #99591. In this PR, instead of modifying how the legalisation occurs depending on surrounding instructions, we refine after legalisation. This PR has two parts: * `SDPatternMatch/MatchContext`: Modify a little bit the code to match Operands (used by `m_Node(...)`) and Unary/Binary/Ternary Patterns to make it compatible with `VPMatchContext`, instead of only `m_Opc` supported. Some tests were added to ensure no regressions. * `DAGCombiner`: Add a `foldSubCtlzNot` which detect and rewrite the patterns using matching context. Remaining Tasks: - [ ] GlobalISel - [ ] Currently the pattern matching will occur even before legalisation. Should I restrict it to specific stages instead ? - [ ] Style: Add a visitVP_SUB ?? Move `foldSubCtlzNot` in another location for style consistency purpose ? @topperc --------- Co-authored-by: v01dxyz <v01dxyz@v01d.xyz>	2024-09-15 15:48:36 +04:00
YANG Xudong	13280d99ae	[loongarch][DAG][FREEZE] Fix crash when FREEZE a half(f16) type on loongarch (#107791 ) For zig with LLVM 19.1.0rc4, we are seeing the following error when bootstrapping a `loongarch64-linux-musl` target. https://github.com/ziglang/zig-bootstrap/issues/164#issuecomment-2332357069 It seems that this issue is caused by `PromoteFloatResult` is not handling FREEZE OP on loongarch. Here is the reproduction of the error: https://godbolt.org/z/PPfvWjjG5 ~~This patch adds the FREEZE OP handling with `PromoteFloatRes_UnaryOp` and adds a test case.~~ This patch changes loongarch's way of floating point promotion to soft promotion to avoid this problem. See: loongarch's handling of `half`: - https://github.com/llvm/llvm-project/issues/93894 - https://github.com/llvm/llvm-project/pull/94456 Also see: other float promotion FREEZE handling - `0019c2f194`	2024-09-13 08:49:54 +08:00
Lu Weining	ffcebcdb96	[LoongArch] Implement Statepoint lowering (#108212 ) The functionality has been validated in OpenHarmony's arkcompiler.	2024-09-12 18:05:13 +08:00
hev	0f47e3aebd	[LoongArch] Eliminate the redundant sign extension of division (#107971 ) If all incoming values of `div.d` are sign-extended and all users only use the lower 32 bits, then convert them to W versions. Fixes: #107946	2024-09-10 16:52:21 +08:00
wanglei	1ca411ca45	[LoongArch] Codegen for concat_vectors with LASX Fixes: #107355 Reviewed By: SixWeining Pull Request: https://github.com/llvm/llvm-project/pull/107523	2024-09-10 09:28:15 +08:00
Yingwei Zheng	a111f9119a	[LoongArch][ISel] Check the number of sign bits in `PatGprGpr_32` (#107432 ) After https://github.com/llvm/llvm-project/pull/92205, LoongArch ISel selects `div.w` for `trunc i64 (sdiv i64 3202030857, (sext i32 X to i64)) to i32`. It is incorrect since `3202030857` is not a signed 32-bit constant. It will produce wrong result when `X == 2`: https://alive2.llvm.org/ce/z/pzfGZZ This patch adds additional `sexti32` checks to operands of `PatGprGpr_32`. Alive2 proof: https://alive2.llvm.org/ce/z/AkH5Mp Fix #107414.	2024-09-10 09:19:39 +08:00
anjenner	4af249fe6e	Add usub_cond and usub_sat operations to atomicrmw (#105568 ) These both perform conditional subtraction, returning the minuend and zero respectively, if the difference is negative.	2024-09-06 16:19:20 +01:00
wanglei	df93327c1a	[LoongArch] Legalize ISD::CTPOP for GRLenVT type with LSX Reviewed By: SixWeining Pull Request: https://github.com/llvm/llvm-project/pull/106941	2024-09-06 15:46:43 +08:00
wanglei	4b2c950de5	[test][LoongArch] Pre-commit test for optimize CTPOP. NFC Reviewed By: SixWeining Pull Request: https://github.com/llvm/llvm-project/pull/106940	2024-09-06 15:45:23 +08:00
wanglei	eaf87d3275	[LoongArch] Optimize for immediate value materialization using BSTRINS_D instruction Reviewed By: heiher, SixWeining Pull Request: https://github.com/llvm/llvm-project/pull/106332	2024-08-30 16:38:42 +08:00
wanglei	5b77e254e8	[LoongArch] Pre-commit test for immediate value materialization using BSTRINS_D Reviewed By: SixWeining Pull Request: https://github.com/llvm/llvm-project/pull/106331	2024-08-30 16:37:20 +08:00
Stephen Tozer	3d08ade7bd	[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149 ) This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>	2024-08-29 17:53:32 +01:00
Weining Lu	63267ca901	[LoongArch] Fix the assertion for atomic store with 'ptr' type	2024-08-19 17:17:36 +08:00
hev	985d64b03a	[LoongArch] Merge base and offset for LSX/LASX memory accesses (#104452 )	2024-08-19 15:23:05 +08:00
WANG Rui	82cf6558e5	[LoongArch] Pre-commit tests for validating the merge base offset in vecotrs. NFC	2024-08-15 21:06:27 +08:00
YunQiang Su	fb9e685fc4	Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649 ) C23 introduced new functions fminimum_num and fmaximum_num, and they follow the minimumNumber and maximumNumber of IEEE754-2019. Let's introduce new intrinsics to support them. This patch introduces support only support for scalar values. The support of vector (vp, vp.reduce, vector.reduce), experimental.constrained will be added in future patches. With this patch, MIPSr6 and LoongArch can work out of box with fcanonical and fmax/fmin. Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while they have no fcanonical support yet. I will add it in future patches. The FMIN/FMAX of RISC-V instructions follows the minimumNumber/maximumNumber of IEEE754-2019. We can just add it in future patch. Background https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735 Currently we have fminnum/fmaxnum, which have different behavior on different platform for NUM vs sNaN: 1) Fallback to fmin(3)/fmax(3): return qNaN. 2) ARM64/ARM32+Neon: same as libc. 3) MIPSr6/LoongArch/RISC-V: return NUM. And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008 will submit as separated patches.	2024-08-15 14:09:36 +08:00
Peter Rong	74e4694b8c	[LTO] enable `ObjCARCContractPass` only on optimized build (#101114 ) \#92331 tried to make `ObjCARCContractPass` by default, but it caused a regression on O0 builds and was reverted. This patch trys to bring that back by: 1. reverts the [revert](`1579e9ca9c`). 2. `createObjCARCContractPass` only on optimized builds. Tests are updated to refelect the changes. Specifically, all `O0` tests should not include `ObjCARCContractPass` Signed-off-by: Peter Rong <PeterRong@meta.com>	2024-08-09 13:04:25 -07:00
hev	dbae30df24	[LoongArch] Load floating-point immediate using VLDI (#101923 ) This commit uses the VLDI instruction to load some common floating-point constants when the LSX feature is enabled.	2024-08-09 14:08:32 +08:00
hev	b2e69f52bb	[LoongArch] Add machine function pass to merge base + offset (#101139 ) This commit references RISC-V to add a machine function pass to merge the base address and offset.	2024-08-08 23:05:38 +08:00
Alexis Engelke	fa92d51f9e	[VP] Merge ExpandVP pass into PreISelIntrinsicLowering (#101652 ) Similar to #97727; avoid an extra pass over the entire IR by performing the lowering as part of the pre-isel-intrinsic-lowering pass.	2024-08-06 09:27:59 +02:00
WANG Rui	5f7e921fe3	[LoongArch] Pre-commit test for load floating-point immediate using VLDI. NFC	2024-08-05 11:47:27 +08:00
hev	8b26c02caa	[LoongArch] Align stack objects passed to memory intrinsics (#101309 ) Memcpy, and other memory intrinsics, typically try to use wider load/store if the source and destination addresses are aligned. In CodeGenPrepare, look for calls to memory intrinsics and, if the object is on the stack, align it to 4-byte (32-bit) or 8-byte (64-bit) boundaries if it is large enough that we expect memcpy to use wider load/store instructions to copy it. Fixes #101295	2024-08-02 11:28:03 +08:00
Alexis Engelke	b5fc083dc3	[CodeGen] Merge lowerConstantIntrinsics into pre-isel lowering (#97727 ) Currently, the LowerConstantIntrinsics pass does an RPO traversal of every function... only to find that many functions don't have constant intrinsics (is.constant, objectsize). In the CodeGen pipeline, there is already a pre-isel intrinsic lowering pass, which iterates over intrinsic declarations and lowers all users. Call lowerConstantIntrinsics from this pass to avoid the extra iteration over the entire IR and the RPO traversal.	2024-08-01 17:44:32 +02:00
WANG Rui	f51a479520	[LoongArch] Pre-commit test for aligning stack objects passed to memory intrinsics. NFC	2024-08-01 17:17:28 +08:00
Craig Topper	307d1249ea	[LegalizeTypes][RISCV][LoongArch] Optimize promotion of ucmp. (#101366 ) ucmp can be promoted with either sext or zext. RISC-V and LoongArch prefer sext for promoting i32 to i64 unless the inputs are known to be zero extended already. This patch uses the existing SExtOrZExtPromotedOperands function that is used by SETCC promotion to intelligently handle this.	2024-07-31 17:18:27 -07:00
WANG Rui	84ad292f34	[LoongArch] Pre-commit tests for merge base offset. NFC	2024-07-30 14:45:25 +08:00
hev	0e6f64cd5e	[LoongArch] Reimplement to prevent Pseudo{CALL, LA}_LARGE instruction reordering (#100099 ) The Pseudo{CALL, LA}_LARGE instruction patterns specified in psABI v2.30 cannot be reordered. This patch sets scheduling boundaries for these instructions to prevent reordering. The Pseudo{CALL, LA*}_LARGE instruction is moved back to Pre-RA expansion, which will help with subsequent address calculation optimizations.	2024-07-30 14:22:53 +08:00
hev	3e2631c9c6	[LoongArch] Optimize codegen for ISD::ROTL (#100344 ) The LoongArch rotr.{w,d} instruction ignores the high bits of the shift operand, allowing it to generate more efficient code using the constant zero register.	2024-07-30 14:22:24 +08:00
hev	e386aacb74	[LoongArch] Fix codegen for ISD::ROTR (#100292 ) This patch fixes the code generation for IR: sext i32 (trunc i64 (rotr i64 %x, i64 %y) to i32) to i64	2024-07-24 12:08:43 +08:00
WANG Rui	9d1d0cc020	[LoongArch][test] Revert "Pre-commit for fix codegen for ISD::ROTR". NFC This reverts commit bc829b501d0ffa93019d29b0294e998d3dbb3d7a.	2024-07-24 11:00:50 +08:00
WANG Rui	bc829b501d	[LoongArch][test] Pre-commit for fix codegen for ISD::ROTR. NFC	2024-07-24 10:46:58 +08:00
Ami-zhang	fcec298087	[LoongArch] Support la664 (#100068 ) A new ProcessorModel called `la664` is defined in LoongArch.td to support `-march/-mtune=la664`.	2024-07-23 15:14:20 +08:00
Zhaoxin Yang	464ea880cf	[LoongArch][CodeGen] Implement 128-bit and 256-bit vector shuffle. (#100054 ) [LoongArch][CodeGen] Implement 128-bit and 256-bit vector shuffle operations. In LoongArch, shuffle operations can be divided into two types: - Single-vector shuffle: Shuffle using only one vector, with the other vector being `undef` or not selected by mask. This can be expanded to instructions such as `vreplvei` and `vshuf4i`. - Two-vector shuffle: Shuflle using two vectors. This can be expanded to instructions like `vilv[l/h]`, `vpack[ev/od]`, `vpick[ev/od]` and the basic `vshuf`. In the future, more optimizations may be added, such as handling 1-bit vectors and processing single element patterns, etc.	2024-07-23 12:06:59 +08:00
WANG Rui	87c35d7827	[LoongArch][test] Add --relocation-model=pic option to psabi-restricted-scheduling. NFC Add --relocation-model=pic option for generating %gd_pc_hi20 and %ld_pc_hi20.	2024-07-23 11:32:01 +08:00
hev	4c73b1a986	[LoongArch] Recommit "Remove spurious mask operations from andn->icmp on 16 and 8 bit values" (#99798 ) recommit of #99272	2024-07-22 15:10:21 +08:00
WANG Rui	aefe411dae	[LoongArch] Add a test for spurious mask removal. NFC Link: https://github.com/llvm/llvm-project/pull/99272#issuecomment-2241348794	2024-07-21 13:42:26 +08:00
hev	1d5d18924d	Revert "[LoongArch] Remove spurious mask operations from andn->icmp on 16 and 8 bit values" (#99792 ) Reverts llvm/llvm-project#99272	2024-07-21 10:17:20 +08:00

1 2 3 4 5 ...

382 Commits