llvm-project

Author	SHA1	Message	Date
Nikita Popov	c10921fa1a	[CGP] Also freeze ctlz/cttz operand when despeculating D125887 changed the ctlz/cttz despeculation transform to insert a freeze for the introduced branch on zero. While this does fix the "branch on poison" issue, we may still get in trouble if we pick a different value for the branch and for the ctz argument (i.e. non-zero for the branch, but zero for the ctz). To avoid this, we should use the same frozen value in both positions. This does cause a regression in RISCV codegen by introducing an additional sext. The DAG looks like this: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %3 t4: i64 = AssertSext t2, ValueType:ch:i32 t23: i64 = freeze t4 t9: ch = CopyToReg t0, Register:i64 %0, t23 t16: ch = CopyToReg t0, Register:i64 %4, Constant:i64<32> t18: ch = TokenFactor t9, t16 t25: i64 = sign_extend_inreg t23, ValueType:ch:i32 t24: i64 = setcc t25, Constant:i64<0>, seteq:ch t28: i64 = and t24, Constant:i64<1> t19: ch = brcond t18, t28, BasicBlock:ch<cond.end 0x8311f68> t21: ch = br t19, BasicBlock:ch<cond.false 0x8311e80> I don't see a really obvious way to improve this, as we can't push the freeze past the AssertSext (which may produce poison). Differential Revision: https://reviews.llvm.org/D126638	2022-06-10 09:46:10 +02:00
Yeting Kuo	f68cad9087	[RISCV] Lower VLEFF/VLSEGFF SDNodes to MachineInstrs with VL outputs. The patch is a replacement of D125199. PseudoReadVL with vtype has worry for computing same vtypes of VLEFF/VLSEGFF in two different places, DAGToDAG and InsertVSETVLI. VLEFF/VLSEGFF MI with VL output still could provide the vtype of VLEFF/VLSEGFF to the users of its VL. The patch names the new pseudo as original VLEFF/VLSEGFF name suffixed "_VL" and expand them in RISCVInsertVSETVLI pass. This patch also reverts commit 4537aae0d57e17c217c192d8977012ba475b130c, "[RISCV] Make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF.". Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126794	2022-06-10 13:57:10 +08:00
Craig Topper	8bbcb98848	[RISCV] Teach RISCVMergeBaseOffset about cases where we use SHXADD to add some immediates. For an addition with simm14 and simm15 immediates with 2 or 3 trailing bits, we can use a shXadd instruction and an addi to do the addition. This patch teaches RISCVMergeBaseOffset to see through this pattern. I don't think the sh1add case occurs because we use two addis for that, but I implemented it for completeness. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127376	2022-06-09 16:07:35 -07:00
Kito Cheng	cfa463fdc6	[RISCV][NFC] Update testcase for D126861	2022-06-10 00:18:02 +08:00
Kito Cheng	4b11f90903	[RISCV] Fix missing stack pointer recover In order to make sure the stack point is right through the EH region, we also need to restore stack pointer from the frame pointer if we don't preserve stack space within prologue/epilogue for outgoing variables, normally it's just checking the variable sized object is present or not is enough, but we also don't preserve that at prologue/epilogue when have vector objects in stack. Example to show what happened: ``` try { sp adjust for outgoing args. // 1. Sp changed. func_call // 2. Exception raised sp restore // Oh, not restored } catch { // 3. And now we are here. } // 4. Prepare to return!, restore return address from stack, but...sp is wrong. // 5. Screw up! ``` Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D126861	2022-06-09 23:38:50 +08:00
Kito Cheng	8b3426569e	[RISCV] Pre-commit testcase for PR55442 The testcase show the stack pointer isn't recovered when we got exception from `_Z3fooiiiiiiiiiiPi`, and then we screw up due to restore return address from wrong stack pointer. NOTE: Trigger conditions: 1. Frame pointer is required. 2. Stack has out-going argument 3. Vector extension is enabled. Another run-able testcase: $ clang++ -target riscv64-unknown-linux-gnu -march=rv64gcv test.cpp ``` void __attribute__((noinline)) foo(int, int, int, int, int, int, int, int, int, int, int ){ throw int(0); } int main(int argc, char *argv) { int exception_value = 1; try { foo(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0); } catch (int i) { exception_value = i; } return exception_value; } ``` Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D126860	2022-06-09 23:35:38 +08:00
Lian Wang	91e31fd205	[RISCV][VP] Add fp test of widen and split for vp.setcc Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D127079	2022-06-09 08:14:12 +00:00
Lian Wang	362a02dabe	[RISCV][test] Add widen STEP_VECTOR tests. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127371	2022-06-09 07:47:04 +00:00
Craig Topper	4bcfc41846	[SelectionDAG] Teach computeKnownBits that a nsw self multiply produce a positive value. This matches what we do in IR. For the RISC-V test case, this allows us to use -8 for the AND mask instead of materializing a constant in a register. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D127335	2022-06-08 14:55:58 -07:00
Craig Topper	e4ba24c17d	[RISCV] Support (addi (addi globaladdr, C1), C2) in RISCVMergeBaseOffset. Add with immediates in the range [-4096, -2049] or [2048, 4095] get convert to two ADDIs. Teach RISCVMergeBaseOffset to recognize this pattern as well. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D126843	2022-06-08 08:20:37 -07:00
Craig Topper	33f4da2455	[RISCV] Support LUI+ADDIW in RISCVMergeBaseOffsetOpt::matchLargeOffset. LUI+ADDIW always produces a simm32. This allows us to always fold it into a global offset. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D126729	2022-06-08 08:19:21 -07:00
Shao-Ce SUN	862f30a428	[RISCV] Add ISD::EH_DWARF_CFA Based on D24038. LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin. Reviewed By: StephenFan Differential Revision: https://reviews.llvm.org/D126181	2022-06-08 22:03:30 +08:00
Kito Cheng	6a6f632b93	Revert "[RISCV] Testcase to show wrong register allocation result of subreg liveness" Revert due to failed on LLVM_ENABLE_EXPENSIVE_CHECKS. This reverts commit cbe22c794348a1962af8a5d21fbedbb65974d94c.	2022-06-08 21:19:27 +08:00
Kito Cheng	7207373e1e	Revert "[SplitKit] Handle early clobber + tied to def correctly" Revert due to failed on LLVM_ENABLE_EXPENSIVE_CHECKS. This reverts commit e14d04909df4e52e531f6c2e045c3cf9638dd817.	2022-06-08 13:05:35 +08:00
Kito Cheng	e14d04909d	[SplitKit] Handle early clobber + tied to def correctly Spliter will try to extend a live range into `r` slot for a use operand, that's works on most situaion, however that not work correctly when the operand has tied to def, and the def operand is early clobber. Give an example to demo what's wrong: 0 %0 = ... 16 early-clobber %0 = Op %0 (tied-def 0), ... 32 ... = Op %0 Before extend: %0 = [0r, 0d) [16e, 32d) The point we want to extend is 0d to 16e not 16r in this case, but if we use 16r here we will extend nothing because that already contained in [16e, 32d). This patch add check for detect such case and adjust the extend point. Detailed explanation for testcase: https://reviews.llvm.org/D126047 Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D126048	2022-06-08 11:33:05 +08:00
Kito Cheng	cbe22c7943	[RISCV] Testcase to show wrong register allocation result of subreg liveness This testcase show the live range isn't construct correctly when subreg liveness is enabled. In the testcase `early-clobber-tied-def-subreg-liveness.ll`, first operand of `vsext.vf2 v8, v16, v0.t` is both def and use, and the use is come from the memory location of `.L__const._Z3foov.var_49`, it's load and spilled into stack, and then...v8 is overwrite by another instructions. ``` lui a0, %hi(.L__const._Z3foov.var_49) addi a0, a0, %lo(.L__const._Z3foov.var_49) ... vle16.v v8, (a0) # Load value from var_49 ... addi a0, sp, 16 ... vs2r.v v8, (a0) # Spill ... vl2r.v v8, (a1) # Reload ... lui a0, %hi(.L__const._Z3foov.var_40) addi a0, a0, %lo(.L__const._Z3foov.var_40) vle16.v v8, (a0) # Load value...into v8??? vmsbc.vx v0, v8, a0 # And use that. ... vsext.vf2 v8, v16, v0.t # But v8 is here...which is expect value from the reload ``` The `early-clobber-tied-def-subreg-liveness.mir` has more detailed infomation for that, `%25.sub_vrm2_0` is defined in 64, and used in 464, and defined again in 464, and we has used an inline asm to clobber all vector register for trigger spliter. ``` 0B bb.0.entry: 16B %0:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_49 32B %1:gpr = ADDI %0:gpr, target-flags(riscv-lo) @__const._Z3foov.var_49 48B dead $x0 = PseudoVSETIVLI 2, 73, implicit-def $vl, implicit-def $vtype 64B undef %25.sub_vrm2_0:vrn4m2nov0 = PseudoVLE16_V_M2 %1:gpr, 2, 4, implicit $vl, implicit $vtype 80B %3:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_48 96B %4:gpr = ADDI %3:gpr, target-flags(riscv-lo) @__const._Z3foov.var_48 112B %5:vr = PseudoVLE8_V_M1 %4:gpr, 2, 3, implicit $vl, implicit $vtype 128B %6:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_46 144B %7:gpr = ADDI %6:gpr, target-flags(riscv-lo) @__const._Z3foov.var_46 160B %25.sub_vrm2_1:vrn4m2nov0 = PseudoVLE16_V_M2 %7:gpr, 2, 4, implicit $vl, implicit $vtype 176B %9:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_45 192B %10:gpr = ADDI %9:gpr, target-flags(riscv-lo) @__const._Z3foov.var_45 208B %25.sub_vrm2_2:vrn4m2nov0 = PseudoVLE16_V_M2 %10:gpr, 2, 4, implicit $vl, implicit $vtype 224B INLINEASM &"" [sideeffect] [attdialect], $0:[clobber], ... 240B %12:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_44 256B %13:gpr = ADDI %12:gpr, target-flags(riscv-lo) @__const._Z3foov.var_44 272B dead $x0 = PseudoVSETIVLI 2, 73, implicit-def $vl, implicit-def $vtype 288B %25.sub_vrm2_3:vrn4m2nov0 = PseudoVLE16_V_M2 %13:gpr, 2, 4, implicit $vl, implicit $vtype 304B $x0 = PseudoVSETIVLI 2, 73, implicit-def $vl, implicit-def $vtype 320B %16:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_40 336B %17:gpr = ADDI %16:gpr, target-flags(riscv-lo) @__const._Z3foov.var_40 352B %18:vrm2 = PseudoVLE16_V_M2 %17:gpr, 2, 4, implicit $vl, implicit $vtype 368B $x0 = PseudoVSETIVLI 2, 73, implicit-def $vl, implicit-def $vtype 384B %20:gpr = LUI 1048572 400B %21:gpr = ADDIW %20:gpr, 928 416B early-clobber %22:vr = PseudoVMSBC_VX_M2 %18:vrm2, %21:gpr, 2, 4, implicit $vl, implicit $vtype 432B $x0 = PseudoVSETIVLI 2, 9, implicit-def $vl, implicit-def $vtype 448B $v0 = COPY %22:vr 464B early-clobber %25.sub_vrm2_0:vrn4m2nov0 = PseudoVSEXT_VF2_M2_MASK %25.sub_vrm2_0:vrn4m2nov0(tied-def 0), %5:vr, killed $v0, 2, 4, 0, implicit $vl, implicit $vtype 480B %26:gpr = LUI target-flags(riscv-hi) @var_47 496B %27:gpr = ADDI %26:gpr, target-flags(riscv-lo) @var_47 512B PseudoVSSEG4E16_V_M2 %25:vrn4m2nov0, %27:gpr, 2, 4, implicit $vl, implicit $vtype 528B PseudoRET ``` When spliter will try to split %25: ``` selectOrSplit VRN4M2NoV0:%25 [64r,160r:4)[160r,208r:0)[208r,288r:1)[288r,464e:2)[464e,512r:3) 0@160r 1@208r 2@288r 3@464e 4@64r L0000000000000030 [160r,512r:0) 0@160r L00000000000000C0 [208r,512r:0) 0@208r L0000000000000300 [288r,512r:0) 0@288r L000000000000000C [64r,464e:1)[464e,512r:0) 0@464e 1@64r weight:1.179245e-02 w=1.179245e-02 ``` ``` Best local split range: 64r-208r, 6.999861e-03, 3 instrs enterIntvBefore 64r: not live leaveIntvAfter 208r: valno 1 useIntv [64B;216r): [64B;216r):1 blit [64r,160r:4): [64r;160r)=1(%29)(recalc) blit [160r,208r:0): [160r;208r)=1(%29)(recalc) blit [208r,288r:1): [208r;216r)=1(%29)(recalc) [216r;288r)=0(%28)(recalc) blit [288r,464e:2): [288r;464e)=0(%28)(recalc) blit [464e,512r:3): [464e;512r)=0(%28)(recalc) rewr %bb.0 464e:0 early-clobber %28.sub_vrm2_0:vrn4m2nov0 = PseudoVSEXT_VF2_M2_MASK %25.sub_vrm2_0:vrn4m2nov0(tied-def 0), %5:vr, $v0, 2, 4, 0, implicit $vl, implicit $vtype rewr %bb.0 288r:0 %28.sub_vrm2_3:vrn4m2nov0 = PseudoVLE16_V_M2 %13:gpr, 2, 4, implicit $vl, implicit $vtype rewr %bb.0 208r:1 %29.sub_vrm2_2:vrn4m2nov0 = PseudoVLE16_V_M2 %10:gpr, 2, 4, implicit $vl, implicit $vtype rewr %bb.0 160r:1 %29.sub_vrm2_1:vrn4m2nov0 = PseudoVLE16_V_M2 %7:gpr, 2, 4, implicit $vl, implicit $vtype rewr %bb.0 64r:1 undef %29.sub_vrm2_0:vrn4m2nov0 = PseudoVLE16_V_M2 %1:gpr, 2, 4, implicit $vl, implicit $vtype rewr %bb.0 464B:0 early-clobber %28.sub_vrm2_0:vrn4m2nov0 = PseudoVSEXT_VF2_M2_MASK %28.sub_vrm2_0:vrn4m2nov0(tied-def 0), %5:vr, $v0, 2, 4, 0, implicit $vl, implicit $vtype rewr %bb.0 512B:0 PseudoVSSEG4E16_V_M2 %28:vrn4m2nov0, %27:gpr, 2, 4, implicit $vl, implicit $vtype rewr %bb.0 216B:1 undef %28.sub_vrm1_0_sub_vrm1_1_sub_vrm1_2_sub_vrm1_3_sub_vrm1_4_sub_vrm1_5:vrn4m2nov0 = COPY %29.sub_vrm1_0_sub_vrm1_1_sub_vrm1_2_sub_vrm1_3_sub_vrm1_4_sub_vrm1_5:vrn4m2nov0 queuing new interval: %28 [216r,288r:0)[288r,464e:1)[464e,512r:2) 0@216r 1@288r 2@464e L000000000000000C [216r,216d:0)[464e,512r:1) 0@216r 1@464e L0000000000000300 [288r,512r:0) 0@288r L00000000000000C0 [216r,512r:0) 0@216r L0000000000000030 [216r,512r:0) 0@216r weight:8.706897e-03 Enqueuing %28 queuing new interval: %29 [64r,160r:0)[160r,208r:1)[208r,216r:2) 0@64r 1@160r 2@208r L000000000000000C [64r,216r:0) 0@64r L00000000000000C0 [208r,216r:0) 0@208r L0000000000000030 [160r,216r:0) 0@160r weight:1.097826e-02 Enqueuing %29 ``` The live range of first part subreg of %25 is become [216r,216d:0)[464e,512r:1), however first live range should live until 464e rather than just live and [216r,216d:0). And then the register allocator allocated wrong result accroding the live range info. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D126047	2022-06-08 11:27:24 +08:00
Craig Topper	0c66deb498	[RISCV] Scalarize gather/scatter on RV64 with Zve32* extension. i64 indices aren't supported on Zve32*. Scalarize gathers to prevent generating illegal instructions. Since InstCombine will aggressively canonicalize GEP indices to pointer size, we're pretty much always going to have an i64 index. Trying to predict when SelectionDAG will find a smaller index from the TTI hook used by the ScalarizeMaskedMemIntrinPass seems fragile. To optimize this we probably need an IR pass to rewrite it earlier. Test RUN lines have also been added to make sure the strided load/store optimization still works. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D127179	2022-06-07 08:07:50 -07:00
Matt Arsenault	56303223ac	llvm-reduce: Don't assert on functions which don't track liveness Use the query that doesn't assert if TracksLiveness isn't set, which needs to always be available. We also need to start printing liveins regardless of TracksLiveness.	2022-06-07 10:00:25 -04:00
Craig Topper	be398100ea	[SelectionDAG] Further improve computeKnownBits for (smax X, C) where C is non-negative. Move the code that was added for D126896 after the normal recursive calls to computeKnownBits. This allows us to calculate trailing zeros. Previously we would break out of the switch before the recursive calls.	2022-06-06 09:59:23 -07:00
Shao-Ce SUN	84bacb18c6	[RISCV] Use check-prefixes to reduce check lines Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D125083	2022-06-06 15:59:15 +08:00
Lian Wang	20cf77f776	[LegalizeTypes][VP] Add widen and split support for vp.fptrunc and vp.fpext Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D126439	2022-06-06 02:28:01 +00:00
LiaoChunyu	f14d18c7a9	[RISCV] Add more patterns for FNMADD D54205 handles fnmadd: -rs1 * rs2 - rs3 This patch add fnmadd: -(rs1 * rs2 + rs3) (the nsz flag on the FMA) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D126852	2022-06-04 12:31:45 +08:00
Craig Topper	cc3bd43533	[RISCV] Support LUI+ADDIW in doPeepholeLoadStoreADDI. This fixes an inconsistency between RV32 and RV64. Still considering trying to do this peephole during isel, but wanted to fix the inconsistency first. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126986	2022-06-03 18:06:56 -07:00
Craig Topper	8da5d5dbdc	[RISCV] Pre-commit test cases for D126986. NFC	2022-06-03 13:31:45 -07:00
Craig Topper	170c550ca8	[RISCV] Use SelectionDAG::isBaseWithConstantOffset in scalar load/store address matching. Test changes are because isBaseWithConstantOffset uses computeKnownBits and that is able to see that an earlier AND instruction guaranteed alignment so that we can treat an OR as an ADD. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126970	2022-06-03 10:55:28 -07:00
Craig Topper	dbead2388b	[RISCV] Add custom isel for (add X, imm) used by load/stores. If the imm is out of range for an ADDI, we will materialize it in a register using multiple instructions. If the ADD is used by a load/store, doPeepholeLoadStoreADDI can try to pull an ADDI from the constant materialization into the load/store offset. This only works if the ADD has a single use, otherwise the peephole would have to rebuild multiple nodes. This patch instead tries to solve the problem when the add is selected. We check that the add is only used by loads/stores and if it is we will select it to (ADDI (ADD X, Imm-Lo12), Lo12). This will enable the simple case in doPeepholeLoadStoreADDI that can bypass an ADDI used as a pointer. As a result we can remove the more complicated peephole from doPeepholeLoadStoreADDI. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126576	2022-06-02 13:45:32 -07:00
Craig Topper	fa20bf1636	[DAGCombiner][RISCV] Improve computeKnownBits for (smax X, C) where C is non-negative. If C is non-negative, the result of the smax must also be non-negative, so all sign bits of the result are 0. This allows DAGCombiner to remove a zext_inreg in the modified test. This zext_inreg started as a sext that became zext before type legalization then was promoted to a zext_inreg. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126896	2022-06-02 12:34:24 -07:00
Craig Topper	01ba470826	[RISCV] Add test case showing unnecessary extend after i32 smax on rv64. NFC One of the operands of the smax is a positive value so computeKnownBits determines the result of the smax must always be positive. This allows DAG combiner to convert the sign extend to zero extend before type legalization. After type legalization the smax is promoted to i64 by sign extending its inputs and the zero extend becomes an AND instruction. We are unable to remove the AND at this point and it becomes a pair of shifts or a zext.w. The result of smax has as many sign bits as the minimum of its inputs. Had we kept the sign extend instead of turning it into a zero extend it would be removed by DAG combiner after type legalization.	2022-06-02 09:58:11 -07:00
Philip Reames	dcdb0bf25b	[RISCV] Fix an inconsistency with compatible load/store handling Once we've computed the incoming predecessor state, we should use the same compatibility check with knowledge of MI as we did in phase 2 in order to be consistent across all phases. Differential Revision: https://reviews.llvm.org/D126574	2022-06-02 08:03:51 -07:00
jacquesguan	5482ae6328	[LegalizeTypes][VP] Add widen and split support for VP FP integer casting op. This patch adds widen and split support for VP_FPTOSI, VP_FPTOUI, VP_SITOFP and VP_UITOFP. Differential Revision: https://reviews.llvm.org/D126847	2022-06-02 09:05:27 +00:00
jacquesguan	058791d8f2	[LegalizeTypes][VP] Add widen and split support for VP_SIGN_EXTEND and VP_ZERO_EXTEND. Differential Revision: https://reviews.llvm.org/D126442	2022-06-02 02:21:22 +00:00
Hendrik Greving	a92ed167f2	[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4. Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4. Keeps MVT::i2, MVT::i4 lowering actions as expand, which should be removed once targets set this explicitly. Adjusts 11 lit tests to reflect slightly different behavior during DAG combine. Differential Revision: https://reviews.llvm.org/D125247	2022-06-02 00:49:11 +00:00
Philip Reames	f15add7d93	[RISCV] Split fixed-vector-strided-load-store.ll so it can be autogened I've gotten tired of updating register allocation changes by hand, let's just autogen this even if we have to duplicate it.	2022-06-01 16:12:35 -07:00
Hendrik Greving	e9d05cc7d8	Revert "[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4." This reverts commit 430ac5c3029c52e391e584c6d4447e6e361fae99. Due to failures in Clang tests. Differential Revision: https://reviews.llvm.org/D125247	2022-06-01 13:27:49 -07:00
Hendrik Greving	430ac5c302	[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4. Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4. Keeps MVT::i2, MVT::i4 lowering actions as `expand`, which should be removed once targets set this explicitly. Adjusts 11 lit tests to reflect slightly different behavior during DAG combine. Differential Revision: https://reviews.llvm.org/D125247	2022-06-01 12:48:01 -07:00
Craig Topper	aeb27f133a	[RISCV] Fix i64<->f64 and i32<->f32 bitcasts with VLS vectors enabled. We enable a custom handler to optimize conversions between scalars and fixed vectors. Unfortunately, the custom handler picks up scalar to scalar conversions as well. If the scalar types are both legal, we wouldn't match any of the fixed vector cases and would return SDValue() causing the LegalizeDAG to expand the bitcast through memory. This patch fixes this by checking if it's a scalar to scalar conversion and returns `Op` if both types are legal. Differential Revision: https://reviews.llvm.org/D126739	2022-06-01 08:13:49 -07:00
wangpc	57203af167	[RISCV] Set target-abi explicitly to reduce codegen results As mentioned in D125947, we can reduce codegen results by adding an explicit hard single-float ABI. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D126640	2022-06-01 13:49:23 +08:00
Philip Reames	33b1be5916	[riscv] add test coverage for fractional lmul w/fixed length vectorization	2022-05-31 10:25:37 -07:00
Craig Topper	1b2de79ff4	[RISCV] Use two ADDIs to do some stack pointer adjustments. If the adjustment doesn't fit in 12 bits, try to break it into two 12 bit values before falling back to movImm+add/sub. This is based on a similar idea from isel. Reviewed By: luismarques, reames Differential Revision: https://reviews.llvm.org/D126392	2022-05-31 10:25:28 -07:00
Craig Topper	80c4cf6369	[RISCV] Fix a few corner case bugs in RISCVMergeBaseOffsetOpt::matchLargeOffset The immediate for LUI is stored as 20-bit unsigned value. We need to sign extend if after shifting by 12 to match the instruction behavior. If we find an LUI+ADDI on RV64, it means the constant isn't a simm32. If it was, we would have emitted LUI+ADDIW from constant materialization. Make sure the constant is a simm32 before folding. This appears to match gcc. A future patch will add support for LUI+ADDIW on RV64.	2022-05-31 09:50:54 -07:00
Craig Topper	3b5456d5f0	[RISCV] Pre-commit tests for D126635. NFC	2022-05-31 09:49:46 -07:00
eopXD	2cadf84fc8	[RISCV] Pass OptLevel to `RISCVDAGToDAGISel` correctly Originally, `OptLevel` isn't passed into the `MachineFunctionPass`. This lets the default parameter of `SelectionDAGISel`, which is `CodeGenOpt::Default`, be passed in. OptLevelChanger captures the optimization level with the parameter, and rather not the value within `TargetMachine`. This lets the optimization be unintentionally overwriten if other value than `CodeGenOpt::Default` passed. This patch fixes this by passing the optimization level rather than using the default value. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D126641	2022-05-30 17:22:50 -07:00
eopXD	51002bdb5e	[RISCV] Precommit test case to show bug in RISCVISelDagToDag The optimization level should not be restored into O2. This is a pre-commit test case to show fix in D126641. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D126677	2022-05-30 15:59:20 -07:00
Ping Deng	88af539c0e	[RISCV] Support VP_REDUCE_MUL mask operation Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126520	2022-05-30 03:05:39 +00:00
Ping Deng	083798e270	[LegalizeTypes][VP] Add integer promotion support for vp.fptosi/vp.fptoui Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125760	2022-05-30 03:05:39 +00:00
Craig Topper	6a6cf2e28d	[RISCV] isel (add (and X, 0x1FFFFFFFE), Y) as (SH1ADD (SRLI X, 1), Y) This pattern is what we get after DAG combine for C code like this. short ptr1, ptr2, *ptr3; unsigned diff = ptr1 - ptr2; return ptr3[diff]; Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126588	2022-05-29 18:24:07 -07:00
Craig Topper	e642d0ea21	[RISCV] Add test cases showing missed opportunity to use shXadd.uw. NFC The tests here show the codegen for something like this C code. unsigned diff = ptr1 - ptr2; return ptr3[diff]; The pointer difference is truncated to 32-bits before being used again as an index. In SelectionDAG this appears as an AND between a SRL and a SHL. DAGCombiner will remove the shifts leaving only an AND. The Mask now has 1,2, or 3 trailing zeros and 31, 30, or 29 leading zeros. We end up falling back to constant materialization to create this mask. We could instead use srli followed by slli.uw. Or since we have an add, we can use srli followed by shXadd.uw. Differential Revision: https://reviews.llvm.org/D126589	2022-05-29 18:22:55 -07:00
Philip Reames	85b4470035	[RISCV] Allow PRE of vsetvli involving non-1 LMUL This is a follow up to address a review comment from D124869. When deciding whether to PRE a vsetvli, we can allow non-LMUL1 vsetvlis. Differential Revision: https://reviews.llvm.org/D126563	2022-05-27 15:49:41 -07:00
Craig Topper	542a83c362	[RISCV] Correct load/store alignments in sink-splat-operands.ll. NFC These should be aligned to the natural alignment of the element. Probably copy/paste mistake from the i32 tests. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D126567	2022-05-27 14:39:31 -07:00
Philip Reames	d4905a7b20	[RISCV] Add a vsetvli PRE test involving non-1 LMUL	2022-05-27 13:16:05 -07:00

1 2 3 4 5 ...

1709 Commits