1709 Commits

Author SHA1 Message Date
Nikita Popov
c10921fa1a [CGP] Also freeze ctlz/cttz operand when despeculating
D125887 changed the ctlz/cttz despeculation transform to insert
a freeze for the introduced branch on zero. While this does fix
the "branch on poison" issue, we may still get in trouble if we
pick a different value for the branch and for the ctz argument
(i.e. non-zero for the branch, but zero for the ctz). To avoid
this, we should use the same frozen value in both positions.

This does cause a regression in RISCV codegen by introducing an
additional sext. The DAG looks like this:

    t0: ch = EntryToken
        t2: i64,ch = CopyFromReg t0, Register:i64 %3
      t4: i64 = AssertSext t2, ValueType:ch:i32
    t23: i64 = freeze t4
          t9: ch = CopyToReg t0, Register:i64 %0, t23
          t16: ch = CopyToReg t0, Register:i64 %4, Constant:i64<32>
        t18: ch = TokenFactor t9, t16
            t25: i64 = sign_extend_inreg t23, ValueType:ch:i32
          t24: i64 = setcc t25, Constant:i64<0>, seteq:ch
        t28: i64 = and t24, Constant:i64<1>
      t19: ch = brcond t18, t28, BasicBlock:ch<cond.end 0x8311f68>
    t21: ch = br t19, BasicBlock:ch<cond.false 0x8311e80>

I don't see a really obvious way to improve this, as we can't push
the freeze past the AssertSext (which may produce poison).

Differential Revision: https://reviews.llvm.org/D126638
2022-06-10 09:46:10 +02:00
Yeting Kuo
f68cad9087 [RISCV] Lower VLEFF/VLSEGFF SDNodes to MachineInstrs with VL outputs.
The patch is a replacement of D125199. PseudoReadVL with vtype has worry for
computing same vtypes of VLEFF/VLSEGFF in two different places, DAGToDAG and
InsertVSETVLI. VLEFF/VLSEGFF MI with VL output still could provide the vtype of
VLEFF/VLSEGFF to the users of its VL.

The patch names the new pseudo as original VLEFF/VLSEGFF name suffixed "_VL" and
expand them in RISCVInsertVSETVLI pass.

This patch also reverts commit 4537aae0d57e17c217c192d8977012ba475b130c,
"[RISCV] Make PseudoReadVL have the vtypes of the corresponding VLEFF/VLSEGFF.".

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126794
2022-06-10 13:57:10 +08:00
Craig Topper
8bbcb98848 [RISCV] Teach RISCVMergeBaseOffset about cases where we use SHXADD to add some immediates.
For an addition with simm14 and simm15 immediates with 2 or 3 trailing bits,
we can use a shXadd instruction and an addi to do the addition.

This patch teaches RISCVMergeBaseOffset to see through this pattern.
I don't think the sh1add case occurs because we use two addis for that,
but I implemented it for completeness.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D127376
2022-06-09 16:07:35 -07:00
Kito Cheng
cfa463fdc6 [RISCV][NFC] Update testcase for D126861 2022-06-10 00:18:02 +08:00
Kito Cheng
4b11f90903 [RISCV] Fix missing stack pointer recover
In order to make sure the stack point is right through the EH region,
we also need to restore stack pointer from the frame pointer if we
don't preserve stack space within prologue/epilogue for outgoing variables,
normally it's just checking the variable sized object is present or not
is enough, but we also don't preserve that at prologue/epilogue when
have vector objects in stack.

Example to show what happened:
```
try {
  sp adjust for outgoing args. // 1. Sp changed.
  func_call  // 2. Exception raised
  sp restore // Oh, not restored
} catch {
  // 3. And now we are here.
}

// 4. Prepare to return!, restore return address from stack, but...sp is wrong.
// 5. Screw up!
```

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D126861
2022-06-09 23:38:50 +08:00
Kito Cheng
8b3426569e [RISCV] Pre-commit testcase for PR55442
The testcase show the stack pointer isn't recovered when we got
exception from `_Z3fooiiiiiiiiiiPi`, and then we screw up due to
restore return address from wrong stack pointer.

NOTE:
Trigger conditions:
1. Frame pointer is required.
2. Stack has out-going argument
3. Vector extension is enabled.

Another run-able testcase:

$ clang++ -target riscv64-unknown-linux-gnu -march=rv64gcv test.cpp
```
void __attribute__((noinline)) foo(int, int, int, int, int, int, int, int, int, int, int *){
 throw int(0);
}

int main(int argc, char **argv) {
  int exception_value = 1;
  try {
      foo(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0);
  } catch (int i) {
    exception_value = i;
  }
  return exception_value;
}
```

Reviewed By: rogfer01

Differential Revision: https://reviews.llvm.org/D126860
2022-06-09 23:35:38 +08:00
Lian Wang
91e31fd205 [RISCV][VP] Add fp test of widen and split for vp.setcc
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D127079
2022-06-09 08:14:12 +00:00
Lian Wang
362a02dabe [RISCV][test] Add widen STEP_VECTOR tests.
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D127371
2022-06-09 07:47:04 +00:00
Craig Topper
4bcfc41846 [SelectionDAG] Teach computeKnownBits that a nsw self multiply produce a positive value.
This matches what we do in IR. For the RISC-V test case, this allows
us to use -8 for the AND mask instead of materializing a constant in a register.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D127335
2022-06-08 14:55:58 -07:00
Craig Topper
e4ba24c17d [RISCV] Support (addi (addi globaladdr, C1), C2) in RISCVMergeBaseOffset.
Add with immediates in the range [-4096, -2049] or [2048, 4095] get
convert to two ADDIs. Teach RISCVMergeBaseOffset to recognize this
pattern as well.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D126843
2022-06-08 08:20:37 -07:00
Craig Topper
33f4da2455 [RISCV] Support LUI+ADDIW in RISCVMergeBaseOffsetOpt::matchLargeOffset.
LUI+ADDIW always produces a simm32. This allows us to always
fold it into a global offset.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D126729
2022-06-08 08:19:21 -07:00
Shao-Ce SUN
862f30a428 [RISCV] Add ISD::EH_DWARF_CFA
Based on D24038.
LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible __builtin_dwarf_cfa() builtin.

Reviewed By: StephenFan

Differential Revision: https://reviews.llvm.org/D126181
2022-06-08 22:03:30 +08:00
Kito Cheng
6a6f632b93 Revert "[RISCV] Testcase to show wrong register allocation result of subreg liveness"
Revert due to failed on LLVM_ENABLE_EXPENSIVE_CHECKS.

This reverts commit cbe22c794348a1962af8a5d21fbedbb65974d94c.
2022-06-08 21:19:27 +08:00
Kito Cheng
7207373e1e Revert "[SplitKit] Handle early clobber + tied to def correctly"
Revert due to failed on LLVM_ENABLE_EXPENSIVE_CHECKS.

This reverts commit e14d04909df4e52e531f6c2e045c3cf9638dd817.
2022-06-08 13:05:35 +08:00
Kito Cheng
e14d04909d [SplitKit] Handle early clobber + tied to def correctly
Spliter will try to extend a live range into `r` slot for a use operand,
that's works on most situaion, however that not work correctly when the operand
has tied to def, and the def operand is early clobber.

Give an example to demo what's wrong:
  0  %0 = ...
 16  early-clobber %0 = Op %0 (tied-def 0), ...
 32  ... = Op %0

Before extend:
 %0 = [0r, 0d) [16e, 32d)

The point we want to extend is 0d to 16e not 16r in this case, but if
we use 16r here we will extend nothing because that already contained
in [16e, 32d).

This patch add check for detect such case and adjust the extend point.

Detailed explanation for testcase: https://reviews.llvm.org/D126047

Reviewed By: MatzeB

Differential Revision: https://reviews.llvm.org/D126048
2022-06-08 11:33:05 +08:00
Kito Cheng
cbe22c7943 [RISCV] Testcase to show wrong register allocation result of subreg liveness
This testcase show the live range isn't construct correctly when subreg
liveness is enabled.

In the testcase `early-clobber-tied-def-subreg-liveness.ll`, first operand of
`vsext.vf2 v8, v16, v0.t` is both def and use, and the use is come from
the memory location of `.L__const._Z3foov.var_49`, it's load and spilled
into stack, and then...v8 is overwrite by another instructions.

```
lui a0, %hi(.L__const._Z3foov.var_49)
addi a0, a0, %lo(.L__const._Z3foov.var_49)
...
vle16.v v8, (a0) # Load value from var_49
...
addi a0, sp, 16
...
vs2r.v v8, (a0) # Spill
...
vl2r.v v8, (a1) # Reload
...
lui a0, %hi(.L__const._Z3foov.var_40)
addi a0, a0, %lo(.L__const._Z3foov.var_40)
vle16.v v8, (a0)     # Load value...into v8???
vmsbc.vx v0, v8, a0  # And use that.
...
vsext.vf2 v8, v16, v0.t # But v8 is here...which is expect value from the reload
```

The `early-clobber-tied-def-subreg-liveness.mir` has more detailed
infomation for that, `%25.sub_vrm2_0` is defined in 64, and used in 464,
and defined again in 464, and we has used an inline asm to clobber all
vector register for trigger spliter.

```
0B      bb.0.entry:
16B       %0:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_49
32B       %1:gpr = ADDI %0:gpr, target-flags(riscv-lo) @__const._Z3foov.var_49
48B       dead $x0 = PseudoVSETIVLI 2, 73, implicit-def $vl, implicit-def $vtype
64B       undef %25.sub_vrm2_0:vrn4m2nov0 = PseudoVLE16_V_M2 %1:gpr, 2, 4, implicit $vl, implicit $vtype
80B       %3:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_48
96B       %4:gpr = ADDI %3:gpr, target-flags(riscv-lo) @__const._Z3foov.var_48
112B      %5:vr = PseudoVLE8_V_M1 %4:gpr, 2, 3, implicit $vl, implicit $vtype
128B      %6:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_46
144B      %7:gpr = ADDI %6:gpr, target-flags(riscv-lo) @__const._Z3foov.var_46
160B      %25.sub_vrm2_1:vrn4m2nov0 = PseudoVLE16_V_M2 %7:gpr, 2, 4, implicit $vl, implicit $vtype
176B      %9:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_45
192B      %10:gpr = ADDI %9:gpr, target-flags(riscv-lo) @__const._Z3foov.var_45
208B      %25.sub_vrm2_2:vrn4m2nov0 = PseudoVLE16_V_M2 %10:gpr, 2, 4, implicit $vl, implicit $vtype
224B      INLINEASM &"" [sideeffect] [attdialect], $0:[clobber], ...
240B      %12:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_44
256B      %13:gpr = ADDI %12:gpr, target-flags(riscv-lo) @__const._Z3foov.var_44
272B      dead $x0 = PseudoVSETIVLI 2, 73, implicit-def $vl, implicit-def $vtype
288B      %25.sub_vrm2_3:vrn4m2nov0 = PseudoVLE16_V_M2 %13:gpr, 2, 4, implicit $vl, implicit $vtype
304B      $x0 = PseudoVSETIVLI 2, 73, implicit-def $vl, implicit-def $vtype
320B      %16:gpr = LUI target-flags(riscv-hi) @__const._Z3foov.var_40
336B      %17:gpr = ADDI %16:gpr, target-flags(riscv-lo) @__const._Z3foov.var_40
352B      %18:vrm2 = PseudoVLE16_V_M2 %17:gpr, 2, 4, implicit $vl, implicit $vtype
368B      $x0 = PseudoVSETIVLI 2, 73, implicit-def $vl, implicit-def $vtype
384B      %20:gpr = LUI 1048572
400B      %21:gpr = ADDIW %20:gpr, 928
416B      early-clobber %22:vr = PseudoVMSBC_VX_M2 %18:vrm2, %21:gpr, 2, 4, implicit $vl, implicit $vtype
432B      $x0 = PseudoVSETIVLI 2, 9, implicit-def $vl, implicit-def $vtype
448B      $v0 = COPY %22:vr
464B      early-clobber %25.sub_vrm2_0:vrn4m2nov0 = PseudoVSEXT_VF2_M2_MASK %25.sub_vrm2_0:vrn4m2nov0(tied-def 0), %5:vr, killed $v0, 2, 4, 0, implicit $vl, implicit $vtype
480B      %26:gpr = LUI target-flags(riscv-hi) @var_47
496B      %27:gpr = ADDI %26:gpr, target-flags(riscv-lo) @var_47
512B      PseudoVSSEG4E16_V_M2 %25:vrn4m2nov0, %27:gpr, 2, 4, implicit $vl, implicit $vtype
528B      PseudoRET
```

When spliter will try to split %25:

```
selectOrSplit VRN4M2NoV0:%25 [64r,160r:4)[160r,208r:0)[208r,288r:1)[288r,464e:2)[464e,512r:3) 0@160r 1@208r 2@288r 3@464e 4@64r  L0000000000000030 [160r,512r:0) 0@160r  L00000000000000C0 [208r,512r:0) 0@208r  L0000000000000300 [288r,512r:0) 0@288r  L000000000000000C [64r,464e:1)[464e,512r:0) 0@464e 1@64r  weight:1.179245e-02 w=1.179245e-02
```

```
Best local split range: 64r-208r, 6.999861e-03, 3 instrs
    enterIntvBefore 64r: not live
    leaveIntvAfter 208r: valno 1
    useIntv [64B;216r): [64B;216r):1
  blit [64r,160r:4): [64r;160r)=1(%29)(recalc)
  blit [160r,208r:0): [160r;208r)=1(%29)(recalc)
  blit [208r,288r:1): [208r;216r)=1(%29)(recalc) [216r;288r)=0(%28)(recalc)
  blit [288r,464e:2): [288r;464e)=0(%28)(recalc)
  blit [464e,512r:3): [464e;512r)=0(%28)(recalc)
  rewr %bb.0    464e:0  early-clobber %28.sub_vrm2_0:vrn4m2nov0 = PseudoVSEXT_VF2_M2_MASK %25.sub_vrm2_0:vrn4m2nov0(tied-def 0), %5:vr, $v0, 2, 4, 0, implicit $vl, implicit $vtype
  rewr %bb.0    288r:0  %28.sub_vrm2_3:vrn4m2nov0 = PseudoVLE16_V_M2 %13:gpr, 2, 4, implicit $vl, implicit $vtype
  rewr %bb.0    208r:1  %29.sub_vrm2_2:vrn4m2nov0 = PseudoVLE16_V_M2 %10:gpr, 2, 4, implicit $vl, implicit $vtype
  rewr %bb.0    160r:1  %29.sub_vrm2_1:vrn4m2nov0 = PseudoVLE16_V_M2 %7:gpr, 2, 4, implicit $vl, implicit $vtype
  rewr %bb.0    64r:1   undef %29.sub_vrm2_0:vrn4m2nov0 = PseudoVLE16_V_M2 %1:gpr, 2, 4, implicit $vl, implicit $vtype
  rewr %bb.0    464B:0  early-clobber %28.sub_vrm2_0:vrn4m2nov0 = PseudoVSEXT_VF2_M2_MASK %28.sub_vrm2_0:vrn4m2nov0(tied-def 0), %5:vr, $v0, 2, 4, 0, implicit $vl, implicit $vtype
  rewr %bb.0    512B:0  PseudoVSSEG4E16_V_M2 %28:vrn4m2nov0, %27:gpr, 2, 4, implicit $vl, implicit $vtype
  rewr %bb.0    216B:1  undef %28.sub_vrm1_0_sub_vrm1_1_sub_vrm1_2_sub_vrm1_3_sub_vrm1_4_sub_vrm1_5:vrn4m2nov0 = COPY %29.sub_vrm1_0_sub_vrm1_1_sub_vrm1_2_sub_vrm1_3_sub_vrm1_4_sub_vrm1_5:vrn4m2nov0
queuing new interval: %28 [216r,288r:0)[288r,464e:1)[464e,512r:2) 0@216r 1@288r 2@464e  L000000000000000C [216r,216d:0)[464e,512r:1) 0@216r 1@464e  L0000000000000300 [288r,512r:0) 0@288r  L00000000000000C0 [216r,512r:0) 0@216r  L0000000000000030 [216r,512r:0) 0@216r  weight:8.706897e-03
Enqueuing %28
queuing new interval: %29 [64r,160r:0)[160r,208r:1)[208r,216r:2) 0@64r 1@160r 2@208r  L000000000000000C [64r,216r:0) 0@64r  L00000000000000C0 [208r,216r:0) 0@208r  L0000000000000030 [160r,216r:0) 0@160r  weight:1.097826e-02
Enqueuing %29
```

The live range of first part subreg of %25 is become [216r,216d:0)[464e,512r:1),
however first live range should live until 464e rather than just live
and [216r,216d:0).

And then the register allocator allocated wrong result accroding the
live range info.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D126047
2022-06-08 11:27:24 +08:00
Craig Topper
0c66deb498 [RISCV] Scalarize gather/scatter on RV64 with Zve32* extension.
i64 indices aren't supported on Zve32*. Scalarize gathers to prevent
generating illegal instructions.

Since InstCombine will aggressively canonicalize GEP indices to
pointer size, we're pretty much always going to have an i64 index.

Trying to predict when SelectionDAG will find a smaller index from
the TTI hook used by the ScalarizeMaskedMemIntrinPass seems fragile.
To optimize this we probably need an IR pass to rewrite it earlier.

Test RUN lines have also been added to make sure the strided load/store
optimization still works.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D127179
2022-06-07 08:07:50 -07:00
Matt Arsenault
56303223ac llvm-reduce: Don't assert on functions which don't track liveness
Use the query that doesn't assert if TracksLiveness isn't set, which
needs to always be available. We also need to start printing liveins
regardless of TracksLiveness.
2022-06-07 10:00:25 -04:00
Craig Topper
be398100ea [SelectionDAG] Further improve computeKnownBits for (smax X, C) where C is non-negative.
Move the code that was added for D126896 after the normal recursive calls
to computeKnownBits. This allows us to calculate trailing zeros.
Previously we would break out of the switch before the recursive calls.
2022-06-06 09:59:23 -07:00
Shao-Ce SUN
84bacb18c6 [RISCV] Use check-prefixes to reduce check lines
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D125083
2022-06-06 15:59:15 +08:00
Lian Wang
20cf77f776 [LegalizeTypes][VP] Add widen and split support for vp.fptrunc and vp.fpext
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D126439
2022-06-06 02:28:01 +00:00
LiaoChunyu
f14d18c7a9 [RISCV] Add more patterns for FNMADD
D54205 handles fnmadd: -rs1 * rs2 - rs3
This patch add fnmadd: -(rs1 * rs2 + rs3) (the nsz flag on the FMA)

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D126852
2022-06-04 12:31:45 +08:00
Craig Topper
cc3bd43533 [RISCV] Support LUI+ADDIW in doPeepholeLoadStoreADDI.
This fixes an inconsistency between RV32 and RV64. Still considering
trying to do this peephole during isel, but wanted to fix the
inconsistency first.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126986
2022-06-03 18:06:56 -07:00
Craig Topper
8da5d5dbdc [RISCV] Pre-commit test cases for D126986. NFC 2022-06-03 13:31:45 -07:00
Craig Topper
170c550ca8 [RISCV] Use SelectionDAG::isBaseWithConstantOffset in scalar load/store address matching.
Test changes are because isBaseWithConstantOffset uses computeKnownBits
and that is able to see that an earlier AND instruction guaranteed
alignment so that we can treat an OR as an ADD.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126970
2022-06-03 10:55:28 -07:00
Craig Topper
dbead2388b [RISCV] Add custom isel for (add X, imm) used by load/stores.
If the imm is out of range for an ADDI, we will materialize it in
a register using multiple instructions. If the ADD is used by a
load/store, doPeepholeLoadStoreADDI can try to pull an ADDI from
the constant materialization into the load/store offset. This only
works if the ADD has a single use, otherwise the peephole would have
to rebuild multiple nodes.

This patch instead tries to solve the problem when the add is selected.
We check that the add is only used by loads/stores and if it is
we will select it to (ADDI (ADD X, Imm-Lo12), Lo12). This will enable
the simple case in doPeepholeLoadStoreADDI that can bypass an ADDI
used as a pointer. As a result we can remove the more complicated
peephole from doPeepholeLoadStoreADDI.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126576
2022-06-02 13:45:32 -07:00
Craig Topper
fa20bf1636 [DAGCombiner][RISCV] Improve computeKnownBits for (smax X, C) where C is non-negative.
If C is non-negative, the result of the smax must also be
non-negative, so all sign bits of the result are 0.

This allows DAGCombiner to remove a zext_inreg in the modified test.
This zext_inreg started as a sext that became zext before type
legalization then was promoted to a zext_inreg.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126896
2022-06-02 12:34:24 -07:00
Craig Topper
01ba470826 [RISCV] Add test case showing unnecessary extend after i32 smax on rv64. NFC
One of the operands of the smax is a positive value so computeKnownBits
determines the result of the smax must always be positive. This allows
DAG combiner to convert the sign extend to zero extend before type
legalization.

After type legalization the smax is promoted to i64 by sign extending
its inputs and the zero extend becomes an AND instruction. We are unable
to remove the AND at this point and it becomes a pair of shifts or a
zext.w.

The result of smax has as many sign bits as the minimum of its inputs.
Had we kept the sign extend instead of turning it into a zero extend
it would be removed by DAG combiner after type legalization.
2022-06-02 09:58:11 -07:00
Philip Reames
dcdb0bf25b [RISCV] Fix an inconsistency with compatible load/store handling
Once we've computed the incoming predecessor state, we should use the same compatibility check with knowledge of MI as we did in phase 2 in order to be consistent across all phases.

Differential Revision: https://reviews.llvm.org/D126574
2022-06-02 08:03:51 -07:00
jacquesguan
5482ae6328 [LegalizeTypes][VP] Add widen and split support for VP FP integer casting op.
This patch adds widen and split support for VP_FPTOSI, VP_FPTOUI, VP_SITOFP and VP_UITOFP.

Differential Revision: https://reviews.llvm.org/D126847
2022-06-02 09:05:27 +00:00
jacquesguan
058791d8f2 [LegalizeTypes][VP] Add widen and split support for VP_SIGN_EXTEND and VP_ZERO_EXTEND.
Differential Revision: https://reviews.llvm.org/D126442
2022-06-02 02:21:22 +00:00
Hendrik Greving
a92ed167f2 [ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.

Keeps MVT::i2, MVT::i4 lowering actions as expand, which should be
removed once targets set this explicitly.

Adjusts 11 lit tests to reflect slightly different behavior during
DAG combine.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-02 00:49:11 +00:00
Philip Reames
f15add7d93 [RISCV] Split fixed-vector-strided-load-store.ll so it can be autogened
I've gotten tired of updating register allocation changes by hand, let's just autogen this even if we have to duplicate it.
2022-06-01 16:12:35 -07:00
Hendrik Greving
e9d05cc7d8 Revert "[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4."
This reverts commit 430ac5c3029c52e391e584c6d4447e6e361fae99.

Due to failures in Clang tests.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-01 13:27:49 -07:00
Hendrik Greving
430ac5c302 [ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4.
Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4.

Keeps MVT::i2, MVT::i4 lowering actions as `expand`, which should be
removed once targets set this explicitly.

Adjusts 11 lit tests to reflect slightly different behavior during
DAG combine.

Differential Revision: https://reviews.llvm.org/D125247
2022-06-01 12:48:01 -07:00
Craig Topper
aeb27f133a [RISCV] Fix i64<->f64 and i32<->f32 bitcasts with VLS vectors enabled.
We enable a custom handler to optimize conversions between scalars
and fixed vectors. Unfortunately, the custom handler picks up scalar
to scalar conversions as well. If the scalar types are both legal,
we wouldn't match any of the fixed vector cases and would return SDValue()
causing the LegalizeDAG to expand the bitcast through memory.

This patch fixes this by checking if it's a scalar to scalar conversion
and returns `Op` if both types are legal.

Differential Revision: https://reviews.llvm.org/D126739
2022-06-01 08:13:49 -07:00
wangpc
57203af167 [RISCV] Set target-abi explicitly to reduce codegen results
As mentioned in D125947, we can reduce codegen results by
adding an explicit hard single-float ABI.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D126640
2022-06-01 13:49:23 +08:00
Philip Reames
33b1be5916 [riscv] add test coverage for fractional lmul w/fixed length vectorization 2022-05-31 10:25:37 -07:00
Craig Topper
1b2de79ff4 [RISCV] Use two ADDIs to do some stack pointer adjustments.
If the adjustment doesn't fit in 12 bits, try to break it into
two 12 bit values before falling back to movImm+add/sub.

This is based on a similar idea from isel.

Reviewed By: luismarques, reames

Differential Revision: https://reviews.llvm.org/D126392
2022-05-31 10:25:28 -07:00
Craig Topper
80c4cf6369 [RISCV] Fix a few corner case bugs in RISCVMergeBaseOffsetOpt::matchLargeOffset
The immediate for LUI is stored as 20-bit unsigned value. We need
to sign extend if after shifting by 12 to match the instruction
behavior.

If we find an LUI+ADDI on RV64, it means the constant isn't a
simm32. If it was, we would have emitted LUI+ADDIW from constant
materialization. Make sure the constant is a simm32 before folding.
This appears to match gcc.

A future patch will add support for LUI+ADDIW on RV64.
2022-05-31 09:50:54 -07:00
Craig Topper
3b5456d5f0 [RISCV] Pre-commit tests for D126635. NFC 2022-05-31 09:49:46 -07:00
eopXD
2cadf84fc8 [RISCV] Pass OptLevel to RISCVDAGToDAGISel correctly
Originally, `OptLevel` isn't passed into the `MachineFunctionPass`.
This lets the default parameter of `SelectionDAGISel`, which is
`CodeGenOpt::Default`, be passed in. OptLevelChanger captures the
optimization level with the parameter, and rather not the value
within `TargetMachine`. This lets the optimization be
unintentionally overwriten if other value than `CodeGenOpt::Default`
passed.

This patch fixes this by passing the optimization level rather
than using the default value.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D126641
2022-05-30 17:22:50 -07:00
eopXD
51002bdb5e [RISCV] Precommit test case to show bug in RISCVISelDagToDag
The optimization level should not be restored into O2.
This is a pre-commit test case to show fix in D126641.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D126677
2022-05-30 15:59:20 -07:00
Ping Deng
88af539c0e [RISCV] Support VP_REDUCE_MUL mask operation
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126520
2022-05-30 03:05:39 +00:00
Ping Deng
083798e270 [LegalizeTypes][VP] Add integer promotion support for vp.fptosi/vp.fptoui
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D125760
2022-05-30 03:05:39 +00:00
Craig Topper
6a6cf2e28d [RISCV] isel (add (and X, 0x1FFFFFFFE), Y) as (SH1ADD (SRLI X, 1), Y)
This pattern is what we get after DAG combine for C code like this.

short *ptr1, *ptr2, *ptr3;
unsigned diff = ptr1 - ptr2;
return ptr3[diff];

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126588
2022-05-29 18:24:07 -07:00
Craig Topper
e642d0ea21 [RISCV] Add test cases showing missed opportunity to use shXadd.uw. NFC
The tests here show the codegen for something like this C code.

unsigned diff = ptr1 - ptr2;
return ptr3[diff];

The pointer difference is truncated to 32-bits before being used
again as an index. In SelectionDAG this appears as an AND between
a SRL and a SHL. DAGCombiner will remove the shifts leaving only
an AND. The Mask now has 1,2, or 3 trailing zeros and 31, 30, or 29
leading zeros. We end up falling back to constant materialization
to create this mask.

We could instead use srli followed by slli.uw. Or since
we have an add, we can use srli followed by shXadd.uw.

Differential Revision: https://reviews.llvm.org/D126589
2022-05-29 18:22:55 -07:00
Philip Reames
85b4470035 [RISCV] Allow PRE of vsetvli involving non-1 LMUL
This is a follow up to address a review comment from D124869. When deciding whether to PRE a vsetvli, we can allow non-LMUL1 vsetvlis.

Differential Revision: https://reviews.llvm.org/D126563
2022-05-27 15:49:41 -07:00
Craig Topper
542a83c362 [RISCV] Correct load/store alignments in sink-splat-operands.ll. NFC
These should be aligned to the natural alignment of the element.
Probably copy/paste mistake from the i32 tests.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126567
2022-05-27 14:39:31 -07:00
Philip Reames
d4905a7b20 [RISCV] Add a vsetvli PRE test involving non-1 LMUL 2022-05-27 13:16:05 -07:00