170 Commits

Author SHA1 Message Date
Yeting Kuo
d83620d101 [RISCV] Support vector strict_fsetcc/fsetccs.
The patch supports vector strict_fsetcc/fsetccs. Instead of revserving fflags,
the method to implement scalar quiet compares, the patch implement quiet
compares by masking the signaling compares when either input is NaN [0].

[0]: https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-floating-point-compare-instructions

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147998
2023-04-14 09:10:41 +08:00
Yeting Kuo
6858a920b8 [RISCV] Support vector type strict_[su]int_to_fp and strict_fp_to_[su]int.
Also the patch loose the fixed vector contraint in llvm/lib/IR/Verifier.cpp.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147380
2023-04-06 10:09:44 +08:00
Yeting Kuo
84c8c2b4b4 [DAG][RISCV] Allow scalable vector ISD::STRICT_FP_ROUND and support vector ISD::STRICT_FP_ROUND for RISC-V.
The patch customized lower vector type ISD::STRICT_FP_ROUND to RISCVISD::STRICT_FP_ROUND.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147113
2023-03-30 08:20:02 +08:00
Nitin John Raj
50876630b9 [RISCV] Made v(f)(w)red* pseudoinstructions SEW-aware
Differential Revision: https://reviews.llvm.org/D147098
2023-03-29 10:37:56 -07:00
Yeting Kuo
0676c6d91f [RISCV] Support vector type strict_fma.
Like D145900, the patch also supports fixed vector strict_fma nodes in RISC-V by
customized lowering them to riscv_strict_vfmadd_vl nodes. riscv_strict_vfmadd_vl
is created to avoid some riscv_vfmadd_vl optimizations happening to original
strict_fma nodes. The patch also adds combine patterns for riscv_strict_fmadd_vl
nodes with negation operands.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D146939
2023-03-28 09:01:46 +08:00
Yeting Kuo
946d29e7e9 [RISCV] Support vector type strict_fsqrt.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D146911
2023-03-27 14:02:22 +08:00
Nitin John Raj
7da272af89 [RISCV][RISCVISelLowering] Add tail agnostic policy operand to VECREDUCE instructions
Differential Revision: https://reviews.llvm.org/D146752
2023-03-25 02:42:13 -07:00
Nitin John Raj
7b39f16fb8 [RISCV] Made fsqrtv pseudoinstruction SEW-aware 2023-03-24 16:33:25 -07:00
Nitin John Raj
3cf7e35180 [RISCV] Made division pseudoinstructions SEW-aware 2023-03-24 16:33:24 -07:00
Nitin John Raj
5ab9ae12b7 [RISCV] Made vrgather.vv and vrgatherei16 pseudoinstructions SEW-aware 2023-03-24 16:33:24 -07:00
Yeting Kuo
9637e950cb [RISCV] Support ISD::STRICT_FADD/FSUB/FMUL/FDIV for vector types.
The patch handles fixed type strict-fp by new RISCVISD::STRICT_ prefixed
isd nodes.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145900
2023-03-15 07:47:16 +08:00
Yeting Kuo
b2c48559c8 [IR][DAG][RISCV] Allow scalable vector ISD::STRICT_FP_EXTEND and RISC-V supports for vector ISD::STRICT_FP_EXTEND.
The patch mainly does two things. The first is allowing scalable vector
ISD::STRICT_FP_EXTEND. The second is making RISC-V customized lower
strict_fpextend to riscv_strict_fpextend_vl, the strict version of
riscv_fpextend_vl.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145548
2023-03-09 17:37:59 +08:00
Han-Kuan Chen
d02b9869b2 [RISCV] Don't use constantpool for floating-point value if the value can be easily constructed by integer sequence and a floating-point move.
In addition, this commit does the following combine

vfmv.v.f + fmv.[dhw].x -> vmv.v.x
vfmv.s.f + fmv.[dhw].x -> vmv.s.x
vfmerge.vfm + fmv.[dhw].x -> vmerge.vxm

Differential Revision: https://reviews.llvm.org/D142953
2023-02-03 22:42:08 -08:00
Luke Lau
5ccd288731 [RISCV][NFC] Rename narrowing patterns to use W suffix
To match up with the pseudo instruction names

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D142553
2023-01-26 11:17:14 +00:00
Luke Lau
f5a6447196 [RISCV] Combine FP_TO_INT to vfwcvt/fvncvt
Adds new pseudo instructions to make sure that the fcvt instructions
have all rounding mode (RM) and unsigned (XU) variants across
single-width, widening and narrowing conversions.
And likewise, extends the VL patterns to accompany them. We don't add
new VL nodes for the widening/narrowing conversions though, instead we
just add specific patterns for vfcvts on those wider/narrower types.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D142102
2023-01-24 09:44:57 +00:00
Luke Lau
a0d80c2398 [RISCV] Generalize performFP_TO_INTCombine to vectors
Like in the scalar domain, combine calls to (fp_to_int (ftrunc X)) on
scalable and fixed-length vectors into a single vfcvt instruction.
For truncating rounds, the static vfcvt.rtz rounding mode is used.
Otherwise use the VFCVT_RM_ variants to set the rounding mode
dynamically.
Closes https://github.com/llvm/llvm-project/issues/56737

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D141599
2023-01-18 10:53:24 +00:00
Yeting Kuo
5280d3e738 [RISCV] Teach lowerCTLZ_CTTZ_ZERO_UNDEF to handle conversion i32/i64 vectors to f32 vectors.
Previously lowerCTLZ_CTTZ_ZERO_UNDEF converted the source to float value by
ISD::UINT_TO_FP. ISD::UINT_TO_FP uses dynamic rounding mode, so the rounding
may make the exponent of the result not as expected when converting i32/i64 to f32.
This is the reason why we constrained lowerCTLZ_CTTZ_ZERO_UNDEF to only handle
an i32 source when the f64 type having the same element count as source is legal.

The patch teaches lowerCTLZ_CTTZ_ZERO_UNDEF converts i32/i64 vectors to f32
vectors by vfcvt.f.xu.v with RTZ rounding mode. Using RTZ is to make sure the
exponent of results is correct, although f32 could not totally represent each
value in i32/i64.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D140782
2023-01-12 14:42:47 +08:00
Luke Lau
6f00e11657 [RISCV][NFC] Update V spec section numbers
The section numbers got reshuffled around between v0.10 and [[ https://github.com/riscv/riscv-v-spec/blob/v1.0/v-spec.adoc |  v1.0 of the spec ]], this updates references in comments to refer to their version in v1.0.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D141382
2023-01-11 10:13:23 +00:00
Craig Topper
64fae4d3b7 [RISCV] Add isel patterns to form tail undisturbed vwadd(u).wv from vwadd(u)_vl+vp_merge.
We use a special TIED instructions for vwadd(u).wv to avoid an
earlyclobber constraint preventing the first source and the destination
from being the same register.

This prevents our normal post process for forming TU instructions.
Add manual isel pattern instead. This matches what we do for FMA
for example.
2023-01-09 16:44:11 -08:00
Craig Topper
3b2537be76 [RISCV] Rename SDT_RISCVVecCvtX2FOp_VL->SDT_RISCVVecCvtF2XOp_VL. NFC
The instruction name is x.f with the destination type first. The
template name was intended as "convert F to X". So the F comes first.
2023-01-05 16:37:13 -08:00
jacquesguan
db3f3243bb [RISCV] Use vfirst.m to extract the first element from mask vector.
This patch uses vfirst.m to extract the first bit of mask.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D139512
2023-01-03 11:24:18 +08:00
Yeting Kuo
f8a05727b0 [RISCV][NFC] Add policy operand for RISCVISD::VSLIDEUP_VL and RISCVISD::VSLIDEDOWN_VL.
There is room for optimization to use tail agnostic vslideup/vslidedown to lower
some vector operations. D125546 is an revision for the kind of optimization.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D140393
2022-12-21 10:50:04 +08:00
Yeting Kuo
ed9638c44b [VP][RISCV] Add vp.nearbyint and RISC-V support.
nearbyint has the property to execute without exception.
For not modifying fflags, the patch added new machine opcode
PseudoVFROUND_NOEXCEPT_V that expands vfcvt.x.f.v and vfcvt.f.x.v between a pair
of frflags and fsflags.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D137685
2022-11-16 14:05:35 +08:00
Yeting Kuo
71e4e35581 [VP][RISCV] Add vp.rint and RISC-V support.
FRINT uses dynamic rounding mode instead of static rounding mode. The patch
rename VFCVT_X_F_VL to VFCVT_RM_X_F_VL for static rounding mode uses and added
new ISDNode VFCVT_X_F_VL directly selected to PseudoVFCVT_X_F_V.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D136662
2022-11-01 14:52:47 +08:00
Yeting Kuo
2749b942e9 [RISCV] Add isel patterns for vmacc, vnmsac.
The patch selects VSELECT/VP_MERGE_VL which uses fmadd/fnmsub as true operand
and the adden of the fmadd/fnmsub as false operand.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D135330
2022-10-12 09:19:01 +08:00
Craig Topper
3b20765cf7 [RISCV] Use mask agnostic policy for isel patterns where the merge operand is IMPLICIT_DEF.
I tend to think we should ignore the policy bit in vsetvli insertion
if the tied operand is IMPLICIT_DEF. But that raises questions about
what the policy operand on RVV intrinsics means if you also pass
vundefined().

This change at least fixes some cases. I'll post a separate patch
for vsetvli insertion for discussion.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D135386
2022-10-06 15:44:39 -07:00
Yeting Kuo
74a130af97 [RISCV] Add isel patterns for vfmacc, vfnmacc, vfmsac and vfnmsac.
The patch selects VSELECT_VL/VP_MERGE_VL that uses VF(N)M(ACC|SAC) as its
true operand and the adden of the true operand as its false operand.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D135080
2022-10-05 09:56:43 +08:00
Craig Topper
05df15965b [RISCV] Use _TIED form of VFWADD(U)_WV/VFWSUB(U)_WV to avoid early clobber.
One of the sources is the same size as the destination so that source
doesn't have an overlap with the destination register. By using the _TIED
form we avoid an early clobber contraint for that source.

This matches what was already done for instrinsics. ConvertToThreeAddress
will fix it if it can't stay tied.
2022-10-03 21:44:08 -07:00
Craig Topper
5bbc5eb55f [RISCV] Use _TIED form of VWADD(U)_WX/VWSUB(U)_WX to avoid early clobber.
One of the sources is the same size as the destination so that source
doesn't have an overlap with the destination register. By using the _TIED
form we avoid an early clobber contraint for that source.

This matches what was already done for instrinsics. ConvertToThreeAddress
will fix it if it can't stay tied.
2022-10-01 16:34:39 -07:00
Craig Topper
85db4f10e3 [RISCV] Minor tablegen formatting cleanup. NFC 2022-10-01 15:59:25 -07:00
Quentin Colombet
ce35e8b426 [RISCV][ISel] Remove the commutative flag on SUB
I wasn't able to produce a testcase for that because right now VWSUB is
only generated from VWSUB_W and from there to trigger the commutative
bug we would need to grab VWSUB where the splat value is on the LHS,
which is currently not matched.

Differential Revision: https://reviews.llvm.org/D134701
2022-09-27 20:15:01 +00:00
Yeting Kuo
43c5fbdd3a [VP][RISCV] Add vp.sqrt intrinsic and RISC-V support.
The patch modeled vp.fabs patch D132793.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D133690
2022-09-26 10:47:40 +08:00
Craig Topper
1d8a7adca6 [RISCV] Rename RISCVISD::SINT_TO_FP_VL/UINT_TO_FP_VL. NFC
Name them after the instructions VFCVT_RTZ_X(U)_F_VL to make it
clear that the ISD nodes don't have the poison semantics of
ISD::SINT_TO_FP/UINT_TO_FP.

I play to reuse this node for a FP_TO_SINT_SAT/FP_TO_UINT_SAT
patch and need the instruction semantics.
2022-09-21 15:33:04 -07:00
Craig Topper
8d7e73effe [RISCV] Teach lowerVECTOR_SHUFFLE to recognize some shuffles as vnsrl.
Unary shuffles such as <0,2,4,6,8,10,12,14> or <1,3,5,7,9,11,13,15>
where half the elements are returned, can be lowered using vnsrl.

SelectionDAGBuilder lowers such shuffles as a build_vector of
extract_elements since the mask has less elements than the source.
To fix this, I've enable the extractSubvectorIsCheapHook to allow
DAGCombine to rebuild the shuffle using 2 extract_subvectors preceding
the shufffle.

I've gone very conservative on extractSubvectorIsCheapHook to minimize
test impact and match what we have test coverage for. This can be
improved in the future.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D133736
2022-09-13 11:07:11 -07:00
Craig Topper
f0332d12ae [RISCV] Improve vector fceil/ffloor lowering by changing FRM.
This adds new VFCVT pseudoinstructions that take a rounding mode operand. A custom inserter is used to insert additional instructions to change FRM around the
VFCVT.

Some of this is borrowed from D122860, but takes a somewhat different direction. We may migrate to that patch, but for now I was trying to keep this as independent from
RVV intrinsics as I could.

A followup patch will use this approach for FROUND too.

Still need to fix the cost model.

Reviewed By: arcbbb

Differential Revision: https://reviews.llvm.org/D133238
2022-09-05 19:03:44 -07:00
Craig Topper
11881a8f3f [RISCV] Rename some V extension multiclasses for consistency. NFC
Use "SDNode" in the name is the convention for the VLMax patterns
in RISCVInstrInfoVSDPatterns.td. This files use "VL".
2022-09-01 22:17:08 -07:00
Craig Topper
2f811a6c7f [VP][RISCV] Add vp.fabs intrinsic and RISC-V support.
Mostly just modeled after vp.fneg except there is a
"functional instruction" for fneg while fabs is always an
intrinsic.

Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D132793
2022-08-29 09:32:06 -07:00
Craig Topper
961838cc13 [RISCV] Add passthru operand to RISCVISD::SETCC_VL.
Use it to the fix a bug in the fceil/ffloor lowerings. We were
setting the passthru to IMPLICIT_DEF before and using a mask
agnostic policy. This means where the incoming bits in
the mask were 0 they could be anything in the outgoing mask. We
want those bits in the outgoing mask to be 0. This means we need to
pass the input mask as the passthru.

This generates worse code because we are unable to allocate the
v0 register to the output due to an earlyclobber constraint. We
probably need a special TIED pseudoinstruction and probably custom
isel since you can't use V0 twice in the input pattern.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D132058
2022-08-19 08:53:44 -07:00
Craig Topper
c9a41fe60a [RISCV] Prefer vnsrl.wi v8, v8, 0 over vnsrl.wx v8, v8, x0.
I have a couple data points that some microarchitectures prefer
the immediate 0 over x0. Does anyone know of microarchitectures
where the opposite is true?

Unfortunately, this is different than the vncvt.x.x.w alias
from the spec. Perhaps the alias was poorly chosen if x0 isn't
as optimal as immediate 0 on all microarchitectures.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D132041
2022-08-19 08:40:17 -07:00
Craig Topper
a23f07fb1d [RISCV] Add merge operands to more RISCVISD::*_VL opcodes.
This adds a merge operand to all of the binary _VL nodes. Including
integer and widening. They all share multiclasses in tablegen
so doing them all at once was easiest.

I plan to use FADD_VL in an upcoming patch. The rest are just for
consistency to keep tablegen working.

This does reduce the isel table size by about 25k so that's nice.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D130816
2022-07-30 10:26:38 -07:00
Craig Topper
9bf305fe2b [RISCV] Swap the merge and mask operand order for VRGATHER*_VL and FCOPYSIGN_VL nodes.
Based on review feedback from D130816.
2022-07-30 09:57:05 -07:00
Craig Topper
98647330bf [RISCV] Add merge operand to RISCVISD::FCOPYSIGN_VL.
Similar to what was done for VRGATHER*_VL recently.

This will be used in D130659.
2022-07-27 15:25:34 -07:00
Craig Topper
92f1794d41 [RISCV] Mark fminnum_vl and fmaxnum_vl as commutable. 2022-07-08 10:19:09 -07:00
Philip Reames
264018d764 [RISCV] Mark vsadd(u)_vl as commutable
This allows fixed length vectors involving splats on the LHS to commute into the _vx form of the instruction. Oddly, the generic canonicalization rules appear to catch the scalable vector cases. I haven't fully dug in to understand why, but I suspect it's because of a difference in how we represent splats (splat_vector vs build_vector).

Differential Revision: https://reviews.llvm.org/D129302
2022-07-08 10:18:21 -07:00
Craig Topper
a246eb6814 [RISCV] Mark (s/u)min_vl and (s/u)max_vl as commutable. 2022-07-08 09:59:42 -07:00
Craig Topper
c579ab53bd [RISCV] Move vfma_vl+fneg_vl matching to DAG combine.
This patch adds 3 new _VL RISCVISD opcodes to represent VFMA_VL with
different portions negated. It also adds a DAG combine to peek
through FNEG_VL to create these new opcodes.

This is modeled after similar code from X86.

This makes the isel patterns more regular and reduces the size of
the isel table by ~37K.

The test changes look like regressions, but they point to a bug that
was already there. We aren't able to commute a masked FMA instruction
to improve register allocation because we always use a mask undisturbed
policy. Prior to this patch we matched two multiply operands in a
different order and hid this issue for these test cases, but a different
test still could have encountered it.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D128310
2022-06-24 00:00:37 -07:00
Craig Topper
0efbf5bfbb [RISCV] Move the passthru operand for RISCVISD::VRGATHER*_VL nodes. NFC
Put it before the VL instead of as the first operand. I want to add
passthru to more operands, but the commutable ones like VADD_VL
require the commutable operands to be operand 0 and 1. So we can't
have the passthru as operand 0 for those.
2022-06-21 14:01:02 -07:00
Craig Topper
0af19ef9ff [RISCV] Remove true_mask patterns for VRGATHERE16..
After adding it to the table so the post-isel peephole can handle it.
2022-06-21 11:59:37 -07:00
Craig Topper
e50b141a13 [RISCV] Remove true_mask patterns for VRGATHER.
These can be handled by the post-isel peephole.
2022-06-21 11:59:37 -07:00
Craig Topper
16d3a82de5 [RISCV] Add merge operand to RISCVISD::VRGATHER*_VL nodes.
Use it in place of VSELECT_VL+VRGATHER*_VL.

This simplifies the isel patterns.

Overall, I think trying to match select+op to create masked instructions
in isel doesn't scale. We either need to do it in DAG combine, pre-isel
peepole, or post-isel peephole. I don't yet know which is the right
answer, but for this case it seemed best to be able to request the
masked form directly from lowering.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D128023
2022-06-20 18:58:24 -07:00