297 Commits

Author SHA1 Message Date
Craig Topper
a7425f900f [RISCV] Use (i64 GPR:$rs1) instead of i64:$rs1 in isel patterns. 2025-08-01 09:44:09 -07:00
Craig Topper
4d0c25f4a6
[RISCV] Select disjoint_or+not as xnor. (#147636)
A disjoint OR can be converted to XOR. And a XOR+NOT is XNOR. Idea
taken from #147279.
    
I changed the existing xnor pattern to have the not on the outside
instead of the inside. These are equivalent for xor since xor is
associative. Tablegen was already generating multiple variants
of the isel pattern using associativity.
    
There are some issues here. The disjoint flag isn't preserved
through type legalization. I was hoping we could recover it
manually for the masked merge cases, but that doesn't work either.
2025-07-08 21:50:23 -07:00
Jim Lin
cb80651091 [RISCV] Merge AllBFloatVectors into AllFloatVectors. NFC. 2025-07-01 16:21:41 +08:00
Jim Lin
d64ee2cd4f
[RISCV] Add GetVTypeMinimalPredicates for the operation supported by zvfhmin. NFC. (#143847)
This patch adds a new `GetVTypeMinimalPredicates` for `f16` operation
supported by `Zvfhmin`. Split the type predicates for minimal support
and full compute support. This is a refactor patch for implementing
vector compute support for bf16 (Zvfbfa), that we can check `bf16` type
whether with `Zvfbfa` extension in `GetVTypePredicates`.
2025-06-16 10:12:51 +08:00
Piotr Fusik
39a7664fc1
[RISCV] Select (add/or C, x) -> (add.uw C|0xffffffff00000000, x) (#143375)
Emits fewer instructions for certain constants.
2025-06-10 13:28:49 +02:00
Piotr Fusik
3cfdf2ccdf
[RISCV] Handle more (add x, C) -> (sub x, -C) cases (#138705)
This is a follow-up to #137309, adding:
- multi-use of the constant with different adds
- vectors (vadd.vx -> vsub.vx)
2025-05-13 09:12:24 +02:00
Sam Elliott
c60db55568
[RISCV] TableGen-erate RISC-V SDNodes (#138381)
This commit moves RISC-V to auto-generate its target-specific SDNode
types. The biggest change is that SDNodes can now be validated against
their expected type profiles, and that we don't need to edit several
different files when declaring a new one.

This takes Sergei's work in #119709 and "finishes" it - by moving the
final five RISCVISD opcodes into tablegen (including defining their
types), and by ensuring the tablegen has expected closing scope
comments.

Co-authored-by: Sergei Barannikov <barannikov88@gmail.com>
2025-05-09 12:36:59 -07:00
Craig Topper
af45da1d32 [RISCV] Remove unused tablegen multiclass. NFC 2025-04-30 15:07:45 -07:00
Luke Lau
a2f00e1f8f
[RISCV] Add fixed-length patterns for disjoint or patterns for vwadd[u].v{v,x} (#136824)
This is the fixed-length equivalent of #136716.

The pattern we need to match is ({s,z}ext_vl (or_vl disjoint a, b)).
This only allows or_vls with an undef passthru, which allows us to
ignore its mask and vl and just take it from the {s,z}ext_vl.

A riscv_or_vl_is_add_oneuse PatFrag is added to mirror or_is_add in
RISCVInstrInfo.td.
2025-04-24 16:36:15 +08:00
Craig Topper
749535ba28
[RISCV] Use tablegen HasOneUse. NFC (#133974) 2025-04-01 15:51:39 -07:00
Philip Reames
fa315eceb7
[RISCV] Convert vsub.vx to vadd.vi if possible (#130669)
We'd already had this transform for the intrinsics, but hadn't added it
for either fixed length or scalable vectors coming from normal IR.

For the record, the fact we have three different sets of patterns here
really is quite ugly.
2025-03-10 16:20:14 -07:00
Craig Topper
9516f44f6b
[RISCV] Add policy operand to masked vector compare pseudos. Remove ForceTailAgnostic. NFC (#127575)
Add a policy operand to set the tail agnostic policy instead of using
ForceTailAgnostic. The masked to unmasked transforms had to be updated
to drop the policy operand when converting to unmasked.
2025-02-18 07:05:05 -08:00
Craig Topper
85f7ec12b8
[RISCV] Remove unneeded unmasked patterns for vcpop_v and riscv_vfirst_vl. (#127435)
The pseudos had RISCVMaskedPseudo add in #115162 so I we are able to convert the
masked form to unmasked form automatically.
2025-02-17 09:50:01 -08:00
Luke Lau
cc7e83601d
[RISCV] Select mask operands as virtual registers and eliminate uses of vmv0 (#125026)
This is another attempt at #88496 to keep mask operands in SSA after
instruction selection.

Previously we selected the mask operands into vmv0, a singleton register
class with exactly one register, V0.

But the register allocator doesn't really support singleton register
classes and we ran into errors like "ran out of registers during
register allocation in function".

This avoids this by introducing a pass just before register allocation
that converts any use of vmv0 to a copy to $v0, i.e. what isel currently
does today.

That way the register allocator doesn't need to deal with the singleton
register class, but we get the benefits of having the mask registers in
SSA throughout the backend:

- This allows RISCVVLOptimizer to reduce the VLs of instructions that
define mask registers
- It enables CSE and code sinking in more places
- It removes the need to peek through mask copies in RISCVISelDAGToDAG
and keep track of V0 defs in RISCVVectorPeephole

This patch initially eliminates uses of vmv0s after RISCVVectorPeephole
to keep the diff to a minimum, and a follow up patch will move it past
the other MachineInstr SSA passes.

Note that it doesn't try to remove any defs of vmv0 as we shouldn't have
any instructions that have any vmv0 outputs.

As a further follow up, we can move the elimination pass to after phi
elimination and outside of SSA, which would unblock the pre-RA scheduler
around masked pseudos. This might also help the issue that
RISCVVectorMaskDAGMutation tries to solve.
2025-02-12 12:06:55 +08:00
Luke Lau
e42fdcb41f
[RISCV] Match widening fp instructions with same fpext used in multiple operands (#125803)
Because the fpext has a single use constraint on it we can't match cases
where it's used for both operands.

Introduce a new PatFrag that allows multiple uses on a single user and
use it for the binary patterns, and some ternary patterns.

(For some of the ternary patterns there is a fneg that counts as a
separate user, we still need to handle these)
2025-02-11 01:11:44 +08:00
Luke Lau
51b0517a5e
[RISCV] Don't check extop VL in vfwred{u,o}sum patterns (#125799)
Because riscv_fpextend_vl doesn't have a passthru operand the tail
elements are undef, so we can treat them as if they were active.

Relaxing this allows us to match widening reductions where the fpextend
isn't a VP intrinsic.

This same reasoning is already used for riscv_fpextend_vl in
RISCVInstrInfoVSDPatterns.td
2025-02-05 13:01:01 +08:00
Luke Lau
0815b0e7ce
[RISCV] Don't custom lower direct fp_extends where possible (#125644)
This avoids lowering scalable fp_extends that don't need multiple
extends (i.e. f16->f32, f32->f64) to _vl nodes, but converts them back
during DAG preprocessing so we don't need to add any more patterns.

Keeping the nodes in their generic SDNode form matches more splat
patterns
2025-02-05 12:29:26 +08:00
Craig Topper
5cba1f123f
[RISCV] Simplify usage of SplatPat_simm5_plus1. NFC (#125340)
Make SplatPat_simm5_plus1 responsible for decrementing the immediate
instead of requiring DecImm SDNodeXForm to be used after. This allows
better sharing of tablegen classes.
2025-02-01 09:58:54 -08:00
Craig Topper
0c94915d34
[RISCV] Use _B* suffix for vector mask logic pseudo instructions. (#119787)
Replace LMUL suffixes with _B1, _B2, etc. This matches what we do
for other mask only instructions like VCPOP_M, VFIRST_M, VMSBF_M,
VLM, VSM, etc.

Now all pseudoinstructions that use Log2SEW=0 will be consistently
named.
2024-12-12 21:11:01 -08:00
Philip Reames
56cb5cbfcd
[RISCV] Remove RISCVISD::VNSRL_VL and adjust deinterleave lowering to match (#118391)
Instead of directly lowering to vnsrl_vl and having custom pattern
matching for that case, we can just lower to a (legal) shift and
truncate, and let generic pattern matching produce the vnsrl.

The major motivation for this is that I'm going to reuse this logic to
handle e.g. deinterleave4 w/ i8 result.

The test changes aren't particularly interesting. They're minor code
improvements - I think because we do slightly better with the
insert_subvector patterns, but that's mostly irrelevant.
2024-12-02 13:39:12 -08:00
Luke Lau
a7d1d381d2
[RISCV] Use integer VTypeInfo predicate for vmv_v_v_vl pattern (#114915)
When lowering fixed length f16 insert_subvector nodes at index 0 we
crashed with zvfhmin because we couldn't select vmv_v_v_vl.
This was due to the predicates requiring full zvfh, even though we only
need zve32x. Use the integer VTypeInfo instead similarly to
VPatSlideVL_VX_VI.

The extract_subvector tests aren't related but were just added for
consistency with the insert_subvector tests.
2024-11-05 13:01:36 +08:00
Craig Topper
55dbacbf07
[RISCV] Remove RISCVISD::VFCVT_X(U)_F_VL by using VFCVT_RM_X(U)_F_VL with DYN rounding mode. NFC (#114306) 2024-10-30 19:16:23 -07:00
Craig Topper
56dcfbef45
[RISCV] Remove duplicate vector conversion pseudos. (#114287)
These pseudos used to be handled by CustomInserter to insert the
rounding
mode change for vector ceil, floor, etc. At some point they were changed
to use the InsertReadWriteCSR pass instead of the custom inserter. I
believe
that makes them redundant with the pseudos used by the RVV intrinsics
with rounding mode operand.
2024-10-30 14:47:29 -07:00
Luke Lau
30f58ab17f
[RISCV] Lower vector_reverse for zvfhmin/zvfbfmin (#110218)
Previously we crashed because we had no lowering for f16/bf16 scalable
vectors.
Because the lowering uses vrgather_vv_vl, we need to add bf16 patterns
for it.
2024-10-02 14:25:15 +08:00
Craig Topper
8c17ed1512
[RISCV] Generalize RISCVDAGToDAGISel::selectFPImm to handle bitcasts from int to FP. (#108284)
selectFPImm previously handled cases where an FPImm could be
materialized in an integer register.

We can generalize this to cases where a value was in an integer register
and then copied to a scalar FP register to be used by a vector
instruction.

In the affected test, the call lowering code used up all of the FP
argument registers and started using GPRs. Now we use integer vector
instructions to consume those GPRs instead of moving them to scalar FP
first.
2024-09-11 21:13:26 -07:00
Luke Lau
480f07ff6c
[RISCV] Add fixed length vector patterns for vfwmaccbf16.vv (#108204)
This adds VL patterns for vfwmaccbf16.vv so that we can handle fixed
length vectors.

It does this by teaching combineOp_VLToVWOp_VL to emit
RISCVISD::VFWMADD_VL for bf16. The change in getOrCreateExtendedOp is
needed because getNarrowType is based off of the bitwidth so returns
f16. We need to explicitly check for bf16.

Note that the .vf patterns don't work yet, since the build_vector splat
gets lowered to a (vmv_v_x_vl (fmv_x_anyexth x)) instead of a vfmv.v.f,
which SplatFP doesn't pick up, see #106637.
2024-09-12 08:41:50 +08:00
Luke Lau
fdce0bfb7f
[RISCV] Add back missing vmv_v_x_vl pattern predicates (#101455)
Looks like these got left behind in
17e2d07ad15e02c9c757fdd4a532c43747ed8bf3
2024-08-01 15:25:36 +08:00
Luke Lau
b1542afd0b
[RISCV] Rename merge operand -> passthru. NFC (#100330)
We sometimes call the first tied dest operand in vector pseudos the
merge operand, and other times the passthru.

Passthru seems to be more common, and it's what the C intrinsics call
it[^1], so this renames all usages of merge to passthru to be
consistent. It also helps prevent confusion with vmerge.vvm in some of
the peephole optimisations.

[^1]:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/main/doc/rvv-intrinsic-spec.adoc#the-passthrough-vd-argument-in-the-intrinsics
2024-07-30 17:47:00 +08:00
Craig Topper
43de4e03a3
[RISCV] Rename hasVInstructionsBF16 to hasVInstructionsBF16Minimal. NFC (#101080)
This makes it more consistent with Zvfhmin since it is not a complete
bf16 implementation.
2024-07-29 21:55:42 -07:00
Luke Lau
e1065370aa
[RISCV] Remove vfmv.s.f and vfmv.f.s lmul pseudo variants (#100970)
In #71501 we removed the LMUL variants for vmv.s.x and vmv.x.s because
they ignore register groups, so this patch does the same for their
floating point equivalents.

We don't need to add any extra patterns for extractelt in
RISCVInstrInfoVSDPatterns.td because in lowerEXTRACT_VECTOR_ELT we make
sure that the node is narrowed down to LMUL 1.
2024-07-29 22:01:17 +08:00
Craig Topper
caaba2a883
[RISCV] Replace VNCLIP RISCVISD opcodes with TRUNCATE_VECTOR_VL_SSAT/USAT opcodes (#100173)
These new opcodes drop the shift amount, rounding mode, and passthru.
Making them exactly like TRUNCATE_VECTOR_VL. The shift amount, rounding
mode, and passthru are added in isel patterns similar to how we
translate TRUNCATE_VECTOR_VL to vnsrl with a shift of 0.

This should simplify #99418 a little.
2024-07-23 14:57:31 -07:00
Craig Topper
c67653fbc3
[RISCV] Support vXf16 vector_shuffle with Zvfhmin. (#97491)
We can shuffle vXf16 vectors just like vXi16 vectors. We don't need any
FP instructions. Update the predicates for vrgather and vslides patterns
to only check the predicates based on the equivalent integer type. If we
use the FP type it will check Zvfh and block Zvfhmin.

These are probably not the only patterns that need to be fixed, but the
test from the bug report no longer crashes.

Fixes #97477
2024-07-03 23:56:17 -07:00
Craig Topper
2ed2975e8b [RISCV] Add isel patterns for bf16 riscv_vfmv_v_f_vl of FP constant.
We try not let bf16 splats through to isel, but constant folding
allows FP constants to get through. Thankfully we can handle those
using vmv.v.i or vmv.v.x.
2024-06-14 09:33:12 -07:00
Craig Topper
f83d5d293d [RISCV] Remove vfmerge.vf patterns with bf16 types.
These patterns are no longer used because we don't generate bf16
to vector splats except for constants that can be handled with
vmerge.vi.
2024-06-13 22:17:18 -07:00
Craig Topper
b95446286b [RISCV] Remove partially duplicate riscv_vfmv_v_f_vl patterns.
We had specific patterns for riscv_vfmv_v_f_vl in both RISCVInstrInfoVVLPatterns.td
and RISCVInstrInfoVSDPatterns.td.

The RISCVInstrInfoVSDPatterns.td patterns could only match if the
RISCVInstrInfoVVLPatterns.td failed. As far as I can tell this
would only happen if the predicate didn't match. Tweak the predicate
so the RISCVInstrInfoVVLPatterns.td can match in more cases.
2024-06-13 22:09:23 -07:00
Craig Topper
ae4677c81a
[RISCV] Remove support for vfmv.v.f with bf16 type. (#95352)
This isn't used by clang and isn't in the rvv-intrinsic-doc.

The instruction requires Zvfh.

If the F register passed to the instruction isn't nan-boxed correctly,
the instruction will generate the wrong nan. So the instruction isn't a
generic move FPR16 to vector register instruction.
2024-06-13 08:34:38 -07:00
Jianjian GUAN
be18daad06 Reland "[RISCV] Support select/merge like ops for bf16 vectors when have Zvfbfmin" (#94565)" 2024-06-07 15:55:16 +08:00
Mehdi Amini
8c452d0cc5
Revert "[RISCV] Support select/merge like ops for bf16 vectors when have Zvfbfmin" (#94565)
Reverts llvm/llvm-project#91936

Premerge bots are broken.
2024-06-05 21:13:52 -07:00
Jianjian Guan
d5ab38f69c
[RISCV] Support select/merge like ops for bf16 vectors when have Zvfbfmin (#91936) 2024-06-06 10:33:54 +08:00
Craig Topper
8a8cd8a766
[RISCV] Move vnclip patterns into DAGCombiner. (#93728)
Similar to #93596, this moves the signed vnclip patterns into DAG
combine.
    
This will allows us to support more than 1 level of truncate in a
future patch.
2024-05-29 16:46:36 -07:00
Craig Topper
ec8fe598a9
[RISCV] Move vnclipu patterns into DAGCombiner. (#93596)
I plan to add support for multiple layers of vnclipu. For example,
i32->i8 using 2 vnclipu instructions. First clipping to 65535, then
clipping to 255. Similar for signed vnclip.
    
This scales poorly if we need to add patterns with 2 or 3 truncates.
Instead, move the code to DAGCombiner with new ISD opcodes to represent
VCLIP(U).
    
This patch just moves the existing patterns into DAG combine. Support
for multiple truncates will as a follow up. A similar patch series will
be made for the signed vnclip.
2024-05-29 13:00:15 -07:00
Craig Topper
f9278d61ba [RISCV] Fix tablegen indentation. NFC 2024-05-26 10:40:01 -07:00
Craig Topper
6246b495ad
[RISCV] Select ISD::AVGCEILS/AVGFLOORS as vaadd. (#92839)
I think the behaviors are the same if this describes their behavior.

AVGFLOORS sign extends the inputs by 1 bit, adds them, then does an
arithmetic shift right by 1 before truncating to the original bit width.
This is vaadd with rdn rounding mode.

AVGCEILS sign extends the inputs by 1 bit, adds them, then does an
arithmetic shift right by 1. If the bit shifted out is 1, it adds 1 to
the shifted value. Then truncates to the original bit width. This is vaadd
with rnu rounding mode.

I think this wasn't implemented previously because there was some
confusion about what average means. Some may expect average to round
towards zero, but there is no way to do that in RISC-V or with the
SelectionDAG nodes. Related issue
https://github.com/riscv/riscv-v-spec/issues/935
2024-05-20 23:24:22 -07:00
Jianjian Guan
37fcb323f6
[RISCV] Add codegen support for Zvfbfmin (#87911)
This patch adds basic codegen support for Zvfbfmin extension.
2024-05-07 10:25:06 +08:00
Chia
0afc884e87
[RISCV] Use vnclip for scalable vector saturating truncation. (#88648)
Similar to #75145, but for scalable vectors.

Specifically, this patch works for the below optimization case:

## Source Code
```
define void @trunc_sat_i8i16_maxmin(ptr %x, ptr %y) {
  %1 = load <vscale x 4 x i16>, ptr %x, align 16
  %2 = tail call <vscale x 4 x i16> @llvm.smax.v4i16(<vscale x 4 x i16> %1, <vscale x 4 x i16> splat (i16 -128))
  %3 = tail call <vscale x 4 x i16> @llvm.smin.v4i16(<vscale x 4 x i16> %2, <vscale x 4 x i16> splat (i16 127))
  %4 = trunc <vscale x 4 x i16> %3 to <vscale x 4 x i8>
  store <vscale x 4 x i8> %4, ptr %y, align 8
  ret void
}
```
## Before this patch
[Compiler Explorer](https://godbolt.org/z/EKc9eGvo8)
```
trunc_sat_i8i16_maxmin:
        vl1re16.v       v8, (a0)
        li      a0, -128
        vsetvli a2, zero, e16, m1, ta, ma
        vmax.vx v8, v8, a0
        li      a0, 127
        vmin.vx v8, v8, a0
        vsetvli zero, zero, e8, mf2, ta, ma
        vnsrl.wi        v8, v8, 0
        vse8.v  v8, (a1)
        ret
```
## After this patch
```
trunc_sat_i8i16_maxmin:
        vsetivli zero, 4, e8, mf4, ta, ma
        vle16.v v8, (a0)
        vnclip.wi v8, v8, 0
        vse8.v v8, (a1)
        ret
```
2024-04-18 13:45:51 +09:00
Michael Maitland
1b310c45e9 [RISCV] Split PseudoVFMIN, PseudoVFMAX PseudoVFSGNJ, PseudoVFSGNJN, and PseudoVFSGNJX by SEW 2024-04-15 06:09:14 -07:00
Michael Maitland
469493f556 [RISCV] Split narrowing convert to FP pseudos by SEW 2024-04-15 06:08:56 -07:00
Michael Maitland
60a1158f31 [RISCV] Split Widening convert to FP pseudos by SEW 2024-04-15 06:08:52 -07:00
Michael Maitland
2e0e3b0f10 [RISCV] Split single width convert to FP pseudos by SEW 2024-04-15 06:08:49 -07:00
Michael Maitland
43248ffea7 [RISCV] Split widening floating point fused multiple-add pseudo instructions by SEW
Co-authored-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>
2024-04-12 07:06:40 -07:00