2702 Commits

Author SHA1 Message Date
Craig Topper
a64b3e92c7 [RISCV] Re-define sha256, Zksed, and Zksh intrinsics to use i32 types.
Previously we returned i32 on RV32 and i64 on RV64. The instructions
only consume 32 bits and only produce 32 bits. For RV64, the result
is sign extended to 64 bits like *W instructions.

This patch removes this detail from the interface to improve
portability and consistency. This matches the proposal for scalar
intrinsics here https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44

I've included IR autoupgrade support as well.

I'll be doing this for other builtins/intrinsics that currently use
'long' in other patches.

Reviewed By: VincentWu

Differential Revision: https://reviews.llvm.org/D154647
2023-07-17 08:58:29 -07:00
Craig Topper
fda45d9198 [RISCV] Add FP compare test to condops.ll to show a missed opportunity to remove an xori. NFC
This is a case that D155288 won't get.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D155327
2023-07-17 08:47:42 -07:00
Luke Lau
b5bcd4f60b [RISCV] Add VL nodes and VP patterns for unary zvbb instructions
This follows the pattern of lowering VP nodes to equivalent
RISCVISD::*_VL nodes. The nodes are modelled after the VP ISD nodes rather
than the actual zvbb instructions, and I've included a merge operand to be
consistent with the underlying pseudos that were recently refactored.

I've defined the nodes in RISCVInstrInfoVVLpatterns.td as the nodes aren't Zvk
specific, but the patterns are in RISCVInstrInfoZvk.td.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D155229
2023-07-17 09:17:58 +01:00
Piyou Chen
7ce4e933ea [RISCV] Implement prefetch locality by NTLH
We add the MemOperand then backend will generate NTLH automatically.

```
__builtin_prefetch(ptr,  0 /* rw==read */, 0 /* locality */); => ntl.all + prefetch.r (ptr)
__builtin_prefetch(ptr,  0 /* rw==read */, 1 /* locality */); => ntl.pall + prefetch.r (ptr)
__builtin_prefetch(ptr,  0 /* rw==read */, 2 /* locality */); => ntl.p1 + prefetch.r (ptr)
__builtin_prefetch(ptr,  0 /* rw==read */, 3 /* locality */); => prefetch.r (ptr)
```

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154691
2023-07-16 20:32:46 -07:00
Jim Lin
348c67e254 [RISCV] Merge rv32/rv64 vector single-width shift intrinsic tests that have the same content. NFC. 2023-07-16 13:09:10 +08:00
Nitin John Raj
6a35ceaacf [RISCV][GlobalISel] Legalize add, sub and binary logical instructions for narrow types
For rv32, we test the legalization of i8, i16 and i32. For rv64, we additionally test the legalization of i64.

This is the first of a series of commits aiming to legalize arithmetic instructions for RISCV.

Reviewed By: craig.topper, arsenm

Differential Revision: https://reviews.llvm.org/D154978
2023-07-14 18:22:53 -07:00
Mikhail Gudim
c158ddd99e Reapply [RISCV] Fold binary op into select if profitable.
This fixes some bugs in the original commit:
  (1) Operands are passed in correct order when creating new constant
  and the binary operator. New tests were added to cover these cases.
  (2) Check was added to see if it is safe to commute the select and the binary operator.

Reviewed By: Craig Topper

Differential Revision: https://reviews.llvm.org/D152147
2023-07-14 15:30:54 -04:00
Craig Topper
3a0a25f9b6 [RISCV] Support i32 clmul* intrinsics on RV64.
We can use an i64 clmul to emulate i32 clmul.
For clmulh and clmulr we need to zero extend the 32 bit input
to 64 bits then extract either bits [63:32] or [62:31].

Unfortunately, without Zba we need to use 2 shifts for the
zero extends. These can be optimized out later if the producing
instruction already zeroed the upper bits or if we can use lwu.

There are alternative sequences we can use for clmulh/clmulr
when the zero extend isn't free, but those are best handled by
a DAG combine to give the best opportunity for removing the extend.

This allows us to implement i32 clmul C intrinsics proposed in
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D154729
2023-07-14 11:20:03 -07:00
Simon Cook
4083ecfd7f [RISCV] Cleanups in CORE-V (xcv) extensions
This is a mostly NFC change cleaning up and clarifying components of the
in-tree CORE-V (xcv*) extensions following discussions on the remaining
extensions.

This makes the following changes to the xcbitmanip and xcvmac support:

1. Add missing extensions from RISCVISAInfo, such that they can be
   supported in clang's -march option.
2. Clarify the extension version number is 1.0.0 in documentation.
3. Clarify the extensions are by OpenHW Group, and the capitilization
   of the CORE-V extension family.
4. Add CORE-V to extension name in RISCVFeatures, both to be consistent
   with other vendors, and also better distinguish e.g. CORE-V bit
   manipulation vs RISC-V's standard Zb extensions.

Differential Revision: https://reviews.llvm.org/D155283
2023-07-14 18:21:08 +01:00
Alex Bradbury
95075d3d2c [RISCV][test] Add RV32I and RV64I RUN lines to condops.ll test
Some of these test cases will be changed by upcoming combines, even in
the non-zicond case.
2023-07-14 13:29:40 +01:00
Alex Bradbury
5c5a1a2927 [RISCV] Introduce RISCVISD::CZERO_{EQZ,NEZ} nodes produce them when zicond is present in lowerSELECT
This patch is a step towards altering how we handle the emission of
condops. Marking ISD::SELECT as legal is a major change in the codegen
path, and gives few options for maintaining the old codegen path when
it is believed to be better (e.g. a better branchless sequence is
possible using non-zicond instructions, or the branch-based sequence is
preferable).

This removes the existing SelectionDAG patterns and moves the logic into
lowerSELECT. Along some small codegen changes you'll note a few minor
regressions in the generated code quality - this are due to the fact
that by lowering the SELECT node early we miss out on combines that
would kick in later when setcc condcodes that aren't natively supported
have been expanded (thus exposing opportunities for optimisation by
performing logical negation and swapping truev/falsev). I've opted to
split out work that addresses these into follow-on patches (especially
as zicond is still 'experimental').

matchSetCC is a straight-forward translation from the version in
RISCVISelDAGToDAG. Ideally, in the future it can be converted to a
helper shared between both files.

Differential Revision: https://reviews.llvm.org/D155083
2023-07-14 11:31:27 +01:00
Yeting Kuo
2ac99205ee [RISCV] Narrow types of index operand matched pattern (shl (zext), C).
(shl (zext to iXLenVec), C) is a possible pattern in auto-vectorized code for
indexed loads/stores. But extending to iXLen might be too aggressive, RVV
indexed load/store instructions zero extend their indexed operand to XLEN.
The patch tries to narrow the type of the zero extension. It's benefit to
decrease register pressure.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154687
2023-07-14 15:45:44 +08:00
Craig Topper
0aecddcee9 [RISCV] Add Zce extension.
According to the spec, Zce is an alias for Zca, Zcb, Zcmp, and Zcmt.
If F is enabled on RV32 it also includes Zcf.

This patch adds the Zce and the implication rule which unfortunately
requires custom handling for adding Zcf.

I've also made all the Zc* extensions imply Zca.

I've also added an error for Zcf without RV32.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D153742
2023-07-13 12:22:06 -07:00
Luke Lau
55e2772e9f [RISCV] Add initial SDNode patterns for unary zvbb instructions
This patch adds pseudos and SDNode patterns for vbrev.v, vrev8.v, vclz.v,
vctz.v and vcpop.v.
I've only added them for integer element types so far since we're lacking tests
for floats.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D155216
2023-07-13 19:39:04 +01:00
Luke Lau
ed15e9119b [RISCV] Don't fold vmerge into ops if fp exception can be raised
We are already checking for fp exceptions if VL changes, but I believe we
should also be checking for them if the mask changes as well, since that also
affects the set of active elements. From the spec:
> A vector floating-point exception at any active floating-point element sets
> the standard FP exception flags in the fflags register. Inactive elements do
> not set FP exception flags.

Note that we don't change the mask if IsMasked is true, i.e. True is masked
already, since in that case we keep the existing mask.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D154980
2023-07-13 11:42:23 +01:00
Luke Lau
becfb4612a [RISCV] Add test for vmerge combine that should be prevented
The fadd in these test cases is constrained and may set fflags differently
depending on the active elements (the nofpexcept flag isn't set on the node).
Therefore to preserve semantics we shouldn't change its mask.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D154979
2023-07-13 11:42:20 +01:00
eopXD
2c38d63323 [8/8][RISCV] Add rounding mode control variant for vfredosum, vfredusum, vfwredosum, vfwredusum
Depends on D154635

For the cover letter of the patch-set, please checkout D154628.

This is the 8th patch of the patch-set.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154636
2023-07-13 00:55:10 -07:00
eopXD
5d18d43f26 [7/8][RISCV] Add rounding mode control variant for conversion intrinsics between floating-point and integer
Depends on D154634

For the cover letter of the patch-set, please checkout D154628.

This is the 7th patch of the patch-set. This patch includes change to
vfcvt_x_f, vfcvt_xu_f, vfwcvt_x_f, vfwcvt_xu_f, vfncvt_x_f, vfncvt_xu_f
vfcvt_f_x, vfcvt_f_xu, vfncvt_f_x vfncvt_f_xu, vfncvt_f_f

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154635
2023-07-13 00:54:07 -07:00
eopXD
51b9e33661 [6/8][RISCV] Add rounding mode control variant for vfsqrt, vfrec7
Depends on D154633

For the cover letter of the patch-set, please checkout D154628.

This is the 6th patch of the patch-set.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154634
2023-07-13 00:51:51 -07:00
eopXD
4085b23609 [5/8][RISCV] Add rounding mode control variant for vfwmacc, vfwnmacc, vfwmsac, vfwnmsac
Depends on D154632

For the cover letter of the patch-set, please checkout D154628.

This is the 5th patch of the patch-set.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154633
2023-07-13 00:49:59 -07:00
eopXD
e1f224a647 [4/8][RISCV] Add rounding mode control variant for vfmacc, vfnmacc, vfmsac, vfnmsac, vfmadd, vfnmadd, vfmsub, vfnmsub
Depends on D154631

For the cover letter of the patch-set, please checkout D154628.

This is the 4th patch of the patch-set.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154632
2023-07-13 00:47:27 -07:00
eopXD
1a905e8238 [3/8][RISCV] Add rounding mode control variant for vfmul, vfdiv, vfrdiv, vfwmul
Depends on D154629

For the cover letter of the patch-set, please checkout D154628.

This is the 3rd patch of the patch-set.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154631
2023-07-13 00:43:54 -07:00
eopXD
00093667b1 [2/8][RISCV] Add rounding mode control variant for vfwadd, vfwsub
Depends on D154628

For the cover letter of the patch-set, please checkout D154628.

This is the 2nd patch of the patch-set.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154629
2023-07-13 00:42:00 -07:00
eopXD
474e37c113 [1/8][RISCV] Add rounding mode control variant for vfsub, vfrsub
Depends on D152996.

This patch-set aims to add a variant for the RVV floating-point
intrinsics that controls the rounding mode (`frm`). The rounding mode
variant appends `_rm` before the policy suffix to distinguish from
those without them.

Specification PR: riscv-non-isa/rvv-intrinsic-doc#226

This is the 1st patch of the patch-set.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154628
2023-07-13 00:35:36 -07:00
eopXD
76482078cd [RISCV][POC] Model frm control for vfadd
Depends on D152879.

Specification PR: riscv-non-isa/rvv-intrinsic-doc#226

This patch adds variant of `vfadd` that models the rounding mode control.
The added variant has suffix `_rm` appended to differentiate from the
existing ones that does not alternate `frm` and uses whatever is inside.

The value `7` is used to indicate no rounding mode change. Reusing the
semantic from the rounding mode encoding for scalar floating-point
instructions.

Additional data member `HasFRMRoundModeOp` is added so we can append
`_rm` suffix for the fadd variants that models rounding mode control.

Additional data member `IsRVVFixedPoint` is added so we can define
pseudo instructions with rounding mode operand and distinguish the
instructions between fixed-point and floating-point.

Reviewed By: craig.topper, kito-cheng

Differential Revision: https://reviews.llvm.org/D152996
2023-07-13 00:34:00 -07:00
Philip Reames
b5cbd9628e [RISCV] Remove legacy TA/TU pseudo distinction of vmerge and carry-in arithmetic operations [NFC[
his change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295.

This is analogous to other patches in the series, but with one key difference - the resulting pseudo does *not* have a policy operand. We could add one for vmerge, but the some of the multiclasses are sufficiently entwined with the mask producing arithmetic instructions that the change delta becomes unmanageable. Note that these instructions are *not* in the RISCVMaskedPseudo table, and thus the difference doesn't complicate other code. The main value of working incrementally here is that we get to eagerly cleanup the IsTA logic flowing through the post-ISEL combines.

Differential Revision: https://reviews.llvm.org/D154645
2023-07-12 15:31:02 -07:00
Noah Goldstein
74f0ec5e24 [DAGCombiner] Make it so that udiv can be folded with (select c, NonZero, 1)
This is done by allowing speculation of `udiv` if we can prove the
denominator is non-zero.

https://alive2.llvm.org/ce/z/VNCt_q

Differential Revision: https://reviews.llvm.org/D149198
2023-07-12 17:17:53 -05:00
Mikhail Gudim
17e2df6695 [RISCV] Removed the requirement of XLenVT for performSELECTCombine.
Reviewed By: Craig Topper

Differential Revision: https://reviews.llvm.org/D153044
2023-07-12 16:29:09 -04:00
Craig Topper
1aecb0e000 [RISCV] Clear kill flags when forming FMA instructions in MachineCombiner.
If the operands to the mul have other uses we may be extending their
live range past a kill flag.

Reviewed By: asb, asi-sc

Differential Revision: https://reviews.llvm.org/D155046
2023-07-12 08:03:45 -07:00
Craig Topper
45b172c838 [LegalizeDAG] Prevent LegalizeLoadOps from creating extloads that mix int and fp types.
For RISC-V, getRegisterType for fp16 returns i16. i16->fp64 extload
is considered legal because the LoadExtActions defaults to Legal
for all entries. Only fp/fp and int/int entries are changed to
Expand fore RISC-V.

This patch detects the FP-ness has changed and won't try to call
isLoadExtLegal.

Alternatively, we could add Expand for int/fp and fp/int, but that
seemed a little silly.

Fixes #63816

Reviewed By: asb, wangpc

Differential Revision: https://reviews.llvm.org/D155040
2023-07-12 08:03:35 -07:00
Philip Reames
5cd41dc62d [RISCV] Remove legacy TA/TU pseudo distinction for binary instructions
This change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295.

This change handles most of the binary pseudos. I excluded pseudos which _TIED variants, and those that produce mask results. Both a bit different in functionality, and deserve their own change and review. As with previous changes in the series, we replace the existing TA and TU forms with a single unified pseudo with a passthru (which may be implicit_def) and a policy operand.

As before, we see codegen changes (some improvements and some regressions) due to scheduling differences caused by the extra implicit_def instructions.

Differential Revision: https://reviews.llvm.org/D154245
2023-07-11 10:21:42 -07:00
Jim Lin
b515133088 [RISCV] Merge rv32/rv64 vector reduction intrinsic tests that have the same content. NFC. 2023-07-11 19:04:56 +08:00
Jim Lin
0fd212c91d [RISCV] Merge rv32/rv64 vector widening intrinsic tests that have the same content. NFC. 2023-07-11 16:25:12 +08:00
Zi Xuan Wu (Zeson)
2ccb2dbc8d [RISCV] Don't fold RISCVISD::VMV_V_X_VL series node and scalar load to vector load when scalar load is update load
We try to fold RISCVISD::VMV_V_X_VL series node + scalar load -> vector load.
But if scalar load is indexed load (load update form), it's not profitable to fold because load update node can't be removed after fold.

Differential Revision: https://reviews.llvm.org/D152222
2023-07-11 15:56:31 +08:00
wangpc
99809f4377 [RISCV] Simplify the definitions of interrupt CSRs
For `CSR_Interrupt`, we can generate the register list via a single
`sequence`.

For `CSR_XLEN_F32_Interrupt` and `CSR_XLEN_F64_Interrupt`, I don't
see the reason why we need to keep the order the same as how we used
to allocate registers (and we have changed the order in D146488), so
I fold them into one `sequence`.

There are some *.ll changes because of the order change.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154837
2023-07-11 11:20:24 +08:00
Piyou Chen
299b2c2d93 [RISCV] precommit for prefetch locality support
Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D154690
2023-07-10 20:07:18 -07:00
Craig Topper
dbd47c4489 [RISCV] Don't allow X0 to be used for 'r' constraint in inline assembly
Some instructions treat x0 as a special encoding rather than as a
value of 0. Since we don't parse the inline assembly to know what
the instruction is, chooser the safest option of never using x0.

Fixes #63747.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D154744
2023-07-10 13:25:17 -07:00
Alex Bradbury
29f630a1dd [RISCV][MC] MC layer support for the experimental zacas extension
This implements the v1.0-rc1 draft extension.

amocas.d on RV32 and amocas.q have the restriction that rd and rs2 must
be even registers. I've opted to implement this restriction in
RISCVAsmParser::validateInstruction even though for codegen we'll need a
new register class and can then remove this validation. This also
sidesteps, for now, the issue of amocas.d being different on rv32 vs
rv64.

See <https://github.com/riscv-non-isa/riscv-c-api-doc/issues/37> for the
issue of needing an agreed asm register constraint for register pairs.

Differential Revision: https://reviews.llvm.org/D149248
2023-07-10 08:26:31 +01:00
LiaoChunyu
1575063db2 [RISCV] Match shl_vl (ext_vl v, splat 1) to vwadd_vl
Similer to: D153112, match shl (v, splat 1) to vwadd

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154726
2023-07-08 08:03:15 +08:00
Luke Lau
02bb33c3ce [RISCV] Check for alignment when lowering interleaved/deinterleaved loads/stores
As noted by @reames, we should be checking that the memory access is aligned to
the element size (or the unaligned vector memory access feature is enabled)
before lowering vlseg/vsseg intrinsics via the interleaved access pass.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D154536
2023-07-07 15:34:24 +01:00
Luke Lau
18013bea46 [RISCV] Add tests for unaligned segmented loads and stores
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D154535
2023-07-07 15:34:22 +01:00
WuXinlong
c0221e006d [RISCV] Add a pass to combine cm.pop and ret insts
`RISCVPushPopOptimizer.cpp` combine `cm.pop` and `ret` to generates `cm.popretz` or `cm.popret` .

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D150416
2023-07-07 14:04:11 +08:00
Jim Lin
43927542d8 [RISCV] Rename prefix fixed-vector to fixed-vectors to be the same with other testcases. NFC.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154679
2023-07-07 13:04:00 +08:00
Craig Topper
a403124998 [RISCV] Don't sink i1 vectors in shouldSinkOperands.
These can't create .vx instructions so there's no reason to sink them.
2023-07-06 20:36:55 -07:00
WuXinlong
6269ed24cf [RISCV] Readjusting the framestack for Zcmp
This patch readjusts the frame stack for the push and pop instructions

co-author: @Lukacma

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134599
2023-07-07 11:24:21 +08:00
Craig Topper
be253cb987 [RISCV] Support i32 brev8 intrinsic on RV64.
Similar to what we do for orc.b. Another patch will expose this
as a builtin in clang.
2023-07-06 17:24:53 -07:00
Alex Bradbury
619c6c0e38 [RISCV][test] Add RV32I and RV64I RUN lines to llvm.frexp.ll
Thanks to D154555, these intrinsics no longer crash when used with a
soft float ABI.
2023-07-06 13:36:03 +01:00
Jianjian GUAN
a813a633d5 [RISCV][NFC] Use common prefix to simlify test.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154487
2023-07-06 11:52:51 +08:00
Craig Topper
ee34fa0032 [RISCV] Add DAG combine for (fmv_w_x_rv64 (fmv_x_anyextw_rv64 X))
This pattern started showing up more after D151284
2023-07-05 19:35:13 -07:00
Matt Arsenault
e8ed6e35bd DAG: Implement soften float for ffrexp
Fixes #63661

https://reviews.llvm.org/D154555
2023-07-05 21:42:27 -04:00