Previously we returned i32 on RV32 and i64 on RV64. The instructions
only consume 32 bits and only produce 32 bits. For RV64, the result
is sign extended to 64 bits like *W instructions.
This patch removes this detail from the interface to improve
portability and consistency. This matches the proposal for scalar
intrinsics here https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44
I've included IR autoupgrade support as well.
I'll be doing this for other builtins/intrinsics that currently use
'long' in other patches.
Reviewed By: VincentWu
Differential Revision: https://reviews.llvm.org/D154647
This follows the pattern of lowering VP nodes to equivalent
RISCVISD::*_VL nodes. The nodes are modelled after the VP ISD nodes rather
than the actual zvbb instructions, and I've included a merge operand to be
consistent with the underlying pseudos that were recently refactored.
I've defined the nodes in RISCVInstrInfoVVLpatterns.td as the nodes aren't Zvk
specific, but the patterns are in RISCVInstrInfoZvk.td.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D155229
For rv32, we test the legalization of i8, i16 and i32. For rv64, we additionally test the legalization of i64.
This is the first of a series of commits aiming to legalize arithmetic instructions for RISCV.
Reviewed By: craig.topper, arsenm
Differential Revision: https://reviews.llvm.org/D154978
This fixes some bugs in the original commit:
(1) Operands are passed in correct order when creating new constant
and the binary operator. New tests were added to cover these cases.
(2) Check was added to see if it is safe to commute the select and the binary operator.
Reviewed By: Craig Topper
Differential Revision: https://reviews.llvm.org/D152147
We can use an i64 clmul to emulate i32 clmul.
For clmulh and clmulr we need to zero extend the 32 bit input
to 64 bits then extract either bits [63:32] or [62:31].
Unfortunately, without Zba we need to use 2 shifts for the
zero extends. These can be optimized out later if the producing
instruction already zeroed the upper bits or if we can use lwu.
There are alternative sequences we can use for clmulh/clmulr
when the zero extend isn't free, but those are best handled by
a DAG combine to give the best opportunity for removing the extend.
This allows us to implement i32 clmul C intrinsics proposed in
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D154729
This is a mostly NFC change cleaning up and clarifying components of the
in-tree CORE-V (xcv*) extensions following discussions on the remaining
extensions.
This makes the following changes to the xcbitmanip and xcvmac support:
1. Add missing extensions from RISCVISAInfo, such that they can be
supported in clang's -march option.
2. Clarify the extension version number is 1.0.0 in documentation.
3. Clarify the extensions are by OpenHW Group, and the capitilization
of the CORE-V extension family.
4. Add CORE-V to extension name in RISCVFeatures, both to be consistent
with other vendors, and also better distinguish e.g. CORE-V bit
manipulation vs RISC-V's standard Zb extensions.
Differential Revision: https://reviews.llvm.org/D155283
This patch is a step towards altering how we handle the emission of
condops. Marking ISD::SELECT as legal is a major change in the codegen
path, and gives few options for maintaining the old codegen path when
it is believed to be better (e.g. a better branchless sequence is
possible using non-zicond instructions, or the branch-based sequence is
preferable).
This removes the existing SelectionDAG patterns and moves the logic into
lowerSELECT. Along some small codegen changes you'll note a few minor
regressions in the generated code quality - this are due to the fact
that by lowering the SELECT node early we miss out on combines that
would kick in later when setcc condcodes that aren't natively supported
have been expanded (thus exposing opportunities for optimisation by
performing logical negation and swapping truev/falsev). I've opted to
split out work that addresses these into follow-on patches (especially
as zicond is still 'experimental').
matchSetCC is a straight-forward translation from the version in
RISCVISelDAGToDAG. Ideally, in the future it can be converted to a
helper shared between both files.
Differential Revision: https://reviews.llvm.org/D155083
(shl (zext to iXLenVec), C) is a possible pattern in auto-vectorized code for
indexed loads/stores. But extending to iXLen might be too aggressive, RVV
indexed load/store instructions zero extend their indexed operand to XLEN.
The patch tries to narrow the type of the zero extension. It's benefit to
decrease register pressure.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154687
According to the spec, Zce is an alias for Zca, Zcb, Zcmp, and Zcmt.
If F is enabled on RV32 it also includes Zcf.
This patch adds the Zce and the implication rule which unfortunately
requires custom handling for adding Zcf.
I've also made all the Zc* extensions imply Zca.
I've also added an error for Zcf without RV32.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D153742
This patch adds pseudos and SDNode patterns for vbrev.v, vrev8.v, vclz.v,
vctz.v and vcpop.v.
I've only added them for integer element types so far since we're lacking tests
for floats.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D155216
We are already checking for fp exceptions if VL changes, but I believe we
should also be checking for them if the mask changes as well, since that also
affects the set of active elements. From the spec:
> A vector floating-point exception at any active floating-point element sets
> the standard FP exception flags in the fflags register. Inactive elements do
> not set FP exception flags.
Note that we don't change the mask if IsMasked is true, i.e. True is masked
already, since in that case we keep the existing mask.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D154980
The fadd in these test cases is constrained and may set fflags differently
depending on the active elements (the nofpexcept flag isn't set on the node).
Therefore to preserve semantics we shouldn't change its mask.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D154979
Depends on D154635
For the cover letter of the patch-set, please checkout D154628.
This is the 8th patch of the patch-set.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154636
Depends on D154634
For the cover letter of the patch-set, please checkout D154628.
This is the 7th patch of the patch-set. This patch includes change to
vfcvt_x_f, vfcvt_xu_f, vfwcvt_x_f, vfwcvt_xu_f, vfncvt_x_f, vfncvt_xu_f
vfcvt_f_x, vfcvt_f_xu, vfncvt_f_x vfncvt_f_xu, vfncvt_f_f
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154635
Depends on D154633
For the cover letter of the patch-set, please checkout D154628.
This is the 6th patch of the patch-set.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154634
Depends on D154632
For the cover letter of the patch-set, please checkout D154628.
This is the 5th patch of the patch-set.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154633
Depends on D154631
For the cover letter of the patch-set, please checkout D154628.
This is the 4th patch of the patch-set.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154632
Depends on D154629
For the cover letter of the patch-set, please checkout D154628.
This is the 3rd patch of the patch-set.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154631
Depends on D154628
For the cover letter of the patch-set, please checkout D154628.
This is the 2nd patch of the patch-set.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154629
Depends on D152996.
This patch-set aims to add a variant for the RVV floating-point
intrinsics that controls the rounding mode (`frm`). The rounding mode
variant appends `_rm` before the policy suffix to distinguish from
those without them.
Specification PR: riscv-non-isa/rvv-intrinsic-doc#226
This is the 1st patch of the patch-set.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154628
Depends on D152879.
Specification PR: riscv-non-isa/rvv-intrinsic-doc#226
This patch adds variant of `vfadd` that models the rounding mode control.
The added variant has suffix `_rm` appended to differentiate from the
existing ones that does not alternate `frm` and uses whatever is inside.
The value `7` is used to indicate no rounding mode change. Reusing the
semantic from the rounding mode encoding for scalar floating-point
instructions.
Additional data member `HasFRMRoundModeOp` is added so we can append
`_rm` suffix for the fadd variants that models rounding mode control.
Additional data member `IsRVVFixedPoint` is added so we can define
pseudo instructions with rounding mode operand and distinguish the
instructions between fixed-point and floating-point.
Reviewed By: craig.topper, kito-cheng
Differential Revision: https://reviews.llvm.org/D152996
his change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295.
This is analogous to other patches in the series, but with one key difference - the resulting pseudo does *not* have a policy operand. We could add one for vmerge, but the some of the multiclasses are sufficiently entwined with the mask producing arithmetic instructions that the change delta becomes unmanageable. Note that these instructions are *not* in the RISCVMaskedPseudo table, and thus the difference doesn't complicate other code. The main value of working incrementally here is that we get to eagerly cleanup the IsTA logic flowing through the post-ISEL combines.
Differential Revision: https://reviews.llvm.org/D154645
If the operands to the mul have other uses we may be extending their
live range past a kill flag.
Reviewed By: asb, asi-sc
Differential Revision: https://reviews.llvm.org/D155046
For RISC-V, getRegisterType for fp16 returns i16. i16->fp64 extload
is considered legal because the LoadExtActions defaults to Legal
for all entries. Only fp/fp and int/int entries are changed to
Expand fore RISC-V.
This patch detects the FP-ness has changed and won't try to call
isLoadExtLegal.
Alternatively, we could add Expand for int/fp and fp/int, but that
seemed a little silly.
Fixes#63816
Reviewed By: asb, wangpc
Differential Revision: https://reviews.llvm.org/D155040
This change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295.
This change handles most of the binary pseudos. I excluded pseudos which _TIED variants, and those that produce mask results. Both a bit different in functionality, and deserve their own change and review. As with previous changes in the series, we replace the existing TA and TU forms with a single unified pseudo with a passthru (which may be implicit_def) and a policy operand.
As before, we see codegen changes (some improvements and some regressions) due to scheduling differences caused by the extra implicit_def instructions.
Differential Revision: https://reviews.llvm.org/D154245
We try to fold RISCVISD::VMV_V_X_VL series node + scalar load -> vector load.
But if scalar load is indexed load (load update form), it's not profitable to fold because load update node can't be removed after fold.
Differential Revision: https://reviews.llvm.org/D152222
For `CSR_Interrupt`, we can generate the register list via a single
`sequence`.
For `CSR_XLEN_F32_Interrupt` and `CSR_XLEN_F64_Interrupt`, I don't
see the reason why we need to keep the order the same as how we used
to allocate registers (and we have changed the order in D146488), so
I fold them into one `sequence`.
There are some *.ll changes because of the order change.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D154837
Some instructions treat x0 as a special encoding rather than as a
value of 0. Since we don't parse the inline assembly to know what
the instruction is, chooser the safest option of never using x0.
Fixes#63747.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D154744
This implements the v1.0-rc1 draft extension.
amocas.d on RV32 and amocas.q have the restriction that rd and rs2 must
be even registers. I've opted to implement this restriction in
RISCVAsmParser::validateInstruction even though for codegen we'll need a
new register class and can then remove this validation. This also
sidesteps, for now, the issue of amocas.d being different on rv32 vs
rv64.
See <https://github.com/riscv-non-isa/riscv-c-api-doc/issues/37> for the
issue of needing an agreed asm register constraint for register pairs.
Differential Revision: https://reviews.llvm.org/D149248
As noted by @reames, we should be checking that the memory access is aligned to
the element size (or the unaligned vector memory access feature is enabled)
before lowering vlseg/vsseg intrinsics via the interleaved access pass.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D154536
This patch readjusts the frame stack for the push and pop instructions
co-author: @Lukacma
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D134599