1162 Commits

Author SHA1 Message Date
Craig Topper
a64b3e92c7 [RISCV] Re-define sha256, Zksed, and Zksh intrinsics to use i32 types.
Previously we returned i32 on RV32 and i64 on RV64. The instructions
only consume 32 bits and only produce 32 bits. For RV64, the result
is sign extended to 64 bits like *W instructions.

This patch removes this detail from the interface to improve
portability and consistency. This matches the proposal for scalar
intrinsics here https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44

I've included IR autoupgrade support as well.

I'll be doing this for other builtins/intrinsics that currently use
'long' in other patches.

Reviewed By: VincentWu

Differential Revision: https://reviews.llvm.org/D154647
2023-07-17 08:58:29 -07:00
Luke Lau
b5bcd4f60b [RISCV] Add VL nodes and VP patterns for unary zvbb instructions
This follows the pattern of lowering VP nodes to equivalent
RISCVISD::*_VL nodes. The nodes are modelled after the VP ISD nodes rather
than the actual zvbb instructions, and I've included a merge operand to be
consistent with the underlying pseudos that were recently refactored.

I've defined the nodes in RISCVInstrInfoVVLpatterns.td as the nodes aren't Zvk
specific, but the patterns are in RISCVInstrInfoZvk.td.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D155229
2023-07-17 09:17:58 +01:00
Craig Topper
ce70578303 [RISCV] Move comments before 'if' instead of after. NFC
This allows us to remove some curly braces around the if body.
The code wasn't consistent about it anyway. Comments before is
used in other places in this file already.

Reviewed By: wangpc, MaskRay

Differential Revision: https://reviews.llvm.org/D155390
2023-07-16 12:57:49 -07:00
Craig Topper
2a33f47912 [RISCV] Make selectSETCC return SDValue instead of bool. NFC
We can use a null SDValue for the 'false' case. This avoids the
need for an output parameter. This is consistent with other
SelectionDAG code.

Reviewed By: wangpc

Differential Revision: https://reviews.llvm.org/D155388
2023-07-16 12:56:32 -07:00
Craig Topper
48ee319378 Revert "[RISCV] Move comments before 'if' instead of after. NFC"
This reverts commit ef1ccc493e6167488ac10da2842fa7cac2746565.

Committed by mistake.
2023-07-15 22:54:06 -07:00
Craig Topper
d09109aa1e [RISCV] Use isScalarInteger instead of isInteger. NFC
The type should only be scalar here and the isScalarInteger
should be a simpler check.
2023-07-15 22:52:43 -07:00
Craig Topper
ef1ccc493e [RISCV] Move comments before 'if' instead of after. NFC
This allows us to remove some curly braces around the if body.
The code wasn't consistent about it anyway. Comments before is
used in other places in this file already.

Differential Revision: https://reviews.llvm.org/D155390
2023-07-15 22:47:52 -07:00
Craig Topper
2b0b85c05e [RISCV] Move vector handling earlier in lowerSELECT. NFC
This keeps all the scalar code together.
2023-07-15 22:34:19 -07:00
Craig Topper
12c669a869 [RISCV] Remove 'else' after 'return'. NFC 2023-07-15 22:25:33 -07:00
Mikhail Gudim
c158ddd99e Reapply [RISCV] Fold binary op into select if profitable.
This fixes some bugs in the original commit:
  (1) Operands are passed in correct order when creating new constant
  and the binary operator. New tests were added to cover these cases.
  (2) Check was added to see if it is safe to commute the select and the binary operator.

Reviewed By: Craig Topper

Differential Revision: https://reviews.llvm.org/D152147
2023-07-14 15:30:54 -04:00
Craig Topper
3a0a25f9b6 [RISCV] Support i32 clmul* intrinsics on RV64.
We can use an i64 clmul to emulate i32 clmul.
For clmulh and clmulr we need to zero extend the 32 bit input
to 64 bits then extract either bits [63:32] or [62:31].

Unfortunately, without Zba we need to use 2 shifts for the
zero extends. These can be optimized out later if the producing
instruction already zeroed the upper bits or if we can use lwu.

There are alternative sequences we can use for clmulh/clmulr
when the zero extend isn't free, but those are best handled by
a DAG combine to give the best opportunity for removing the extend.

This allows us to implement i32 clmul C intrinsics proposed in
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D154729
2023-07-14 11:20:03 -07:00
Alex Bradbury
5c5a1a2927 [RISCV] Introduce RISCVISD::CZERO_{EQZ,NEZ} nodes produce them when zicond is present in lowerSELECT
This patch is a step towards altering how we handle the emission of
condops. Marking ISD::SELECT as legal is a major change in the codegen
path, and gives few options for maintaining the old codegen path when
it is believed to be better (e.g. a better branchless sequence is
possible using non-zicond instructions, or the branch-based sequence is
preferable).

This removes the existing SelectionDAG patterns and moves the logic into
lowerSELECT. Along some small codegen changes you'll note a few minor
regressions in the generated code quality - this are due to the fact
that by lowering the SELECT node early we miss out on combines that
would kick in later when setcc condcodes that aren't natively supported
have been expanded (thus exposing opportunities for optimisation by
performing logical negation and swapping truev/falsev). I've opted to
split out work that addresses these into follow-on patches (especially
as zicond is still 'experimental').

matchSetCC is a straight-forward translation from the version in
RISCVISelDAGToDAG. Ideally, in the future it can be converted to a
helper shared between both files.

Differential Revision: https://reviews.llvm.org/D155083
2023-07-14 11:31:27 +01:00
Yeting Kuo
2ac99205ee [RISCV] Narrow types of index operand matched pattern (shl (zext), C).
(shl (zext to iXLenVec), C) is a possible pattern in auto-vectorized code for
indexed loads/stores. But extending to iXLen might be too aggressive, RVV
indexed load/store instructions zero extend their indexed operand to XLEN.
The patch tries to narrow the type of the zero extension. It's benefit to
decrease register pressure.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154687
2023-07-14 15:45:44 +08:00
Luke Lau
55e2772e9f [RISCV] Add initial SDNode patterns for unary zvbb instructions
This patch adds pseudos and SDNode patterns for vbrev.v, vrev8.v, vclz.v,
vctz.v and vcpop.v.
I've only added them for integer element types so far since we're lacking tests
for floats.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D155216
2023-07-13 19:39:04 +01:00
eopXD
5d18d43f26 [7/8][RISCV] Add rounding mode control variant for conversion intrinsics between floating-point and integer
Depends on D154634

For the cover letter of the patch-set, please checkout D154628.

This is the 7th patch of the patch-set. This patch includes change to
vfcvt_x_f, vfcvt_xu_f, vfwcvt_x_f, vfwcvt_xu_f, vfncvt_x_f, vfncvt_xu_f
vfcvt_f_x, vfcvt_f_xu, vfncvt_f_x vfncvt_f_xu, vfncvt_f_f

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154635
2023-07-13 00:54:07 -07:00
eopXD
76482078cd [RISCV][POC] Model frm control for vfadd
Depends on D152879.

Specification PR: riscv-non-isa/rvv-intrinsic-doc#226

This patch adds variant of `vfadd` that models the rounding mode control.
The added variant has suffix `_rm` appended to differentiate from the
existing ones that does not alternate `frm` and uses whatever is inside.

The value `7` is used to indicate no rounding mode change. Reusing the
semantic from the rounding mode encoding for scalar floating-point
instructions.

Additional data member `HasFRMRoundModeOp` is added so we can append
`_rm` suffix for the fadd variants that models rounding mode control.

Additional data member `IsRVVFixedPoint` is added so we can define
pseudo instructions with rounding mode operand and distinguish the
instructions between fixed-point and floating-point.

Reviewed By: craig.topper, kito-cheng

Differential Revision: https://reviews.llvm.org/D152996
2023-07-13 00:34:00 -07:00
Mikhail Gudim
17e2df6695 [RISCV] Removed the requirement of XLenVT for performSELECTCombine.
Reviewed By: Craig Topper

Differential Revision: https://reviews.llvm.org/D153044
2023-07-12 16:29:09 -04:00
Craig Topper
dbd47c4489 [RISCV] Don't allow X0 to be used for 'r' constraint in inline assembly
Some instructions treat x0 as a special encoding rather than as a
value of 0. Since we don't parse the inline assembly to know what
the instruction is, chooser the safest option of never using x0.

Fixes #63747.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D154744
2023-07-10 13:25:17 -07:00
Craig Topper
6f90808074 [RISCV] Add a guard condition to orc_b/brev8 handling in ReplaceNodeResults.
The orc_b and brev8 intrinsics are type overloaded, but only
i32 and XLen are supported types. The type legalization code in
ReplaceNodeResults only handles the i32 case on RV64. Add some
checks so we will fail type legalization for other types.
2023-07-07 08:51:46 -07:00
Craig Topper
427278d11a [RISCV] Remove pseudos for vwcvt.f.x(u) with rounding mode.
vwcvt.f.x doesn't use rounding mode. The integer value fits in
the mantissa of a 2x larger FP type so no rounding is required.

I've remove the Uses = [FRM] that is also not needed.

I deleted the isel patterns. Alternatively, we could keep them and
drop the rounding mode immediate. The patterns are currently untested
so I chose to delete them. If they become needed in the future, we
can decide then if we should have the patterns or teach the node
creation to use the non-RM form for widening.

This reverts part of D142102.

Reviewed By: luke

Differential Revision: https://reviews.llvm.org/D154653
2023-07-07 08:38:20 -07:00
Luke Lau
02bb33c3ce [RISCV] Check for alignment when lowering interleaved/deinterleaved loads/stores
As noted by @reames, we should be checking that the memory access is aligned to
the element size (or the unaligned vector memory access feature is enabled)
before lowering vlseg/vsseg intrinsics via the interleaved access pass.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D154536
2023-07-07 15:34:24 +01:00
Yeting Kuo
74eac85dae [RISCV] Add riscv_vsoxei_mask/riscv_vsuxei_mask to getTgtMemIntrinsic.
This constructs a proper memory operand for riscv_vsoxei_mask and riscv_vsuxei_mask.
I think they are missed in D147119.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D154694
2023-07-07 17:52:11 +08:00
Craig Topper
a403124998 [RISCV] Don't sink i1 vectors in shouldSinkOperands.
These can't create .vx instructions so there's no reason to sink them.
2023-07-06 20:36:55 -07:00
Craig Topper
be253cb987 [RISCV] Support i32 brev8 intrinsic on RV64.
Similar to what we do for orc.b. Another patch will expose this
as a builtin in clang.
2023-07-06 17:24:53 -07:00
Craig Topper
ee34fa0032 [RISCV] Add DAG combine for (fmv_w_x_rv64 (fmv_x_anyextw_rv64 X))
This pattern started showing up more after D151284
2023-07-05 19:35:13 -07:00
Luke Lau
ea62fc79e7 [RISCV] Lower deinterleave2 intrinsics to vlseg2
Following from D153864, this patch implements the lowerDeinterleaveIntrinsic
hook to lower deinterleaves of loads into vlseg2 intrinsics.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D153876
2023-07-05 19:24:15 +01:00
Luke Lau
70093fcf6c [RISCV] Lower interleave2 intrinsics to vsseg2
This patch teaches the RISCV TargetLowering class to lower interleave
intrinsics to vsseg2, so it can lower interleaved stores for scalable vectors.
Previously, we could only lower stores of interleaves for fixed length vectors
with vector shuffles.

This uses the lowerInterleaveIntrinsic interface for the interleaved
access pass that was added in D146218, and subsumes the DAG combine
approach taken in D144175

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D153864
2023-07-05 19:24:05 +01:00
Luke Lau
60be17a685 [RISCV] Add VFCVT pseudos with no mask
When emitting a vfcvt with a rounding mode, we end up generating an unnecessary
vmset because the only rounding mode pseudos have a mask operand. This patch
adds a pseudo without a mask, and marks the masked variant with the
MaskedPseudo class so the doPeepholeMergeVMV optimisation knows to remove the
redundant vmset.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D154266
2023-07-05 17:28:43 +01:00
Jianjian GUAN
1d8db2fab3 [RISCV][NFC] Refactor lowerToScalableOp.
Refactor lowerToScalableOp to combine switch case code.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D153948
2023-07-04 14:09:24 +08:00
Luke Lau
49899cd4ce [RISCV] Refactor vfcvt_rm pseudo insertion case statements. NFC
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154258
2023-07-03 23:52:35 +01:00
Luke Lau
e8e0f32958 [RISCV] Fix vfwcvt/vfncvt pseudos w/ rounding mode lowering
Some signed opcodes were being lowered to their unsigned counterparts and
vice-versa.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154234
2023-06-30 21:43:19 +01:00
Alex Bradbury
5ba40c7be3 [RISCV] Custom lower FP_TO_FP16 and FP16_TO_FP to correct ABI of of libcall
As introduced in D99148, RISC-V uses the softPromoteHalf legalisation
for fp16 values without zfh, with logic ensuring that f16 values are
passed in lower bits of FPRs (see D98670) when F or D support is
present. This legalisation produces ISD::FP_TO_FP16 and ISD::FP16_TO_FP
nodes which (as described in ISDOpcodes.h) provide a "semi-softened
interface for dealing with f16 (as an i16)". i.e. the return type of the
FP_TO_FP16 is an integer rather than a float (and the arg of FP16_TO_FP
is an integer). The remainder of the description focuses primarily on
FP_TO_FP16 for ease of explanation.

FP_TO_FP16 is lowered to a libcall to `__truncsfhf2 (float)` or
`__truncdfhf2 (double)`. As of D92241, `_Float16` is used as the return
type of these libcalls if the host compiler accepts `_Float16` in a test
input (i.e. dst_t is set to `_Float16`). `_Float16` is enabled for the
RISC-V target as of D105001 and so the return value should be passed in
an FPR on hard float ABIs.

This patch fixes the ABI issue in what appears to be a minimally
invasive way - leaving the softPromoteHalf logic undisturbed, and
lowering FP_TO_FP16 to an f32-returning libcall, converting its result
to an XLen integer value.

As can be seen in the test changes, the custom lowering for FP16_TO_FP
means the libcall is no longer tail-callable.

Although this patch fixes the issue, there are two open items:
* Redundant fmv.x.w and fmv.w.x pairs are now somtimes produced during
  lowering (not a correctness issue).
* Now coverage for STRICT variants of FP16 conversion opcodes.

Differential Revision: https://reviews.llvm.org/D151284
2023-06-30 16:41:49 +01:00
Craig Topper
1c676e08d0 [RISCV] Do a more complete job of disabling extending loads and truncating stores for fixed vector types.
We weren't marking some combinations as Expand if ones of the
types wasn't legal.

Fixes #63596.
2023-06-29 00:23:16 -07:00
Jianjian GUAN
a09a19be58 [RISCV] Update computeKnownBitsForTargetNode for FPCLASS.
The fclass instruction only set one of the low 10 bits.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154040
2023-06-29 14:13:01 +08:00
Alex Bradbury
929124993a Recommit "[RISCV] Implement support for bf16 truncate/extend on hard FP targets"
Without the changes from D153598.

Original commit message:

For the same reasons as D151284, this requires custom lowering of the
truncate libcall on hard float ABIs (the normal libcall code path is
used on soft ABIs).

The extend operation is implemented by a shift just as in the standard
legalisation, but needs to be custom lowered because i32 isn't a legal
type on RV64.

This patch aims to make the minimal changes that result in correct
codegen for the bfloat.ll tests.

Differential Revision: https://reviews.llvm.org/D151663
2023-06-23 17:23:12 -07:00
Craig Topper
076759f068 Revert "[RISCV] Implement support for bf16 truncate/extend on hard FP targets"
This was committed with D153598 merged into it. Reverting to recommit as separate patches.

This reverts commit 690b1c847f0b188202a86dc25a0a76fd8c4618f4.
2023-06-23 17:23:12 -07:00
Sami Tolvanen
83835e22c7 [RISCV] Implement KCFI operand bundle lowering
With `-fsanitize=kcfi` (Kernel Control-Flow Integrity), Clang emits
"kcfi" operand bundles to indirect call instructions. Similarly to
the target-specific lowering added in D119296, implement KCFI operand
bundle lowering for RISC-V.

This patch disables the generic KCFI pass for RISC-V in Clang, and
adds the KCFI machine function pass in `RISCVPassConfig::addPreSched`
to emit target-specific `KCFI_CHECK` pseudo instructions before calls
that have KCFI operand bundles. The machine function pass also bundles
the instructions to ensure we emit the checks immediately before the
calls, which is not possible with the generic pass.

`KCFI_CHECK` instructions are lowered in `RISCVAsmPrinter` to a
contiguous code sequence that traps if the expected hash in the
operand bundle doesn't match the hash before the target function
address. This patch emits an `ebreak` instruction for error handling
to match the Linux kernel's `BUG()` implementation. Just like for X86,
we also emit trap locations to a `.kcfi_traps` section to support
error handling, as we cannot embed additional information to the trap
instruction itself.

Relands commit 62fa708ceb027713b386c7e0efda994f8bdc27e2 with fixed
tests.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D148385
2023-06-23 22:57:56 +00:00
Sami Tolvanen
e809ebeb6c Revert "[RISCV] Implement KCFI operand bundle lowering"
This reverts commit 62fa708ceb027713b386c7e0efda994f8bdc27e2.

Reverting to investigate -verify-machineinstrs errors in MIR tests.
2023-06-23 21:42:57 +00:00
Sami Tolvanen
62fa708ceb [RISCV] Implement KCFI operand bundle lowering
With `-fsanitize=kcfi` (Kernel Control-Flow Integrity), Clang emits
"kcfi" operand bundles to indirect call instructions. Similarly to
the target-specific lowering added in D119296, implement KCFI operand
bundle lowering for RISC-V.

This patch disables the generic KCFI pass for RISC-V in Clang, and
adds the KCFI machine function pass in `RISCVPassConfig::addPreSched`
to emit target-specific `KCFI_CHECK` pseudo instructions before calls
that have KCFI operand bundles. The machine function pass also bundles
the instructions to ensure we emit the checks immediately before the
calls, which is not possible with the generic pass.

`KCFI_CHECK` instructions are lowered in `RISCVAsmPrinter` to a
contiguous code sequence that traps if the expected hash in the
operand bundle doesn't match the hash before the target function
address. This patch emits an `ebreak` instruction for error handling
to match the Linux kernel's `BUG()` implementation. Just like for X86,
we also emit trap locations to a `.kcfi_traps` section to support
error handling, as we cannot embed additional information to the trap
instruction itself.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D148385
2023-06-23 18:25:24 +00:00
Alex Bradbury
690b1c847f [RISCV] Implement support for bf16 truncate/extend on hard FP targets
For the same reasons as D151284, this requires custom lowering of the
truncate libcall on hard float ABIs (the normal libcall code path is
used on soft ABIs).

The extend operation is implemented by a shift just as in the standard
legalisation, but needs to be custom lowered because i32 isn't a legal
type on RV64.

This patch aims to make the minimal changes that result in correct
codegen for the bfloat.ll tests.

Differential Revision: https://reviews.llvm.org/D151663
2023-06-23 14:18:59 +01:00
Craig Topper
9d1bcb70ec [RISCV] Use GPR register class for RV64 ZDInx. Remove GPRF64 register class.
The GPRF64 has the same spill size as GPR and is only used for RV64.
There's no real reason to have it as a separate class other than
for type inference for isel patterns in tablegen.

This patch adds f64 to the GPR register class when XLen=64. I use
f32 when XLen=32 even though we don't make use of it just to avoid
the oddity.

isel patterns have been updated to fix the lack of type infererence.

I might do similar for GPRF16 and GPRF32 or I might change them to
use an optimized spill size instead of always using XLen.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D153110
2023-06-22 09:38:46 -07:00
Paul Kirth
3ea8f25265 [RISCV] Strengthen atomic ordering for sequentially consistent stores
This is a similar change to one proposed for GCC:
https://inbox.sourceware.org/gcc-patches/20230414170942.1695672-1-patrick@rivosinc.com/

The changes in this patch are based on the proposal by Hans Boehm to more
closely match the intended semantics for sequentially consistent stores
and to allow some platforms to avoid an ABI break when switching to more
performant atomic instructions. Platforms that have already compiled
code using the existing mappings will also have more time to gradually
replace that code in preparation of the switch.

Further details can be found in the psABI proposal:
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/378.

This patch implements a mapping that is stronger than the one outlined in table
A.6 of the RISC-V unprivileged spec to be future compatible with table A.7 of
the same document. The related discussion can be found at
https://lists.riscv.org/g/tech-unprivileged/topic/risc_v_memory_model_topics/92916241

The major change to RISC-V code generation is that we will now emit a trailing
fence for sequentially consistent stores.

The new code sequence should have the following form:
```
fence rw,w; s{b|h|w|d}; fence rw,rw;
```

Other changes and optimizations like using amoswap will be handled separately.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D149486
2023-06-22 15:42:17 +00:00
Luke Lau
485d25007a [RISCV] Custom lower fixed vector undef to scalable undef
This avoids undefs from being expanded to a build vector of zeroes.
As noted by @craig.topper in D153399

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D153411
2023-06-21 17:14:57 +01:00
Craig Topper
aae155c50b [RISCV] Use a build_vector instead of a chain insert_vector_elts for vXi1 build_vector lowreing.
A build_vector is the canonical representation rather than multiple
insert_vector_elts.

Unfortunately, this regresses quite a few tests now primarily due to not
having a vmv.s.x special case, but I hope we can improve this with future
patches.

Stress testing in our downstream found an infinite loop in DAG combine.
This patch breaks the infinite loop.

The insert_vector_element chain starts with a fixed vector undef.
Fixed vector undef is currently expanded to a build_vector of 0s
which gets lowered to a vmv.v.i. The insert chain overwrites all
elements so SimplifyDemandedVectorElts turns the vmv.v.i back into
undef and the cycle repeats.

We probably should custom lower fixed vector undef to scalable
vector undef. I think that would also fix the infinite loop, but
I didn't test that.

Reviewed By: luke

Differential Revision: https://reviews.llvm.org/D153399
2023-06-21 08:57:46 -07:00
Craig Topper
ddf3f1b3b2 [RISCV] Stop isInterleaveShuffle from producing illegal extract_subvectors.
The definition for ISD::EXTRACT_SUBVECTOR says the index must be
aligned to the known minimum elements of the extracted type. We mostly
got away with this but it turns out there are places that depend on this.

For example, this code in getNode for ISD::EXTRACT_SUBVECTOR

```
    // EXTRACT_SUBVECTOR of CONCAT_VECTOR can be simplified if the pieces of
    // the concat have the same type as the extract.
    if (N1.getOpcode() == ISD::CONCAT_VECTORS && N1.getNumOperands() > 0 &&
        VT == N1.getOperand(0).getValueType()) {
      unsigned Factor = VT.getVectorMinNumElements();
      return N1.getOperand(N2C->getZExtValue() / Factor);
    }
```

This depends on N2C->getZExtValue() being evenly divisible by Factor.

Reviewed By: luke

Differential Revision: https://reviews.llvm.org/D153380
2023-06-21 08:52:28 -07:00
Craig Topper
832eb93251 [RISCV] Reduce some duplicate code in lowerBUILD_VECTOR. NFC
The code at the beginning of the loop body and after the loop are
identifical. Move it to the end of the loop body by making a few
adjustments.
2023-06-20 21:52:14 -07:00
Craig Topper
8680c28add [RISCV] Remove mask from vrgatherei16 in lowerVECTOR_INTERLEAVE.
Unless I'm missing something we need to update the whole vector
not just where OddMask is true.

Reviewed By: luke

Differential Revision: https://reviews.llvm.org/D153087
2023-06-20 09:36:38 -07:00
LiaoChunyu
12fee611ca [RISCV] Fold special case (xor (setcc constant, y, setlt), 1) -> (setcc y, constant + 1, setlt)
Improve D151719.
(xor (setcc constant, y, setlt), 1) -> (setcc y, constant + 1, setlt)
https://alive2.llvm.org/ce/z/BZNEia

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D152128
2023-06-17 10:10:20 +08:00
Nitin John Raj
a986998bad [RISCV] Introduce RISCVISD::VWMACC(U/SU)_VL opcode
Differential Revision: https://reviews.llvm.org/D153057
2023-06-16 16:11:35 -07:00
Craig Topper
f9d0bf0631 Revert "[RISCV] Fold binary op into select if profitable."
This reverts commit d0189584631e587279ee5f0af5feb94d8045bb31.

Build failures have been reported in the Linux kernel.
2023-06-13 18:17:36 -07:00