7012 Commits

Author SHA1 Message Date
Pengcheng Wang
17a98f85c2
[RISCV] Optimize the spill/reload of segment registers (#153184)
The simplest way is:

1. Save `vtype` to a scalar register.
2. Insert a `vsetvli`.
3. Use segment load/store.
4. Restore `vtype` via `vsetvl`.

But `vsetvl` is usually slow, so this PR is not in this way.

Instead, we use wider whole load/store instructions if the register
encoding is aligned. We have done the same optimization for COPY in
https://github.com/llvm/llvm-project/pull/84455.

We found this suboptimal implementation when porting some video codec
kernels via RVV intrinsics.
2025-08-21 16:38:53 +08:00
Luke Lau
a9692391f6
[RISCV] Move volatile check to isCandidate in VL optimizer. NFC (#154685)
This keeps it closer to the other legality checks like the FP exceptions
check.
It also means that isSupportedInstr only needs to check the opcode,
which allows it to be replaced with a TSFlags based check in a later
patch.
2025-08-21 16:37:10 +08:00
Jim Lin
3a715107c2 [RISCV] Fold argstr into class for XSMTVDot instructions. NFC.
All of them use the same argstr "$vd, $vs1, $vs2".
2025-08-21 13:12:46 +08:00
Craig Topper
2d3d8df0e0 [RISCV] Use RVPTernary_rrr for a few more instructions.
This doesn't really affect the assembler, but will
be important when we eventually do codegen.
2025-08-20 21:13:40 -07:00
Sergei Barannikov
d6679d5a5f
[Target] Remove SoftFail field on targets that don't use it (NFC) (#154659)
That is, on all targets except ARM and AArch64.
This field used to be required due to a bug, it was fixed long ago
by 23423c0ea8d414e56081cb6a13bd8b2cc91513a9.
2025-08-21 05:21:42 +03:00
Craig Topper
2cb7c46bf0 [RISCV] Add missing 'OrP' to comment in RISCVInstrInfoZb.td. NFC 2025-08-20 15:27:03 -07:00
Craig Topper
562e021103
[RISCV] Minor refactor of RISCVMoveMerge::mergePairedInsns. (#154467)
Fold the ARegInFirstPair into the later if/else with the same condition.
Use std::swap so we don't need to repeat operands in the opposite order.
2025-08-20 09:07:58 -07:00
Qihan Cai
5f0515debd
[RISCV] Support Remaining P Extension Instructions for RV32/64 (#150379)
This patch implements pages 15-17 from
jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf

Documentation:
jhauser.us/RISCV/ext-P/RVP-baseInstrs-014.pdf
jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf
2025-08-20 22:54:07 +10:00
Link
46e77ebf71
[RISCV][NFC] Ensure files end with newline. (#154457)
Add trailing newlines to the following files to comply with POSIX
standards:
- llvm/lib/Target/RISCV/RISCVInstrInfoXSpacemiT.td
- llvm/test/MC/RISCV/xsmtvdot-invalid.s
- llvm/test/MC/RISCV/xsmtvdot-valid.s

Closes #151706
2025-08-20 17:18:16 +08:00
Craig Topper
a872c4c785
[RISCV] Add SMT_ prefix to XSMTVDot instructions. NFC (#154475)
This helps avoid future name conflicts.
2025-08-19 22:54:55 -07:00
Craig Topper
d145dc10b6
[RISCV] Reduce code duplication in RISCVMoveMerge::findMatchingInst. NFCI (#154451) 2025-08-19 21:08:57 -07:00
Min-Yih Hsu
f82054eaa2
[RISCV] Handle more cases when combining (vfmv.s.f (extract_subvector X, 0)) (#154175)
Previously, we fold `(vfmv.s.f (extract_subvector X, 0))` into X when
X's type is the same as `vfmv.s.f`'s result type. This patch generalizes
it by folding it into insert_subvector when X is narrower and
extract_subvector when X is wider.

Co-authored-by: Craig Topper <craig.topper@sifive.com>
2025-08-19 09:43:55 -07:00
David Sherwood
13d8ba7dea
[LV][TTI] Calculate cost of extracting last index in a scalable vector (#144086)
There are a couple of places in the loop vectoriser where we
want to calculate the cost of extracting the last lane in a
vector. However, we wrongly assume that asking for the cost
of extracting lane (VF.getKnownMinValue() - 1) is an accurate
representation of the cost of extracting the last lane. For
SVE at least, this is non-trivial as it requires the use of
whilelo and lastb instructions.

To solve this problem I have added a new
getReverseVectorInstrCost interface where the index is used
in reverse from the end of the vector. Suppose a vector has
a given ElementCount EC, the extracted/inserted lane would be
EC - 1 - Index. For scalable vectors this index is unknown at
compile time. I've added a AArch64 hook that better represents
the cost, and also a RISCV hook that maintains compatibility
with the behaviour prior to this PR.

I've also taken the liberty of adding support in vplan for
calculating the cost of VPInstruction::ExtractLastElement.
2025-08-19 09:31:37 +01:00
Nikita Popov
86ac834df5
[RISCV] Use OrigTy from InputArg/OutputArg (NFCI) (#154095)
The InputArg/OutputArg now contains the OrigTy, so directly use that
instead of trying to recover it.

CC_RISCV is now *nearly* a normal CC assignment function. However, it
still differs by having an IsRet flag.
2025-08-19 09:28:24 +02:00
Sudharsan Veeravalli
c00b04a7e0
[RISCV] Generate QC_INSB/QC_INSBI instructions from OR of AND Imm (#154023)
Generate QC_INSB/QC_INSBI from `or (and X, MaskImm), OrImm` iff the
value being inserted only sets known zero bits. This is based on a
similar DAG to DAG transform done in `AArch64`.
2025-08-19 11:14:14 +05:30
Craig Topper
da19383ae7
[RISCV] Fold (X & -4096) == 0 -> (X >> 12) == 0 (#154233)
This is a more general form of the recently added isel pattern
(seteq (i64 (and GPR:$rs1, 0x8000000000000000)), 0)
  -> (XORI (i64 (SRLI GPR:$rs1, 63)), 1)

We can use a shift right for any AND mask that is a negated power
of 2. But for every other constant we need to use seqz instead of
xori. I don't think there is a benefit to xori over seqz as neither
are compressible.

We already do this transform from target independent code when the setcc
constant is a non-zero subset of the AND mask that is not a legal icmp
immediate.

I don't believe any of these patterns comparing MSBs to 0 are
canonical according to InstCombine. The canonical form is (X < 4096).
I'm curious if these appear during SelectionDAG and if so, how.

My goal here was just to remove the special case isel patterns.
2025-08-18 21:24:35 -07:00
Craig Topper
2817873082
[RISCV] Fold (sext_inreg (setcc), i1) -> (sub 0, (setcc). (#154206)
This helps the 3 vendor extensions that make sext_inreg i1 legal.

I'm delaying this until after LegalizeDAG since we normally have
sext_inreg i1 up until LegalizeDAG turns it into and+neg.

I also delayed the recently added (sext_inreg (xor (setcc), -1), i1)
combine. Though the xor isn't likely to appear before LegalizeDAG anyway.
2025-08-18 21:24:03 -07:00
Sudharsan Veeravalli
8495018a85
[RISCV] Use sd_match in trySignedBitfieldInsertInMask (#154152)
This keeps everything in APInt and makes it easier to understand and
maintain.
2025-08-19 08:22:06 +05:30
Craig Topper
6c0518a88f
[RISCV] Prioritize zext.h/zext.w over XTheadBb th.extu. (#154186)
Fixes #154125.
2025-08-18 16:56:57 -07:00
Shaoce SUN
7e8ff2afa9
[RISCV][GISel] Optimize +0.0 to use fcvt.d.w for s64 on rv32 (#153978)
Resolve the TODO: on RV32, when constructing the double-precision
constant `+0.0` for `s64`, `BuildPairF64Pseudo` can be optimized to use
the `fcvt.d.w` instruction to generate the result directly.
2025-08-18 17:52:24 +00:00
Craig Topper
60aa0d4bfc
[RISCV] Add P-ext MC support for pli.dh, pli.db, and plui.dh. (#153972)
Refactor the pli.b/h/w and plui.h/w tablegen classes.
2025-08-18 08:23:14 -07:00
Craig Topper
98e8f01d18
[RISCV] Rename MIPS_PREFETCH->MIPS_PREF. NFC (#154062)
This matches the instruction's assembler mnemonic.
2025-08-18 07:38:10 -07:00
Kazu Hirata
07eb7b7692
[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068)
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>.  Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:

  template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};

We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.
2025-08-18 07:01:29 -07:00
林克
6842cc5562
[RISCV] Add SpacemiT XSMTVDot (SpacemiT Vector Dot Product) extension. (#151706)
The full spec can be found at spacemit-x60 processor support scope:
Section 2.1.2.2 (Features):

https://developer.spacemit.com/documentation?token=BWbGwbx7liGW21kq9lucSA6Vnpb#2.1

This patch only supports assembler.
2025-08-18 18:03:17 +08:00
Jim Lin
127ba533bd
[RISCV] Remove ST->hasVInstructions() from getIntrinsicInstrCost for cttz/ctlz/ctpop. NFC. (#154064)
That isn't necessary if we've checked ST->hasStdExtZvbb().
2025-08-18 15:24:25 +08:00
Kazu Hirata
cbf5af9668
[llvm] Remove unused includes (NFC) (#154051)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-08-17 23:46:35 -07:00
Kazu Hirata
400dde6ca8
[RISCV] Remove an unnecessary cast (NFC) (#154049)
&UncompressedMI is already of MCInst *.
2025-08-17 23:46:27 -07:00
Craig Topper
4a3b69920b
[RISCV] Accept [-128,255] instead of [0, 255] for pli.b (#153913)
pli.h and pli.w both accept signed immediates, so pli.b should too. But
unlike those instructions, pli.b doesn't do any extension so its ok to
accept an unsigned immediate as well.
2025-08-17 21:39:08 -07:00
Brandon Wu
98f4b7797e
[RISCV][llvm] Support fixed-length vector inline assembly constraints (#150724) 2025-08-18 03:36:12 +00:00
Craig Topper
e67ec12640
[RISCV] Remove experimental from Smctr and Ssctr. (#153903)
These extensions were ratified in November 2024.
2025-08-15 17:18:09 -07:00
Sergei Barannikov
b7d6f484c8
[RISCV] Remove non-existent operand of nds.vfwcvt/nds.vfncvt instructions (#153865)
Mask operand is likely a copy-past error, they don't have one.
2025-08-16 00:46:19 +03:00
Craig Topper
c84a43ff3b
[RISCV] Fold (sext_inreg (xor (setcc), -1), i1) -> (add (setcc), -1). (#153855)
This improves all 3 vendor extensions that make sext_inreg i1 legal

Fixes #153781.
2025-08-15 12:55:18 -07:00
Nikita Popov
01bc742185
[CodeGen] Give ArgListEntry a proper constructor (NFC) (#153817)
This ensures that the required fields are set, and also makes the
construction more convenient.
2025-08-15 18:06:07 +02:00
Craig Topper
e2eaea412a
[RISCV] Add MC support for more P extension instructions. (#153629)
This implements pages 10-14 from
https://jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf

Test cases copied from #123271 with a couple mistakes fixed.

Co-authored-by: realqhc <caiqihan021@hotmail.com>
2025-08-14 23:23:28 -07:00
Luke Lau
e261f2895f
[RISCV] Add TSFlag for reading past VL behaviour. NFCI (#149704)
Currently we have a switch statement that checks if a vector instruction
may read elements past VL. However it currently doesn't account for
instructions in vendor extensions.

Handling all possible vendor instructions will result in quite a lot of
opcodes being added, so I've created a new TSFlag that we can declare in
TableGen, and added it to the existing instruction definitions.

I've tried to be conservative as possible here: All SiFive vendor vector
instructions should be covered by the flag, as well as all of
XRivosVizip, and ri.vextract from XRivosVisni.

For now this should be NFC because coincidentally, these instructions
aren't handled in getOperandInfo, so RISCVVLOptimizer should currently
avoid touching them despite them being liberally handled in
getMinimumVLForUser.

However in an upcoming patch we'll need to also bail in
getMinimumVLForUser, so this prepares for it.
2025-08-15 01:19:03 +00:00
Craig Topper
9465916a61
[RISCV] Stop passing the merge opcode around in RISCVMoveMerger. NFC (#153687)
What most code wants to know is the direction and we have to decode the
opcode to figure that out. Instead pass the direction around as a bool
and convert to opcode when we create the merge instruction.
2025-08-14 18:08:23 -07:00
Craig Topper
defbbf0129
[RISCV][MoveMerge] Don't copy kill flag when moving past an instruction that reads the register. (#153644)
If we're moving the second copy before another instruction that reads
the copied register, we need to clear the kill flag on the combined
move.

Fixes #153598.
2025-08-14 14:52:54 -07:00
Craig Topper
cba5f1b6c1
[RISCV] Add MC support for P extensions with scalar second operands. (#153502)
These are the instructions from page 8 and the second half of
page 9 here in
https://jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf

Co-authored-by: realqhc <caiqihan021@hotmail.com>
2025-08-14 07:03:36 -07:00
Nikita Popov
d1952baa5d [CodeGen] Remove unnecessary setTypeListBeforeSoften() parameter (NFC)
It does not make sense to set the softening type list without
setting IsSoften=true.
2025-08-14 10:04:56 +02:00
Piotr Fusik
18782db4c9
[RISCV] Improve instruction selection for most significant bit extraction (#151687)
(seteq (and X, 1<<XLEN-1), 0) -> (xori (srli X, XLEN-1), 1)
    (seteq (and X, 1<<31), 0) -> (xori (srliw X, 31), 1) // RV64
    (setlt X, 0) -> (srli X, XLEN-1) // SRLI is compressible
    (setlt (sext X), 0) -> (srliw X, 31) // RV64
2025-08-14 09:59:43 +02:00
quic_hchandel
71b066e3a2
[RISCV] Add CodeGen support for qc.insbi and qc.insb insert instructions (#152447)
This patch adds CodeGen support for qc.insbi and qc.insb instructions
defined in the Qualcomm uC Xqcibm extension. qc.insbi and qc.insb
inserts bits into destination register from immediate and register
operand respectively.
A sequence of `xor`, `and` & `xor` depending on appropriate conditions
are converted to `qc.insbi` or `qc.insb` which depends on the
immediate's value.
2025-08-14 12:08:28 +05:30
Craig Topper
ace08d5ccf
[RISCV] Add MC support for more P extension instructions. (#153458)
These instructions are the shift by immediate and saturate by immediate
instructions from the top half of page 9 of
https://jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf

I've also improved the CHECK lines in the invalid tests to check line
and column number from the diagnostic.

Co-authored-by: realqhc <caiqihan021@hotmail.com>
2025-08-13 22:07:03 -07:00
Craig Topper
059e49ceaa [RISCV] Fix typo in comment Interger->Integer. NFC 2025-08-13 11:19:49 -07:00
Craig Topper
57fb38a536
[RISCV] Indent body of let scopes in RISCVInstrInfoP.td. NFC (#153349)
I think this makes the code a little more readable.
2025-08-13 08:41:33 -07:00
Mikhail R. Gadelha
489a41d474
[RISCV][VLOPT] Added support for the zvbc and the remaining zvbb instructions (#153234)
Follow-up PR to #153071, adding the remaining zvbb instructions
(VBREV8_V and VREV8_V), plus the zvbc instruction (VCLMUL_VV, VCLMUL_VX,
VCLMULH_VV, VCLMULH_VX).
2025-08-13 14:43:25 +00:00
Sergey Kachkov
bdddff2488
[RISCV][RVV] Prohibit conversion of scalar store to single-element vse if vmv.x.s has multiple uses (#152112)
Godbolt example: https://godbolt.org/z/ThdfP475a

In the example single-element vse is used to store reduction result
instead of scalar store ([this optimization was introduced by this
patch](https://reviews.llvm.org/D109482)). However, vmv.x.s can't be
eliminated here because it has other uses (e.g. CopyToReg), so it seems
more profitable to use scalar store (we already have store value in a
scalar register, and can save one vsetvli which is likely to be required
for single-element vse). The proposed solution is to this transform only
if vmv.x.s has one use (in store instruction)
2025-08-13 13:10:27 +03:00
Sam Elliott
7317e3c9dd [NFC][RISCV] Correct signed/unsigned in Comment 2025-08-12 16:17:22 -07:00
Min-Yih Hsu
ca05058b49
[IA][RISCV] Recognize deinterleaved loads that could lower to strided segmented loads (#151612)
Turn the following deinterleaved load patterns
```
%l = masked.load(%ptr, /*mask=*/110110110110, /*passthru=*/poison)
%f0 = shufflevector %l, [0, 3, 6, 9]
%f1 = shufflevector %l, [1, 4, 7, 10]
%f2 = shufflevector %l, [2, 5, 8, 11]
```
into
```
%s = riscv.vlsseg2(/*passthru=*/poison, %ptr, /*mask=*/1111)
%f0 = extractvalue %s, 0
%f1 = extractvalue %s, 1
%f2 = poison
```
The mask `110110110110` is regarded as 'gap mask' since it effectively skips the entire third field / component.

Similarly,  turning the following snippet
```
%l = masked.load(%ptr, /*mask=*/110000110000, /*passthru=*/poison)
%f0 = shufflevector %l, [0, 3, 6, 9]
%f1 = shufflevector %l, [1, 4, 7, 10]
```
into
```
%s = riscv.vlsseg2(/*passthru=*/poison, %ptr, /*mask=*/1010)
%f0 = extractvalue %s, 0
%f1 = extractvalue %s, 1
```

Right now this patch only tries to detect gap mask from a constant mask supplied to a masked.load/vp.load.
2025-08-12 14:08:18 -07:00
Sam Elliott
9b93ccbcbe
[RISCV] Fix Immediate Check for Xqcibi UGT (#153141)
The check should be about unsigned 16-bit immediates, not signed ones.

This is not a bug per-se, as the old codegen was correct for the
uint16_max case, it just didn't end up using `qc.e.bgeui`, which we
would prefer it did.
2025-08-12 11:06:00 -07:00
Mikhail R. Gadelha
d455d45654
[RISCV][VLOPT] Added support for several vector crypto instructions (#153071)
This PR adds support for the following instructions to the RISC-V
VLOptimizer: vandn.vx, vandn.vv, vbrev.v, vclz.v, vcpop.v, vctz.v,
vror.vi, vror.vx, vror.vv, vrol.vx, vrol.vv.
2025-08-12 12:05:03 -03:00