2046 Commits

Author SHA1 Message Date
Craig Topper
ae36702807
[RISCV] Guard against out of bound shifts in expandMul. (#150464)
Spotted while reviewing #150211. If we're multiplying by -3 in i32
MulAmt contains 4,294,967,293 since we zero extend to uint64_t. Adding 3
to this gives 0x100000000 which is a power of 2 and the log2 of that is
32, but we can't shift left by 32 in an i32.

Detect this case and skip the transform. We could use 0, but we don't
handle the case for i64 so this seemed more consistent.

Normally we don't hit this case because decomposeMulByConstant handles
it, but that's disabled by Xqciac. And after #150211 the code in
expandMul is now unreachable for this case.
2025-07-24 17:27:48 -07:00
Sudharsan Veeravalli
d3937e2d12
[RISCV] Pass sign-extended value to isInt check in expandMul (#150211)
In the `isInt` check that was added in #147661 we were passing the
zero-extended `uint64_t` value instead of the sign-extended one.
2025-07-25 05:47:09 +05:30
Craig Topper
a69cddef43
[RISCV] Add TUPLE_INSERT and TUPLE_EXTRACT to verifyTargetNode. (#150148)
Verify that the index is an i32 target constant which is what we get
from intrinsic lowering. All other inserts and extracts should be the
same.
2025-07-22 18:28:11 -07:00
Alex Bradbury
33df888217
[RISCV] Teach RISCVTargetLowering::isFPImmLegal about fli+fneg (#149075)
There was a mismatch between isFPImmlegal and the cases that are handled
by lowerConstantFP. isFPImmLegal didn't check for the case where we
support `fli` of a negated constant (and so can lower to fli+fneg). This
has very minimal impact (42 insertion, 47 deletions across an
rv22u64_zfa llvm-test-suite build including SPEC CPU 2017) but is added
here for completeness.

See the PR thread https://github.com/llvm/llvm-project/pull/149075 for furrther discussion about the degree to which isFPImmLegal and lowerConstantFP are consistent. We ultimately agreed it makes sense to add fli+fneg, but there may be other future cases where it doesn't make sense to match.
2025-07-22 14:22:26 +01:00
Serge Pavlov
372e99938f
Remove unused variable (#149115) 2025-07-16 11:28:57 -04:00
Serge Pavlov
c71b92d09f
[RISCV][FPE] Remove unused variable (#149054)
It was added by me in 905bb5bddb690765cab5416d55ab017d7c832eb3, which
committed PR https://github.com/llvm/llvm-project/pull/148569.
2025-07-16 19:56:31 +07:00
Serge Pavlov
905bb5bddb
[RISCV][FPEnv] Lowering of fpmode intrinsics (#148569)
The change implements custom lowering of `get_fpmode`, `set_fpmode` and
`reset_fpmode` for RISCV target. The implementation is aligned with the
functions `fegetmode` and `fesetmode` in GLIBC.
2025-07-16 16:02:15 +07:00
Jim Lin
3e4153c97b
[RISCV] Implement Builtins for XAndesBFHCvt extension. (#148804)
XAndesBFHCvt provides two builtins functions for converting between
float and bf16. Users can use them to convert bf16 values loaded from
memory to float, perform arithmetic operations, then convert them back
to bf16 and store them to memory.

The load/store and move operations for bf16 will be handled in a later
patch.
2025-07-16 16:13:31 +08:00
Philip Reames
c7d1eae4fc
[RISCV] Use masked segment LD/ST intrinsics in (de)interleaveN lowering [nfc] (#148966)
Follow up on the work from e5bc7e7d, and extend it to the lowering used
for interleave and deinterleave when we can't combine with a nearby
memory operation.
2025-07-15 17:12:08 -07:00
Kazu Hirata
7c83d66719
[llvm] Remove unused includes (NFC) (#148768)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-14 22:19:14 -07:00
Sudharsan Veeravalli
0ae1506847
[RISCV] Add ISel patterns for Xqciac QC_SHLADD instruction (#148256)
Add a couple of patterns to generate the Xqciac QC_SHLADD shift left and
add immediate instruction.
2025-07-14 16:43:41 +05:30
Craig Topper
5a95ec6dc1
[RISCV] Add riscv_vlm/vsm to RISCVTargetLowering::getTgtMemIntrinsic. (#148265) 2025-07-11 16:59:47 -07:00
Min-Yih Hsu
bf94c8ddb3
[RISCV][NFC] Split InterleavedAccess related TLI hooks into a separate file (#148040)
There have been discussions on splitting RISCVISelLowering.cpp. I think
InterleavedAccess related TLI hooks would be some of the low hanging
fruit as it's relatively isolated and also because X86 is already doing
it.

NFC.
2025-07-11 11:04:41 -07:00
Sudharsan Veeravalli
9de657abaf
[RISCV] Add ISel patterns for Xqciac QC.MULIADD instruction (#147661)
Add basic isel patterns for the multiple accumulate QC.MULIADD
instruction.

While most case work with just the TD file pattern, there are few cases
which need to be handled in ISelLowering depending on the immediate we
are multiplying with:

- imm + 1 , imm - 1, 1 - imm, -1 - imm are a power of 2 --> these become
slli and add/sub
- immediate is 2^n - 2 ^m --> this becomes (add/sub (shl X, C1), (shl X,
C2))
- imm - 2, imm - 4, imm - 6 is a power of 2 --> these use shxadd when
zba is enabled

The patch does not decompose mul if Xqciac is present, for the above
conditions. There could be cases where this may not beneficial which I
plan to address in follow up patches.
2025-07-11 12:16:11 +05:30
quic_hchandel
66969c9494
[RISCV] Add ISel patterns for Qualcomm uC Xqcics extension (#146675)
Add CodeGen support for conditional select instructions in this
extension
2025-07-11 10:27:13 +05:30
Ramkumar Ramachandra
19c2fb2325
[ISel/RISCV] Custom-lower vector [l]lround (#147713)
Lower it just like the vector [l]lrint, using vfcvt, with the right
rounding mode. Updating costs to account for this custom-lowering is
left to a companion patch.
2025-07-10 10:33:46 +01:00
Boyao Wang
697beb3f17
[TargetLowering] Change getOptimalMemOpType and findOptimalMemOpLowering to take LLVM Context (#147664)
Add LLVM Context to getOptimalMemOpType and findOptimalMemOpLowering. So
that we can use EVT::getVectorVT to generate EVT type in
getOptimalMemOpType.

Related to [#146673](https://github.com/llvm/llvm-project/pull/146673).
2025-07-10 11:11:09 +08:00
Philip Reames
7bf439d260 [IA] Partially revert interface change from 4a66ba
As noted in post commit review, the API change here was not required.
I'd apparently confused myself when teasing apart patches from my
development branch.
2025-07-09 12:02:52 -07:00
Philip Reames
4a66ba2a4d
[IA] Support deinterleave intrinsics w/ fewer than N extracts (#147572)
For the fixed vector cases, we already support this, but the
deinterleave intrinsic cases (primary used by scalable vectors) didn't.
Supporting it requires plumbing through the Factor separately from the
extracts, as there can now be fewer extracts than the Factor. Note that
the fixed vector path handles this slightly differently - it uses the
shuffle and indices scheme to achieve the same thing.
2025-07-09 09:41:07 -07:00
Ryan Buchner
8905b1c38f
[RISCV] Efficiently lower (select %cond, andn (f, x), f) using zicond (#147369)
The following case is now optimized:
(select c, (and f, ~x), f) -> (andn f, (czero_eqz x, c))
2025-07-09 09:32:54 -04:00
Ramkumar Ramachandra
9c97b38d44
[ISel/RISCV] Custom-promote [b]f16 in [l]lrint (#146507)
Extend lowerVectorXRINT to also do a FP_EXTEND_VL when the source
element type is [b]f16, and wire up this custom-promote. Updating the
cost-model to not give these an invalid cost is left to a companion
patch.
2025-07-09 10:24:38 +01:00
Luke Lau
7c812ea01a
[RISCV] Avoid vl toggles when lowering vector_splice/experimental_vp_splice and add +vl-dependent-latency tuning feature (#146746)
When vectorizing a loop with a fixed-order recurrence we use a splice,
which gets lowered to a vslidedown and vslideup pair.

However with the way we lower it today we end up with extra vl toggles
in the loop, especially with EVL tail folding, e.g:

    .LBB0_5:                                # %vector.body
# =>This Inner Loop Header: Depth=1
    	sub	a5, a2, a3
    	sh2add	a6, a3, a1
    	zext.w	a7, a4
    	vsetvli	a4, a5, e8, mf2, ta, ma
    	vle32.v	v10, (a6)
    	addi	a7, a7, -1
    	vsetivli	zero, 1, e32, m2, ta, ma
    	vslidedown.vx	v8, v8, a7
    	sh2add	a6, a3, a0
    	vsetvli	zero, a5, e32, m2, ta, ma
    	vslideup.vi	v8, v10, 1
    	vadd.vv	v8, v10, v8
    	add	a3, a3, a4
    	vse32.v	v8, (a6)
    	vmv2r.v	v8, v10
    	bne	a3, a2, .LBB0_5

Because the vslideup overwrites all but UpOffset elements from the
vslidedown, we currently set the vslidedown's AVL to said offset.

But in the vslideup we use either VLMAX or the EVL which causes a
toggle.

This increases the AVL of the vslidedown so it matches vslideup, even if
the extra elements are overridden, to avoid the toggle.

A new tuning feature +vl-dependent-latency has been added which keeps
the old behaviour for microarchitectures that dynamically dispatch uops
based on vl, e.g. sifive-x280.

+vl-dependent-latency can be reused for the recently proposed Ovlt
optimization directive if/when it's ratified:
https://lists.riscv.org/g/tech-privileged/message/2487

If we wanted to aggressively optimise for vl at the expense of
introducing more toggles we could probably look at doing this in
RISCVVLOptimizer.
2025-07-09 11:09:13 +08:00
Craig Topper
be19a27cc5
[RISCV] Correct stride for strided load/store of vectors of pointers in lowerInterleavedLoad/lowerInterleavedStore. (#147598)
We need to use DataLayout to get the size if the element type
is a pointer.
2025-07-08 18:24:50 -07:00
Philip Reames
bdf7812855 [RISCV] Consolidate intrinsic ID tables [nfc] 2025-07-07 13:27:53 -07:00
Ramkumar Ramachandra
499e656cac
[ISel/RISCV] Modernize loops (NFC) (#147281) 2025-07-07 17:03:08 +01:00
Matt Arsenault
d8ef156379
DAG: Remove verifyReturnAddressArgumentIsConstant (#147240)
The intrinsic argument is already marked with immarg so non-constant
values are rejected by the IR verifier.
2025-07-07 16:28:47 +09:00
Jim Lin
61529d9e36
[RISCV] Remove implied extension Zvfhmin for XAndesVPackFPH (#146861)
XAndesVPackFPH can actually be used independently without requiring
Zvfhmin. Therefore, we remove the implicitly required Zvfhmin extension
from XAndesVPackFPH and imply that the f extension is sufficient.
2025-07-04 10:16:20 +08:00
Craig Topper
e35cf02e54 [RISCV] Pass RISCVSubtarget to translateSetCCForBranch. NFC 2025-07-03 13:34:46 -07:00
Ryan Buchner
be762b7b7d
[RISCV] Efficiently lower (select cond, u, rot[r/l](u, rot.amt)) using zicond extension (#143768)
The following lowerings now occur:
(select cond, u, rotr(u, rot.amt)) -> (rotr u, (czero_nez rot.amt,
cond))
(select cond, rotr(u, rot.amt), u) -> (rotr u, (czero_eqz rot.amt,
cond))
(select cond, u, rotl(u, rot.amt)) -> (rotl u, (czero_nez rot.amt,
cond))
(select cond, rotl(u, rot.amt), u) -> (rotl u, (czero_eqz rot.amt,
cond))
2025-07-03 15:27:09 -04:00
UmeshKalappa
032966ff56
[RISCV] Added the MIPS prefetch extensions for MIPS RV64 P8700. (#145647)
the extension enabled with xmipscbop.

Please refer "MIPS RV64 P8700/P8700-F Multiprocessing System
Programmer’s Guide" for more info on the extension at
https://mips.com/wp-content/uploads/2025/06/P8700_Programmers_Reference_Manual_Rev1.84_5-31-2025.pdf
2025-07-03 10:59:10 +02:00
Jim Lin
283f53ac6f
[RISCV] Add isel patterns for generating XAndesPerf branch immediate instructions (#145147)
Similar to #139872. This patch adds isel patterns to match
`riscv_brcc` and `riscv_selectcc_frag` to XAndesPerf branch
instructions.
2025-07-03 12:47:53 +08:00
Simon Pilgrim
38200e94f1
[DAG] visitFREEZE - always allow freezing multiple operands (#145939)
Always try to fold freeze(op(....)) -> op(freeze(),freeze(),freeze(),...).

This patch proposes we drop the opt-in limit for opcodes that are allowed to push a freeze through the op to freeze all its operands, through the tree towards the roots.

I'm struggling to find a strong reason for this limit apart from the DAG freeze handling being immature for so long - as we've improved coverage in canCreateUndefOrPoison/isGuaranteedNotToBeUndefOrPoison it looks like the regressions are not as severe.

Hopefully this will help some of the regression issues in #143102 etc.
2025-07-02 11:28:37 +01:00
Ramkumar Ramachandra
652630b3c9
[ISel/RISCV] Fix fixed-vector [l]lrint lowering (#145898)
Make the fixed-vector lowering of ISD::[L]LRINT use the custom-lowering
routine, lowerVectorXRINT, and fix issues in lowerVectorXRINT related to
this new functionality.
2025-06-30 13:44:34 +01:00
Ramkumar Ramachandra
7ff9669a2e
[ISel/RISCV] Refactor isPromotedOpNeedingSplit (NFC) (#146059) 2025-06-28 11:41:26 +01:00
Ramkumar Ramachandra
2282d4faa0
[ISel/RISCV] Improve code in lowerFCOPYSIGN (NFC) (#146061) 2025-06-27 17:02:38 +01:00
Craig Topper
375af75efb
[RISCV] Simplify the check for when to call EmitLoweredCascadedSelect. NFC (#145930)
Based on the comments and tests, we only want to call
EmitLoweredCascadedSelect on selects of FP registers.

Everytime we add a new branch with immediate opcode, we've been
excluding it here.

This patch switches to checking that the comparison operands are both
registers so branch on immediate is automatically excluded.
2025-06-27 08:56:49 -07:00
quic_hchandel
950d281eb2
[RISCV] Add ISel patterns for Qualcomm uC Xqcicm extension (#145643)
Add codegen patterns for the conditional move instructions in this
extension
2025-06-27 12:25:48 +05:30
Craig Topper
c8243251cb
[RISCV] Remove separate immediate condition codes from RISCVCC. NFC (#145762)
This wasn't scalable and made the RISCVCC enum effectively just
a different way of spelling the branch opcodes.
    
This patch reduces RISCVCC back down to 6 enum values. The primary user
is select pseudoinstructions which now share the same encoding across
all
vendor extensions. The select opcode and condition code are used to
determine the branch opcode when expanding the pseudo.
    
The Cond SmallVector returned by analyzeBranch now returns the opcode
instead of the RISCVCC. reverseBranchCondition now works directly on
opcodes. getOppositeBranchCondition is also retained.

Stacked on #145622
2025-06-25 23:09:24 -07:00
Craig Topper
6fd182a3bb
[RISCV] Support fixed vector vp.reverse/splice with Zvfhmin/Zvfbfmin. (#145596)
Fix the names of some tests I accidentally misspelled.
2025-06-25 13:47:00 -07:00
Ming Yan
10edc3df99
[RISCV] Try to optimize vp.splice to vslide1up. (#144871)
Fold (vp.splice (insert_elt poison, scalar, 0), vec, 0, mask, 1, vl)
to (vslide1up vec, scalar, mask, vl).

Fold (vp.splice (splat_vector scalar), vec, 0, mask, 1, vl)
to (vslide1up vec, scalar, mask, vl).
2025-06-25 23:03:20 +08:00
Craig Topper
9702d37062
[RISCV] Support scalable vector vp.reverse/splice with Zvfhmin/Zvfbfmin. (#145588) 2025-06-24 15:40:24 -07:00
Craig Topper
7150b2c76a
[RISCV] Optimize vp.splice with 0 offset. (#145533)
We can skip the slidedown if the offset is 0.
2025-06-24 10:02:28 -07:00
Jim Lin
f6ab1f02ec
[RISCV] Support LLVM IR intrinsics for XAndesVBFHCvt (#145321)
This patch adds LLVM IR intrinsic support for XAndesVBFHCvt.

The document for the intrinsics can be found at:
https://github.com/andestech/andes-vector-intrinsic-doc/blob/ast-v5_4_0-release-v5/auto-generated/andes-v5/intrinsic_funcs.adoc#vector-widening-convert-intrinsicsxandesvbfhcvt
https://github.com/andestech/andes-vector-intrinsic-doc/blob/ast-v5_4_0-release-v5/auto-generated/andes-v5/intrinsic_funcs.adoc#vector-narrowing-convert-intrinsicsxandesvbfhcvt

Vector bf16 load/store intrisics is also enabled when +xandesvbfhcvt is
specified. The corresponding LLVM IR intrisic testcase would be added in
a follow-up patches.

The clang part will be added in a later patch.

Co-authored-by: Tony Chuan-Yue Yuan <yuan593@andestech.com>
2025-06-24 10:19:04 +08:00
Sudharsan Veeravalli
88b98d3367
[RISCV] Add ISel pattern for generating QC_BREV32 (#145288)
The `QC_BREV32` instruction reverses the bit order of `rs1` and writes
the result to `rd`
2025-06-24 07:11:46 +05:30
Sam Elliott
a6eb5eee38
[RISCV][NFC] Remove hasStdExtCOrZca (#145139)
As of 20b5728b7b1ccc4509a316efb270d46cc9526d69, C always enables Zca, so
the check `C || Zca` is equivalent to just checking for `Zca`.

This replaces any uses of `HasStdExtCOrZca` with a new `HasStdExtZca`
(with the same assembler description, to avoid changes in error
messages), and simplifies everywhere where C++ needed to check for
either C or Zca.

The Subtarget function is just deprecated for the moment.
2025-06-23 10:49:47 -07:00
Matt Arsenault
48155f93dd
CodeGen: Emit error if getRegisterByName fails (#145194)
This avoids using report_fatal_error and standardizes the error
message in a subset of the error conditions.
2025-06-23 16:33:35 +09:00
Craig Topper
0c47628515 Re-commit "[RISCV] Properly support RISCVISD::LLA in getTargetConstantFromLoad. (#145112)"
With proper co-author.

Original message:

We need to pass the operand of LLA to GetSupportedConstantPool.

This replaces #142292 with test from there added as a pre-commit
for both medlow and pic.

Co-authored-by: Carl Nettelblad carl.nettelblad@rapidity-space.com
2025-06-21 10:18:49 -07:00
Craig Topper
fc36e47a49 Revert "[RISCV] Properly support RISCVISD::LLA in getTargetConstantFromLoad. (#145112)"
I missed the Co-authored-by that I tried to add.

This reverts commit 1da864b574f699d5c9be68dca9b3969ad50f4803.
2025-06-21 10:18:34 -07:00
Craig Topper
1da864b574
[RISCV] Properly support RISCVISD::LLA in getTargetConstantFromLoad. (#145112)
We need to pass the operand of LLA to GetSupportedConstantPool.
    
This replaces #142292 with test from there added as a pre-commit
for both medlow and pic.
2025-06-21 10:17:30 -07:00
Philip Reames
5886f0a183
[RISCV] Allow larger offset when matching build_vector as vid sequence (#144756)
I happened to notice that when legalizing get.active.lane.mask with
large vectors we were materializing via constant pool instead of just
shifting by a constant.

We should probably be doing a full cost comparison for the different
lowering strategies as opposed to our current adhoc heuristics, but the
few cases this regresses seem pretty minor. (Given the reduction in vset
toggles, they might not be regressions at all.)

---------

Co-authored-by: Craig Topper <craig.topper@sifive.com>
2025-06-20 14:20:17 -07:00