4285 Commits

Author SHA1 Message Date
Luke Lau
faa385a9f4 [RISCV] Add tests for length changing shuffles
Tests taken from Luke's 88147 with minimal changes by me (preames).

The main case of interest here is when mask length is less than source
length (i.e. length is decreasing).  We often scalarize these, which
on RISCV can be quite painful.
2024-11-01 09:51:39 -07:00
Philip Reames
58f525a23c [RISCV] Add tests for deinterleave shuffles w/o vnsrl.vv
With SEW=64, the vnsrl trick we primary rely on does not work.  This
is handled correctly today, but we have fairly minimal testing of the
resulting shuffles which makes it hard to demonstrate value of an
upcoming change.
2024-11-01 08:02:04 -07:00
Simon Pilgrim
9fb4bc5bf4
[DAG] SimplifyMultipleUseDemandedBits - ignore SRL node if we're just demanding known sign bits (#114389)
Check to see if we are only demanding (shifted) signbits from a SRL node that are also signbits in the source node.

We can't demand any upper zero bits that the SRL will shift in (up to max shift amount), and the lower demanded bits bound must already be all signbits.
2024-10-31 16:40:29 +00:00
Pengcheng Wang
18f0f70934
[RISCV] Support llvm.masked.expandload intrinsic (#101954)
We can use `viota`+`vrgather` to synthesize `vdecompress` and lower
expanding load to `vcpop`+`load`+`vdecompress`.

And if `%mask` is all ones, we can lower expanding load to a normal
unmasked load.

Fixes #101914.
2024-10-31 20:03:58 +08:00
Luke Lau
6da5968f5e
[RISCV] Lower scalar_to_vector for supported FP types (#114340)
In https://reviews.llvm.org/D147608 we added custom lowering for
integers, but inadvertently also marked it as custom for scalable FP
vectors despite not handling it.

This adds handling for floats and marks it as custom lowered for
fixed-length FP vectors too.

Note that this doesn't handle bf16 or f16 vectors that would need
promotion, but these scalar_to_vector nodes seem to be emitted when
expanding them.
2024-10-31 13:15:17 +08:00
Luke Lau
1cb599835c [RISCV] Remove redundant +zfh from +zvfh[min] tests. NFC
In the vast majority of f16 tests we don't end up emitting any scalar
code that needs +zfh, so remove it.
2024-10-31 06:51:39 +08:00
Craig Topper
408c84f35b
[RISCV] Add hasPostISelHook to sf.vfnrclip pseudo instructions. (#114274)
Add Uses = [FRM] to the underlying MC instructions.
    
Tweak a couple test cases so the MachineVerifier would have caught this.
2024-10-30 11:52:49 -07:00
Craig Topper
c3724ba866
[RISCV] Add OperandType for vector rounding mode operands. (#114179)
Use TSFlags to distinquish which type of rounding mode it is. We use the same tablegen base classes for vxrm and frm sometimes so its hard to have different types for different instructions.
2024-10-30 11:46:15 -07:00
Harald van Dijk
950ee75909
[RISC-V] Fix check of minimum vlen. (#114055)
If we have a minimum vlen, we were adjusting StackSize to change the
unit from vscale to bytes, and then calculating the required padding
size for alignment in bytes. However, we then used that padding size as
an offset in vscale units, resulting in misplaced stack objects.

While it would be possible to adjust the object offsets by dividing
AlignmentPadding by ST.getRealMinVLen() / RISCV::RVVBitsPerBlock, we can
simplify the calculation a bit if instead we adjust the alignment to be
in vscale units.

@topperc This fixes a bug I am seeing after #110312, but I am not 100%
certain I am understanding the code correctly, could you please see if
this makes sense to you?
2024-10-29 17:30:30 +00:00
Alex Bradbury
7544d3af0e
[RISCV] Mark RVB23U64 and RVB23S64 as non-experimental (#113918)
The specification was recently ratified

<https://github.com/riscv/riscv-profiles/blob/main/src/rvb23-profile.adoc>.
2024-10-29 07:57:34 +00:00
Alex Bradbury
ba7555e640
[RISCV] Mark the RVA23S64 and RVA23U64 profiles as non-experimental (#113826)
All of the extensions used by these profile are themselves
non-experimental, and RVA23 was just ratified

<https://riscv.org/announcements/2024/10/risc-v-announces-ratification-of-the-rva23-profile-standard/>.

<https://github.com/riscv/riscv-profiles/blob/main/src/rva23-profile.adoc>

We lack a way of expressing `Ss1p13` (supervisor architecture 1.13), but
this is a problem we have for RVA22 (Ss1p12) and RVA20 (Ss1p11) so I
don't feel it's a blocker.
2024-10-28 12:56:47 +00:00
dong-miao
75c75fc16e
[RISCV]Add svvptc extension (#113882) 2024-10-28 22:54:51 +11:00
Luke Lau
0cbccb13d6
[RISCV] Remove support for pre-RA vsetvli insertion (#110796)
Now that LLVM 19.1.0 has been out for a while with post-vector-RA
vsetvli insertion enabled by default, this proposes to remove the flag
that restores the old pre-RA behaviour so we only have one configuration
going forward.

That flag was mainly meant as a fallback in case users ran into issues,
but I haven't seen anything reported so far.
2024-10-28 11:31:18 +00:00
Luke Lau
96f5c68350
[RISCV] Lower @llvm.experimental.vector.compress for zvfhmin/zvfbfmin (#113770)
This is a follow up to #113291 and handles f16/bf16 with zvfhmin and
zvfbmin.
2024-10-28 09:37:06 +00:00
Alex Bradbury
43a5719d9f
[RISCV] Use Sha extension in RVA23S64 profile (#113823)
In the ratified version of the RVA23S64 definition, the Sha extension is
now used to group together the set of hypervisor related extensions.

<https://github.com/riscv/riscv-profiles/blob/main/src/rva23-profile.adoc>
2024-10-28 09:22:09 +00:00
Alex Bradbury
35f6cc6af0
[RISCV] Add the Sha extension (#113820)
This was introduced in the now-ratified RVA23 profile (and also added to
the RVA22 text) as a simple way of referring to H plus the set of
supervisor extensions required by RVA23.
https://github.com/riscv/riscv-profiles/blob/main/src/rva23-profile.adoc

This patch simply defines the extension. The next patch will adjust the
RVA23 profile to use it, and at that point I think we will be ready to
mark RVA23 as non-experimental.

Note that I haven't made it so if you enable all extensions that
constitute Sha, Sha is implied. Per #76893 (adding 'B'), the concern is
making this implication might break older external assemblers. Perhaps
this is less of a concern given the relative frequency of
`-march=${foo}_zba_zbb_zbs` vs the collection of H extensions. If we did
want to add that implication, we'd probably want to add it in a separate
patch so it can be easily reverted if found to cause problems.
2024-10-28 07:42:33 +00:00
Serge Pavlov
819abe412d
[Test] Fix usage of constrained intrinsics (#113523)
Some tests contain errors in constrained intrinsic usage, such as missed
or extra type parameters, wrong type parameters order and some other.

---------

Co-authored-by: Andy Kaylor <andy_kaylor@yahoo.com>
2024-10-28 14:07:32 +07:00
Philip Reames
5ad500ca4a [RISCV] Coverage for a few missed vector idioms 2024-10-25 16:28:36 -07:00
Alex Bradbury
cbdfb18794
[RISCV] Add Supm extension to RVA23 profiles (#113619)
This is mandatory for both RVA23U64 and RVA23S64 in the ratified version
of the specification

<https://github.com/riscv/riscv-profiles/blob/main/src/rva23-profile.adoc>.
2024-10-25 15:39:07 +01:00
Alex Bradbury
2c0b34852a
[RISCV] Mark pointer masking extensions as non-experimental (#113618)
These extensions were ratified very recently.

<https://lf-riscv.atlassian.net/wiki/spaces/HOME/pages/16154732/Ratified+Extensions>

I've ensured we have definitions for all extensions in the document
<https://drive.google.com/file/d/159QffOTbi3EEbdkKndYRZ2c46D25ZLmO/view?usp=drive_link>.
There are no additional CSRs.
2024-10-25 12:24:50 +01:00
dong-miao
ed6ddffb58
[RISCV] Add Smrnmi extension (#111668)
This commit has completed the Extension for Resumable Non Maskable
Interrupts, adding four CRSs and one Trap-Return instruction.
Specification link:["Smrnmi"
Extension](https://github.com/riscv/riscv-isa-manual/blob/main/src/rnmi.adoc)

---------

Co-authored-by: Sam Elliott <sam@lenary.co.uk>
2024-10-25 18:41:21 +11:00
Pengcheng Wang
b799cc3418
[RISCV] Add lowering for @llvm.experimental.vector.compress (#113291)
This intrinsic was introduced by #92289 and currently we just expand
it for RISC-V.

This patch adds custom lowering for this intrinsic and simply maps
it to `vcompress` instruction.

Fixes #113242.
2024-10-23 14:22:32 +08:00
Michael Maitland
f2302ed3d0 [RISCV][GISEL] Fix operand on RISCV::G_VMV_V_V_VL
6bac41496eb24c80aa659008d08220355a617c49 added this opcode with the wrong
number of operands. It didn't fail on check-llvm for me or on pre-commit CI,
but once committed we got buildbot failures. This patch fixes the definition
of the instruction and fixes the failing test.
2024-10-21 06:58:12 -07:00
Michael Maitland
6bac41496e
[RISCV][GISEL] Legalize G_INSERT_SUBVECTOR (#108859)
This code is heavily based on the SelectionDAG lowerINSERT_SUBVECTOR
code.
2024-10-21 08:49:13 -04:00
Craig Topper
1bc1a79a65
[RISCV] Support inline assembly 'f' constraint for Zfinx. (#112986)
This would allow some inline assembly code to work with either F or Zfinx.
This appears to match gcc behavior.
2024-10-18 18:17:23 -07:00
Sam Elliott
228f88fdc8
[RISCV] Inline Assembly: RVC constraint and N modifier (#112561)
This change implements support for the `cr` and `cf` register
constraints (which allocate a RVC GPR or RVC FPR respectively), and the
`N` modifier (which prints the raw encoding of a register rather than
the name).

The intention behind these additions is to make it easier to use inline
assembly when assembling raw instructions that are not supported by the
compiler, for instance when experimenting with new instructions or when
supporting proprietary extensions outside the toolchain.

These implement part of my proposal in riscv-non-isa/riscv-c-api-doc#92

As part of the implementation, I felt there was not enough coverage of
inline assembly and the "in X" floating-point extensions, so I have
added more regression tests around these configurations.
2024-10-18 10:40:38 +01:00
Roger Ferrer Ibáñez
9d469b5988
[RISCV] Implement trampolines for rv64 (#96309)
This is implementation is based on what the X86 target does but
emitting the instructions that GCC emits for rv64.

---------

Co-authored-by: Pengcheng Wang <wangpengcheng.pp@bytedance.com>
2024-10-18 08:06:47 +02:00
Alex Rønne Petersen
ad4a582fd9
[llvm] Consistently respect naked fn attribute in TargetFrameLowering::hasFP() (#106014)
Some targets (e.g. PPC and Hexagon) already did this. I think it's best
to do this consistently so that frontend authors don't run into
inconsistent results when they emit `naked` functions. For example, in
Zig, we had to change our emit code to also set `frame-pointer=none` to
get reliable results across targets.

Note: I don't have commit access.
2024-10-18 09:35:42 +04:00
Craig Topper
feedb35e41 [RISCV][GISel] Correct RORIW patterns.
We had two rotl patterns and no rotr pattern. The order was such
that the incorrect rotl pattern was being used.
2024-10-17 09:32:45 -07:00
Luke Lau
e88bcc1204
[RISCV] Lower vector_splice on zvfhmin/zvfbfmin (#112579)
Similar to other permutation ops, we can just reuse the existing
lowering.
2024-10-16 21:40:18 +01:00
Michael Maitland
ae68d532f8
[RISCV][VLOPT] Allow propagation even when VL isn't VLMAX (#112228)
The original goal of this pass was to focus on vector operations with
VLMAX. However, users often utilize only part of the result, and such
usage may come from the vectorizer.

We found that relaxing this constraint can capture more optimization
opportunities, such as non-power-of-2 code generation and vector
operation sequences with different VLs.

---------

Co-authored-by: Kito Cheng <kito.cheng@sifive.com>
2024-10-16 14:58:00 -04:00
Michael Maitland
658ff0b84c
[RISCV][VLOPT] Add support for integer widening multiply instructions (#112204)
This adds support for these instructions and also tests getOperandInfo
for these instructions as well.
2024-10-16 09:37:27 -04:00
Luke Lau
f6c23222a4
[RISCV] Promote fixed-length bf16 arith vector ops with zvfbfmin (#112393)
The aim is to have the same set of promotions on fixed-length bf16
vectors as on fixed-length f16 vectors, and then deduplicate them
similarly to what was done for scalable vectors.

It looks like fneg/fabs/fcopysign end up getting expanded because fsub
is now legal, and the default operation action must be expand.
2024-10-15 22:49:05 +01:00
Luke Lau
043f066a64
[RISCV][VLOPT] Fix operand check in isVectorOpUsedAsScalarOp (#112253)
A reduction instruction always has a passthru operand, so the scalar
operand should always be vs1 which is at index 3.

Even though the destination operand is also scalar, I think the passthru
will need to preserve all elements so I haven't included it.
2024-10-15 15:10:26 +01:00
c8ef
854ded9b24
Reapply "[DAG] Enhance SDPatternMatch to match integer minimum and maximum patterns in addition to the existing ISD nodes." (#112203)
This patch adds icmp+select patterns for integer min/max matchers in
SDPatternMatch, similar to those in IR PatternMatch.

Reapply #111774.

Closes #108218.
2024-10-15 21:07:06 +08:00
Brandon Wu
ae7751f405
[RISCV][VCIX] Add a tied constraint between rd and rs3 in sf.v.xvv and sf.v.xvw instructions (#111630)
The instruction has the constraint, but the pseudo instruction is
missing.
2024-10-14 21:10:55 -07:00
Yingwei Zheng
637e81f8ad
Reland [CodeGenPrepare] Convert ctpop(X) ==/!= 1 into ctpop(X) u</u> 2/1 (#111284) (#111998)
Relands #111284. Test failure with stage2 build has been fixed by
https://github.com/llvm/llvm-project/pull/111946.


Some targets have better codegen for `ctpop(X) u< 2` than `ctpop(X) ==
1`. After https://github.com/llvm/llvm-project/pull/100899, we set the
range of ctpop's return value to indicate the argument/result is
non-zero.

This patch converts `ctpop(X) ==/!= 1` into `ctpop(X) u</u> 2/1` in CGP
to fix https://github.com/llvm/llvm-project/issues/95255.
2024-10-15 08:17:50 +08:00
Luke Lau
db57fc4edc
[RISCV][VLOPT] Fix passthru check in getOperandInfo (#112244)
If a pseudo has a passthru, I believe the first source operand will have
operand no 2, not 1.
2024-10-14 20:54:17 +01:00
Michael Maitland
c2c4db8d8f
[RISCV][VLOPT] Add support for 11.11 div instructions (#112201)
This adds support for these instructions and also tests getOperandInfo
for these instructions as well.
2024-10-14 14:44:33 -04:00
Michael Maitland
82e89c0271
[RISCV][VLOPT] Add support for 11.9 min/max instructions (#112198)
This adds support for these instructions and also tests getOperandInfo
for these instructions as well.
2024-10-14 14:43:56 -04:00
Michael Maitland
2f077ece2f [RISCV][VLOPT] Enable VLOptimizer for vl-opt.ll test file 2024-10-14 10:36:06 -07:00
Michael Maitland
a31e834ba8 [RISCV][VLOPT] Update test cases to use riscv-enable-vl-optimizer and better formatting 2024-10-14 08:44:16 -07:00
c8ef
a3b0c31ebc
Revert "[DAG] Enhance SDPatternMatch to match integer minimum and maximum patterns in addition to the existing ISD nodes." (#112200)
Reverts llvm/llvm-project#111774

This appears to be causing some tests to fail.
2024-10-14 21:43:49 +08:00
c8ef
11f625cb87
[DAG] Enhance SDPatternMatch to match integer minimum and maximum patterns in addition to the existing ISD nodes. (#111774)
Closes #108218.

This patch adds icmp+select patterns for integer min/max matchers in
SDPatternMatch, similar to those in IR PatternMatch.
2024-10-14 21:19:34 +08:00
YunQiang Su
c01ddbe916
RISC-V: Select FCANONICALIZE (#112083)
We can use `FMIN.x OP,OP` to canonlize a float.
2024-10-14 14:12:36 +08:00
Jim Lin
dba54fb074
[RISCV] Add support for inline asm constraint vd (#111653)
It constrains vector registers excluding v0. Refer to
https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html RISC-V part.

This patch also adds a testcase for constraints vr, vd and vm.
2024-10-14 10:47:59 +08:00
Craig Topper
902520256b [RISCV] Make (sext_inreg X, i1) legal for XTHeadBb to cover the existing isel pattern.
I just happened to notice the untested isel pattern.
2024-10-11 16:16:07 -07:00
Craig Topper
8b46d40221 [RISCV] Re-generate orc-b-patterns.ll for store clustering. NFC
The patch added orc-b-patterns.ll landed while store clustering was
still in review.
2024-10-11 14:28:51 -07:00
Alex Bradbury
2967e5f800
[RISCV] Enable store clustering by default (#73796)
Builds on #73789, enabling store clustering by default using the same
heuristic.
2024-10-11 20:25:53 +01:00
Michael Maitland
1c94388f38
[RISCV] Introduce VLOptimizer pass (#108640)
The purpose of this optimization is to make the VL argument, for
instructions that have a VL argument, as small as possible. This is
implemented by visiting each instruction in reverse order and checking
that if it has a VL argument, whether the VL can be reduced.

By putting this pass before VSETVLI insertion, we see three kinds of
changes to generated code:
1. Eliminate VSETVLI instructions
2. Reduce the VL toggle on VSETVLI instructions that also change vtype
3. Reduce the VL set by a VSETVLI instruction

The list of supported instructions is currently whitelisted for safety.
In the future, we could add more instructions to `isSupportedInstr` to
support even more VL optimization.

We originally wrote this pass because vector GEP instructions do not
take a VL, which leads us to emit code that uses VL=VLMAX to implement
GEP in the RISC-V backend. As a result, some of the vector instructions
will write to lanes, specifically between the intended VL and VLMAX,
that will never be read. As an alternative to this pass, we considered
adding a vector predicated GEP instruction, but this would not fit well
into the intrinsic type system since GEP has a variable number of
arguments, each with arbitrary types. The second approach we considered
was to put this pass after VSETVLI insertion, but we found that it was
more difficult to recognize optimization opportunities, especially
across basic block boundaries -- the data flow analysis was also a bit
more expensive and complex.

While this pass solves the GEP problem, we have expanded it to handle
more cases of VL optimization, and there is opportunity for the analysis
to be improved to enable even more optimization. We have a few follow up
patches to post, but figured this would be a good start.

---------

Co-authored-by: Craig Topper <craig.topper@sifive.com>
Co-authored-by: Kito Cheng <kito.cheng@sifive.com>
2024-10-11 09:45:35 -04:00