1014 Commits

Author SHA1 Message Date
Ben Shi
2e43eea2da [RISCV] Optimize multiplication with immediates
The optimization of (mul x, c) to (ADD (SLLI x, i0), (SLLI x, i1))
is only enabled for i32 multiplication on rv64, because of
the regression in i64 multiplication on rv32.

However we can change the condition to that the immediate 'c'
should only be used once, then the above regression can also be
avoided, and ohter chances of optimization can be enabled.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147410
2023-04-15 17:29:06 +08:00
Yeting Kuo
d83620d101 [RISCV] Support vector strict_fsetcc/fsetccs.
The patch supports vector strict_fsetcc/fsetccs. Instead of revserving fflags,
the method to implement scalar quiet compares, the patch implement quiet
compares by masking the signaling compares when either input is NaN [0].

[0]: https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#vector-floating-point-compare-instructions

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147998
2023-04-14 09:10:41 +08:00
Craig Topper
d7c97e9129 [RISCV] Support llvm.lround intrinsics with i32 return type on RV64.
It seems that flang uses this for "nint" and expects this i32
to work. On the C side we think lround should only work for "long"
which is i64 on rv64.

It's easy for us to support i32 when we have native FP instructions.
I fell back to i64 and truncated the result otherwise. The
documentation for lround says it returns an unspecified value if
doesn't fit in the integer type. I have no idea what flang is
expecting. I really only did the libcall to avoid forking a test.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D147195
2023-04-12 13:15:59 -07:00
Craig Topper
64d29e8ecb [RISCV] Add segment load/store to getTgtMemIntrinsic. 2023-04-11 11:30:15 -07:00
Kazu Hirata
53ead5215b [Target] Use isNullConstant and isOneConstant (NFC) 2023-04-10 18:23:07 -07:00
LiaoChunyu
b6ea46fe72 [RISCV] Add DAG combine to fold (sub 0, (setcc x, 0, setlt)) -> (sra x , xlen - 1)
The result of sub + setcc is 0 or 1 for all bits.
The sra instruction get the same result.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147538
2023-04-07 08:59:28 +08:00
Luke Lau
ce397e500d [RISCV] Lower scalar_to_vector
Loads of fixed length vectors with irregular element counts are
sometimes emitted as a scalar load + scalar_to_vector.
Previously the scalar_to_vector wasn't legal and so was scalarized
further. This patch handles it by lowering it to a vmv.s.x.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147608
2023-04-06 17:39:19 +01:00
Luke Lau
2b24e7b5f7 [RISCV] Use tail agnostic policy more often when lowering insert_subvector
If we're inserting a fixed length subvector into a fixed length vector,
then we can use a tail agnostic policy as long as we're inserting up to
or past the end of the main vector.
I.e., because we're overwriting all of the main vector's tail elements,
and we don't care what the elements after that are.
As noted by Philip in https://reviews.llvm.org/D146711#4220341

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147347
2023-04-06 10:31:45 +01:00
Craig Topper
2c57868e2e [RISCV] Add vector load/store intrinsics to getTgtMemIntrinsic.
This constructs a proper memory operand for these intrinsics.

Segment load/store will be added in a separate patch.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D147119
2023-04-05 19:28:05 -07:00
Yeting Kuo
6858a920b8 [RISCV] Support vector type strict_[su]int_to_fp and strict_fp_to_[su]int.
Also the patch loose the fixed vector contraint in llvm/lib/IR/Verifier.cpp.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147380
2023-04-06 10:09:44 +08:00
Dinar Temirbulatov
7f05bdf4ee [AArch64][SME] Fix an infinite loop in DAGCombine related to adding -force-streaming-compatible-sve flag.
Compiler hits infinite loop in DAGCombine. For force-streaming-compatible-sve
mode we have custom lowering for 128-bit vector splats and later in
DAGCombiner::SimplifyVCastOp() we scalarized SPLAT because we have custom
lowering for SME. Later, we restored SPLAT opertion via performMulCombine().
2023-04-05 10:10:55 +00:00
Craig Topper
219ff07f72 [Targets] Rename Flag->Glue. NFC
Long long ago Glue was called Flag, and it was never completely
renamed.
2023-04-02 19:28:51 -07:00
Luke Lau
ec26c9cdc0 [RISCV] Lower fixed length interleaved accesses via vssegN/vlsegN
This enables the interleaved access pass on O1 and above, and causes
interleaving/deinterleaving shuffles of fixed length vectors with
stores/loads to be lowered into vssegN/vlsegN.

We need to be careful and make sure that we only lower vsseg/vlseg
whenever we know the fixed vector type will fit within the minimum vlen,
and that the interleaving factor is supported for the given LMUL.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145085
2023-04-02 16:47:44 +01:00
Luke Lau
80f3be9603 Revert "[RISCV] Lower fixed length interleaved accesses via vssegN/vlsegN"
This reverts commit b95913e8c3a3521b85d689a358e620d89a4e83de.
2023-04-02 15:56:24 +01:00
Luke Lau
b95913e8c3 [RISCV] Lower fixed length interleaved accesses via vssegN/vlsegN
This enables the interleaved access pass on O1 and above, and causes
interleaving/deinterleaving shuffles of fixed length vectors with
stores/loads to be lowered into vssegN/vlsegN.

We need to be careful and make sure that we only lower vsseg/vlseg
whenever we know the fixed vector type will fit within the minimum vlen,
and that the interleaving factor is supported for the given LMUL.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145085
2023-04-02 15:20:21 +01:00
Craig Topper
241ad16eb0 [RISCV] Add special case for i32 uaddo X, -1 on RV64.
uaddo X, -1 over flows if X is non-zero.

Matches what we do i32 uaddo X, -1 on RV32.

Fixes #61891.
2023-04-01 18:54:03 -07:00
Simon Pilgrim
8153b92d9b [DAG] Add SelectionDAG::SplitScalar helper
Similar to the existing SelectionDAG::SplitVector helper, this helper creates the EXTRACT_ELEMENT nodes for the LO/HI halves of the scalar source.

Differential Revision: https://reviews.llvm.org/D147264
2023-03-31 18:35:40 +01:00
Craig Topper
f2315545b2 [RISCV] Correct the EvenSrc/OddSrc computation in isInterleaveShuffle.
StartIndexes[0] Tells exactly which source element is in element 0,
the even source. Nothing needs to be swapped.

Since we're dealing with power of 2 vector lengths, StartIndexes[0]
is almost always even so the condition here was never true. The
exception is when we're interleaving two 1 element vectors. In that
case StartIndexes[0] could be 1.

We recently hit a failure from this on a pulldown. I don't have
the reduced reproducer yet and my naive attempts at making an
interleave of 1 element vectors produces a slideup instead so don't
go through this path.

Reviewed By: luke

Differential Revision: https://reviews.llvm.org/D147268
2023-03-30 15:52:24 -07:00
Alex Bradbury
a755e80ed1 [RISCV] Add codegen for the experimental zicond extension
This directly matches the codegen for xventanacondops with vt.maskcn =>
czero.nez and vt.maskc => czero.eqz. An additional difference is that
zicond is available on RV32 in addition to RV64 (xventanacondops is RV64
only).

Differential Revision: https://reviews.llvm.org/D147147
2023-03-30 21:05:22 +01:00
Yeting Kuo
84c8c2b4b4 [DAG][RISCV] Allow scalable vector ISD::STRICT_FP_ROUND and support vector ISD::STRICT_FP_ROUND for RISC-V.
The patch customized lower vector type ISD::STRICT_FP_ROUND to RISCVISD::STRICT_FP_ROUND.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D147113
2023-03-30 08:20:02 +08:00
Craig Topper
98798a5a90 [RISCV] Add helper function for RVV intrinsics in getTgtMemIntrinsic. NFC
Preparation for adding the other RVV load/store intrinsics we use
for the C API.

Reviewed By: asb, kito-cheng

Differential Revision: https://reviews.llvm.org/D147004
2023-03-29 15:34:25 -07:00
Vitaly Cheptsov
f5e63f8fc9 [RISCV] Support emulated TLS
As discussed earlier in the [GitHub
issue](https://github.com/llvm/llvm-project/issues/59500), currently
LLVM generates invalid code when emulated TLS is used. There were
attempts to resolve this previously (D102527), but they were not merged
because the component owners raised concerns about emulated TLS
efficiency.

The current state of the art is that:

- OpenBSD team, which raised the initial issue, simply has [patches
  downstream](https://github.com/openbsd/src/blob/a0747c9/gnu/llvm/llvm/lib/Target/RISCV/RISCVISelLowering.cpp#L2850-L2852).
- Our team, which raised the GH issue, has patches downstream as well.
  We also do not use `malloc` or any [dynamic
allocations](https://github.com/llvm/llvm-project/issues/59500#issuecomment-1349046835)
with emulated TLS, so the concerns raised in the original issue does not
apply to us.
- GCC compatibility is broken, because GCC supports emulated TLS.
- RISC-V is the only architecture in LLVM that does not support emulated
  TLS, and work is being done to at least warn the users about it
(D143619).

With all these in mind I believe it is important to address the
consumers' needs especially given that there is little to no maintenance
downsides.

Differential Revision: https://reviews.llvm.org/D143708
2023-03-29 20:55:51 +01:00
Yeting Kuo
0676c6d91f [RISCV] Support vector type strict_fma.
Like D145900, the patch also supports fixed vector strict_fma nodes in RISC-V by
customized lowering them to riscv_strict_vfmadd_vl nodes. riscv_strict_vfmadd_vl
is created to avoid some riscv_vfmadd_vl optimizations happening to original
strict_fma nodes. The patch also adds combine patterns for riscv_strict_fmadd_vl
nodes with negation operands.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D146939
2023-03-28 09:01:46 +08:00
Craig Topper
29463612d2 [RISCV] Replace RISCV -> RISC-V in comments. NFC
To be consistent with RISC-V branding guidelines
https://riscv.org/about/risc-v-branding-guidelines/
Think we should be using RISC-V where possible.

More patches will follow.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D146449
2023-03-27 09:50:17 -07:00
Yeting Kuo
946d29e7e9 [RISCV] Support vector type strict_fsqrt.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D146911
2023-03-27 14:02:22 +08:00
Kazu Hirata
1a8668cf0c [Target] Use isAllOnesConstant (NFC) 2023-03-26 22:57:39 -07:00
Nitin John Raj
7da272af89 [RISCV][RISCVISelLowering] Add tail agnostic policy operand to VECREDUCE instructions
Differential Revision: https://reviews.llvm.org/D146752
2023-03-25 02:42:13 -07:00
Luke Lau
6e9c24edf0 [RISCV] Lower insert subvector shuffles as vslideups
A shuffle with an insert subvector mask is functionally equivalent to:
(insert_subvector v0, (extract_subvector v1, len), index)
We can emulate by doing a vslideup on v1 into the right index, and
carefully selecting VL so that we don't overwrite any more destination
elements than what we have to.
This avoids the need for a select with a mask.
2023-03-24 17:30:31 +00:00
Job Noorman
c39dd7c1db [RISCV][MC] Add support for RV64E
Implement MC support for the recently ratified RV64E base instruction
set.

Differential Revision: https://reviews.llvm.org/D143570
2023-03-23 12:32:25 +00:00
Craig Topper
b50c6857a4 [RISCV] Move fli selection in RISCVISelDAGToDAG.cpp. NFC
We custom isel for ConstantFP that has higher priority than isel
patterns. We were previously detecting valid FP constants for fli
to early exit from the custom code. This detection called
getLoadFPImm. Then we would run the isel patterns which would call
getLoadFPImm a second time.

With a little bit more code we can directly select the fli instruction
in the custom handler and avoid a second call.

Remove the incorrect mayRaiseFPException flag from the FLI instructions.

Reviewed By: joshua-arch1

Differential Revision: https://reviews.llvm.org/D146093
2023-03-21 19:33:27 -07:00
LiaoChunyu
fc9730376c [RISCV]Optimize (riscvisd::select_cc x, 0, ne, x, 1)
This patch reduces the number of unpredictable branches.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D146117
2023-03-16 10:56:26 +08:00
Kito Cheng
cf40b8a4dd [RISCV] Pass vector argument by stack correctly.
We've a argument lowering logic to prevent floating-point value pass
passed with bit-conversion, but that rule should not applied to vector
arguments.

---

How to pass argument to `foo`:

```
tail call void @foo(i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0,
                    <vscale x 16 x float> zeroinitializer,
                    <vscale x 16 x float> zeroinitializer,
                    <vscale x 16 x float> zeroinitializer)
```

`foo` take 13 arguments, first 8 argument pass in GPR, and next 2 LMUL 8 vector
arguments passed in v8-v23, and now we run out of argument register for GPR and
vector register, so we must pass last LMUL 8 vector argument by stack.

Which means we should reserve `vlenb * 8` byte for stack for the last
vector argument.

Reviewed By: craig.topper, asb

Differential Revision: https://reviews.llvm.org/D145938
2023-03-15 17:22:47 +08:00
Yeting Kuo
9637e950cb [RISCV] Support ISD::STRICT_FADD/FSUB/FMUL/FDIV for vector types.
The patch handles fixed type strict-fp by new RISCVISD::STRICT_ prefixed
isd nodes.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145900
2023-03-15 07:47:16 +08:00
Craig Topper
a1e39f35c5 [RISCV] Merge getLoadFP*Imm into a single function.
We currently have 3 functions and 3 lookup tables. This was the
most expediant and obvious way to fix several bugs.

This patch uses a single function and single lookup
table. It uses APFloat::convert to convert from the half or double
to single precision. If the conversion doesn't have any errors or
lose any information we use the f32 table to finish the lookup.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D145897
2023-03-14 13:11:11 -07:00
Luke Lau
a9d9616c0d [RISCV][NFC] Share interleave mask checking logic
This adds two new methods to ShuffleVectorInst, isInterleave and
isInterleaveMask, so that the logic to check if a shuffle mask is an
interleave can be shared across the TTI, codegen and the interleaved
access pass.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145971
2023-03-14 11:02:52 +00:00
Craig Topper
c0c4c725e9 [RISCV] Return false for unsupported VTs in isFPImmLegal.
I don't have a test case that fails for this, but it seemed like
we should only handle legal types. The callers I looked at in
DAGCombine either check the type is legal or don't even call
isFPImmLegal unless LegalOperations is true.

Written in a slightly odd way because switches on EVT require
an additional isSimple check so an if/else chain is easier. Used a bool
to shorten the code instead of having multiple ifs and returns.
AArch64 uses a similarish structure.
2023-03-12 23:34:13 -07:00
Craig Topper
30705e9770 [RISCV] Support Zfa fli instructions with vector splats.
-Return false from RISCVDAGToDAGISel::selectFPImm for fli
 constants so we don't try to use integer expansion.
-Support fli.h with Zvfh+Zfhmin.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D145766
2023-03-10 09:16:21 -08:00
Yeting Kuo
b2c48559c8 [IR][DAG][RISCV] Allow scalable vector ISD::STRICT_FP_EXTEND and RISC-V supports for vector ISD::STRICT_FP_EXTEND.
The patch mainly does two things. The first is allowing scalable vector
ISD::STRICT_FP_EXTEND. The second is making RISC-V customized lower
strict_fpextend to riscv_strict_fpextend_vl, the strict version of
riscv_fpextend_vl.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145548
2023-03-09 17:37:59 +08:00
Craig Topper
17e0926d6a [RISCV] Don't try to use fli.h with Zfa+Zfhmin.
fli.h requires Zfh or Zvfh. We need to check for this in
isFPImmLegal. Zvfh support will come in another patch.

I had to split the test file because there are other issues with
Zfhmin and some intrinsics.
2023-03-08 22:54:25 -08:00
Craig Topper
006f88d05d [RISCV] Remove seemingly unneeded !isPosZero from Zfa code in isFPImmLegal.
This was added after the patch was approved. I'm not sure why its
there. It doesn't fire in any lit test.
2023-03-08 22:06:05 -08:00
Craig Topper
08b65c5c9e [RISCV] Remove some trailing whitespace. NFC 2023-03-08 21:34:10 -08:00
Luke Lau
d610c6c9c7 [RISCV] Add vsseg intrinsic for fixed length vectors
These intrinsics are equivalent to the regular @llvm.riscv.vssegNF
intrinsics, only they accept fixed length vectors in their overloaded
types: The regular intrinsics only operate on scalable vectors.
These intrinsics convert the fixed length vectors to scalable ones, and
then lower it on to the regular scalable intrinsic.

This mirrors the intrinsics added in 0803dba7dd998ad073d75a32b65296734c10ae70
This will be used in a later patch with the Interleaved Access pass.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145022
2023-03-08 17:19:03 +00:00
Jun Sha (Joshua)
ada2641460 [RISCV][CodeGen] Add codegen pattern for FLI instruction in experimental zfa extension
This patch implements experimental support for the RISCV Zfa extension as specified here: https://github.com/riscv/riscv-isa-manual/releases/download/draft-20221119-5234c63/riscv-spec.pdf, Ch. 25. This extension has not been ratified. Once ratified, it'll move out of experimental status.

This change adds codegen support for load-immediate instructions (fli.s/fli.d/fli.h).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D141560
2023-03-07 14:27:48 +08:00
Philipp Tomsich
f68f04d07c [RISCV] Add vendor-defined XTheadCondMov (conditional move) extension
The vendor-defined XTheadCondMov (somewhat related to the upcoming
Zicond and XVentanaCondOps) extension add conditional move
instructions with $rd being an input and an ouput instructions.

It is supported by the C9xx cores (e.g., found in the wild in the
Allwinner D1) by Alibaba T-Head.

The current (as of this commit) public documentation for this
extension is available at:
  https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf

Support for these instructions has already landed in GNU Binutils:
  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=73442230966a22b3238b2074691a71d7b4ed914a

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144681
2023-02-24 21:40:42 +01:00
Manolis Tsamis
f6262201d8 [RISCV] Add vendor-defined XTheadMemIdx (Indexed Memory Operations) extension
The vendor-defined XTHeadMemIdx (no comparable standard extension exists
at the time of writing) extension adds indexed load/store instructions
as well as load/store and update register instructions.

It is supported by the C9xx cores (e.g., found in the wild in the
Allwinner D1) by Alibaba T-Head.

The current (as of this commit) public documentation for this
extension is available at:
  https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf

Support for these instructions has already landed in GNU Binutils:
  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=27cfd142d0a7e378d19aa9a1278e2137f849b71b

Depends on D144002

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144249
2023-02-24 00:17:58 +01:00
Luke Lau
e340e9e632 [RISCV][NFC] Reuse getDeinterleaveViaVNSRL to lower deinterleave intrinsics
This modifies it to work on both scalable and fixed vectors

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D144584
2023-02-23 16:23:05 +00:00
Luke Lau
8d15e7275f [RISCV] Lower interleave and deinterleave intrinsics
Lower the two intrinsics introduced in D141924.

These intrinsics can be combined with loads and stores into the much more efficient segmented load and store instructions in a following patch.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D144092
2023-02-23 16:23:02 +00:00
Manolis Tsamis
a6446668a3 [RISCV] XTHeadMemPair: Fix invalid mempair combine for types other than i32/i64
A mistake in the control flow of performMemPairCombine resulted in paired
loads/stores for types that were not supported by the instructions (i8/i16).
These loads/stores could not match the constraints of the patterns defined
in the THead td file and the compiler would throw a 'Cannot select' error.

This is now fixed and two new test functions have been added in xtheadmempair.ll
which would previously crash the compiler. The compiler was additionally tested
with a wide range of benchmarks and no issues were observed.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144559
2023-02-22 19:57:37 +01:00
Philip Reames
f5a656050a [RISCV] Reorganize deinterleave lowering for reuse [nfc]
Not entirely sure we'll end up reusing the body of the transform, but personally I find this structure easier to follow anyways.

Differential Revision: https://reviews.llvm.org/D144532
2023-02-22 09:45:57 -08:00
Philip Reames
ac35c1d859 [RISCV] Minor style cleanup in lowerVECTOR_SHUFFLEAsVNSRL [nfc] 2023-02-21 12:06:43 -08:00