1186 Commits

Author SHA1 Message Date
Craig Topper
56b54eda52 [RISCV] Combine setOperationAction code for ISD::CTLZ for Zbb and XTheadBB. NFC
This avoids needing to change ISD::CTLZ back to Legal after earlier
code set it to Expand.
2023-07-30 23:01:59 -07:00
Jun Sha (Joshua)
2b6df4a336 [RISCV] Add codegen support for bf16 vector
This patch adds codegen support for vector with bfloat16 type in llvm backend.
With this patch, Zvbfmin/Zvbfwma instructions as well as vle16/vse16 can generated from newly added bf16 IR intrinsics.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D156287
2023-07-28 09:54:23 +08:00
Jianjian GUAN
5d6d6493ff [RISCV][NFC] Simplify lowerVPOp.
This patch is similar to https://reviews.llvm.org/D153948, using helper function to get ISD and information.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154411
2023-07-27 16:42:20 +08:00
Craig Topper
e28307e93a [RISCV] Handle seteq/setne conditions for CZERO_NEZ/CZERO_EQZ during isel.
This removes selectSETCC and adds isel patterns for seteq/setne
conditions.

This removes the duplication of selectSETCC between lowering and
isel. This also gets some cases in xaluo.ll that we missed previously.

Reviewed By: wangpc

Differential Revision: https://reviews.llvm.org/D156250
2023-07-26 10:06:08 -07:00
Jianjian GUAN
84d4618a02 [RISCV] Fix the check assertion in hasMergeOp and hasMaskOp
Because we have STRICT_FCVT_W_RV64 equal to ISD::FIRST_TARGET_STRICTFP_OPCODE, the check needs to be splitted into 2 parts.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D155683
2023-07-26 11:21:50 +08:00
Craig Topper
f6dc75cdd8 [RISCV] Add DAG combine to pull xor with 1 through select idiom that uses czero_eqz/nez.
If we are selecting between two setccs that need to be legalized
with xor, the select will be legalized first. Detect this pattern
so we can pull the xor through to expose it to additional
optimizations.

We could generalize this to other operations, but those normally
get handled in DAG combine before select legalization.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D156159
2023-07-25 09:13:24 -07:00
Craig Topper
b34a8b3a52 [RISCV] Generalize combineAddOfBooleanXor to support any boolean not just setcc.
Instead of checking for setcc, look for any 0/1 value.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D156153
2023-07-25 09:04:49 -07:00
Craig Topper
5ff5dac852 [RISCV] Add simple DAG combine to pull xor with 1 through select_cc.
If we're selecting the result of two setccs that have been legalized
by introducing an xor with 1, we can pull the xor with 1 through the
select to enable more optimizations.

We could generalize this to other binary operators with identical
conditions, but those are usually caught before we legalize the select.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D156144
2023-07-25 09:03:45 -07:00
Craig Topper
49429783b0 [RISCV] Add lowering for scalar fmaximum/fminimum.
Unlike fmaxnum and fminnum, these operations propagate nan and
consider -0.0 to be less than +0.0.

Without Zfa, we don't have a single instruction for this. The
lowering I've used forces the other input to nan if one input
is a nan. If both inputs are nan, they get swapped. Then use
the fmax or fmin instruction.

New ISD nodes are needed because fmaxnum/fminnum to not define
the order of -0.0 and +0.0.

This lowering ensures the snans are quieted though that is probably not
required in default environment). Also ensures non-canonical nans
are canonicalized, though I'm also not sure that's needed.

Another option could be to use fmax/fmin and then overwrite the
result based on the inputs being nan, but I'm not sure we can do
that with any less code.

Future work will handle nonans FMF, and handling the case where
we can prove the input isn't nan.

This does fix the crash in #64022, but we need to do more work
to avoid scalarization.

Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D156069
2023-07-24 13:46:35 -07:00
Craig Topper
9d9cde5a90 [RISCV] Remove combineCmpOp and associated code. NFCI
This code was originally added in D134277. This transform is now
available in target independent DAG combine after D153502.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D156075
2023-07-24 09:05:44 -07:00
Craig Topper
5990199e2c [RISCV] Add CZERO_EQZ/CZERO_NEZ to ComputeNumSignBitsForTargetNode.
Reviewed By: wangpc

Differential Revision: https://reviews.llvm.org/D156082
2023-07-24 07:43:02 -07:00
Craig Topper
9da0db4dd8 [RISCV] Add CZERO_EQZ/CZERO_NEZ to computeKnownBitsForTargetNode.
Reviewed By: wangpc

Differential Revision: https://reviews.llvm.org/D156081
2023-07-24 07:38:12 -07:00
Luke Lau
5b95bba6fe [RISCV] Set Fast flag for unaligned memory accesses
The +unaligned-scalar-mem and +unaligned-vector-mem features were added in
D126085 and D149375 respectively to allow subtargets to indicate that
they supported misaligned loads/stores with "sufficient" performance.
This is separate from whether or not the target actually supports
misaligned accesses, which could be determined from Zicclsm.

This patch enables the Fast flag under the assumption that any subtarget
that declares support for +unaligned-*-mem will want to opt into
optimisations that take advantage of misaligned scalar accesses, such as
store merging.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D150771
2023-07-24 10:58:57 +01:00
eopXD
78d91df452 [RISCV] Support register allocation for GHC when f/d is not specified in the architecture
This patch supports register allocation for floating-point types when
`zfinx` and `zdinx` is specified in the architecture for the GHC
calling convention.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D155910
2023-07-23 22:40:10 -07:00
Jun Sha (Joshua)
f375ee36c4 [RISCV] Add codegen for Zfbfmin instructions
The implementation in https://reviews.llvm.org/D151313 is done for the circumstance without Zfbfmin. This patch adds codegen support for the 6 instructions provided in Zfbfmin extension.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D153234
2023-07-24 10:37:58 +08:00
Craig Topper
ac715f7f5b [RISCV] Simplify setOperationAction for f64 ceil/floor/round/trunc/etc. NFC
We were setting the operations as Legal for Zfa in two places. Use
an else to avoid this.
2023-07-22 12:13:33 -07:00
Luke Lau
33a83c5486 [RISCV] Add SDNode patterns for vrol.[vv,vx] and vror.[vv,vx,vi]
These correspond to ROTL/ROTR nodes

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D155439
2023-07-21 10:22:46 +01:00
Craig Topper
7dfe62327d [RISCV] Add a DAG combine for (czero_eq X, (xor Y, 1)) -> (czero_ne X, Y) if Y is 0 or 1.
This is an alternative to D155288 that can handle other sources of
xori like FP compares. Unfortunately, it misses the i64 setge case
on RV32 in condops.ll.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D155328
2023-07-19 12:33:08 -07:00
Djordje Todorovic
80e20c8a8d [RISCV] Add DAG combine for CTTZ/CTLZ in the case of input 0
Within the AggressiveInstCombine Pass we have
an analysis/optimization that matches that
pattern of the Table Based CTZ. Some Targets do
not support/define ctz(0), but since the
AggressiveInstCombine is just an extension of
InstCombine, it should be a target-independent
canonicalization Pass, and therefore, we decided
to introduce several instructions, such as select
and compare that produce canonical IR, even if
the input is 0. The task for the Targets that do
support that input is to handle such a case and
to produce an optimal assembly.

This patch optimizes the CTTZ/CTLZ instructions
if the input is 0 by performing the`DAG combine`,
by generating the cttz(x) & 0x1f pattern (the
same goes for ctlz as well).

Differential Revision: https://reviews.llvm.org/D151449
2023-07-19 16:22:04 +02:00
eopXD
32c257d384 [RISCV] Use the stack for MVT::f16 for fastcc when there are no other registers available
In D155502, we added code for the compiler to check GPR-s for f16
under zhinx. This commit adds code to hit the stack when we run out of
GPR-s.

With this patch and D155502, resolves #63922

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D155507
2023-07-18 19:49:17 -07:00
Craig Topper
ea3683e98f [RISCV] Improve type promotion for i32 clmulr/clmulh on RV64.
Instead of zero extending the inputs by masking. We can shift them
left instead. This is cheaper when we don't zext.w instruction.

This does make the case where the inputs are already zero extended
or freely zero extendable worse though.

Reviewed By: wangpc

Differential Revision: https://reviews.llvm.org/D155530
2023-07-18 10:39:25 -07:00
Craig Topper
0c055286b2 [RISCV] Use RISCVISD::CZERO_EQZ/CZERO_NEZ for XVentanaCondOps.
This makes Zicond and XVentanaCondOps use the same code path.
The instructions have identical semantics.

Reviewed By: wangpc

Differential Revision: https://reviews.llvm.org/D155391
2023-07-18 10:18:02 -07:00
eopXD
eb89bf8d0d [RISCV] Do not use FPR registers for fastcc if zfh/f/d is not specified in the architecture
Resolves #63917.

Also lets the compiler check for available GPR before hitting the stack.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D155502
2023-07-18 10:03:04 -07:00
LiaoChunyu
65ffcc099c [RISCV] Lower VP_CTLZ_ZERO_UNDEF/VP_CTTZ_ZERO_UNDEF/VP_CTLZ by converting to FP and extracting the exponent.
D111904, D141585 made RISC-V customized lower vector ISD::CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF/CTLZ
by converting to float and using the float result.

Perhaps VP_CTLZ_ZERO_UNDEF/VP_CTTZ_ZERO_UNDEF/VP_CTLZ could use the similar feature.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D155150
2023-07-18 15:25:59 +08:00
Craig Topper
a64b3e92c7 [RISCV] Re-define sha256, Zksed, and Zksh intrinsics to use i32 types.
Previously we returned i32 on RV32 and i64 on RV64. The instructions
only consume 32 bits and only produce 32 bits. For RV64, the result
is sign extended to 64 bits like *W instructions.

This patch removes this detail from the interface to improve
portability and consistency. This matches the proposal for scalar
intrinsics here https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44

I've included IR autoupgrade support as well.

I'll be doing this for other builtins/intrinsics that currently use
'long' in other patches.

Reviewed By: VincentWu

Differential Revision: https://reviews.llvm.org/D154647
2023-07-17 08:58:29 -07:00
Luke Lau
b5bcd4f60b [RISCV] Add VL nodes and VP patterns for unary zvbb instructions
This follows the pattern of lowering VP nodes to equivalent
RISCVISD::*_VL nodes. The nodes are modelled after the VP ISD nodes rather
than the actual zvbb instructions, and I've included a merge operand to be
consistent with the underlying pseudos that were recently refactored.

I've defined the nodes in RISCVInstrInfoVVLpatterns.td as the nodes aren't Zvk
specific, but the patterns are in RISCVInstrInfoZvk.td.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D155229
2023-07-17 09:17:58 +01:00
Craig Topper
ce70578303 [RISCV] Move comments before 'if' instead of after. NFC
This allows us to remove some curly braces around the if body.
The code wasn't consistent about it anyway. Comments before is
used in other places in this file already.

Reviewed By: wangpc, MaskRay

Differential Revision: https://reviews.llvm.org/D155390
2023-07-16 12:57:49 -07:00
Craig Topper
2a33f47912 [RISCV] Make selectSETCC return SDValue instead of bool. NFC
We can use a null SDValue for the 'false' case. This avoids the
need for an output parameter. This is consistent with other
SelectionDAG code.

Reviewed By: wangpc

Differential Revision: https://reviews.llvm.org/D155388
2023-07-16 12:56:32 -07:00
Craig Topper
48ee319378 Revert "[RISCV] Move comments before 'if' instead of after. NFC"
This reverts commit ef1ccc493e6167488ac10da2842fa7cac2746565.

Committed by mistake.
2023-07-15 22:54:06 -07:00
Craig Topper
d09109aa1e [RISCV] Use isScalarInteger instead of isInteger. NFC
The type should only be scalar here and the isScalarInteger
should be a simpler check.
2023-07-15 22:52:43 -07:00
Craig Topper
ef1ccc493e [RISCV] Move comments before 'if' instead of after. NFC
This allows us to remove some curly braces around the if body.
The code wasn't consistent about it anyway. Comments before is
used in other places in this file already.

Differential Revision: https://reviews.llvm.org/D155390
2023-07-15 22:47:52 -07:00
Craig Topper
2b0b85c05e [RISCV] Move vector handling earlier in lowerSELECT. NFC
This keeps all the scalar code together.
2023-07-15 22:34:19 -07:00
Craig Topper
12c669a869 [RISCV] Remove 'else' after 'return'. NFC 2023-07-15 22:25:33 -07:00
Mikhail Gudim
c158ddd99e Reapply [RISCV] Fold binary op into select if profitable.
This fixes some bugs in the original commit:
  (1) Operands are passed in correct order when creating new constant
  and the binary operator. New tests were added to cover these cases.
  (2) Check was added to see if it is safe to commute the select and the binary operator.

Reviewed By: Craig Topper

Differential Revision: https://reviews.llvm.org/D152147
2023-07-14 15:30:54 -04:00
Craig Topper
3a0a25f9b6 [RISCV] Support i32 clmul* intrinsics on RV64.
We can use an i64 clmul to emulate i32 clmul.
For clmulh and clmulr we need to zero extend the 32 bit input
to 64 bits then extract either bits [63:32] or [62:31].

Unfortunately, without Zba we need to use 2 shifts for the
zero extends. These can be optimized out later if the producing
instruction already zeroed the upper bits or if we can use lwu.

There are alternative sequences we can use for clmulh/clmulr
when the zero extend isn't free, but those are best handled by
a DAG combine to give the best opportunity for removing the extend.

This allows us to implement i32 clmul C intrinsics proposed in
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/44.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D154729
2023-07-14 11:20:03 -07:00
Alex Bradbury
5c5a1a2927 [RISCV] Introduce RISCVISD::CZERO_{EQZ,NEZ} nodes produce them when zicond is present in lowerSELECT
This patch is a step towards altering how we handle the emission of
condops. Marking ISD::SELECT as legal is a major change in the codegen
path, and gives few options for maintaining the old codegen path when
it is believed to be better (e.g. a better branchless sequence is
possible using non-zicond instructions, or the branch-based sequence is
preferable).

This removes the existing SelectionDAG patterns and moves the logic into
lowerSELECT. Along some small codegen changes you'll note a few minor
regressions in the generated code quality - this are due to the fact
that by lowering the SELECT node early we miss out on combines that
would kick in later when setcc condcodes that aren't natively supported
have been expanded (thus exposing opportunities for optimisation by
performing logical negation and swapping truev/falsev). I've opted to
split out work that addresses these into follow-on patches (especially
as zicond is still 'experimental').

matchSetCC is a straight-forward translation from the version in
RISCVISelDAGToDAG. Ideally, in the future it can be converted to a
helper shared between both files.

Differential Revision: https://reviews.llvm.org/D155083
2023-07-14 11:31:27 +01:00
Yeting Kuo
2ac99205ee [RISCV] Narrow types of index operand matched pattern (shl (zext), C).
(shl (zext to iXLenVec), C) is a possible pattern in auto-vectorized code for
indexed loads/stores. But extending to iXLen might be too aggressive, RVV
indexed load/store instructions zero extend their indexed operand to XLEN.
The patch tries to narrow the type of the zero extension. It's benefit to
decrease register pressure.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154687
2023-07-14 15:45:44 +08:00
Luke Lau
55e2772e9f [RISCV] Add initial SDNode patterns for unary zvbb instructions
This patch adds pseudos and SDNode patterns for vbrev.v, vrev8.v, vclz.v,
vctz.v and vcpop.v.
I've only added them for integer element types so far since we're lacking tests
for floats.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D155216
2023-07-13 19:39:04 +01:00
eopXD
5d18d43f26 [7/8][RISCV] Add rounding mode control variant for conversion intrinsics between floating-point and integer
Depends on D154634

For the cover letter of the patch-set, please checkout D154628.

This is the 7th patch of the patch-set. This patch includes change to
vfcvt_x_f, vfcvt_xu_f, vfwcvt_x_f, vfwcvt_xu_f, vfncvt_x_f, vfncvt_xu_f
vfcvt_f_x, vfcvt_f_xu, vfncvt_f_x vfncvt_f_xu, vfncvt_f_f

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154635
2023-07-13 00:54:07 -07:00
eopXD
76482078cd [RISCV][POC] Model frm control for vfadd
Depends on D152879.

Specification PR: riscv-non-isa/rvv-intrinsic-doc#226

This patch adds variant of `vfadd` that models the rounding mode control.
The added variant has suffix `_rm` appended to differentiate from the
existing ones that does not alternate `frm` and uses whatever is inside.

The value `7` is used to indicate no rounding mode change. Reusing the
semantic from the rounding mode encoding for scalar floating-point
instructions.

Additional data member `HasFRMRoundModeOp` is added so we can append
`_rm` suffix for the fadd variants that models rounding mode control.

Additional data member `IsRVVFixedPoint` is added so we can define
pseudo instructions with rounding mode operand and distinguish the
instructions between fixed-point and floating-point.

Reviewed By: craig.topper, kito-cheng

Differential Revision: https://reviews.llvm.org/D152996
2023-07-13 00:34:00 -07:00
Mikhail Gudim
17e2df6695 [RISCV] Removed the requirement of XLenVT for performSELECTCombine.
Reviewed By: Craig Topper

Differential Revision: https://reviews.llvm.org/D153044
2023-07-12 16:29:09 -04:00
Craig Topper
dbd47c4489 [RISCV] Don't allow X0 to be used for 'r' constraint in inline assembly
Some instructions treat x0 as a special encoding rather than as a
value of 0. Since we don't parse the inline assembly to know what
the instruction is, chooser the safest option of never using x0.

Fixes #63747.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D154744
2023-07-10 13:25:17 -07:00
Craig Topper
6f90808074 [RISCV] Add a guard condition to orc_b/brev8 handling in ReplaceNodeResults.
The orc_b and brev8 intrinsics are type overloaded, but only
i32 and XLen are supported types. The type legalization code in
ReplaceNodeResults only handles the i32 case on RV64. Add some
checks so we will fail type legalization for other types.
2023-07-07 08:51:46 -07:00
Craig Topper
427278d11a [RISCV] Remove pseudos for vwcvt.f.x(u) with rounding mode.
vwcvt.f.x doesn't use rounding mode. The integer value fits in
the mantissa of a 2x larger FP type so no rounding is required.

I've remove the Uses = [FRM] that is also not needed.

I deleted the isel patterns. Alternatively, we could keep them and
drop the rounding mode immediate. The patterns are currently untested
so I chose to delete them. If they become needed in the future, we
can decide then if we should have the patterns or teach the node
creation to use the non-RM form for widening.

This reverts part of D142102.

Reviewed By: luke

Differential Revision: https://reviews.llvm.org/D154653
2023-07-07 08:38:20 -07:00
Luke Lau
02bb33c3ce [RISCV] Check for alignment when lowering interleaved/deinterleaved loads/stores
As noted by @reames, we should be checking that the memory access is aligned to
the element size (or the unaligned vector memory access feature is enabled)
before lowering vlseg/vsseg intrinsics via the interleaved access pass.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D154536
2023-07-07 15:34:24 +01:00
Yeting Kuo
74eac85dae [RISCV] Add riscv_vsoxei_mask/riscv_vsuxei_mask to getTgtMemIntrinsic.
This constructs a proper memory operand for riscv_vsoxei_mask and riscv_vsuxei_mask.
I think they are missed in D147119.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D154694
2023-07-07 17:52:11 +08:00
Craig Topper
a403124998 [RISCV] Don't sink i1 vectors in shouldSinkOperands.
These can't create .vx instructions so there's no reason to sink them.
2023-07-06 20:36:55 -07:00
Craig Topper
be253cb987 [RISCV] Support i32 brev8 intrinsic on RV64.
Similar to what we do for orc.b. Another patch will expose this
as a builtin in clang.
2023-07-06 17:24:53 -07:00
Craig Topper
ee34fa0032 [RISCV] Add DAG combine for (fmv_w_x_rv64 (fmv_x_anyextw_rv64 X))
This pattern started showing up more after D151284
2023-07-05 19:35:13 -07:00
Luke Lau
ea62fc79e7 [RISCV] Lower deinterleave2 intrinsics to vlseg2
Following from D153864, this patch implements the lowerDeinterleaveIntrinsic
hook to lower deinterleaves of loads into vlseg2 intrinsics.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D153876
2023-07-05 19:24:15 +01:00