835 Commits

Author SHA1 Message Date
Craig Topper
e94dc58dff [RISCV] Inline scalar ceil/floor/trunc/rint/round/roundeven.
This avoids the call overhead as well as the the save/restore of
fflags and the snan handling in the libm function.

The save/restore of fflags and snan handling are needed to be
correct for -ftrapping-math. I think we can ignore them in the
default environment.

The inline sequence will generate an invalid exception for nan
and an inexact exception if fractional bits are discarded.

I've used a custom inserter to explicitly create the control flow
around the float->int->float conversion.

We can probably avoid the final fsgnj after the conversion for
no signed zeros FMF, but I'll leave that for future work.

Note the comparison constant is slightly different than glibc uses.
They use 1<<53 for double, I'm using 1<<52. I believe either are valid.
Numbers >= 1<<52 can't have any fractional bits. It's ok to do the
float->int->float conversion on numbers between 1<<53 and 1<<52 since
they will all fit in 64. We only have a problem if the double can't fit
in i64

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D136508
2022-10-26 14:36:49 -07:00
Craig Topper
a61b74889f [RISCV] Use vslide1down for i64 insertelt on RV32.
Instead of using vslide1up, use vslide1down and build the other
direction. This avoids the overlap constraint early clobber of
vslide1up.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D136735
2022-10-26 09:43:12 -07:00
Craig Topper
63ed3d0eeb [RISCV] Rename lowerFTRUNC_FCEIL_FFLOOR_FROUND to lowerVectorFTRUNC_FCEIL_FFLOOR_FROUND. NFC
Extracted from D136508.
2022-10-24 20:32:22 -07:00
Craig Topper
ef72ff7b15 [RISCV] Fix unused variable warning. NFC 2022-10-22 22:29:03 -07:00
Craig Topper
7a4e56acac [RISCV] Add an early out to lowerVECTOR_SHUFFLEAsVSlidedown. NFC
If Mask[0] is 0, then we're never going to match a slidedown. If
we get through the for loop, then it's an identity mask which should
have already been optimized out. Otherwise it's some non-contiguous
mask that will fail out of the lop. Might as well not bother entering
the loop.
2022-10-18 21:35:15 -07:00
Han-Kuan Chen
615af94dc2 [RISCV] Lower VECTOR_SHUFFLE to VSLIDEDOWN_VL.
Differential Revision: https://reviews.llvm.org/D136136
2022-10-18 08:58:39 -07:00
LiaoChunyu
7b970290c0 [RISCV] Optimize SELECT_CC when the true value of select is Constant
(select (setcc lhs, rhs, CC), constant, falsev) -> (select (setcc lhs, rhs, InverseCC), falsev, constant)

This patch removes unnecessary copies

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D129757
2022-10-18 09:24:17 +08:00
Craig Topper
2b32e4f98b [RISCV] Add basic support for the sifive-7-series short forward branch optimization.
sifive-7-series has macrofusion support to convert a branch over
a single instruction into a conditional instruction. This can be
an improvement if the branch is hard to predict.

This patch adds support for the most basic case, a branch over a
move instruction. This is implemented as a pseudo instruction so
we can hide the control flow until all code motion passes complete.

I've disabled a recent select optimization if this feature is enabled
in the subtarget.

Related gcc patch for the same optimization https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg211045.html

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D135814
2022-10-17 13:56:22 -07:00
Craig Topper
e68b0d5875 [RISCV] Match (select C, -1, X)->(or -C, X) during lowerSelect
Same with (select C, X, -1), (select C, 0, X), and (select C, X, 0).

There's a DAGCombine after we turn the select into select_cc, but
that may introduce a setcc that didn't previously exist. We could
add more DAGCombines to remove the extra setcc, but this seemed lower
effort.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D135833
2022-10-13 09:06:12 -07:00
Philip Reames
1c41d0cb62 [RISCV] Use branchless form for selects with 0 in either arm
Continuing the theme of adding branchless lowerings for simple selects, this time handle the 0 arm case. This is very common for various umin idioms, etc..

Differential Revision: https://reviews.llvm.org/D135600
2022-10-12 13:51:52 -07:00
Yeting Kuo
7329dc0cc3 [RISCV][NFC] Fix unused variable warning.
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D135365
2022-10-09 21:39:30 +08:00
Craig Topper
f749b2d9a5 [RISCV] Fix incorrect parenthese placement in comment. NFC 2022-10-07 17:16:38 -07:00
Craig Topper
9f67047cf0 [VP][RISCV] Add vp.smax/smin/umax/umin intrinsics
Differential Revision: https://reviews.llvm.org/D135418
2022-10-07 17:14:31 -07:00
eopXD
dbc681c98e [VP][RISCV] Add vp.roundtozero and its RISC-V support
The scalar instruction of this is `llvm.trunc`. However the naming of
ISD::VP_TRUNC is already taken by `trunc` of the LLVM IR. Naming this as
`vp.ftrunc` would likely cause confusion with `vp.fptrunc`. So adding
`vp.roundtozero` that will look similar to `vp.roundeven`.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D135233
2022-10-07 02:15:23 -07:00
Philip Reames
79f0413e5e [RISCV] Use branchless form for selects with -1 in either arm
We can lower these as an or with the negative of the condition value. This appears to result in significantly less branch-y code on multiple common idioms (as seen in tests).

Differential Revision: https://reviews.llvm.org/D135316
2022-10-06 15:18:43 -07:00
Philip Reames
04bb32e58a [DAG] Extract helper for (neg x) [nfc]
This is a frequently reoccurring pattern, let's factor it out.

Differential Revision: https://reviews.llvm.org/D135301
2022-10-06 13:23:52 -07:00
Quentin Colombet
6e440ee2aa [RISCV][ISel] Finally fix the UBSan error
Forgot another SDValue check and a boolean initialization.
2022-10-05 21:43:09 +00:00
Quentin Colombet
6bbe7d376e [RISCV][ISel] Attempt to fix UBSan error
Explicitly check an SDValue with the invalid SDValue.

UBSan reports:
runtime error: load of value 36, which is not a valid value for type
'bool'

https://lab.llvm.org/buildbot/#/builders/85/builds/11231
2022-10-05 20:59:28 +00:00
Quentin Colombet
c5c2de287e [RISCV][ISel] Fold extensions when all the users can consume them
This patch allows the combines that fold extensions in binary operations
to have more than one use.
The approach here is pretty conservative: if all the users of an
extension can fold the extension, then the folding is done, otherwise we
don't fold.
This is the first step towards avoiding the one-use limitation.

As a result, we make a decision to fold/don't fold for a web of
instructions. An instruction is part of the web of instructions as soon
as it consumes an extension that needs to be folded for all its users.

Because of how SDISel works a web of instructions can be visited over
and over. More precisely, if the folding happens, it happens for the
whole web and that's the end of it, but if the folding fails, the whole
web may be revisited when another member of the web is visited.

To avoid a compile time explosion in pathological cases, we bail out
earlier for webs that are bigger than a given threshold (arbitrarily set
at 18 for now.) This size can be changed using
`--riscv-lower-ext-max-web-size=<maxWebSize>`.

At the current time, I didn't see a better scheme for that. Assuming we
want to stick with doing that in SDISel.

Differential Revision: https://reviews.llvm.org/D133739
2022-10-05 20:49:21 +00:00
Quentin Colombet
4852f26acd [RISCV][ISel] Refactor the formation of VW operations
This patch centralizes all the combines of add|sub|mul with extended
operands in one "framework".
The rationale for this change is to offer a one-stop-shop for all these
transformations so that, in the future, it is easier to make combine
decisions for a web of instructions (i.e., instructions connected
through s|zext operands).

Technically this patch is not NFC because the new version is more
powerful than the previous version.
In particular, it diverges in two cases:
- VWMULSU can now also be produced from `mul(splat, zext)`, whereas
  previously only `mul(sext, splat)` were supported when `splat`s were
  involved. (As demonstrated in rvv/fixed-vectors-vwmulsu.ll)
- VWSUB(U) can now also be produced from `sub(splat, ext)`, whereas
  previously only `sub(ext, splat)` were supported when `splat`s were
  involved. (As demonstrated in rvv/fixed-vectors-vwsub.ll)

If we wanted, we could block these transformations to make this
patch really NFC. For instance, we could do something similar to
`AllowSplatInVW_W`, which prevents the combines to form vw(add|sub)(u)_w
when the RHS is a splat.

Regarding the "framework" itself, the bulk of the patch is some
boilderplate code that abstracts away the actual extensions that are
present in the DAG. This allows us to handle `vwadd_w(ext a, b)` as if
it was a regular `add(ext a, ext b)`. Since the node `ext b` doesn't
actually exist in the DAG, we have a bunch of methods (all in the
NodeExtensionHelper class) that fake all that for us.

The other half of the change is around `CombineToTry` and
`CombineResult`. These helper structures respectively:
- Represent the kind of combines that can be applied to a node, and
- Store what needs to happen to do that combine.

This can be viewed as a two step approach:
- First, check if a pattern applies, and
- Second apply it.

The checks and the materialization of the combines are decoupled so that
in the future we can perform several checks and do all the related
applies in one go.

Differential Revision: https://reviews.llvm.org/D134703
2022-10-05 17:43:48 +00:00
Craig Topper
ece4bb5ab8 [RISCV] Teach SExtWRemoval to recognize sign extended values that come from arguments.
This information is not preserved in MIR today. So this patch adds
information to RISCVMachineFunctionInfo when the vreg is created for
the argument.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D134621
2022-10-04 15:39:10 -07:00
Craig Topper
b41fe90dc3 [RISCV] Correct the setcc in vp.floor/ceil/round/roundeven lowering.
We want to emit a masked setcc that preserves zeros in all of the bits
where the original mask is zero. To do this we need to pass the original
mask as the passthru operand as well. Otherwise, we'll use the mask agnostic
policy and replace the zeros with 1s on some CPUs.

Differential Revision: https://reviews.llvm.org/D135122
2022-10-03 20:58:05 -07:00
Philip Reames
e884324145 [RISCV] Generalize select (and (x , 0x1) == 0), y, (z ^ y) ) and select (and (x , 0x1) == 0), y, (z | y) ) transforms by removing and-clause
These transforms were recently added (by me) in D134881. Looking at the code again, I realized we don't need the (and x, 0x1) portion of the pattern, we just need to know that the result of that sub-tree is either 0 or 1. Checking for this directly allows us to match slightly more broadly. The test changes are zext i1 arguments, but this could also kick in for e.g. shifts of high bits, or any other source of known bits.

Differential Revision: https://reviews.llvm.org/D135081
2022-10-03 13:57:38 -07:00
Philip Reames
a200b0fc25 [DAG] Introduce getSplat utility for common dispatch pattern [nfc]
We have a very common pattern of dispatching between BUILD_VECTOR and SPLAT_VECTOR creation repeated in many cases in code.  Common the pattern into a utility function.
2022-10-03 12:49:39 -07:00
Craig Topper
f3e87a63e5 [RISCV] Add missing VL arguments to the creation of RISCVISD::VMV_V_X_VL nodes.
VMV_V_X_VL nodes should always have a passthru, a splat, and a VL.
We were sometimes missing the VL.

This went unnoticed because these cases were all selected into the
following node to form a .vx or .vi instruction. The ComplexPattern
that does this, doesn't check the VL operand. I've added an assert
to the ComplexPattern to catch if the operand is missing.

@qcolombet spotted some of these in D134703.
2022-10-03 12:21:05 -07:00
Craig Topper
5b06ccb611 Revert "foo"
This reverts commit 2138ef354a2a8be9f273804e1133a3b97a493f03.

Forgot to squash
2022-10-03 12:15:41 -07:00
Craig Topper
a55cdcae3e Revert "[RISCV] Add missing VL arguments to the creation of RISCVISD::VMV_V_X_VL nodes."
This reverts commit 4c03c9f375f326a87065443d649c6568a4b7dd67.

Forgot to squash
2022-10-03 12:15:28 -07:00
Craig Topper
4c03c9f375 [RISCV] Add missing VL arguments to the creation of RISCVISD::VMV_V_X_VL nodes.
VMV_V_X_VL nodes should always have a passthru, a splat, and a VL.
We were sometimes missing the VL.

This went unnoticed because these cases were all selected into the
following node to form a .vx or .vi instruction. The ComplexPattern
that does this, doesn't check the VL operand. I've added an assert
to the ComplexPattern to catch if the operand is missing.

@qcolombet spotted some of these in D134703.
2022-10-03 12:13:21 -07:00
Craig Topper
2138ef354a foo 2022-10-03 12:13:20 -07:00
Yeting Kuo
cefb7aab61 [VP][RISCV] Add vp.copysign and RISC-V support.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134935
2022-10-01 10:19:10 +08:00
Philip Reames
1e3c179519 [RISCV] Address post commit review comments from D134881 2022-09-30 08:31:40 -07:00
Philip Reames
2b5960028e [RISCV] Branchless lowering for select (and (x , 0x1) == 0), y, (z ^ y) ) and select (and (x , 0x1) == 0), y, (z | y) )
This code is directly ported from the X86 backend which applies the same rewrite (along with several others). Planning on looking more closely at the other branchless variants from x86 to see if any are worth porting in future changes.

Motivation here is the coremark crc8 routine from https://github.com/eembc/coremark/blob/main/core_util.c#L165. This patch significantly reduces the number of unpredictable branches in the workload.

Differential Revision: https://reviews.llvm.org/D134881
2022-09-30 08:24:32 -07:00
Ray Wang
4c786c9747 [RISCV] Remove some unused var decl. NFC
Differential Revision: https://reviews.llvm.org/D134707
2022-09-30 08:08:15 -07:00
Yeting Kuo
1cc02b05b7 [SelectionDAG] Add helper function to check whether a SDValue is neutral element. NFC.
Using this helper makes work about neutral elements more easier. Although I only
find one case now, I think it will have more chance to be used since so many
combine works are related to neutral elements.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D133866
2022-09-30 11:29:11 +08:00
eopXD
02a982829c [RISCV] Add lowering for llvm.roundeven
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134785
2022-09-29 06:08:14 -07:00
eopXD
9677d70eb2 [VP][RISCV] Add vp.floor, vp.round, vp.roundeven and their RISC-V support
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134759
2022-09-27 19:45:58 -07:00
Han-Kuan Chen
c595c874cb [RISCV] Lower BUILD_VECTOR to RISCVISD::VID_VL if it is floating-point type.
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D133688
2022-09-27 17:25:34 -07:00
eopXD
163cb33854 [VP][RISCV] Add vp.ceil and RISC-V support
Previous commit 8b00b24f8505 missed to add `int_ceil` anchor for the
llvm.ceil.* section under LangRef.rst

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134586
2022-09-27 12:04:09 -07:00
eopXD
384b8b3da7 Revert "[VP][RISCV] Add vp.ceil and RISC-V support"
This reverts commit 8b00b24f8505970f54eab85aad8db5845a635850.
2022-09-27 11:12:57 -07:00
eopXD
8b00b24f85 [VP][RISCV] Add vp.ceil and RISC-V support
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134586
2022-09-27 11:08:27 -07:00
Yeting Kuo
04e1301f3d [VP][RISCV] Add vp.maxnum and vp.minnum intrinsics and RISC-V support.
Add vp.maxnum and vp.minnum which are vector predicted intrinsics of llvm.maxnum
and llvm.minnum.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D134639
2022-09-27 13:36:45 +08:00
Yeting Kuo
43c5fbdd3a [VP][RISCV] Add vp.sqrt intrinsic and RISC-V support.
The patch modeled vp.fabs patch D132793.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D133690
2022-09-26 10:47:40 +08:00
Philip Reames
6e7c54ecaf [RISCV] Add lowering for scalable @llvm.riscv.masked.strided.load/store
The code previously assumed fixed length vectors; make the relevant code conditional.

Having the lowering in place is neccessary for an upcoming change to generalize scatter/gather matching to scalable vectors.

Differential Revision: https://reviews.llvm.org/D134489
2022-09-24 17:41:57 -07:00
Craig Topper
19850cc2d8 Revert "[RISCV] Lower BUILD_VECTOR to RISCVISD::VID_VL if it is floating-point type."
This reverts commit dd53a0bb30413d26c3ab63a5c57c3cdf301be7a1.

We have seen crashes from this internally. Probably due to the use
of RoundingMode::Dynamic.
2022-09-23 18:41:41 -07:00
Craig Topper
90a5d8499a [RISCV] Promote f16 STRICT_FCEIL/FLOOR/TRUNC/NEARBYINT/RINT/ROUND,ROUNDEVEN to f32. 2022-09-23 14:01:51 -07:00
Philip Reames
60c91fd364 [RISCV] Disallow scale for scatter/gather
RISCV doesn't actually support a scaled form of indexed load and store. We previously handled this by forming the scaled SDNode, and then doing custom legalization during lowering. This patch instead adds a callback via TLI to prevent formation entirely.

This has two effects:
* First, the GEP gets expanded (and used). Instead of the shift being created with an SDLoc of the memory operation, it has the SDLoc of the GEP instruction. This avoids the scheduler perturbing IR order when there's no reason to.
* Second, we fix what appears to be a bug in index calculation with RV32. The rules for GEPs require index calculation be done in particular bitwidth, and it appears the custom legalization code got this wrong for the case where index type exceeds pointer width. (Or at least, I trust the generic GEP lowering to be correct a lot more.)

The DAGCombiner change to handle VPScatter/VPGather is technically separate, but is required to prevent a regression on those intrinsics.

Differential Revision: https://reviews.llvm.org/D134382
2022-09-22 15:31:26 -07:00
Craig Topper
52708be182 [RISCV] Remove support for the unratified Zbe, Zbf, and Zbm extensions.
These extensions do not appear to be on their way to ratification.
2022-09-22 13:04:41 -07:00
Fraser Cormack
92d71c615d [RISCV] Use structured bindings in common RVV lowering code
This patch uses structured bindings to simplify a couple of specific
cases when lowering RVV operations where we commonly declare two
SDValues and immediately 'tie' them to the mask and vector length.

There's also a couple places where we split vectors that structured
bindings make sense to use.

This patch tries to keep these sorts of changes minimal and to cases
where the returned types are commonly understood, rather than applying
this wholesale to the RISCV backend.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D134442
2022-09-22 16:38:40 +01:00
Craig Topper
bf7c7696fe [RISCV] Improve support for vector fp_to_sint_sat/uint_sat.
The default fixed vector legalization is to unroll. The default
scalable vector legalization is to clamp in the FP domain. The
RVV vfcvt instructions have saturating behavior so we can use them
directly. The only difference is that RVV instruction turn nan into
the max value, but the _SAT intrinsics want 0.

I'm only supporting 1 step of narrowing for now. I think we can
support more steps by using VNCLIP to saturate and narrower.

The only case that needs 2 steps of widening is f16->i64 which we can
do as f16->f32->i64.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D134400
2022-09-22 08:13:48 -07:00
Craig Topper
8b8e18e11f [RISCV] Replace RISCVISD::GREV/GORC/SHFL/UNSHFL with BREV8/ORC_B/ZIP/UNZIP.
With Zbp removed, we no longer need the generalized forms.

The computeKnownBitsForTargetNode code brev8/orc.b is still based
on the general form with the shift amount forced to 7.
2022-09-21 21:57:59 -07:00