2060 Commits

Author SHA1 Message Date
Craig Topper
e94dc58dff [RISCV] Inline scalar ceil/floor/trunc/rint/round/roundeven.
This avoids the call overhead as well as the the save/restore of
fflags and the snan handling in the libm function.

The save/restore of fflags and snan handling are needed to be
correct for -ftrapping-math. I think we can ignore them in the
default environment.

The inline sequence will generate an invalid exception for nan
and an inexact exception if fractional bits are discarded.

I've used a custom inserter to explicitly create the control flow
around the float->int->float conversion.

We can probably avoid the final fsgnj after the conversion for
no signed zeros FMF, but I'll leave that for future work.

Note the comparison constant is slightly different than glibc uses.
They use 1<<53 for double, I'm using 1<<52. I believe either are valid.
Numbers >= 1<<52 can't have any fractional bits. It's ok to do the
float->int->float conversion on numbers between 1<<53 and 1<<52 since
they will all fit in 64. We only have a problem if the double can't fit
in i64

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D136508
2022-10-26 14:36:49 -07:00
Craig Topper
0a03240fb4 [RISCV] Add tests for fixed vector sshl_sat/ushl_sat. NFC 2022-10-26 14:15:47 -07:00
Piyou Chen
7d7940fd77 [RISCV] add svinval extension
1. Add the svinval extension support
2. Add the svinval Predicates for its instruction

Note: the svinval instructions defined in https://reviews.llvm.org/D117654

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D136571
2022-10-26 09:45:30 -07:00
Craig Topper
a61b74889f [RISCV] Use vslide1down for i64 insertelt on RV32.
Instead of using vslide1up, use vslide1down and build the other
direction. This avoids the overlap constraint early clobber of
vslide1up.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D136735
2022-10-26 09:43:12 -07:00
Craig Topper
a54f3347e8 [RISCV] Add shift amount operands of shift, rotate, and Zbs instructions to hasAllNBitUsers. 2022-10-24 22:07:22 -07:00
Craig Topper
223f466f4f [RISCV] Add ORI to hasAllNBitUsers.
If the immediate is negative with sufficient leading ones, then
the upper bits of the other operand aren't demanded.
2022-10-24 21:33:17 -07:00
Craig Topper
1fa8fd4c33 Recommit "[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant."
This reverts commit 65aaecca8842dec30d03734a7fe8ce33c5afec81.

There was an ordering problem in the calculation of the partial
remainder.

Original commit message:

If the divisor is even, we can first shift the dividend and divisor
right by the number of trailing zeros. Now the divisor is odd and we
can do the original algorithm to calculate a remainder. Then we shift
that remainder left by the number of trailing zeros and add the bits
that were shifted out of the dividend.

Differential Revision: https://reviews.llvm.org/D135541
2022-10-24 10:08:50 -07:00
Craig Topper
65aaecca88 Revert "[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant."
This reverts commit f6a7b47820904c5e69cc4f133d382c74a87c44e8.

I received a report that this fails on 32-bit X86.
2022-10-24 07:12:54 -07:00
Piyou Chen
f8b8426861 [RISCV] Add Svnapot extension
Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D136570
2022-10-24 01:27:04 -07:00
Craig Topper
a41c1f3168 [RISCV] Make selectShiftMask look for negate opportunities after looking through AND.
Previously we would only look for an AND or a negate. But its
possible there is a negate after looking through the AND.
2022-10-23 14:23:13 -07:00
Craig Topper
f6a7b47820 [TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant.
If the divisor is even, we can first shift the dividend and divisor
right by the number of trailing zeros. Now the divisor is odd and we
can do the original algorithm to calculate a remainder. Then we shift
that remainder left by the number of trailing zeros and add the bits
that were shifted out of the dividend.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D135541
2022-10-22 23:35:33 -07:00
Craig Topper
db25f51e37 Revert "[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))"
This reverts commit e8b3ffa532b8ebac5dcdf17bb91b47817382c14d.

The AMDGPU/mad_64_32.ll seems to fail on some of the build bots but
passes locally. I'm really confused.
2022-10-22 22:50:43 -07:00
Craig Topper
e8b3ffa532 [DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))
(sra X, BW-1) is either 0 or -1. So the multiply is a conditional
negate of Y.

This pattern shows up when type legalizing wide multiplies involving
a sign extended value.

Fixes PR57549.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D133399
2022-10-22 21:51:45 -07:00
Craig Topper
00816714f9 [DAGCombiner][RISCV] Make foldBinOpIntoSelect work correctly with opaque constants.
The CanFoldNonConst doesn't work correctly with opaque constants
because getNode won't constant fold constants if one is opaque. Even
if the operation is AND/OR. This can lead to infinite loops.

This patch does the folding manually in the DAGCombine. Alternatively,
we could improve getNode but that seemed likely to have bigger impact
and possibly increase compile time for the additional checks. We wouldn't
want to directly constant fold because we need to preserve the opaque flag.

Fixes PR58511.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D136472
2022-10-22 19:10:33 -07:00
Craig Topper
4830fa18aa [RISCV] Make sure we always call tryShrinkShlLogicImm for ISD:AND during isel.
There was an early out that prevented us from calling this for
(and (sext_inreg (shl X, C1), i32), C2).
2022-10-22 14:30:13 -07:00
Anton Sidorenko
14a5b9cdae [MachineCombiner][RISCV] Relax optimization level requirement
Enable Machine Combiner for O1/O2/O3 optimization levels. It makes RISCV
consistent with other targets running Machine Combiner.

Originally it was enabled only for -O3, however I looked through time reports
and usually it takes 0.1%-0.4% of total time, and never takes more than 1.0%.

Differential Revision: https://reviews.llvm.org/D136339
2022-10-21 13:25:28 +03:00
Craig Topper
2c82080f09 [MachineFrameInfo][RISCV] Call ensureStackAlignment for objects created with scalable vector stack id.
This is an alternative to fix PR57939 for RISC-V. It definitely
can be argued that the stack temporaries for RISC-V are being created
with an unnecessarily large alignment. But ignoring the alignment
in MachineFrameInfo also seems bad.

Looking at the test update that go with the current ID==0 check,
it was intending to exclude things like the NoAlloc stackid. So I'm
not sure if scalable vectors are intentionally being excluded.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D135913
2022-10-20 14:05:46 -07:00
Yeting Kuo
55ae180a4c [VP] Teach isVPBinaryOp to recognize vp.smin/smax/umin/umax/minnum/maxnum.
Those vp intrinsics should be vp binary operations.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D135753
2022-10-20 07:21:13 +08:00
Craig Topper
061566d954 [RISCV] Remove -enable-unsafe-fp-math from machine combiner tests. NFC
The optimization is using fast math flags on the instructions instead.
2022-10-19 15:54:33 -07:00
Craig Topper
e0afb72e82 [RISCV] Add more check prefixes to extractelt-fp.ll to fix a conflicting case.
The existing prefix conflicted and the script silently dropped the checks.
2022-10-19 12:12:39 -07:00
luxufan
82c820b95c [RISCV] Enable the LocalStackSlotAllocation pass support
For RISC-V, load/store(exclude vector load/store) instructions only
has a 12 bit immediate operand. If the offset is out-of-range, it
must make use of a temp register to make up this offset. If between
these offsets, they have a small(IsInt<12>) relative offset,
LocalStackSlotAllocation pass can find a value as frame base register's
value, and replace the origin offset with this register's value plus
the relative offset.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D98101
2022-10-19 16:15:14 +08:00
Han-Kuan Chen
615af94dc2 [RISCV] Lower VECTOR_SHUFFLE to VSLIDEDOWN_VL.
Differential Revision: https://reviews.llvm.org/D136136
2022-10-18 08:58:39 -07:00
Han-Kuan Chen
8d0246a926 [RISCV] Pre-commit tests for lowering VECTOR_SHUFFLE to VSLIDEDOWN_VL.
Differential Revision: https://reviews.llvm.org/D136135
2022-10-18 08:58:38 -07:00
Anton Sidorenko
1978b4d968 [MachineCombiner][RISCV] Enable MachineCombiner for RISCV
Initial implementation to match basic FP reassociation patterns.

Differential Revision: https://reviews.llvm.org/D135264
2022-10-18 18:56:32 +03:00
Anton Afanasyev
e175f99c49 Revert "[MachineCombiner][RISCV] Enable MachineCombiner for RISCV"
This reverts commit 3112cf3b00fe45a0911ec0c2e6706ef1f8a9b972.
Test breakage: https://lab.llvm.org/buildbot/#/builders/16/builds/36631
2022-10-18 15:57:11 +03:00
Anton Sidorenko
3112cf3b00 [MachineCombiner][RISCV] Enable MachineCombiner for RISCV
Initial implementation to match basic FP reassociation patterns.

Differential Revision: https://reviews.llvm.org/D135264
2022-10-18 15:31:03 +03:00
Anton Sidorenko
bd6bf3499f [MachineCombiner][RISCV] Precommit test for D135264 2022-10-18 12:53:07 +03:00
LiaoChunyu
7b970290c0 [RISCV] Optimize SELECT_CC when the true value of select is Constant
(select (setcc lhs, rhs, CC), constant, falsev) -> (select (setcc lhs, rhs, InverseCC), falsev, constant)

This patch removes unnecessary copies

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D129757
2022-10-18 09:24:17 +08:00
Craig Topper
2b32e4f98b [RISCV] Add basic support for the sifive-7-series short forward branch optimization.
sifive-7-series has macrofusion support to convert a branch over
a single instruction into a conditional instruction. This can be
an improvement if the branch is hard to predict.

This patch adds support for the most basic case, a branch over a
move instruction. This is implemented as a pseudo instruction so
we can hide the control flow until all code motion passes complete.

I've disabled a recent select optimization if this feature is enabled
in the subtarget.

Related gcc patch for the same optimization https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg211045.html

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D135814
2022-10-17 13:56:22 -07:00
Craig Topper
30305d7948 [TargetLowering][RISCV][Sparc] Don't emit zero check in CTTZTableLookup for CTTZ_ZERO_UNDEF.
The code incorrectly checked for CTLZ_ZERO_UNDEF instead of
CTTZ_ZERO_UNDEF.

While I was there I flipped the condition into an early out.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D136010
2022-10-17 10:15:39 -07:00
Peter Rong
c2e7c9cb33 [CodeGen] Using ZExt for extractelement indices.
In https://github.com/llvm/llvm-project/issues/57452, we found that IRTranslator is translating `i1 true` into `i32 -1`.
This is because IRTranslator uses SExt for indices.

In this fix, we change the expected behavior of extractelement's index, moving from SExt to ZExt.
This change includes both documentation, SelectionDAG and IRTranslator.
We also included a test for AMDGPU, updated tests for AArch64, Mips, PowerPC, RISCV, VE, WebAssembly and X86

This patch fixes issue #57452.

Differential Revision: https://reviews.llvm.org/D132978
2022-10-15 15:45:35 -07:00
Philip Reames
d91b0d6816 [RISCV] Merge rv32 and rv64 fixed vector stepvector tests 2022-10-14 14:54:37 -07:00
Craig Topper
e68b0d5875 [RISCV] Match (select C, -1, X)->(or -C, X) during lowerSelect
Same with (select C, X, -1), (select C, 0, X), and (select C, X, 0).

There's a DAGCombine after we turn the select into select_cc, but
that may introduce a setcc that didn't previously exist. We could
add more DAGCombines to remove the extra setcc, but this seemed lower
effort.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D135833
2022-10-13 09:06:12 -07:00
Philip Reames
1c41d0cb62 [RISCV] Use branchless form for selects with 0 in either arm
Continuing the theme of adding branchless lowerings for simple selects, this time handle the 0 arm case. This is very common for various umin idioms, etc..

Differential Revision: https://reviews.llvm.org/D135600
2022-10-12 13:51:52 -07:00
Yeting Kuo
2749b942e9 [RISCV] Add isel patterns for vmacc, vnmsac.
The patch selects VSELECT/VP_MERGE_VL which uses fmadd/fnmsub as true operand
and the adden of the fmadd/fnmsub as false operand.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D135330
2022-10-12 09:19:01 +08:00
Craig Topper
1bdf21d55c [RISCV] Use mask/tail agnostic if tied source is IMPLICIT_DEF regardless of the policy operand.
If the source is implicit_def, the register allocator won't have
any constraint on what register it picks for the destination. This
doesn't give the user much control of what register is being used.

So in my mind that means the only reason to honor the policy operand
is to control what policy is used in vsetvli to maybe avoid a vtype
change. Given the other optimizations we do on the policy field, I
don't think allowing the user this control is reliable.

Therefore, I think we should use agnostic policies if the source is
undef.

This should give better performance on some CPUs for VP intrinsics where
there is no merge operand and the backend adds IMPLICIT_DEF to the instruction.

Differential Revision: https://reviews.llvm.org/D135396
2022-10-11 16:40:16 -07:00
Craig Topper
ac9209751a Revert "[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))"
This reverts commit 0148df8157f05ecf3b1064508e6f012aefb87dad.

Getting a lit test failures on AMDGPU but I can't reproduce it so far.
Reverting to investigate.
2022-10-11 16:30:40 -07:00
Craig Topper
0148df8157 [DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))
(sra X, BW-1) is either 0 or -1. So the multiply is a conditional
negate of Y.

This pattern shows up when type legalizing wide multiplies involving
a sign extended value.

Fixes PR57549.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D133399
2022-10-11 16:20:55 -07:00
Craig Topper
0121b1a4ac Revert "[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant."
This reverts commit d4facda414b6b9b8b1a34bc7e6b7c15172775318.

This has been reported to cause failures. Reverting while I investigate.
2022-10-10 14:53:29 -07:00
Craig Topper
d4facda414 [TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant.
If the divisor is even, we can first shift the dividend and divisor
right by the number of trailing zeros. Now the divisor is odd and we
can do the original algorithm to calculate a remainder. Then we shift
that remainder left by the number of trailing zeros and add the bits
that were shifted out of the dividend.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D135541
2022-10-10 11:02:22 -07:00
LiaoChunyu
a835b92e6c [RISCV] Use hasAllWUsers to recover XORI/ORI
reference 0fbe71e91f44.

Also add testcase for addi.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D135538
2022-10-10 14:16:50 +08:00
Craig Topper
b0c2f90453 [RISCV] Merge more rv32/rv64 vector intrinsic tests that contain the same content. 2022-10-08 18:30:40 -07:00
Craig Topper
39532ea073 [RISCV] Add signext attribute to i32 arguments in some tests. NFC 2022-10-08 10:50:16 -07:00
Craig Topper
9f67047cf0 [VP][RISCV] Add vp.smax/smin/umax/umin intrinsics
Differential Revision: https://reviews.llvm.org/D135418
2022-10-07 17:14:31 -07:00
eopXD
dbc681c98e [VP][RISCV] Add vp.roundtozero and its RISC-V support
The scalar instruction of this is `llvm.trunc`. However the naming of
ISD::VP_TRUNC is already taken by `trunc` of the LLVM IR. Naming this as
`vp.ftrunc` would likely cause confusion with `vp.fptrunc`. So adding
`vp.roundtozero` that will look similar to `vp.roundeven`.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D135233
2022-10-07 02:15:23 -07:00
Craig Topper
3b20765cf7 [RISCV] Use mask agnostic policy for isel patterns where the merge operand is IMPLICIT_DEF.
I tend to think we should ignore the policy bit in vsetvli insertion
if the tied operand is IMPLICIT_DEF. But that raises questions about
what the policy operand on RVV intrinsics means if you also pass
vundefined().

This change at least fixes some cases. I'll post a separate patch
for vsetvli insertion for discussion.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D135386
2022-10-06 15:44:39 -07:00
Philip Reames
79f0413e5e [RISCV] Use branchless form for selects with -1 in either arm
We can lower these as an or with the negative of the condition value. This appears to result in significantly less branch-y code on multiple common idioms (as seen in tests).

Differential Revision: https://reviews.llvm.org/D135316
2022-10-06 15:18:43 -07:00
Craig Topper
dc2b8fb965 [RISCV] Use fixed vector types in fixed-vectors-vfnmsac-vp.ll. NFC 2022-10-06 11:02:13 -07:00
Craig Topper
3d6c63d413 [RISCV] Cleanup some vector tests. NFC
Some tests had scalable vector intrinsic names with fixed vector types.
Some had types in the wrong order.

Remove scalable vector test from fixed vector files.

Also replace insert+shuffle constexprs with fixed constant vectors.
2022-10-06 10:51:39 -07:00
Ivan Tetyushkin
0e6c1576e6 [RISCV] Optimization for using compressed beqz and bnez PR#56391
Optimization for using compressed beqz and bnez

If there is pattern
```
br_cc val1 constval eq/neq place
select_cc val1 constval eq/neq trueval falseval
```
and constval does not fit in compressed imm format(6 bit), but fit in
imm format(12 bit), we can replace by non compress sub and compress
c.beqz/c.bneqz:

```
addi val val -constval
c.beqz val place
```

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D132839
2022-10-06 09:33:32 -07:00