1560 Commits

Author SHA1 Message Date
Craig Topper
060b3023e1
[RISCV] Move TRUNCATE_VECTOR_VL combine into a helper function. NFC (#93574)
I plan to add other combines on TRUNCATE_VECTOR_VL.
2024-05-28 14:49:57 -07:00
Craig Topper
d490ce22e9
[RISCV] Use mask undisturbed policy when silencing sNans for strict rounding ops. (#93356)
The elements that aren't sNans need to get passed through this fadd
instruction unchanged. With the agnostic mask policy they might be
forced to all ones.
2024-05-28 08:51:42 -07:00
Craig Topper
a1c9b9673c
[SelectionDAG][RISCV][VE] Rename VP_ASHR->VP_SRA VP_LSHR->VP_SRL. (#93221)
This maintains consistency with the non-VP ISD opcodes.
2024-05-24 09:03:19 -07:00
Yingwei Zheng
557bf3835b
[RISCV][ISel] Allow opaque constants in hasAndNotCompare (#92926)
See the following code:

4ae896fe97/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (L9334-L9357)

> Combining: t47: i64 = xor t43, OpaqueConstant:i64<31808>
X: i64 = Constant<0>
Y: i64 = OpaqueConstant<31808>

The assertion failed because both `X` and `Y` are constants.
This patch allows opaque constants in `hasAndNotCompare` to fix the
issue.

Fixes https://github.com/llvm/llvm-project/issues/90730.
2024-05-22 00:48:26 +08:00
Yingwei Zheng
76748119bf
[GISel][RISCV] Add irtranslator/legalizer/selector support for G_FREEZE. (#92744)
This patch adds support for G_FREEZE on riscv. It will be selected into
a copy instruction.
 
The ll test is copied from the AArch64 patch:
665da59685.
2024-05-21 23:59:51 +08:00
Craig Topper
6246b495ad
[RISCV] Select ISD::AVGCEILS/AVGFLOORS as vaadd. (#92839)
I think the behaviors are the same if this describes their behavior.

AVGFLOORS sign extends the inputs by 1 bit, adds them, then does an
arithmetic shift right by 1 before truncating to the original bit width.
This is vaadd with rdn rounding mode.

AVGCEILS sign extends the inputs by 1 bit, adds them, then does an
arithmetic shift right by 1. If the bit shifted out is 1, it adds 1 to
the shifted value. Then truncates to the original bit width. This is vaadd
with rnu rounding mode.

I think this wasn't implemented previously because there was some
confusion about what average means. Some may expect average to round
towards zero, but there is no way to do that in RISC-V or with the
SelectionDAG nodes. Related issue
https://github.com/riscv/riscv-v-spec/issues/935
2024-05-20 23:24:22 -07:00
Kazu Hirata
79a6a7e28f [RISCV] Fix a warning
This patch fixes:

  llvm/lib/Target/RISCV/RISCVISelLowering.cpp:19848:11: error:
  enumeration value 'SW_GUARDED_BRIND' not handled in switch
  [-Werror,-Wswitch]
2024-05-14 00:48:56 -07:00
Yeting Kuo
d488a54b40
[RISCV] Use software guarded branch for indirect jump table branch. (#66762)
When Zicfilp enabled, indirect jump table branch should be a software
guarded branch.
2024-05-14 14:44:25 +08:00
Philip Reames
6140b5bae4
[RISCV] Use RISCVISD::SHL_ADD in transformAddShlImm (#89832)
Doing so avoids negative interactions with other combines which don't
know the shl_add is a single instruction. From the commit log, we've had
several combine loops already.

This was originally posted as part of #88791, where a bug was pointed
out. That bug was fixed by #89789 which hits the same issue from another
angle. To confirm the fix, I included the reduced test case here.
2024-05-13 09:48:46 -07:00
Paul Kirth
d95f7c9cab
[RISCV] Use the thread local stack protector for Android targets (#87672)
Android supports per thread stack protectors that are individually
managed and
initialized, which can provide stronger protections than using the
global stack
protector cookie. This patch matches the convention for other
architectures
targeting Android platforms.
2024-05-13 08:52:59 -07:00
Min-Yih Hsu
f8063ffe73
[VP][RISCV] Add vp.reduce.fmaximum/fminimum and its RISC-V codegen (#91782)
`vp.reduce.fmaximum/fminimum` are the VP version of
`vector.reduce.fmaximum/fminimum`.
2024-05-10 16:01:47 -07:00
Luke Lau
d24eaef925
[RISCV] Sink vector select splat operands (#91554)
vmerge.vxm allows us to splat the true operand of a select, so sink it
where possible to reduce vector register pressure.
2024-05-10 10:01:23 +08:00
Harald van Dijk
8fd838a8c4
[RISC-V] Limit vscale interleaving to addrspace 0. (#91573)
The vlseg and vsseg intrinsic functions are not overloaded on pointer
type, so cannot handle non-default address spaces.

This fixes an error we see after #90583.
2024-05-09 19:15:42 +01:00
Philip Reames
4298fc5eb5
[RISCV] Move strength reduction of mul X, 3/5/9*2^N to combine (#89966)
This moves our last major category tablegen driven multiply strength
reduction into the post legalize combine framework. The one slightly
tricky bit is making sure that we use a leading shl if we can form a
slli.uw, and trailing shl otherwise. Having the trailing shl is critical
for shNadd matching, and folding any following sext.w.

As can be seen in the TD deltas, this allows us to kill off both the
actual multiply patterns and the explicit add (mul X, C) Y patterns. The
later are now handled by the generic shNadd matching code, with the
exception of the THead only C=200 case because we don't (yet) have a
multiply expansion with two shNadd + a shift.

---------

Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
2024-05-08 10:13:01 -07:00
Liao Chunyu
f4d2f7a3b7
[RISCV] Codegen support for XCVbi extension (#89719)
spec:
https://github.com/openhwgroup/cv32e40p/blob/master/docs/source/instruction_set_extensions.rst#immediate-branching-operations

Contributors: @CharKeaney, @jeremybennett, @lewis-revill,
@NandniJamnadas,
@PaoloS02, @simonpcook, @xingmingjie, @realqhc, @PhilippvK,@melonedo
2024-05-08 11:22:16 +08:00
Jianjian Guan
37fcb323f6
[RISCV] Add codegen support for Zvfbfmin (#87911)
This patch adds basic codegen support for Zvfbfmin extension.
2024-05-07 10:25:06 +08:00
Yingwei Zheng
2647bd7369
[RISCV][ISel] Fix types in tryFoldSelectIntoOp (#90659)
```
SelectionDAG has 17 nodes:
  t0: ch,glue = EntryToken
    t6: i64,ch = CopyFromReg t0, Register:i64 %2
  t8: i1 = truncate t6
          t4: i64,ch = CopyFromReg t0, Register:i64 %1
        t7: i1 = truncate t4
            t2: i64,ch = CopyFromReg t0, Register:i64 %0
          t10: i64,i1 = saddo t2, Constant:i64<1>
        t11: i1 = or t8, t10:1
      t12: i1 = select t7, t8, t11
    t13: i64 = any_extend t12
  t15: ch,glue = CopyToReg t0, Register:i64 $x10, t13
  t16: ch = RISCVISD::RET_GLUE t15, Register:i64 $x10, t15:1
```

`OtherOpVT` should be i1, but `OtherOp->getValueType(0)` returns `i64`,
which ignores `ResNo` in `SDValue`.

Fix https://github.com/llvm/llvm-project/issues/90652.
2024-05-01 06:51:36 +08:00
Luke Lau
f565b79f9f
[RISCV] Handle fixed length vectors with exact VLEN in lowerINSERT_SUBVECTOR (#84107)
This is the insert_subvector equivalent to #79949, where we can avoid
sliding up by the full LMUL amount if we know the exact subregister the
subvector will be inserted into.

This mirrors the lowerEXTRACT_SUBVECTOR changes in that we handle this
in two parts:

- We handle fixed length subvector types by converting the subvector to
a scalable vector. But unlike EXTRACT_SUBVECTOR, we may also need to
convert the vector being inserted into too.

- Whenever we don't need a vslideup because either the subvector fits
exactly into a vector register group *or* the vector is undef, we need
to emit an insert_subreg ourselves because RISCVISelDAGToDAG::Select
doesn't correctly handle fixed length subvectors yet: see d7a28f7ad

A subvector exactly fits into a vector register group if its size is a
known multiple of the size of a vector register, and this adds a new
overload for TypeSize::isKnownMultipleOf for scalable to scalable
comparisons to help reason about this.

I've left RISCVISelDAGToDAG::Select untouched for now (minus relaxing an
invariant), so that the insert_subvector and extract_subvector code
paths are the same.

We should teach it to properly handle fixed length subvectors in a
follow-up patch, so that the "exact subregsiter" logic is handled in one
place instead of being spread across both RISCVISelDAGToDAG.cpp and
RISCVISelLowering.cpp.
2024-05-01 01:35:13 +08:00
Min-Yih Hsu
539f626ecd
[VP][RISCV] Add vp.cttz.elts intrinsic and its RISC-V codegen (#90502)
This intrinsic is the VP version of `experimental.cttz.elts`.
2024-04-30 09:27:10 -07:00
Craig Topper
2524146b25
[RISCV] Add DAG combine for (vmv_s_x_vl (undef) (vmv_x_s X). (#90524)
We can use the original vector as long as the type of X matches the
result type of the vmv_s_x_vl.
2024-04-29 23:35:30 -07:00
Craig Topper
f9d4d54aa0
[RISCV] Break the (czero_eqz x, (setne x, 0)) -> x combine into 2 combines. (#90428)
We can think of this as two separate combines

(czero_eqz x, (setne y, 0)) -> (czero_eqz x, y)
and
(czero_eqz x, x) -> x

Similary the (czero_nez x, (seteq x, 0)) -> x combine can be broken into

(czero_nez x, (seteq y, 0)) -> (czero_eqz x, y)
and
(czero_eqz x, x) -> x

isel already does the (czero_eqz x, (setne y, 0)) -> (czero_eqz x, y)
and (czero_nez x, (seteq y, 0)) -> (czero_eqz x, y) combines, but doing
them early could expose other opportunities.
2024-04-29 10:15:57 -07:00
Maciej Gabka
bfc0317153
Move several vector intrinsics out of experimental namespace (#88748)
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice

from the experimental namespace.

All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
2024-04-29 10:16:45 +01:00
Qiu Chaofan
4a8f2f2e1a
[Legalizer] Expand fmaximum and fminimum (#67301)
According to langref, llvm.maximum/minimum has -0.0 < +0.0 semantics and
propagates NaN.

Expand the nodes on targets not supporting the operation, by adding
extra check for NaN and using is_fpclass to check zero signs.
2024-04-29 15:09:54 +08:00
Zhijin Zeng
37eb9c9632
[RISC-V][ISel] Remove redundant czero.eqz like 'czero.eqz a0, a0, a0' (#90208)
In RISC-V ISel, the instruction `czero.eqz a0, a0, a0` is meaningless.
This patch does the following folds in ISel:
```
czero_eqz x, (setcc x, 0, ne) -> x
czero_nez x, (setcc x, 0, eq) -> x
```

---------

Signed-off-by: Zhijin Zeng <zhijin.zeng@spacemit.com>
2024-04-28 13:28:14 +08:00
Alex Bradbury
357530f113 Revert "[llvm][RISCV] Enable trailing fences for seq-cst stores by default (#87376)"
This reverts commit 733b271db793ce30c504a1b5c4ae7a8775b0a6a2.

Reverting in order to revert the companion patch adding the atomics ABI
ELF attributes due to the reported incompatibility with GNU ld.
https://github.com/llvm/llvm-project/pull/84597#issuecomment-2079128332
2024-04-26 12:16:53 +01:00
Paul Kirth
733b271db7
[llvm][RISCV] Enable trailing fences for seq-cst stores by default (#87376)
With the tag merging in place, we can safely change the default for
+seq-cst-trailing-fence to the default, according to the recommendation
in

https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-atomic.adoc

This tag changes the default for the feature flag, and moves to more
consistent naming with respect to existing features.
2024-04-25 16:33:10 -07:00
Philip Reames
2e77aea22f
[RISCV] Give up on correct undef semantics in mul strength reduction (#90097)
This is a change I really don't like posting, but I think we're out of
other options.  As can be seen in the test differences, we have cases
where adding the freeze inhibits real optimizations.
    
Given no other target handles the undef semantics correctly here, I
think the practical answer is that we shouldn't either.  Yuck.
    
As examples, consider:
* combineMulSpecial in X86.
* performMulCombine in AArch64
    
The only other real option I see here is to move all of the strength
reduction code out of ISEL.  We could do this either via tablegen rules,
or as an MI pass, but other than shifting the point where we ignore
undef
semantics, I don't this is meaningfully different.
    
Note that the particular tests included here would be fixed if we added
SHA/SHL to canCreateUndefOrPoison. However, a) that's already been tried
twice and exposes its own set of regressions, and b) these are simply
examples.  You can create many alternate examples.
2024-04-25 11:52:40 -07:00
Michael Maitland
80f510bbc9 [RISCV] Use lookup tables to find CVTFOpc 2024-04-24 09:29:21 -07:00
Philip Reames
b4f923e912
[RISCV] Strength reduce mul by 2^M - 3/5/9 (#88993)
We can expand these as the three instruction sequence: (sub (shl X, C1), (shXadd X, x)).
2024-04-24 07:55:02 -07:00
Xu Zhang
f6d431f208
[CodeGen] Make the parameter TRI required in some functions. (#85968)
Fixes #82659

There are some functions, such as `findRegisterDefOperandIdx` and  `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI  parameters, as shown in issue #82411.

Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`,  `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact.

After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.
2024-04-24 14:24:14 +01:00
Pengcheng Wang
6493da7356
[RISCV] Use the store value's VT as the MemoryVT after combining riscv.masked.strided.store (#89874)
According to `RISCVTargetLowering::getTgtMemIntrinsic`, the MemoryVT
is the scalar element VT for strided store and the MemoryVT is the
same as the store value's VT for unit-stride store.

After combining `riscv.masked.strided.store` to `masked.store`, we
just use the scalar element VT to construct `masked.store`, which is
wrong.

With wrong MemoryVT, the DAGCombiner will combine `trunc+masked.store`
to truncated `masked.store` because `TLI.canCombineTruncStore` returns
true.

So, we should use the store value's VT as the MemoryVT.

This fixes #89833.
2024-04-24 14:32:06 +08:00
Philip Reames
0c032fd542
[RISCV] Use SHL_ADD in remaining strength reduce cases for MUL (#89789)
The interesting bit is the zext folding. This is the first case where we
end up with a profitable fold of shNadd (zext x), y to shNadd.uw x, y.
See zext_mul68 from rv64zba.ll.

The test differences are cases where we can legally fold (only because
there's no one use check). These are not profitable or harmful, but we
can't a oneuse check without breaking the zext_mul68 case.

Note that XTHeadBa doesn't appear to have the equivalent patterns so
this only shows up in Zba.
2024-04-23 12:40:55 -07:00
Philip Reames
03760ad09d Reapply "[RISCV] Implement RISCVISD::SHL_ADD and move patterns into combine (#89263)"
Changes since original commit:
* Rebase over improved test coverage for theadba
* Revert change to use TargetConstant as it appears to prevent the uimm2
  clause from matching in the XTheadBa patterns.
* Fix an order of operands bug in the THeadBa pattern visible in the new
  test coverage.

Original commit message follows:

This implements a RISCV specific version of the SHL_ADD node proposed in
https://github.com/llvm/llvm-project/pull/88791.

If that lands, the infrastructure from this patch should seamlessly
switch over the to generic DAG node. I'm posting this separately because
I've run out of useful multiply strength reduction work to do without
having a way to represent MUL X, 3/5/9 as a single instruction.

The majority of this change is moving two sets of patterns out of
tablgen and into the post-legalize combine. The major reason for this is
that I have an upcoming change which needs to reuse the expansion logic,
but it also helps common up some code between zba and the THeadBa
variants.

On the test changes, there's a couple major categories:
* We chose a different lowering for mul x, 25. The new lowering involves
  one fewer register and the same critical path, so this seems like a win.
* The order of the two multiplies changes in (3,5,9)*(3,5,9) in some
  cases. I don't believe this matters.
* I'm removing the one use restriction on the multiply. This restriction
  doesn't really make sense to me, and the test changes appear positive.
2024-04-23 08:30:38 -07:00
Philip Reames
dc3f94384d Revert "[RISCV] Implement RISCVISD::SHL_ADD and move patterns into combine (#89263)"
This reverts commit 5a7c80ca58c628fab80aa4f95bb6d18598c70c80.  Noticed failures
with the following command:
$ llc -mtriple=riscv64 -mattr=+m,+xtheadba -verify-machineinstrs < test/CodeGen/RISCV/rv64zba.ll

I think I know the cause and will likely reland with a fix tomorrow.
2024-04-22 17:25:59 -07:00
Philip Reames
5a7c80ca58
[RISCV] Implement RISCVISD::SHL_ADD and move patterns into combine (#89263)
This implements a RISCV specific version of the SHL_ADD node proposed in
https://github.com/llvm/llvm-project/pull/88791.

If that lands, the infrastructure from this patch should seamlessly
switch over the to generic DAG node. I'm posting this separately because
I've run out of useful multiply strength reduction work to do without
having a way to represent MUL X, 3/5/9 as a single instruction.

The majority of this change is moving two sets of patterns out of
tablgen and into the post-legalize combine. The major reason for this is
that I have an upcoming change which needs to reuse the expansion logic,
but it also helps common up some code between zba and the THeadBa
variants.

On the test changes, there's a couple major categories:
* We chose a different lowering for mul x, 25. The new lowering involves
one fewer register and the same critical path, so this seems like a win.
* The order of the two multiplies changes in (3,5,9)*(3,5,9) in some
cases. I don't believe this matters.
* I'm removing the one use restriction on the multiply. This restriction
doesn't really make sense to me, and the test changes appear positive.
2024-04-22 13:41:27 -07:00
Philip Reames
9a3595167d
[RISCV] Add freeze when expanding mul by constant to two or more uses (#89290)
topperc pointed this out in review of
https://github.com/llvm/llvm-project/pull/88791, but I believe the
problem applies
here as well. Worth noting is that the code I introduced with this bug
was mostly copied from other targets - which
also have this bug.
2024-04-22 11:40:48 -07:00
Craig Topper
016ce9ed5c
[RISCV] Rename FeatureRVE to FeatureStdExtE. NFC (#89174)
Planning to declare all extensions in tablegen so we can generate the
tables for RISCVISAInfo.cpp. This requires making "e" consistent with
other extensions.
2024-04-19 12:39:32 -07:00
Craig Topper
9067070d91
[RISCV] Re-separate unaligned scalar and vector memory features in the backend. (#88954)
This is largely a revert of commit
e81796671890b59c110f8e41adc7ca26f8484d20.

As #88029 shows, there exists hardware that only supports unaligned
scalar.

I'm leaving how this gets exposed to the clang interface to a future
patch.
2024-04-16 15:40:32 -07:00
Philip Reames
885b8d9bb5 [RISCV] Enable mul strength reduction for XTheadBa
This vendor extension has the same shift_add as zba, and most of the same
patterns are duplicated.  Enable it here too so the configurations don't
diverge.
2024-04-16 13:28:11 -07:00
Philip Reames
6b83fe5529
[RISCV] Strength reduce mul by 2^n + 2/4/8 + 1 (#88911)
With zba, we can expand this to (add (shl X, C1), (shXadd X, X)).

Note that this is our first expansion to a three instruction sequence. I
believe this to general be a reasonable tradeoff for most architectures,
but we may want to (someday) consider a tuning flag here.

I plan to support 2^n + (2/4/8 + 1) eventually as well, but that comes
behind 2^N - 2^M. Both are also three instruction sequences.

---------

Co-authored-by: Min-Yih Hsu <min@myhsu.dev>
2024-04-16 11:03:53 -07:00
Philip Reames
184ba038ac
[RISCV] Avoid matching 3/5/9 * 2^N as 2^N + 2/4/8 (e.g. 24) (#88937)
The former is better as a zero extend can be folded into the sll,
whereas the later currently produces a seperate zext.w due to bad
interactions with other combines.
2024-04-16 10:46:27 -07:00
Brandon Wu
91dd844aa4
Recommit [RISCV] RISCV vector calling convention (2/2) (#79096) (#87736)
Bug fix: Handle RVV return type in calling convention correctly.
Return values are handled in a same way as function arguments.
One thing to mention is that if a type can be broken down into
homogeneous
vector types, e.g. {<vscale x 4 x i32>, {<vscale x 4 x i32>, <vscale x 4
x i32>}},
it is considered as a vector tuple type and need to be handled by tuple
type rule.
2024-04-16 19:59:36 +08:00
Craig Topper
17d6bf046c
[RISCV] Change how MMO is rebuilt in lowerFixedLengthVectorLoadToRVV/lowerFixedLengthVectorStoreToRVV (#88811)
Copy the pointer info, flags, alignment, AAInfo, and ranges, but let
getLoad rebuild the MMO using the scalable type used for the the new
load/store. This makes sure the LLT minimum size matches the ContainerVT
minimum size. This is important since vscale_range may have been used to
determine that the fixed vector was the exact size of a scalable vector.

Fixes #88799
2024-04-15 22:39:09 -07:00
Craig Topper
5b9af38a03
[RISCV] Provide a more efficient lowering for experimental.cttz.elts. (#88552)
For experimental.cttz.elts, we can use a vfirst instruction, but we need
to correct the result if input vector can be 0. cttz.elts returns the
vector length while vfirst returns -1.
2024-04-15 18:38:54 -07:00
Philip Reames
2b06ff555a
[RISCV] Expand mul to shNadd x, (slli x, c) in DAGCombine (#88524)
This expansion is directly inspired by the analogous code in the x86
backend for LEA. shXadd and (this sub-case of) LEA are largely
equivalent.

This is an alternative to
https://github.com/llvm/llvm-project/pull/87105.

This expansion is also supported via the decomposeMulByConstant
callback, but restricted because of interactions with other combines
since that code runs before legalization. As discussed in the other
review, my original plan had been to support post legalization expansion
through the same interface, but that ended up being more complicated
than seems justified.

Instead, lets go ahead and do the general expansion post-legalize. Other
targets use the combine approach, and matching that structure makes it
easier for us to adapt ideas from other targets to RISCV.
2024-04-15 17:38:39 -07:00
Michael Maitland
60a1158f31 [RISCV] Split Widening convert to FP pseudos by SEW 2024-04-15 06:08:52 -07:00
Brandon Wu
3fa830804e
Revert "[RISCV] RISCV vector calling convention (2/2) (#79096)" (#88511)
This reverts commit 29e8bfc13c6078ed07e6474e8c9634c42aa2f6f4.
This patch didn't handle vector return type correctly.
2024-04-12 21:11:45 +08:00
Sacha Coppey
53003e36e9
[RISCV] Implement Statepoint and Patchpoint lowering to call instructions (#77337)
This patch adds stackmap support for RISC-V with call targets.

Based on patch from https://reviews.llvm.org/D129848.
2024-04-11 12:19:56 +08:00
Craig Topper
999b9e6ddb [RISCV] Use vector getConstant instead of getSplatVector+getConstant. NFC 2024-04-10 19:39:41 -07:00
Jianjian Guan
fd50151180
[RISCV] Only support SPLAT_VECTOR for Zvfhmin when also enable the scalar extension of half fp (#88275) 2024-04-11 10:23:26 +08:00