467 Commits

Author SHA1 Message Date
Roger Ferrer Ibáñez
9d469b5988
[RISCV] Implement trampolines for rv64 (#96309)
This is implementation is based on what the X86 target does but
emitting the instructions that GCC emits for rv64.

---------

Co-authored-by: Pengcheng Wang <wangpengcheng.pp@bytedance.com>
2024-10-18 08:06:47 +02:00
Jeffrey Byrnes
853c43d04a
[TTI] NFC: Port TLI.shouldSinkOperands to TTI (#110564)
Porting to TTI provides direct access to the instruction cost model,
which can enable instruction cost based sinking without introducing code
duplication.
2024-10-09 14:30:09 -07:00
Jesse Huang
9bdcf7aa18
[RISCV] Software guard direct calls in large code model (#109377)
Support for large code model are added recently, and sementically
direct calls are lowered to an indirect branch with a constant pool target.
By default it does not use the x7 register and this is suboptimal with
Zicfilp because it introduces landing pad check, which is unnecessary
since the constant pool is read-only and unlikely to be tampered.

Change direct calls and tail calls to use x7 as the scratch
register (a.k.a. software guarded branch in the CFI spec)
2024-09-27 13:04:16 +08:00
Alex Bradbury
0ee10e9466
[RISCV] Add additional fence for amocas when required by recent ABI change (#101023)
A recent atomics ABI change / fix requires that for the "A6C" and A6S"
atomics ABIs (i.e. both of those supported by LLVM currently), an
additional fence is inserted for an atomic_compare_exchange with seq_cst
failure ordering.
<https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/445>

This isn't trivial to support through the hooks used by AtomicExpandPass
because that pass assumes that when fences are inserted, the original
atomics ordering information can be removed from the instruction. Rather
than try to change and complicate that API, this patch implements the
needed fence insertion through a small special purpose pass.
2024-09-19 13:39:56 +01:00
Craig Topper
de6d7a6c30
[RISCV] Expand Zfa fli+fneg cases during lowering instead of during isel. (#108316)
Most of the constants fli can generate are positive numbers. We can use
fli+fneg to generate their negative versions.

Previously, we considered such negative constants as "legal" and let
isel generate the fli+fneg. However, it is useful to expose the fneg to
DAG combines to fold with fadd to produce fsub or with fma to produce
fnmadd, fnmsub, or fmsub.

This patch moves the fneg creation to lowering so that the fneg will be
visible to the last DAG combine.

I might move the rest of Zfa handling from isel to lowering as a follow
up.

Fixes #107772.
2024-09-11 22:31:45 -07:00
Craig Topper
093b8bfe6b
[RISCV] Separate the calling convention handlers into their own file. NFC (#107484)
These are used by both SelectionDAG and GlobalISel and are separate from
RISCVTargetLowering.

Having a separate file is how other targets are structured. Though other
targets generate most of their calling convention code through tablegen.

I moved the `CC_RISV` functions from the `llvm::RISCV` namespace to
`llvm::`. That's what the tablegen code on other targets does and the
functions already have RISCV in their name. `RISCVCCAssignFn` is moved
from `RISCVTargetLowering` to the `llvm` namespace.
2024-09-05 22:29:23 -07:00
Craig Topper
36c210bb34
[RISCV] Remove pre-assignment of mask vectors during call lowering. NFC (#107192)
The first mask vector operand is supposed to be assigned to V0. No other
vector types will be assigned to V0. We don't need to pre-assign, we can
just try V0 first for any mask vectors in the normal processing.
2024-09-04 11:14:31 -07:00
Craig Topper
a5ce66423b [RISCV] Remove RISCVISD::FP_ROUND_BF16.
Use isel patterns on regular FP_ROUND. For double->bf16 we need
to emit two instructions. Note the double->bf16 conversion does
double rounding, but I don't know a good way to fix that.
2024-09-03 20:18:01 -07:00
Craig Topper
55eb93b268
[RISCV] Remove RISCVISD::FP_EXTEND_BF16. (#106939)
I don't think we need this node. We can isel fp_extend directly.
fp_extend to f64 requires two instructions, but we can emit them with an
isel pattern.

I have not removed RISCVISD::FP_ROUND_BF16 because f64->bf16 needs more
work to fix the double rounding.
2024-09-02 10:14:04 -07:00
Brandon Wu
22f98740b6
[llvm][RISCV] Support RISCV vector tuple CodeGen and Calling Convention (#97995)
This patch handles target lowering and calling convention.

For target lowering, the vector tuple type represented as multiple
scalable vectors is now changed to a single `MVT`, each `MVT` has a
corresponding register class.

The load/store of vector tuples are handled as the same way but need
another vector insert/extract instructions to get sub-register group.

Inline assembly constraint for vector tuple type can directly be modeled
as "vr" which is identical to normal vector registers.

For calling convention, it no longer needs an alternative algorithm to
handle register allocation, this makes the code easier to maintain and
read.

Stacked on https://github.com/llvm/llvm-project/pull/97994
2024-08-31 19:28:36 +08:00
Brandon Wu
db67a66e8e
Revert "[RISCV] RISCV vector calling convention (2/2)" (#97994)
This reverts commit 91dd844aa499d69c7ff75bf3156e2e3593a88057.

Stacked on https://github.com/llvm/llvm-project/pull/97993
2024-08-31 19:02:35 +08:00
Brandon Wu
579fd59ab9
[RISCV][ISel] Move VCIX ISDs to correct position. NFC (#105934)
Current VCIX ISDs are placed after FIRST_TARGET_STRICTFP_OPCODE which is
not expected, it should be in normal OPCODE area.
2024-08-25 14:40:03 +08:00
Hassnaa Hamdi
3176f255c9
[IA][AArch64]: Construct (de)interleave4 out of (de)interleave2 (#89276)
- [AArch64]: TargetLowering is updated to spot load/store (de)interleave4 like sequences using PatternMatch,
   and emit equivalent sve.ld4 and sve.st4 intrinsics.
2024-08-12 17:23:00 +01:00
Luke Lau
b1542afd0b
[RISCV] Rename merge operand -> passthru. NFC (#100330)
We sometimes call the first tied dest operand in vector pseudos the
merge operand, and other times the passthru.

Passthru seems to be more common, and it's what the C intrinsics call
it[^1], so this renames all usages of merge to passthru to be
consistent. It also helps prevent confusion with vmerge.vvm in some of
the peephole optimisations.

[^1]:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/main/doc/rvv-intrinsic-spec.adoc#the-passthrough-vd-argument-in-the-intrinsics
2024-07-30 17:47:00 +08:00
Yingwei Zheng
13996378d8
[RISCV][ISel] Fold FSGNJX idioms (#100718)
This patch folds `fmul X, (fcopysign 1.0, Y)` into `fsgnjx X, Y`. This
pattern exists in some graphics applications/math libraries.
Alive2: https://alive2.llvm.org/ce/z/epyL33

Since fpimm +1.0 is lowered to a load from constant pool after
OpLegalization, I have to introduce a new RISCVISD node FSGNJX and fold
this pattern in DAGCombine.

Closes https://github.com/dtcxzyw/llvm-opt-benchmark/issues/1072.
2024-07-27 12:51:58 +08:00
Craig Topper
caaba2a883
[RISCV] Replace VNCLIP RISCVISD opcodes with TRUNCATE_VECTOR_VL_SSAT/USAT opcodes (#100173)
These new opcodes drop the shift amount, rounding mode, and passthru.
Making them exactly like TRUNCATE_VECTOR_VL. The shift amount, rounding
mode, and passthru are added in isel patterns similar to how we
translate TRUNCATE_VECTOR_VL to vnsrl with a shift of 0.

This should simplify #99418 a little.
2024-07-23 14:57:31 -07:00
Yeting Kuo
746cea3eb7
[VP][RISCV] Introduce vp.splat and RISC-V. (#98731)
This patch introduces a vp intrinsic for splat. It's helpful for
IR-level passes to create a splat with specific vector length.
2024-07-17 08:40:42 +08:00
Froster
c8dc21d77f
[SelectionDAG][RISCV] Fix break of vnsrl pattern in issue #94265 (#95563)
Added a RISCV overload of `isTruncateFree` to fix the break of vnsrl described in issue #94265.

Fixes #94265
2024-07-14 12:09:37 +01:00
Roger Ferrer Ibáñez
5ef02d9963
[RISCV] Lower llvm.clear_cache to __riscv_flush_icache for glibc targets (#93481)
This change is a preliminary step to support trampolines on RISC-V. Trampolines are used by flang to implement obtaining the address of an internal program (i.e., a nested function in Fortran parlance).

In this change we lower `llvm.clear_cache` intrinsic on glibc targets to
`__riscv_flush_icache` which is what GCC is currently doing for Linux targets.
2024-06-20 07:27:07 +02:00
Craig Topper
ec8fe598a9
[RISCV] Move vnclipu patterns into DAGCombiner. (#93596)
I plan to add support for multiple layers of vnclipu. For example,
i32->i8 using 2 vnclipu instructions. First clipping to 65535, then
clipping to 255. Similar for signed vnclip.
    
This scales poorly if we need to add patterns with 2 or 3 truncates.
Instead, move the code to DAGCombiner with new ISD opcodes to represent
VCLIP(U).
    
This patch just moves the existing patterns into DAG combine. Support
for multiple truncates will as a follow up. A similar patch series will
be made for the signed vnclip.
2024-05-29 13:00:15 -07:00
Craig Topper
6246b495ad
[RISCV] Select ISD::AVGCEILS/AVGFLOORS as vaadd. (#92839)
I think the behaviors are the same if this describes their behavior.

AVGFLOORS sign extends the inputs by 1 bit, adds them, then does an
arithmetic shift right by 1 before truncating to the original bit width.
This is vaadd with rdn rounding mode.

AVGCEILS sign extends the inputs by 1 bit, adds them, then does an
arithmetic shift right by 1. If the bit shifted out is 1, it adds 1 to
the shifted value. Then truncates to the original bit width. This is vaadd
with rnu rounding mode.

I think this wasn't implemented previously because there was some
confusion about what average means. Some may expect average to round
towards zero, but there is no way to do that in RISC-V or with the
SelectionDAG nodes. Related issue
https://github.com/riscv/riscv-v-spec/issues/935
2024-05-20 23:24:22 -07:00
Craig Topper
888e087b09 [RISCV] Remove unused function declaration. NFC 2024-05-20 16:17:53 -07:00
Min-Yih Hsu
4c68de5a00
[RISCV][CostModel] Add cost model for experimental.cttz.elts (#91778)
The cost of `experimental.cttz.elts` in RISC-V equals to the cost of
vfirst when the zero_is_poison argument is true. Otherwise, we add
additional costs of cmp + select to convert the -1 result from vfirst to
EVL.
2024-05-14 09:18:08 -07:00
Yeting Kuo
d488a54b40
[RISCV] Use software guarded branch for indirect jump table branch. (#66762)
When Zicfilp enabled, indirect jump table branch should be a software
guarded branch.
2024-05-14 14:44:25 +08:00
Min-Yih Hsu
539f626ecd
[VP][RISCV] Add vp.cttz.elts intrinsic and its RISC-V codegen (#90502)
This intrinsic is the VP version of `experimental.cttz.elts`.
2024-04-30 09:27:10 -07:00
Philip Reames
03760ad09d Reapply "[RISCV] Implement RISCVISD::SHL_ADD and move patterns into combine (#89263)"
Changes since original commit:
* Rebase over improved test coverage for theadba
* Revert change to use TargetConstant as it appears to prevent the uimm2
  clause from matching in the XTheadBa patterns.
* Fix an order of operands bug in the THeadBa pattern visible in the new
  test coverage.

Original commit message follows:

This implements a RISCV specific version of the SHL_ADD node proposed in
https://github.com/llvm/llvm-project/pull/88791.

If that lands, the infrastructure from this patch should seamlessly
switch over the to generic DAG node. I'm posting this separately because
I've run out of useful multiply strength reduction work to do without
having a way to represent MUL X, 3/5/9 as a single instruction.

The majority of this change is moving two sets of patterns out of
tablgen and into the post-legalize combine. The major reason for this is
that I have an upcoming change which needs to reuse the expansion logic,
but it also helps common up some code between zba and the THeadBa
variants.

On the test changes, there's a couple major categories:
* We chose a different lowering for mul x, 25. The new lowering involves
  one fewer register and the same critical path, so this seems like a win.
* The order of the two multiplies changes in (3,5,9)*(3,5,9) in some
  cases. I don't believe this matters.
* I'm removing the one use restriction on the multiply. This restriction
  doesn't really make sense to me, and the test changes appear positive.
2024-04-23 08:30:38 -07:00
Philip Reames
dc3f94384d Revert "[RISCV] Implement RISCVISD::SHL_ADD and move patterns into combine (#89263)"
This reverts commit 5a7c80ca58c628fab80aa4f95bb6d18598c70c80.  Noticed failures
with the following command:
$ llc -mtriple=riscv64 -mattr=+m,+xtheadba -verify-machineinstrs < test/CodeGen/RISCV/rv64zba.ll

I think I know the cause and will likely reland with a fix tomorrow.
2024-04-22 17:25:59 -07:00
Philip Reames
5a7c80ca58
[RISCV] Implement RISCVISD::SHL_ADD and move patterns into combine (#89263)
This implements a RISCV specific version of the SHL_ADD node proposed in
https://github.com/llvm/llvm-project/pull/88791.

If that lands, the infrastructure from this patch should seamlessly
switch over the to generic DAG node. I'm posting this separately because
I've run out of useful multiply strength reduction work to do without
having a way to represent MUL X, 3/5/9 as a single instruction.

The majority of this change is moving two sets of patterns out of
tablgen and into the post-legalize combine. The major reason for this is
that I have an upcoming change which needs to reuse the expansion logic,
but it also helps common up some code between zba and the THeadBa
variants.

On the test changes, there's a couple major categories:
* We chose a different lowering for mul x, 25. The new lowering involves
one fewer register and the same critical path, so this seems like a win.
* The order of the two multiplies changes in (3,5,9)*(3,5,9) in some
cases. I don't believe this matters.
* I'm removing the one use restriction on the multiply. This restriction
doesn't really make sense to me, and the test changes appear positive.
2024-04-22 13:41:27 -07:00
Brandon Wu
91dd844aa4
Recommit [RISCV] RISCV vector calling convention (2/2) (#79096) (#87736)
Bug fix: Handle RVV return type in calling convention correctly.
Return values are handled in a same way as function arguments.
One thing to mention is that if a type can be broken down into
homogeneous
vector types, e.g. {<vscale x 4 x i32>, {<vscale x 4 x i32>, <vscale x 4
x i32>}},
it is considered as a vector tuple type and need to be handled by tuple
type rule.
2024-04-16 19:59:36 +08:00
Craig Topper
5b9af38a03
[RISCV] Provide a more efficient lowering for experimental.cttz.elts. (#88552)
For experimental.cttz.elts, we can use a vfirst instruction, but we need
to correct the result if input vector can be 0. cttz.elts returns the
vector length while vfirst returns -1.
2024-04-15 18:38:54 -07:00
Brandon Wu
3fa830804e
Revert "[RISCV] RISCV vector calling convention (2/2) (#79096)" (#88511)
This reverts commit 29e8bfc13c6078ed07e6474e8c9634c42aa2f6f4.
This patch didn't handle vector return type correctly.
2024-04-12 21:11:45 +08:00
Brandon Wu
29e8bfc13c
[RISCV] RISCV vector calling convention (2/2) (#79096)
This commit handles vector arguments/return for function definition/call,
the new class RVVArgDispatcher is added for doing all vector register
assignment including mask types, data types as well as tuple types.
It precomputes the register number for each argument as per
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#standard-vector-calling-convention-variant
and it's passed to calling convention function to handle all vector arguments.

Depends on: #78550
2024-03-30 21:05:33 +08:00
hchandel
5dfc446d75
[RISCV] Remove Unnecessary Semicolon. NFC (#86911)
Removes Unnecessary Semicolon

Co-authored-by: Harsh Chandel <hchandel@hu-hchandel-hyd.qualcomm.com>
2024-03-27 23:13:47 -07:00
Craig Topper
ce37a7131f
[RISCV] Add integer RISCVISD::SELECT_CC to canCreateUndefOrPoison and isGuaranteedNotToBeUndefOrPoison. (#84693)
Integer RISCVISD::SELECT_CC doesn't create poison. If none of the,
operands are poison, the result is not poison.

This allows ISD::FREEZE to be hoisted above RISCVISD::SELECT_CC.
2024-03-25 11:10:58 -07:00
Wang Pengcheng
85388a06b6
[RISCV] Move RISCVVType namespace to TargetParser (#83222)
Clang and some middle-end optimizations may need these helper
functions.

This can reduce some duplications.
2024-03-06 10:56:19 +08:00
Wang Pengcheng
a445474d3f
[RISCV] Use TImmLeaf for csr_sysreg (#82463)
And use `getTargetConstant` to create operands.

This PR addresses comments after committing #82322.
2024-02-21 15:04:29 +08:00
Wang Pengcheng
b8ed69ecc0 [RISCV] Support llvm.readsteadycounter intrinsic
This intrinsic was introduced by #81331, which is a lot like
`llvm.readcyclecounter`.

For the RISCV implementation, we rename `ReadCycleWide` pseudo to
`ReadCounterWide` and make it accept two operands (the low and high
parts of the counter). As for legalization and lowering parts, we
reuse the code of `ISD::READCYCLECOUNTER` (make it able to handle
both intrinsics), and we use `time` CSR for `ISD::READSTEADYCOUNTER`.

Tests using Clang builtins are runned on real hardware and it works
as excepted.

Reviewers: asb, MaskRay, dtcxzyw, preames, topperc, jhuber6

Reviewed By: jhuber6, asb, MaskRay, dtcxzyw

Pull Request: https://github.com/llvm/llvm-project/pull/82322
2024-02-21 13:12:14 +08:00
Craig Topper
9179d87abc [RISCV] Remove unused RISCVISD opcodes. NFC
These were left behind after fb94c6491a114ebd5815b1d42665a8f6bcd9d639
2024-01-30 20:46:01 -08:00
Jivan Hakobyan
0461448313
[RISCV][ISel] Add ISel support for experimental Zimop extension (#77089)
This implements ISel support for mopr[0-31] and moprr[0-7] instructions
for 32 and 64 bits

---------

Co-authored-by: ln8-8 <lyut.nersisyan@gmail.com>
2024-01-29 15:24:00 -08:00
Brandon Wu
33d804c6c2
[RISCV] Allow VCIX with SE to reorder (#77049)
This patch allows VCIX instructions that have side effect to be
reordered
with memory and other side effecting instructions. However we don't want
VCIX instructions to be reordered with each other, so we propose a dummy
register called VCIX_STATE and make these instructions implicitly define
and use
it.
2024-01-24 11:30:12 +08:00
Paul Kirth
03a61d34eb
[RISCV] Support TLSDESC in the RISC-V backend (#66915)
This patch adds basic TLSDESC support in the RISC-V backend.

Specifically, we add new relocation types for TLSDESC, as prescribed in 
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373, and add a
new pseudo instruction to simplify code generation.

This patch does not try to optimize the local dynamic case, which can be
improved in separate patches. 

Linker side changes will also be handled separately.

The current implementation is only enabled when passing the new
`-enable-tlsdesc` codegen flag.
2024-01-23 16:16:07 -08:00
Wang Pengcheng
3ac9fe69f7
[RISCV] CodeGen of RVE and ilp32e/lp64e ABIs (#76777)
This commit includes the necessary changes to clang and LLVM to support
codegen of `RVE` and the `ilp32e`/`lp64e` ABIs.

The differences between `RVE` and `RVI` are:
* `RVE` reduces the integer register count to 16(x0-x16).
* The ABI should be `ilp32e` for 32 bits and `lp64e` for 64 bits.

`RVE` can be combined with all current standard extensions.

The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are:
* Only 6 integer argument registers (rather than 8).
* Only 2 callee-saved registers (rather than 12).
* A Stack Alignment of 32bits (rather than 128bits).
* ilp32e isn't compatible with D ISA extension.

If `ilp32e` or `lp64` is used with an ISA that has any of the registers
x16-x31 and f0-f31, then these registers are considered temporaries.

To be compatible with the implementation of ilp32e in GCC, we don't use
aligned registers to pass variadic arguments and set stack alignment\
to 4-bytes for types with length of 2*XLEN.

FastCC is also supported on RVE, while GHC isn't since there is only one
avaiable register.

Differential Revision: https://reviews.llvm.org/D70401
2024-01-16 20:44:30 +08:00
Craig Topper
3378514a4d
[RISCV] Use any_extend for type legalizing atomic_compare_swap with Zacas. (#77669)
With Zacas we will use amocas.w which doesn't require the input to be
sign extended.
2024-01-10 12:41:11 -08:00
Chia
a79d13f12a
[RISCV][ISel] Use vaaddu with rounding mode rnu for ISD::AVGCEILU. (#77473)
Similar to #76550, but for `ISD::AVGCEILU`.
Specifically, this patch aims to use `vaaddu` with rounding mode rnu
(i.e `vxrm[1:0] = 0b00`) for `ISD::AVGCEILU`.

### Source code 
```
define <vscale x 8 x i8> @vaaddu_vv_nxv8i8_ceil(<vscale x 8 x i8> %x, <vscale x 8 x i8> %y) {
  %xzv = zext <vscale x 8 x i8> %x to <vscale x 8 x i16>
  %yzv = zext <vscale x 8 x i8> %y to <vscale x 8 x i16>
  %add = add nuw nsw <vscale x 8 x i16> %xzv, %yzv
  %one = insertelement <vscale x 8 x i16> poison, i16 1, i32 0
  %splat = shufflevector <vscale x 8 x i16> %one, <vscale x 8 x i16> poison, <vscale x 8 x i32> zeroinitializer
  %add1 = add nuw nsw <vscale x 8 x i16> %add, %splat
  %div = lshr <vscale x 8 x i16> %add1, %splat
  %ret = trunc <vscale x 8 x i16> %div to <vscale x 8 x i8>
  ret <vscale x 8 x i8> %ret
}
```

### Before this patch 
```
vaaddu_vv_nxv8i8_ceil:
        vsetvli a0, zero, e8, m1, ta, ma
        vwaddu.vv       v10, v8, v9
        vsetvli zero, zero, e16, m2, ta, ma
        vadd.vi v10, v10, 1
        vsetvli zero, zero, e8, m1, ta, ma
        vnsrl.wi        v8, v10, 1
        ret
```
### After this patch 
```
vaaddu_vv_nxv8i8_ceil:
        vsetvli a0, zero, e8, m1, ta, ma
        csrwi vxrm, 0
        vaaddu.vv v8, v8, v9
        ret
```
2024-01-10 12:08:16 +09:00
Chia
0c24c175f2
[RISCV][ISel] Use vaaddu with rounding mode rdn for ISD::AVGFLOORU. (#76550)
This patch aims to use `vaaddu` with rounding mode rdn (i.e `vxrm[1:0] =
0b10`) for `ISD::AVGFLOORU`.

### Source code 
```
define <8 x i8> @vaaddu_auto(ptr %x, ptr %y, ptr %z) {
  %xv = load <8 x i8>, ptr %x, align 2
  %yv = load <8 x i8>, ptr %y, align 2
  %xzv = zext <8 x i8> %xv to <8 x i16>
  %yzv = zext <8 x i8> %yv to <8 x i16>
  %add = add nuw nsw <8 x i16> %xzv, %yzv
  %div = lshr <8 x i16> %add, <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1>
  %ret = trunc <8 x i16> %div to <8 x i8>
  ret <8 x i8> %ret 
}
```

### Before this patch 
```
vaaddu_auto: 
        vsetivli        zero, 8, e8, mf2, ta, ma
        vle8.v  v8, (a0)
        vle8.v  v9, (a1)
        vwaddu.vv       v10, v8, v9
        vnsrl.wi        v8, v10, 1
        ret
```
### After this patch 
```
vaaddu_auto: 
	vsetivli	zero, 8, e8, mf2, ta, ma
	vle8.v	v8, (a0)
	vle8.v	v9, (a1)
	csrwi	vxrm, 2
	vaaddu.vv	v8, v8, v9
	ret
```

### Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging
addition called `vaadd`.
But AFAIU, no matter in which rounding mode, we cannot achieve the
semantic of signed averaging addition through `vaadd`.
Thus this patch only introduces `vaaddu`.
2024-01-09 15:17:38 +09:00
Craig Topper
a960703466
[RISCV] Remove incomplete PRE_DEC/POST_DEC code for XTHeadMemIdx. (#76922)
As far as I can tell if getIndexedAddressParts received an ISD::SUB, the
constant would be negated. So `IsInc` should be set to true since the
SUB was effectively converted to ADD. This means we should never use
PRE_DEC/POST_DEC.

No tests are affected because DAGCombine aggressively turns SUB with
constant into ADD so no lit test has a SUB reach getIndexedAddressParts.
2024-01-04 09:48:40 -08:00
Shih-Po Hung
475890cd2e
[RISCV][CostModel] Add getRISCVInstructionCost() to TTI for CostKind (#76793)
Instruction cost for CodeSize and Latency/RecipThroughput can be very
different. Considering the diversity of CostKind and vendor-specific
cost, and how they are spread across various TTI functions, it's
becoming quite a challenge to handle. This patch adds an interface
getRISCVInstructionCost to address it.
2024-01-04 21:04:36 +08:00
Craig Topper
80889ae029
[RISCV] Remove RISCVISD::VSELECT_VL. (#76866)
We can use RISCVISD::VMERGE_VL with an undef passthru operand.

I had to rewrite the FMA patterns to handle both undef and non-undef
cases so we can get the tail policy.
2024-01-03 21:31:07 -08:00
Vitaly Buka
9c39d9bb49
Revert "[RISCV][CostModel] Add getRISCVInstructionCost() to TTI for Cost… (#73651)" (#76536)
Fails on bots https://lab.llvm.org/buildbot/#/builders/5/builds/39629

Issue #76535

This reverts commit 3e75dece919511e4a2edada82d783304cc14a9cd.
2023-12-28 13:30:56 -08:00
Shih-Po Hung
3e75dece91
[RISCV][CostModel] Add getRISCVInstructionCost() to TTI for Cost… (#73651)
…Kind

Instruction cost for CodeSize and Latency/RecipThroughput can be very
different. Considering the diversity of CostKind and vendor-specific
cost, and how they are spread across various TTI functions, it's
becoming quite a challenge to handle. This patch adds an interface
getRISCVInstructionCost to address it.
2023-12-28 14:36:01 +08:00