466 Commits

Author SHA1 Message Date
Sam Elliott
408659c5b5
[RISCV] Merge GPRPair and GPRF64Pair (#116094)
As suggested by Craig, this tries to merge the two sets of register
classes created in #112983, GPRPair* and GPRF64Pair*.

- I added some explicit annotations to `RISCVInstrInfoD.td` which fixed
the type inference issues I was seeing from tablegen for select
patterns.
- I've had to make the behaviour of `splitValueIntoRegisterParts` and
`joinRegisterPartsIntoValue` cover more cases, because you cannot
bitcast to/from untyped (the bitcast would otherwise have been inserted
automatically by TargetLowering code).
- I apparently didn't need to change `getNumRegisters` again, which
continues to tell me there's a bug in the code for tied inputs. I added
some more test coverage of this case but it didn't seem to help find the
asserts I was finding before - I think the difference is between the
default behaviour for integers which doesn't apply to floats.
- There's still a difference between BuildGPRPair and BuildPairF64 (and
the same for SplitGPRPair and SplitF64). I'm not happy with this, I
think it's quite confusing, as they're very similar, just differing in
whether they give a `untyped` or a `f64`. I haven't really worked out
how the DAGCombiner copes if one meets the other, I know we have some of
this for the f64 variants already, but they're a lot more complex than
the GPRPair variants anyway.
2024-11-20 10:08:55 +00:00
Craig Topper
ac17b50f50 [RISCV] Use getSignedTargetConstant. NFC 2024-11-18 12:20:44 -08:00
Sam Elliott
4615cc38f3
[RISCV] Inline Assembly Support for GPR Pairs ('R') (#112983)
This patch adds support for getting even-odd general purpose register
pairs into and out of inline assembly using the `R` constraint as
proposed in riscv-non-isa/riscv-c-api-doc#92

There are a few different pieces to this patch, each of which need their
own explanation.

- Renames the Register Class used for f64 values on rv32i_zdinx from
  `GPRPair*` to `GPRF64Pair*`. These register classes are kept broadly
  unmodified, as their primary value type is used for type inference
  over selection patterns. This rename affects quite a lot of files.

- Adds new `GPRPair*` register classes which will be used for `R`
  constraints and for instructions that need an even-odd GPR pair. This
  new type is used for `amocas.d.*`(rv32) and `amocas.q.*`(rv64) in
  Zacas, instead of the `GPRF64Pair` class being used before.

- Marks the new `GPRPair` class legal as for holding a `MVT::Untyped`.
  Two new RISCVISD node types are added for creating and destructing a
  pair - `BuildGPRPair` and `SplitGPRPair`, and are introduced when
  bitcasting to/from the pair type and `untyped`.

- Adds functionality to `splitValueIntoRegisterParts` and
  `joinRegisterPartsIntoValue` to handle changing `i<2*xlen>` MVTs into
  `untyped` pairs.

- Adds an override for `getNumRegisters` to ensure that `i<2*xlen>`
  values, when going to/from inline assembly, only allocate one (pair)
  register (they would otherwise allocate two). This is due to a bug in
  SelectionDAGBuilder.cpp which other backends also work around.

- Ensures that Clang understands that `R` is a valid inline assembly
  constraint.

- This also allows `R` to be used for `f64` types on `rv32_zdinx`
  architectures, where doubles are stored in a GPR pair.
2024-11-18 17:45:58 +00:00
Jianjian Guan
a6f8af676a
[RISCV] Improve vmsge and vmsgeu selection (#115435)
Select vmsge(u) vs, C to vmsgt(u) vs, C-1 if C is not in the imm range
and not the minimum value.

Fix https://github.com/llvm/llvm-project/issues/114505.
2024-11-13 15:05:08 +08:00
Kazu Hirata
82d5dd28b4
[RISCV] Remove unused includes (NFC) (#115814)
Identified with misc-include-cleaner.
2024-11-11 22:54:54 -08:00
Pengcheng Wang
3850801ca5
[RISCV] Add vcpop.m/vfirst.m to RISCVMaskedPseudosTable
We seem to forget these two instructions.

Reviewers: preames, frasercrmck, lukel97, topperc

Reviewed By: lukel97

Pull Request: https://github.com/llvm/llvm-project/pull/115162
2024-11-07 15:41:46 +08:00
Craig Topper
0167a92e28 [RISCV] Use unsigned instead of int64_t for two small positive shift amounts. NFC 2024-10-30 13:10:55 -07:00
Yingwei Zheng
944f4adcd2
[RISCV][ISel] Select binvi for pattern icmp eq/ne X, pow2 (#110957)
This patch selects `binvi` for pattern `icmp eq/ne X, pow2` when zbs is
available.
2024-10-03 15:13:02 +08:00
Craig Topper
bc91f3cdd5
[RISCV] Add 32 bit GPR sub-register for Zfinx. (#108336)
This patches adds a 32 bit register class for use with Zfinx instructions. This makes them more similar to F instructions and allows us to only spill 32 bits.
    
I've added CodeGenOnly instructions for load/store using GPRF32 as that gave better results than insert_subreg/extract_subreg.
    
Function arguments use this new GPRF32 register class for f32 arguments with Zfinx. Eliminating the need to use RISCVISD::FMV* nodes.
    
This is similar to #107446 which adds a 16 bit register class.
2024-10-01 22:09:27 -07:00
Craig Topper
8a7843ca0f
[RISCV] Add 16 bit GPR sub-register for Zhinx. (#107446)
This patches adds a 16 bit register class for use with Zhinx
instructions. This makes them more similar to Zfh instructions and
allows us to only spill 16 bits.

I've added CodeGenOnly instructions for load/store using GPRF16 as that
gave better results than insert_subreg/extract_subreg. I'm using FSGNJ
for GPRF16 copy with Zhinx as that gave better results. Zhinxmin will
use ADDI+subreg operations.

Function arguments use this new GPRF16 register class for f16 arguments
with Zhinxmin. Eliminating the need to use RISCVISD::FMV* nodes.

I plan to extend this idea to Zfinx next.
2024-09-26 22:56:12 -07:00
Piotr Fusik
cc7b24a4d1
[NFC] Fix typos in comments (#109765) 2024-09-24 11:19:56 +02:00
Craig Topper
079f31c11f
[RISCV] Move the rest of Zfa FLI instruction handling to lowerConstantFP. (#109217)
We already moved the fneg case. This moves the rest so we can drop the
custom isel.
2024-09-19 15:16:10 -07:00
Craig Topper
de6d7a6c30
[RISCV] Expand Zfa fli+fneg cases during lowering instead of during isel. (#108316)
Most of the constants fli can generate are positive numbers. We can use
fli+fneg to generate their negative versions.

Previously, we considered such negative constants as "legal" and let
isel generate the fli+fneg. However, it is useful to expose the fneg to
DAG combines to fold with fadd to produce fsub or with fma to produce
fnmadd, fnmsub, or fmsub.

This patch moves the fneg creation to lowering so that the fneg will be
visible to the last DAG combine.

I might move the rest of Zfa handling from isel to lowering as a follow
up.

Fixes #107772.
2024-09-11 22:31:45 -07:00
Craig Topper
8c17ed1512
[RISCV] Generalize RISCVDAGToDAGISel::selectFPImm to handle bitcasts from int to FP. (#108284)
selectFPImm previously handled cases where an FPImm could be
materialized in an integer register.

We can generalize this to cases where a value was in an integer register
and then copied to a scalar FP register to be used by a vector
instruction.

In the affected test, the call lowering code used up all of the FP
argument registers and started using GPRs. Now we use integer vector
instructions to consume those GPRs instead of moving them to scalar FP
first.
2024-09-11 21:13:26 -07:00
Luke Lau
2949720c2e
[RISCV] Move vmerge same mask peephole to RISCVVectorPeephole (#106108)
We currently fold a vmerge.vvm into its true operand if the true operand
is a masked pseudo with the same mask.

We can move this over to RISCVVectorPeephole by instead splitting it up
into a smaller peephole which converts it to a vmv.v.v first. The
existing foldVMV_V_V peephole will then take care of folding it if
needed.

This is very similar to the existing all-ones mask peephole and we could
potentially do it inside of it. I opted to put it in a separate peephole
to make it easier to reason about, given that the duplication is small,
but I could be persuaded either way.
2024-09-06 08:59:13 +08:00
Craig Topper
0c1500ef05 [RISCV] Fix another RV32 Zdinx load/store addressing corner case.
RISCVExpandPseudoInsts makes sure the offset is divisible by 8
so we need to enforce that during isel.
2024-09-04 23:26:40 -07:00
Brandon Wu
22f98740b6
[llvm][RISCV] Support RISCV vector tuple CodeGen and Calling Convention (#97995)
This patch handles target lowering and calling convention.

For target lowering, the vector tuple type represented as multiple
scalable vectors is now changed to a single `MVT`, each `MVT` has a
corresponding register class.

The load/store of vector tuples are handled as the same way but need
another vector insert/extract instructions to get sub-register group.

Inline assembly constraint for vector tuple type can directly be modeled
as "vr" which is identical to normal vector registers.

For calling convention, it no longer needs an alternative algorithm to
handle register allocation, this makes the code easier to maintain and
read.

Stacked on https://github.com/llvm/llvm-project/pull/97994
2024-08-31 19:28:36 +08:00
Luke Lau
dbbfc952f0
[RISCV] Separate ActiveElementsAffectResult into VL and Mask flags (#106517)
In #106110 we had to mark v[f]slide1down.vx as
ActiveElementsAffectResult since the elements in the body depend on VL.
However it doesn't depend on the mask, so this was overly conservative
and broke the vmerge peephole.

We can recover this by splitting up ActiveElementsAffectResult into VL
and Mask bits, so we can more accurately model v[f]slide1down.vx and
re-enable the peephole.
2024-08-30 07:46:06 +08:00
Craig Topper
5b41eb3a6d
[RISCV] Fix more boundary cases in immediate selection for Zdinx load/store on RV32. (#105874)
In order to support -unaligned-scalar-mem properly, we need to be more
careful with immediates of global variables. We need to guarantee that
adding 4 in RISCVExpandingPseudos won't overflow simm12. Since we don't
know what the simm12 is until link time, the only way to guarantee this
is to make sure the base address is at least 8 byte aligned.

There were also several corner cases bugs in immediate folding where we
would fold an immediate in the range [2044,2047] where adding 4 would
overflow. These are not related to unaligned-scalar-mem.
2024-08-25 14:27:08 -07:00
Craig Topper
0381e01424 Recommit "[RISCV] Add isel optimization for (and (sra y, c2), c1) to recover regression from #101751. (#104114)"
Fixed an incorrect cast.

Original message:

If c1 is a shifted mask with c3 leading zeros and c4 trailing zeros. If
c2 is greater than c3, we can use (srli (srai y, c2 - c3), c3 + c4)
followed by a SHXADD with c4 as the X amount.

Without Zba we can use (slli (srli (srai y, c2 - c3), c3 + c4), c4).
Alive2: https://alive2.llvm.org/ce/z/AwhheR
2024-08-23 09:37:48 -07:00
Hans Wennborg
858afe90aa Revert "[RISCV] Add isel optimization for (and (sra y, c2), c1) to recover regression from #101751. (#104114)"
This caused an assert to fire:

  llvm/include/llvm/Support/Casting.h:566:
  decltype(auto) llvm::cast(const From &) [To = llvm::ConstantSDNode, From = llvm::SDValue]:
  Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

see comment on the PR.

> If c1 is a shifted mask with c3 leading zeros and c4 trailing zeros. If
> c2 is greater than c3, we can use (srli (srai y, c2 - c3), c3 + c4)
> followed by a SHXADD with c4 as the X amount.
>
> Without Zba we can use (slli (srli (srai y, c2 - c3), c3 + c4), c4).
> Alive2: https://alive2.llvm.org/ce/z/AwhheR

This reverts commit 514481736cf943464125ef34570a7df0a19290de.
2024-08-23 16:42:04 +02:00
Craig Topper
514481736c
[RISCV] Add isel optimization for (and (sra y, c2), c1) to recover regression from #101751. (#104114)
If c1 is a shifted mask with c3 leading zeros and c4 trailing zeros. If
c2 is greater than c3, we can use (srli (srai y, c2 - c3), c3 + c4)
followed by a SHXADD with c4 as the X amount.

Without Zba we can use (slli (srli (srai y, c2 - c3), c3 + c4), c4).
Alive2: https://alive2.llvm.org/ce/z/AwhheR
2024-08-20 09:41:46 -07:00
Craig Topper
e6ceb29ab6 [RISCV] Use getAllOnesConstant/getSignedConstant. 2024-08-17 00:18:41 -07:00
Luke Lau
aba3476111
[RISCV] Move vmv.v.v peephole from SelectionDAG to RISCVVectorPeephole (#100367)
This is split off from #71764, and moves only the vmv.v.v part of
performCombineVMergeAndVOps to work on MachineInstrs.

In retrospect trying to handle PseudoVMV_V_V and PseudoVMERGE_VVM in the
same function makes the code quite hard to read, so this just does it in
a separate peephole.

This turns out to be simpler since for PseudoVMV_V_V we don't need to
convert the Src instruction to a masked variant, and we don't need to
create a fake all ones mask.
2024-08-17 00:49:27 +08:00
Wang Yaduo
1900810b6f
[RISCV] Simplify (srl (and X, Mask), Const) to TH_EXTU (#102802) 2024-08-16 14:23:00 +08:00
Craig Topper
294ed6a1eb [RISCV] Use if init statement to reduce scope of variable. NFC 2024-08-14 08:37:50 -07:00
Craig Topper
0a4e1c518b [RISCV] Add some Zfinx instructions to hasAllNBitUsers. 2024-08-08 17:33:54 -07:00
Craig Topper
15895daa68
[RISCV] Limit (and (sra x, c2), c1) -> (srli (srai x, c2-c3), c3) isel in some cases. (#102034)
If x is a shl by 32 and c1 is an simm12, we would prefer to use a
SRAIW+ANDI. This prevents selecting the slli to a separate slli
instruction.

Fixes regression from #101868
2024-08-06 20:27:14 -07:00
Craig Topper
cfd13cbac1 [RISCV] Improve variable scoping in custom isel for ISD::AND.
Give the (and (srl/shl X, C2), C1) handling its owns private `C1` variable
it can modify using known zeros. This may be out of sync with
N1C->getZExtValue().

Add a separate const C1 for (and (sra X, C2), C1) and (and X, C).
This copy will always be in sync with N1C->getZExtValue().

Remove the IsC1Mask and IsC1ANDI variables and compute them at their
usage.

Use N1C->getSExtValue() when calling isInt. This shouldn't be a
functional change since we already checked that it was a mask. In
order for it to be a mask and a negative number, it would need to be -1
which should have been removed by DAG combine.
2024-08-05 16:02:57 -07:00
Craig Topper
5bc99fb515
[RISCV] Select (and (sra x, c2), c1) as (srli (srai x, c2-c3), c3). (#101868)
If c1 is a mask with c3 leading zeros and c3 is larger than c2.

Fixes regression reported in #101751.
2024-08-04 22:35:38 -07:00
Luke Lau
a5b65399a7
[RISCV] Move ActiveElementsAffectResult to TSFlags. NFC (#101123)
As noted in
https://github.com/llvm/llvm-project/pull/100367/files#r1695442138,
RISCVMaskedPseudoInfo currently stores two things, whether or not a
masked pseudo has an unmasked variant, and whether or not it's
element-wise.

These are separate things, so this patch splits the latter out into the
underlying instruction's TSFlags to help make the semantics of #100367
more clear.

To the best of my knowledge the only non-element-wise instructions in V
are:

- vredsum.vs and other reductions
- vcompress.vm
- vms*f.m
- vcpop.m and vfirst.m
- viota.m

In vector crypto the instructions that operate on element groups are
conservatively marked (this might be fine to relax later given since
non-EGS multiple vls are reserved), as well as the SiFive extensions and
XTHeadVdot.
2024-08-05 12:08:03 +08:00
Craig Topper
766f68d17a [RISCV] Invert if conditions in the switch in RISCVDAGToDAGISel::hasAllNBitUsers. NFC
Make "break" consistently the "if" body and the "return false" the
last thing in each case.

This makes it easier to add different conditions for different operands
of some instructions and makes everything more consistent.
2024-08-03 17:23:55 -07:00
Craig Topper
1a9acd786d [RISCV] Capitalize some variable names. NFC 2024-08-03 13:37:53 -07:00
Luke Lau
d01c0514ab
[RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW (#101152)
As noted in
https://github.com/llvm/llvm-project/pull/100367/files#r1695448771, we
currently fold in vmerge.vvms and vmv.v.vs into their ops even if the
EEW is different which leads to an incorrect transform.

This checks the op's EEW via its simple value type for now since there
doesn't seem to be any existing information about the EEW size of
instructions. We'll probably need to encode this at some point if we
want to be able to access it at the MachineInstr level in #100367
2024-07-31 00:28:52 +08:00
Luke Lau
b1542afd0b
[RISCV] Rename merge operand -> passthru. NFC (#100330)
We sometimes call the first tied dest operand in vector pseudos the
merge operand, and other times the passthru.

Passthru seems to be more common, and it's what the C intrinsics call
it[^1], so this renames all usages of merge to passthru to be
consistent. It also helps prevent confusion with vmerge.vvm in some of
the peephole optimisations.

[^1]:
https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/main/doc/rvv-intrinsic-spec.adoc#the-passthrough-vd-argument-in-the-intrinsics
2024-07-30 17:47:00 +08:00
Craig Topper
ad80265874
[RISCV] Qualify all XCV predicates with !is64Bit. (#101074)
The tablegen patterns all have isRV32. I did not check if any of them
could naively support RV64.

Fixes #101067 and probably other bugs like it we haven't found yet.
2024-07-29 21:52:57 -07:00
Craig Topper
7647f88234
[RISCV] Add isel special case for (and (srl X, c2), c1) -> (slli_uw (srli x, c2+c3), c3). (#100966)
Where c1 is a shifted mask with 32 set bits and c3 trailing zeros.

Fixes #100936.
2024-07-29 09:08:39 -07:00
Luke Lau
9e9924cc2e
[RISCV] Don't fold vmerge.vvm or vmv.v.v into vredsum.vs if AVL changed (#99006)
When folding, we currently check if the pseudo's result is not lanewise
(e.g. vredsum.vs or viota.m) and bail if we're changing the mask.
However we also need to check for the AVL too.

This patch bails if the AVL changed for these pseudos, and also renames
the pseudo table property to be more explicit.
2024-07-17 12:50:17 +08:00
Luke Lau
3f83a69bcb
[RISCV] Allow folding vmerge into masked ops when mask is the same (#97989)
We currently only fold a vmerge into a masked true operand if the vmerge
has an all-ones mask, since we end up keeping the mask from the true
operand.

But if the masks are the same then we can still fold, because vmerge and
true have the same passthru. If an element was masked off in the
original vmerge, it will also be masked off in the resulting true, and
will have the same passthru value.

The motivation for this is to lower masked VP loads and stores with
passthrus to masked RVV instructions. Normally you can express a masked
RVV instruction with a mask undisturbed passthru via a combination of a
VP op with an all-ones mask and a vp.merge. But for loads and stores you
need the same mask on the VP op as well as the vp.merge.
2024-07-09 12:12:02 +08:00
Luke Lau
aa9e4f0bc9 [RISCV] Refactor mask check in performCombineVMergeAndVOps. NFC 2024-07-08 14:19:25 +08:00
Luke Lau
a348824798
[RISCV] Allow folding vmerge with implicit passthru when true has tied dest (#78565)
We currently don't fold a vmerge if it has an implicit-def passthru and
its true operand also has a passthru (i.e. tied dest).

This restriction was added in https://reviews.llvm.org/D151596, back
whenever we had separate TU/TA pseudos. It looks like it was added
because the policy might not have been handled correctly.

However the policy should be set correctly if we relax this restriction
today, since we compute the policy differently now that we have removed
the TU/TA distinction in our pseudos.

We use a TUMU policy, and relax it to TAMU iff the vmerge's passthru is
implicit-def.

The reasoning behind this being that the tail elements always come from
the vmerge's passthru[^1], so if vmerge's passthru is implicit-def then
the tail is also implicit-def. So a tail agnostic policy is OK.

[^1]: unless the VL was shrunk, but in this case which case we
conservatively use TUMU.
2024-07-06 12:43:11 +08:00
Liao Chunyu
2afea72968
[RISCV] Codegen support for XCVmem extension (#76916)
All post-Increment load/store, register-register load/store

spec:

https://github.com/openhwgroup/cv32e40p/blob/master/docs/source/instruction_set_extensions.rst

Contributors: @CharKeaney, @jeremybennett, @lewis-revill,
@NandniJamnadas, @PaoloS02, @serkm, @simonpcook, @xingmingjie, @realqhc
2024-06-07 21:45:49 +08:00
paperchalice
7652a59407
Reland "[NewPM][CodeGen] Port selection dag isel to new pass manager" (#94149)
- Fix build with `EXPENSIVE_CHECKS`
- Remove unused `PassName::ID` to resolve warning
- Mark `~SelectionDAGISel` virtual so AArch64 backend can work properly
2024-06-04 08:10:58 +08:00
paperchalice
8917afaf0e
Revert "[NewPM][CodeGen] Port selection dag isel to new pass manager" (#94146)
This reverts commit de37c06f01772e02465ccc9f538894c76d89a7a1 to
de37c06f01772e02465ccc9f538894c76d89a7a1

It still breaks EXPENSIVE_CHECKS build. Sorry.
2024-06-02 14:31:52 +08:00
paperchalice
d2cdc8ab45
[NewPM][CodeGen] Port selection dag isel to new pass manager (#83567)
Port selection dag isel to new pass manager.
Only `AMDGPU` and `X86` support new pass version. `-verify-machineinstrs` in new pass manager belongs to verify instrumentation, it is enabled by default.
2024-06-02 09:12:33 +08:00
Alex Bradbury
90109d4448
[RISCV] Improve constant materialisation for stores of i8 negative constants (#92131)
This follows the same pattern as 20e62658735a1b03ecadc. Although we
can't reduce the number of instructions used, if we are able to use a
sign-extended 6-bit immediate then the 16-bit c.li instruction can be
selected (thus saving code size). Although this _could_ be gated so it
only happens if C is enabled, I've opted not to because at worst it's
neutral and it doesn't seem helpful to add unnecessary divergence
between the RVC and non-RVC paths.
2024-05-14 19:08:04 +01:00
Craig Topper
dfff57e751
[RISCV] Add isel special case for (and (shl X, c2), c1) -> (slli_uw (srli x, c3-c2), c3). (#91638)
Where c1 is a shifted mask with 32 set bits and c3 trailing zeros.
2024-05-09 14:39:06 -07:00
Luke Lau
bbd6a2d85c
[RISCV] Convert implicit_def tuples to noreg in post-isel peephole (#91173)
If a segmented load has an undefined passthru then it will be selected
as a reg_sequence with implicit_def operands, which currently slips
through the implicit_def -> noreg peephole.

This patch fixes this so we're able to infer if the passthru is
undefined without the need for looking through vreg definitions with
MachineRegisterInfo, which will help with moving RISCVInsertVSETVLI to
LiveIntervals in #70549
2024-05-08 15:38:13 +08:00
Luke Lau
f565b79f9f
[RISCV] Handle fixed length vectors with exact VLEN in lowerINSERT_SUBVECTOR (#84107)
This is the insert_subvector equivalent to #79949, where we can avoid
sliding up by the full LMUL amount if we know the exact subregister the
subvector will be inserted into.

This mirrors the lowerEXTRACT_SUBVECTOR changes in that we handle this
in two parts:

- We handle fixed length subvector types by converting the subvector to
a scalable vector. But unlike EXTRACT_SUBVECTOR, we may also need to
convert the vector being inserted into too.

- Whenever we don't need a vslideup because either the subvector fits
exactly into a vector register group *or* the vector is undef, we need
to emit an insert_subreg ourselves because RISCVISelDAGToDAG::Select
doesn't correctly handle fixed length subvectors yet: see d7a28f7ad

A subvector exactly fits into a vector register group if its size is a
known multiple of the size of a vector register, and this adds a new
overload for TypeSize::isKnownMultipleOf for scalable to scalable
comparisons to help reason about this.

I've left RISCVISelDAGToDAG::Select untouched for now (minus relaxing an
invariant), so that the insert_subvector and extract_subvector code
paths are the same.

We should teach it to properly handle fixed length subvectors in a
follow-up patch, so that the "exact subregsiter" logic is handled in one
place instead of being spread across both RISCVISelDAGToDAG.cpp and
RISCVISelLowering.cpp.
2024-05-01 01:35:13 +08:00
Pengcheng Wang
940ef9687f
[RISCV] Remove hasSideEffects=1 for saturating/fault-only-first instructions
Marking them as `hasSideEffects=1` stops some optimizations.

According to `Target.td`:

> // Does the instruction have side effects that are not captured by any
> // operands of the instruction or other flags?
> bit hasSideEffects = ?;

It seems we don't need to set `hasSideEffects` for vleNff since we have
modelled `vl` as an output operand.

As for saturating instructions, I think that explicit Def/Use list
is kind of side effects captured by any operands of the instruction,
so we don't need to set `hasSideEffects` either. And I have just
investigated AArch64's implementation, they don't set this flag and
don't add `Def` list.

These changes make optimizations like `performCombineVMergeAndVOps`
and MachineCSE possible for these instructions.

As a consequence, `copyprop.mir` can't test what we want to test in
https://reviews.llvm.org/D155140, so we replace `vssra.vi` with a
VCIX instruction (it has side effects).

Reviewers: jacquesguan, topperc, preames, asb, lukel97

Reviewed By: topperc, lukel97

Pull Request: https://github.com/llvm/llvm-project/pull/90049
2024-04-30 14:14:16 +08:00