2354 Commits

Author SHA1 Message Date
Yeting Kuo
9637e950cb [RISCV] Support ISD::STRICT_FADD/FSUB/FMUL/FDIV for vector types.
The patch handles fixed type strict-fp by new RISCVISD::STRICT_ prefixed
isd nodes.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145900
2023-03-15 07:47:16 +08:00
Craig Topper
c09730c23e [RISCV] Pre-commit tests for D145897. NFC 2023-03-14 13:11:11 -07:00
Alex Bradbury
084e413893 [RISCV] Fix regression due to interaction of MachineOutliner and MachineCopyPropagation
D144535 enabled machine copy propagation for RISC-V and added it to the
pass pipeline in addPreEmitPass2 (after the MachineOutliner).
Unfortunately, the MachineCopyPropagation pass is unable to correctly
analyse outlined functions, and will delete copy instructions where a
register is set that is intended to be live-out.
RISCVInstrInfo::buildOutlinedFrame will directly insert a JALR, while a
similar function going through the normal codegen path would have a
PseudoRet with operands indicating registers that are live-out.

This patch does the simplest fix, which is to run MachineCopyPropagation
before the MachineOutliner.

Differential Revision: https://reviews.llvm.org/D146037
2023-03-14 17:55:11 +00:00
LiaoChunyu
eb54254b6e [RISCV] Return false from shouldFormOverflowOp when type is i8 and i16
i8 and i16 are not using overflow.
Reduce the number of zero extension instructions.

To reduce the uncertainty of the unknown,
most of the checks of the virtual function are kept

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D143646
2023-03-14 20:42:55 +08:00
Alex Bradbury
0246c61484 [RISCV][test] Test case for regression when MachineOutliner and MachineCopyPropagation are both enabled
MachineCopyPropagation removes a register copy in the outlined function
as it doesn't see that it's live-out from the function.
2023-03-14 11:18:21 +00:00
Craig Topper
bf4b9857d0 Recommit "[RISCV] Add separate lookup tables for fli.h and fli.d."
Fix mistake in f16 table in previous patch.

Original commit message:
Use separate lookup tables instead of trying to reuse the fli.s
table.

We were missing the 2 denormal cases for fli.h. We also had an issue
where fli.d was only checking 8 bits of the 11 bit exponent.
2023-03-12 22:17:56 -07:00
Craig Topper
ffc18c3339 Revert "[RISCV] Add separate lookup tables for fli.h and fli.d."
This reverts commit ebc11b68412cdcf2a0e6e2c50df262cfd9b8f481.

I made a mistake in the f16 table. Will fix and recommit.
2023-03-12 22:06:48 -07:00
Craig Topper
ebc11b6841 [RISCV] Add separate lookup tables for fli.h and fli.d.
Use separate lookup tables instead of trying to reuse the fli.s
table.

We were missing the 2 denormal cases for fli.h. We also had an issue
where fli.d was only checking 8 bits of the 11 bit exponent.
2023-03-12 11:28:49 -07:00
Craig Topper
858451c50b [RISCV] Add test cases for fli.h and fli.d CodeGen bugs. NFC
We fail to use fli.h for the 2 denormal values.
We use fli.d for some values where the value is larger than a float
can represent due to truncating the exponent to 8 bits without checking
if it fits in 8 bits.
2023-03-12 11:28:49 -07:00
Craig Topper
30705e9770 [RISCV] Support Zfa fli instructions with vector splats.
-Return false from RISCVDAGToDAGISel::selectFPImm for fli
 constants so we don't try to use integer expansion.
-Support fli.h with Zvfh+Zfhmin.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D145766
2023-03-10 09:16:21 -08:00
Craig Topper
c2bcb21cc6 [RISCV] Print Zfa fli instruction FP values in a more adaptive way.
Previously, we printed all constants in scientific notation with
6 digits of precision. This is not enough to accurately display
the smallest value, but increasing the precision would be too much
for other values.

This patch prints values with fractional bits using only as many digits as
needed. 1*2^-15 and 1*2^-16 will be printed in scientific notation while
the others are printed without scientific notation. The integer values
are printed with a single 0 after the decimal point.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D145645
2023-03-10 07:57:44 -08:00
Jim Lin
7e9293572d [RISCV] Set how many bytes load from or store to stack slot
Refer from: https://reviews.llvm.org/D44782

After https://reviews.llvm.org/D130302, LW+SEXT.B can be folded into LB
as partially reload stack slot. This gains incorrect optimization result
from `StackSlotColoring` without given the number of bytes exactly load
from stack. LB+SW are mis-interpreted as fully reload/restore from stack
slot without the sign-extension. SW would be considered as a redundant store.

The testcase is copied from llvm/test/CodeGen/X86/pr30821.mir.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145471
2023-03-10 10:14:59 +08:00
Jim Lin
865fb2e440 [RISCV] Pre-commit test case for D140460
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145549
2023-03-10 10:14:59 +08:00
Yeting Kuo
b2c48559c8 [IR][DAG][RISCV] Allow scalable vector ISD::STRICT_FP_EXTEND and RISC-V supports for vector ISD::STRICT_FP_EXTEND.
The patch mainly does two things. The first is allowing scalable vector
ISD::STRICT_FP_EXTEND. The second is making RISC-V customized lower
strict_fpextend to riscv_strict_fpextend_vl, the strict version of
riscv_fpextend_vl.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145548
2023-03-09 17:37:59 +08:00
Piyou Chen
365f840398 [RISCV] Enable subregister liveness by default
This commit enable the subregister liveness by default in RISC-V.

It was previously disabled in https://reviews.llvm.org/D129646 after a previous attempt to enabled it https://reviews.llvm.org/D128016.

We believe that https://reviews.llvm.org/D129735 fixes the issue that caused it to be disabled.

Reviewed By: craig.topper, kito-cheng

Differential Revision: https://reviews.llvm.org/D145546
2023-03-08 23:03:35 -08:00
Craig Topper
17e0926d6a [RISCV] Don't try to use fli.h with Zfa+Zfhmin.
fli.h requires Zfh or Zvfh. We need to check for this in
isFPImmLegal. Zvfh support will come in another patch.

I had to split the test file because there are other issues with
Zfhmin and some intrinsics.
2023-03-08 22:54:25 -08:00
LiaoChunyu
42a5dda553 [RISCV] Add more testcases for overflow-intrinsics.ll 2023-03-09 09:13:30 +08:00
Craig Topper
73516b355c [RISCV] Don't parse the decimal minimum value for fli.s/fli.d/fli.h.
There are a couple bugs in the current support for this:
-We do all the parsing in single precision so any value less than or
 equal to the minimum fp32 is accepted as the minimum value for f64.
-To support fp16 minimum value, getLoadFP32Imm has a special case, but
 that causes a miscompile in CodeGen.

Differential Revision: https://reviews.llvm.org/D145542
2023-03-08 09:24:58 -08:00
Luke Lau
d610c6c9c7 [RISCV] Add vsseg intrinsic for fixed length vectors
These intrinsics are equivalent to the regular @llvm.riscv.vssegNF
intrinsics, only they accept fixed length vectors in their overloaded
types: The regular intrinsics only operate on scalable vectors.
These intrinsics convert the fixed length vectors to scalable ones, and
then lower it on to the regular scalable intrinsic.

This mirrors the intrinsics added in 0803dba7dd998ad073d75a32b65296734c10ae70
This will be used in a later patch with the Interleaved Access pass.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145022
2023-03-08 17:19:03 +00:00
Craig Topper
f2c1b1a7f5 [RISCV] Add test case for Zfa fli.s miscompile. NFC
The f32 matching code for fli was hacked to allow the f16 minimum value
to match for the fli.h instruction in the assembler. This was done
because the assembler parses the floating point literal for fli.h,
fli.s, and fli.d as a single precision value.

Unfortunately, this function is also used by CodeGen and causes
this value to be miscompiled for f32.
2023-03-07 19:53:07 -08:00
Craig Topper
8fa1e5e673 [RISCV] Teach performCombineVMergeAndVOps to combine unmasked TU vpmerge with a masked MU TA op.
We can form a MU TU operation and remove the merge if they use the
same merge value.

My primary interest was a case involving VP intrinsics from our downstream,
but it requires another optimization that isn't in upstream yet. So I've used
RVV intrinsics to get the desired instructions.

Co-authored-by: Nitin John Raj <nitin.raj@sifive.com>

Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D145272
2023-03-07 08:59:48 -08:00
wangpc
5fdab3c81b [RISCV] Enable machine copy propagation for copy-like instructions
Like what has been done in AArch64 (D125335).

We enable this under `-O2` to show the codegen diffs here but we
may only do this under `-O3` like AArch64.

There are two cases that we may produce these eliminable copies:
1. ISel of `FrameIndex`. Like `rvv/fixed-vectors-calling-conv.ll`.
2. Tail duplication. Like `select-optimize-multiple.ll`.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144535
2023-03-07 17:54:05 +08:00
Jun Sha (Joshua)
ada2641460 [RISCV][CodeGen] Add codegen pattern for FLI instruction in experimental zfa extension
This patch implements experimental support for the RISCV Zfa extension as specified here: https://github.com/riscv/riscv-isa-manual/releases/download/draft-20221119-5234c63/riscv-spec.pdf, Ch. 25. This extension has not been ratified. Once ratified, it'll move out of experimental status.

This change adds codegen support for load-immediate instructions (fli.s/fli.d/fli.h).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D141560
2023-03-07 14:27:48 +08:00
Jessica Clarke
3a748cd01b [RISCV][NFC] Add PIC RUN/CHECK lines for jumptable.ll test
The omission of checking our PIC jump table code generation is a bit of
an oversight, so belatedly add this coverage.
2023-03-03 20:19:15 +00:00
Jessica Clarke
6e8bcaafba [RISCV][NFC] Unify CHECK lines for jumptable.ll @below_threshold test
These are all identical across XLEN and code model since they're not
using a jump table, so unify them rather than having multiple copies.
2023-03-03 20:18:18 +00:00
Jessica Clarke
e0fe8e6412 [RISCV][NFC] Add signext to jumptable.ll tests to avoid irrelevant sext.w
Sign-extending the input is irrelevant to the test in question, which
merely needs some arbitrary input to switch on. This also removes the
only difference between the RV32I and RV64I cases for @below_threshold
and will allow unifying them.
2023-03-03 20:14:31 +00:00
Leonard Chan
fa6aadd6cb [llvm] Prevent building for riscv32-unknown-fuchsia
Fuchsia is exclusively 64-bit so this throw an error when using this
triple.

Differential Revision: https://reviews.llvm.org/D144998
2023-03-01 19:42:56 +00:00
Nikita Popov
ddccc5ba44 [CodeGen] Always expand division larger than i128
Default MaxDivRemBitWidthSupported to 128, so that divisions larger
than 128 bits are always expanded, without requiring additional
configuration from the target.

Note that this may still emit calls to __udivti3 on 32-bit targets,
which likely don't have an implementation of that builtin. However,
I believe this is sufficient to fix
https://github.com/llvm/llvm-project/issues/60531, because Zig must
already be defining those builtins.

Differential Revision: https://reviews.llvm.org/D144871
2023-03-01 15:33:45 +01:00
LiaoChunyu
fbace95408 [RISCV] Enable preferZeroCompareBranch to optimize branch on zero in codegenprepare
Similar to ARM and SystemZ.

Related Patchs: D101778(preferZeroCompareBranch)
https://reviews.llvm.org/rG9a9421a461166482465e786a46f8cced63cd2e9f   ( == 0 to u< 1)

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D142071
2023-02-28 14:43:40 +08:00
Philipp Tomsich
f68f04d07c [RISCV] Add vendor-defined XTheadCondMov (conditional move) extension
The vendor-defined XTheadCondMov (somewhat related to the upcoming
Zicond and XVentanaCondOps) extension add conditional move
instructions with $rd being an input and an ouput instructions.

It is supported by the C9xx cores (e.g., found in the wild in the
Allwinner D1) by Alibaba T-Head.

The current (as of this commit) public documentation for this
extension is available at:
  https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf

Support for these instructions has already landed in GNU Binutils:
  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=73442230966a22b3238b2074691a71d7b4ed914a

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144681
2023-02-24 21:40:42 +01:00
Manolis Tsamis
7b79e8d455 [RISCV] Add vendor-defined XTheadFMemIdx (FP Indexed Memory Operations) extension
The vendor-defined XTHeadFMemIdx (no comparable standard extension exists
at the time of writing) extension adds indexed load/store instructions
for floating-point registers.

It is supported by the C9xx cores (e.g., found in the wild in the
Allwinner D1) by Alibaba T-Head.

The current (as of this commit) public documentation for this
extension is available at:
  https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf

Support for these instructions has already landed in GNU Binutils:
  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=f511f80fa3fcaf6bcbe727fb902b8bd5ec8f9c20

Depends on D144249

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144647
2023-02-24 00:35:37 +01:00
Manolis Tsamis
f6262201d8 [RISCV] Add vendor-defined XTheadMemIdx (Indexed Memory Operations) extension
The vendor-defined XTHeadMemIdx (no comparable standard extension exists
at the time of writing) extension adds indexed load/store instructions
as well as load/store and update register instructions.

It is supported by the C9xx cores (e.g., found in the wild in the
Allwinner D1) by Alibaba T-Head.

The current (as of this commit) public documentation for this
extension is available at:
  https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf

Support for these instructions has already landed in GNU Binutils:
  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=27cfd142d0a7e378d19aa9a1278e2137f849b71b

Depends on D144002

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144249
2023-02-24 00:17:58 +01:00
Samuel Parker
f48d3b6f46 Revert "[DAGCombine] Fold redundant select"
This reverts commit c7f9344d0f8f6a00adab138037e2e7b406ef2b69.
2023-02-23 17:59:41 +00:00
Craig Topper
230e61658b [LegalizeTypes] Add a special case for (add X, 1) to ExpandIntRes_ADDSUB.
On targets without ADDCARRY or ADDE, we need to emit a separate
SETCC to determine carry from the low half to the high half. Usually
we do (setult Lo, LHSLo). If RHSLo is 1 we can instead do (seteq Lo, 0).
This can reduce the live range of LHSLo.
2023-02-23 09:47:42 -08:00
Craig Topper
2fc5a5117c [LegalizeTypes][RISCV] Add a special case to ExpandIntRes_UADDSUBO for (uaddo X, 1).
On targets that lack ADDCARRY support we split a wide uaddo into
an ADD and a SETCC that both need to be split.

For (uaddo X, 1) we can observe that when the add overflows the result
will be 0. We can emit (seteq (or Lo, Hi), 0) to detect this.

This improves D142071.

There is an alternative here. We could use either ~(lo(X) & hi(X)) == 0 or
(lo(X) & hi(X)) == -1 before the addition. That would be closer to the
code before D142071.

Reviewed By: liaolucy

Differential Revision: https://reviews.llvm.org/D144614
2023-02-23 09:16:54 -08:00
Luke Lau
8d15e7275f [RISCV] Lower interleave and deinterleave intrinsics
Lower the two intrinsics introduced in D141924.

These intrinsics can be combined with loads and stores into the much more efficient segmented load and store instructions in a following patch.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D144092
2023-02-23 16:23:02 +00:00
Yeting Kuo
419948fe67 [VP] Reorder is_int_min_poison/is_zero_poison operand before mask for vp.abs/ctlz/cttz.
The patch ensures last two operands of vp.abs/ctlz/cttz are mask and evl.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144536
2023-02-23 13:58:21 +08:00
Leonard Chan
cdb9a0c086 [MC][CodeGen] Define R_RISCV_PLT32 and lower dso_local_equivalent to it
This introduces R_RISCV_PLT32, PC-relative data relocation that takes
the 32-bit relative offset to a function or its PLT entry from its
relocation location.

This is needed to support relative vtables on RISCV.

Github PR: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/363

The lld handling of this reloc is D143115.

Differential Revision: https://reviews.llvm.org/D143226
2023-02-23 01:26:27 +00:00
Manolis Tsamis
a6446668a3 [RISCV] XTHeadMemPair: Fix invalid mempair combine for types other than i32/i64
A mistake in the control flow of performMemPairCombine resulted in paired
loads/stores for types that were not supported by the instructions (i8/i16).
These loads/stores could not match the constraints of the patterns defined
in the THead td file and the compiler would throw a 'Cannot select' error.

This is now fixed and two new test functions have been added in xtheadmempair.ll
which would previously crash the compiler. The compiler was additionally tested
with a wide range of benchmarks and no issues were observed.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144559
2023-02-22 19:57:37 +01:00
Piyou Chen
3b8c0b342e [RISCV] Add new pass to transform undef to pseudo for vector values.
RISC-V vector instruction has register overlapping constraint for certain
instructions, and will cause illegal instruction trap if violated, we use
early clobber to model this constraint, but it can't prevent register allocator
allocated same or overlapped if the input register is undef value, so convert
IMPLICIT_DEF to temporary pseudo could prevent that happen, it's not best way
to resolve this. Ideally we should model the constraint right, but before we
model the constraint right, it's the approach to prevent that happen.

See also: https://github.com/llvm/llvm-project/issues/50157

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D129735
2023-02-22 04:03:22 -08:00
Manolis Tsamis
16a6cf6a99 [RISCV] Add vendor-defined XTheadSync (Multi-core synchronization) extension
The vendor-defined XTheadSync (no comparable standard extension exists
at the time of writing) extension adds multi-core synchronization
instructions.

It is supported by the C9xx cores (e.g., found in the wild in the
Allwinner D1) by Alibaba T-Head.

The current (as of this commit) public documentation for this
extension is available at:
  https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf

Support for these instructions has already landed in GNU Binutils:
  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=547c18d9bb95571261dbd17f4767194037eb82bd

Depends on D144496

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144501
2023-02-22 11:15:40 +01:00
Manolis Tsamis
f5b484c56f [RISCV] Add vendor-defined XTheadCmo (Cache Management Operations) extension
The vendor-defined XTHeadCmo (there are some similarities with the
Zicbom standard extension) extension adds cache management instructions.

It is supported by the C9xx cores (e.g., found in the wild in the
Allwinner D1) by Alibaba T-Head.

The current (as of this commit) public documentation for this
extension is available at:
  https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf

Support for these instructions has already landed in GNU Binutils:
  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=a9ba8bc2d396fb8ae2b892f3bc6be8cdfe4b555c

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144496
2023-02-22 10:57:48 +01:00
Manolis Tsamis
bbb58a2302 [RISCV] Add vendor-defined XTheadMemPair (two-GPR Memory Operations) extension
The vendor-defined XTHeadMemPair (no comparable standard extension exists
at the time of writing) extension adds two-GPR load/store pair instructions.

It is supported by the C9xx cores (e.g., found in the wild in the
Allwinner D1) by Alibaba T-Head.

The current (as of this commit) public documentation for this
extension is available at:
  https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf

Support for these instructions has already landed in GNU Binutils:
  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=6e17ae625570ff8f3c12c8765b8d45d4db8694bd

Depends on D143847

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144002
2023-02-21 12:21:49 +01:00
Luke Lau
b486246135 [RISCV] Use a smaller VL when interleaving fixed vectors
Interleaves generated with vwaddu.vv and vwmaccu.vx were using VLs that
were twice the number of elements actually needed in the vector.
This also pulls the interleaving logic out into its own function so it
can be reused by later patches, and adapts it for scalable vectors.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144386
2023-02-21 09:46:23 +00:00
Alex Bradbury
d41a73aa94 [RISCV][MC] Mark Zawrs extension as non-experimental
Support for the unratified 1.0-rc3 specification was introduced in
D133443. The specification has since been ratified (in November 2022
according to the recently ratified extensions list
<https://wiki.riscv.org/display/HOME/Recently+Ratified+Extensions>.

A review of the diff
<https://github.com/riscv/riscv-zawrs/compare/V1.0-rc3...main> of the
1.0-rc3 spec vs the current/ratified document shows no changes to the
instruction encoding or naming. At one point, a note was added
<e84f42406a>
indicating Zawrs depends on the Zalrsc extension (not officially
specified, but I believe to be just the LR/SC instructions from the A
extension). The final text ended up as "The instructions in the Zawrs
extension are only useful in conjunction with the LR instructions, which
are provided by the A extension, and which we also expect to be provided
by a narrower Zalrsc extension in the future." I think it's consistent
with this phrasing to not require the A extension for Zawrs, which
matches what was implemented.

No intrinsics are implemented for Zawrs currently, meaning we don't need
to additionally review whether those intrinsics can be considered
finalised and ready for exposure to end users.

Differential Revision: https://reviews.llvm.org/D143507
2023-02-19 20:43:03 +00:00
Craig Topper
3d0a5bf7de [RISCV] Add Zfa test cases for strict ONE and UEQ comparisons. NFC
These correspond to islessgreater and it inverse.
2023-02-18 17:28:10 -08:00
Philip Reames
495b653480 [RISCV] Add missing plumbing and tests for zfa
Experimental support for the zfa extension was recently added in https://reviews.llvm.org/D141984. A couple of the normal test changes and clang plumbing got missed in that change. This commit updates the usual suspects.

Differential Revision: https://reviews.llvm.org/D144288
2023-02-17 17:56:30 -08:00
Philipp Tomsich
10b7cd660c [RISCV] Select signed and unsigned bitfield extracts for XTHeadBb
The XTHeadBb extension hab both signed and unsigned bitfield
extraction instructions (TH.EXT and TH.EXTU, respectively) which have
previously only been supported for sign extension on byte, halfword,
and word-boundaries.

This adds the infrastructure to use TH.EXT and TH.EXTU for arbitrary
bitfield extraction.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144229
2023-02-17 21:46:26 +01:00
Philipp Tomsich
16a66af0a0 Revert "[RISCV] Add vendor-defined XTheadMemPair (two-GPR Memory Operations) extension"
This reverts commit d2918544a7fc4b5443879fe12f32a712e6dfe325.
2023-02-17 19:45:55 +01:00
Manolis Tsamis
6774ba8411 [RISCV] xtheadmac: fix commutativity issue for the in/out register
The instructions in the XTHeadMac extension (multiply accumulate
instructions) were marked as commutative but because the destination
register was also an input (accumulate) register and was connected to
the destination register with a register allocator constraint, all
three operands (instead of two) were incorrectly considered
commutative. To fix that an appropriate fixCommutedOpIndices call was
added for these instructions in findCommutedOpIndices

New test functions have been added to test the correct behaviour in
xtheadmac.ll.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D144278
2023-02-17 19:45:22 +01:00