51 Commits

Author SHA1 Message Date
Craig Topper
ce37a7131f
[RISCV] Add integer RISCVISD::SELECT_CC to canCreateUndefOrPoison and isGuaranteedNotToBeUndefOrPoison. (#84693)
Integer RISCVISD::SELECT_CC doesn't create poison. If none of the,
operands are poison, the result is not poison.

This allows ISD::FREEZE to be hoisted above RISCVISD::SELECT_CC.
2024-03-25 11:10:58 -07:00
Craig Topper
6b270358c7
[SelectionDAG] Allow FREEZE to be hoisted before FP SETCC. (#84358)
No nans/infs in SelectionDAG is complicated. Hopefully I've captured
all of the cases. I've only applied to ConsiderFlags to the SDNodeFlags
since those are the only ones that will be droped by hoisting. The
condition code and TargetOptions would still be in effect.
    
Recovers some regression from #84232.
2024-03-08 17:21:21 -08:00
Craig Topper
909ab0e0d1
[RISCV] Insert a freeze before converting select to AND/OR. (#84232)
Select blocks poison, but AND/OR do not. We need to insert a freeze
to block poison propagation.

This creates suboptimal codegen which I will try to fix with other
patches. I'm prioritizing the correctness fix since we have 2 bug reports.

Fixes #84200 and #84350
2024-03-07 15:03:51 -08:00
Fangrui Song
eabaee0c59
[RISCV] Omit "@plt" in assembly output "call foo@plt" (#72467)
R_RISCV_CALL/R_RISCV_CALL_PLT distinction is not necessary and
R_RISCV_CALL has been deprecated. Since https://reviews.llvm.org/D132530
`call foo` assembles to R_RISCV_CALL_PLT. The `@plt` suffix is not
useful and can be removed now (matching AArch64 and PowerPC).

GNU assembler assembles `call foo` to RISCV_CALL_PLT since 2022-09
(70f35d72ef04cd23771875c1661c9975044a749c).

Without this patch, unconditionally changing MO_CALL to MO_PLT could
create `jump .L1@plt, a0`, which is invalid in LLVM integrated assembler
and GNU assembler.
2024-01-07 12:09:44 -08:00
Jay Foad
7b3bbd83c0 Revert "[CodeGen] Really renumber slot indexes before register allocation (#67038)"
This reverts commit 2501ae58e3bb9a70d279a56d7b3a0ed70a8a852c.

Reverted due to various buildbot failures.
2023-10-09 12:31:32 +01:00
Jay Foad
2501ae58e3
[CodeGen] Really renumber slot indexes before register allocation (#67038)
PR #66334 tried to renumber slot indexes before register allocation, but
the numbering was still affected by list entries for instructions which
had been erased. Fix this to make the register allocator's live range
length heuristics even less dependent on the history of how instructions
have been added to and removed from SlotIndexes's maps.
2023-10-09 11:44:41 +01:00
Philip Reames
8624075105
[RISCV] Strip W suffix from ADDIW (#68425)
The motivation of this change is simply to reduce test duplication. As
can be seen in the (massive) test delta, we have many tests whose output
differ only due to the use of addi on rv32 vs addiw on rv64 when the
high bits are don't care.

As an aside, we don't need to worry about the non-zero immediate
restriction on the compressed variants because we're not directly
forming the compressed variants. If we happen to get a zero immediate
for the ADDI, then either a later optimization will strip the useless
instruction or the encoder is responsible for not compressing the
instruction.
2023-10-06 10:28:01 -07:00
Shao-Ce SUN
fe558efe71 [RISCV][CodeGen] Support Zfinx codegen
This patch was split from D122918 . Co-Author: @liaolucy @realqhc

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D148874
2023-05-03 00:13:38 +08:00
Craig Topper
df017ba9d3 [TargetLowering] Don't use ISD::SELECT_CC in expandFP_TO_INT_SAT.
This function gets called for vectors and ISD::SELECT_CC was never
intended to support vectors. Some updates were made to support
it when this function started getting used for vectors.

Overall, using separate ISD::SETCC and ISD::SELECT looks like an
improvement even for scalar.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D149481
2023-04-29 10:23:08 -07:00
Craig Topper
7b0c41841e [RISCV] Move compressible registers to the beginning of the FP allocation order.
We don't have very many compressible FP instructions, just load and store.
These instruction require the FP register to be f8-f15.

This patch changes the FP allocation order to prioritize f10-f15 first.
These are also the FP argument registers. So I allocated them in reverse
order starting at f15 to avoid taking the first argument registers.
This appears to match gcc allocation order.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D146488
2023-03-27 17:29:28 -07:00
LiaoChunyu
fc9730376c [RISCV]Optimize (riscvisd::select_cc x, 0, ne, x, 1)
This patch reduces the number of unpredictable branches.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D146117
2023-03-16 10:56:26 +08:00
Han-Kuan Chen
d02b9869b2 [RISCV] Don't use constantpool for floating-point value if the value can be easily constructed by integer sequence and a floating-point move.
In addition, this commit does the following combine

vfmv.v.f + fmv.[dhw].x -> vmv.v.x
vfmv.s.f + fmv.[dhw].x -> vmv.s.x
vfmerge.vfm + fmv.[dhw].x -> vmerge.vxm

Differential Revision: https://reviews.llvm.org/D142953
2023-02-03 22:42:08 -08:00
Nikita Popov
1456b68686 [RISCV] Convert some tests to opaque pointers (NFC) 2022-12-19 13:01:08 +01:00
Nitin John Raj
d741a31a39 [RISCV][CodeGen][SelectionDAG] Recursively check hasAllNBitUsers for logical machine opcodes
We don’t have W versions of AND/OR/XOR/ANDN/ORN/XNOR so we should recursively check their users. We should limit the recursion to SelectionDAG::MaxRecursionDepth levels.

We need to add a Depth argument, all existing callers should pass 0 to the Depth. The new recursive calls should increment it by 1. At the top of the function we should give up and return false if Depth >= SelectionDAG::MaxRecursionDepth.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D139462
2022-12-14 15:15:30 -08:00
Craig Topper
e00e20a055 [RISCV] Add ADDW/AND/OR/XOR/SUB/SUBW to getRegAllocHints.
These instructions requires both register operands to be compressible
so I've only applied the hint if we already have a GPRC physical register
assigned for the other register operand.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D139079
2022-12-01 11:09:38 -08:00
Craig Topper
a2b5b584a5 [RISCV] Use register allocation hints to improve use of compressed instructions.
Compressed instructions usually require one of the source registers
to also be the source register. The register allocator doesn't have
that bias on its own.

This patch adds register allocation hints to introduce this bias.
I've started with ADDI, ADDIW, and SLLI. These all have a 5-bit
field for the register. If the source and dest register are the
same they are guaranteed to compress as long as the immediate is
also 6 bits.

This code was inspired by similar code from the SystemZ target.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D138242
2022-11-25 08:39:44 -08:00
LiaoChunyu
7b970290c0 [RISCV] Optimize SELECT_CC when the true value of select is Constant
(select (setcc lhs, rhs, CC), constant, falsev) -> (select (setcc lhs, rhs, InverseCC), falsev, constant)

This patch removes unnecessary copies

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D129757
2022-10-18 09:24:17 +08:00
Craig Topper
e68b0d5875 [RISCV] Match (select C, -1, X)->(or -C, X) during lowerSelect
Same with (select C, X, -1), (select C, 0, X), and (select C, X, 0).

There's a DAGCombine after we turn the select into select_cc, but
that may introduce a setcc that didn't previously exist. We could
add more DAGCombines to remove the extra setcc, but this seemed lower
effort.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D135833
2022-10-13 09:06:12 -07:00
Philip Reames
1c41d0cb62 [RISCV] Use branchless form for selects with 0 in either arm
Continuing the theme of adding branchless lowerings for simple selects, this time handle the 0 arm case. This is very common for various umin idioms, etc..

Differential Revision: https://reviews.llvm.org/D135600
2022-10-12 13:51:52 -07:00
Philip Reames
79f0413e5e [RISCV] Use branchless form for selects with -1 in either arm
We can lower these as an or with the negative of the condition value. This appears to result in significantly less branch-y code on multiple common idioms (as seen in tests).

Differential Revision: https://reviews.llvm.org/D135316
2022-10-06 15:18:43 -07:00
Craig Topper
5224bae613 [RISCV] Fix a bug in i32 FP_TO_UINT_SAT lowering on RV64.
We use the saturating behavior of fcvt.wu.h/s/d but forgot to
take into account that fcvt.wu will sign extend the saturated
result. According to computeKnownBits a promoted FP_TO_UINT_SAT
is expected to zero extend the saturated value.

In many case the upper bits aren't be demanded so this wouldn't
be an issue. But if we computeKnownBits caused an AND to be removed
it would be a bug.

This patch inserts an AND during to zero the upper bits.

Unfortunately, this pessimizes code if we aren't able to tell if
the upper bits are demanded. To fix that we could custom type
promote the FP_TO_UINT_SAT with SEXT_INREG after it, but I'll
leave that for future work.

I haven't found a failure from this, I was revisiting the code to
add vector support and spotted it.

Differential Revision: https://reviews.llvm.org/D133746
2022-09-13 08:41:32 -07:00
Craig Topper
b14b0b5213 [RISCV] Add test cases with result of fp_to_s/uint_sat sign/zero-extended from i32 to i64. NFC
I believe the result for fp_to_uint_sat is incorrect for this case.
2022-09-12 20:27:25 -07:00
Craig Topper
450edb0b37 [RISCV] Explicitly select second operand of branch condition to X0.
At least based on the lit tests, the coalescer sometimes fails to
propagate the copy from X0 into the branch instruction. This patch
does it manually during isel. The majority of the changes are from
the select patterns.

Some of the changes are just register allocation changes. Only
the Select change affects the whether a b*z instruction is generated
in the tests. I changed the branch pattern for consistency.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D130809
2022-08-01 11:16:48 -07:00
Shao-Ce SUN
84bacb18c6 [RISCV] Use check-prefixes to reduce check lines
Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D125083
2022-06-06 15:59:15 +08:00
Craig Topper
1d8bbe3d25 [RISCV] Implement a basic version of AArch64RedundantCopyElimination pass.
Using AArch64's original implementation for reference, this patch
implements a pass to remove unneeded copies of X0. This pass runs
after register allocation and looks to see if a register is implied
to be 0 by a branch in the predecessor basic block.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D118160
2022-02-04 10:43:46 -08:00
wangpc
8def89b5dc [RISCV] Set CostPerUse to 1 iff RVC is enabled
After D86836, we can define multiple cost values for
different cost models. So here we set CostPerUse to
1 iff RVC is enabled to avoid potential impact on RA.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D117741
2022-01-21 14:44:26 +08:00
Craig Topper
b271184f07 [RISCV] Use FP ABI on some of the FP tests to reduce the number of CHECK lines. NFC
These tests are interested in the FP instructions being used, not
the conversions needed to pass the arguments/returns in GPRs.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D116869
2022-01-10 09:08:29 -08:00
Craig Topper
a500f7f48f [SelectionDAG] Add FP_TO_UINT_SAT/FP_TO_SINT_SAT to computeKnownBits/computeNumSignBits.
These nodes should saturate to their saturating VT. We can use this
information to know the bits past the VT are all zeros or all sign bits.

I think we might only have test coverage for the unsigned case. I'll
verify and add tests.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D116870
2022-01-09 17:48:05 -08:00
Craig Topper
6a10bc7056 [RISCV] Add i8/i16 fptosi/fptoui and fptosi_sat/fptoui_sat tests. NFC
Use signext/zeroext return attributes to show unnecessary ands or
shifts in the saturating tests.
2022-01-08 14:01:31 -08:00
Craig Topper
0f9f17869f [RISCV] Add nounwind to remove some cfi directives from test CHECKs. NFC 2022-01-08 12:27:46 -08:00
Craig Topper
683cbc12b3 [RISCV] Remove stale comments from tests. NFC
The tests no longer generate the instructions that are mentioned
in the comments.
2021-12-18 13:36:03 -08:00
Hsiangkai Wang
137d3474ca [RISCV] Reverse the order of loading/storing callee-saved registers.
Currently, we restore the return address register as the last restoring
instruction in the epilog. The next instruction is `ret` usually. It is
a use of return address register. In some microarchitectures, there is
load-to-use data hazard. To avoid the load-to-use data hazard, we could
separate the load instruction from its use as far as possible. In this
patch, we reverse the order of restoring callee-saved registers to
increase the distance of `load ra` and `ret` in the epilog.

Differential Revision: https://reviews.llvm.org/D113967
2021-11-22 23:02:11 +08:00
wangpc
af0ecfccae [RISCV] Generate pseudo instruction li
Add an alias of `addi [x], zero, imm` to generate pseudo
instruction li, which makes assembly mush more readable.
For existed tests, users can update them by running script
`llvm/utils/update_llc_test_checks.py`.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D112692
2021-11-22 14:01:37 +08:00
Craig Topper
eb44f3fc58 [RISCV] Add rv32i/rv64i command lines to some floating point tests. NFC
This improves our coverage of soft float libcalls lowering.

Remove most of the test cases from rv64i-single-softfloat.ll. They
were duplicated in the test files that now test softflow. Only
a couple test cases for constrained FP remain. Those should be
removed when we start supporting constrained FP.

This is follow up from D113528.
2021-11-11 10:56:27 -08:00
Craig Topper
16ba77d19c [RISCV] Remove stale FIXMEs from float-convert.ll and double-convert.ll. NFC 2021-09-22 14:25:40 -07:00
Craig Topper
f0a422f935 [RISCV] Add fcvt.s.w(u)/fcvt.d.w(u)/fcvt.h.w(u) to hasAllNBitUsers
These instructions only read the lower 32 bits of their input.
2021-09-22 14:24:26 -07:00
Craig Topper
c7e78150f7 [RISCV] Add test cases showing failure to use ADDIW before fcvt.s.w/fcvt.d.w/fcvt.h.w. NFC
By not using ADDIW we can cause both an ADDIW and ADDI to be emitted
when the add has multiple users.

These instructions needed be added to the list of instructions that
only use the lower 32 bits of input.

I've also added tests for the wu versions, but I'm having trouble
showing bad codegen from it.
2021-09-22 14:24:26 -07:00
Craig Topper
d4ee84ceee [RISCV] Support FP_TO_S/UINT_SAT for i32 and i64.
The fcvt fp to integer instructions saturate if their input is
infinity or out of range, but the instructions produce a maximum
integer for nan instead of 0 required for the ISD opcodes.

This means we can use the instructions to do the saturating
conversion, but we'll need to fix up the nan case at the end.

We can probably improve the i8 and i16 default codegen as well,
but I'll leave that for a follow up.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D107230
2021-08-07 16:06:00 -07:00
Craig Topper
c63dbd8501 [RISCV] Custom lower (i32 (fptoui/fptosi X)).
I stumbled onto a case where our (sext_inreg (assertzexti32 (fptoui X)), i32)
isel pattern can cause an fcvt.wu and fcvt.lu to be emitted if
the assertzexti32 has an additional user. If we add a one use check
it would just cause a fcvt.lu followed by a sext.w when only need
a fcvt.wu to satisfy both users.

To mitigate this I've added custom isel and new ISD opcodes for
fcvt.wu. This allows us to keep know it started life as a conversion
to i32 without needing to match multiple nodes. ComputeNumSignBits
has been taught that this new nodes produces 33 sign bits. To
prevent regressions when we need to zero extend the result of an
(i32 (fptoui X)), I've added a DAG combine to convert it to an
(i64 (fptoui X)) before type legalization. In most cases this would
happen in InstCombine, but a zero_extend can be created for function
returns or arguments.

To keep everything consistent I've added new nodes for fptosi as well.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D106346
2021-07-24 10:50:43 -07:00
Craig Topper
4f1270a61e [RISCV] Add test cases to show an issue with our fcvt.wu isel patterns on RV64.
The pattern we match is (sext_inreg (assertzexti32 (fp_to_uint)), i32). If
the assertzexti32 has an additional user we'll end up emitting
an fcvt.wu and an fcvt.lu.

This can happen if the original fp_to_uint before type legalization
has one user that causes a sext_inreg to be emitted and one that
doesn't.
2021-07-19 22:58:42 -07:00
Craig Topper
420bd5ee8e [RISCV] Use ComputeNumSignBits/MaskedValueIsZero in RISCVDAGToDAGISel::selectSExti32/selectZExti32.
This helps us select W instructions in more cases. Most of the
affected tests have had the sign_extend_inreg or AND folded into
sextload/zextload.

Differential Revision: https://reviews.llvm.org/D104079
2021-06-10 19:06:45 -07:00
Craig Topper
b35a842581 [RISCV] Add test cases that show failure to use some W instructions if they are proceeded by a load. NFC
The loads end up becoming sextload/zextload which prevent our
isel patterns from finding the sign_extend_inreg or AND instruction
we need.

The easiest way to fix this is to use computeKnownBits or
ComputeNumSignBits in our isel matching to catch this.
2021-06-10 16:55:49 -07:00
Craig Topper
5185b52988 [RISCV] Fix crash with fptosi.sat/fptoui.sat intrinsics on RV64. Add test cases.
Add PromoteIntOp_FP_TO_XINT_SAT to type legalize the bit width
operand from i32 to i64 for RV64.

Add test cases for the saturating intrinsics for half/float/double
and i32/i64. CodeGen is definitely not optimal. We can probably
make use of the native behavior of fcvt instructions in many cases.

Fixes PR50083
2021-04-22 15:18:15 -07:00
Craig Topper
12d0753aca [RISCV] Use bitsLE instead of strict == MVT::i32 in assertsexti32 and assertzexti32.
The patterns that use this really want to know if the operand has at
least 32 sign/zero bits.

This increases opportunities to use W instructions when the original
source used i8/i16. Not sure how much this matters for performance,
but it makes i8/i16 code more consistent with i32.
2021-01-24 13:58:14 -08:00
Craig Topper
60ebf6408e [RISCV] Add test cases for missed opportunities to use fcvt.*.w(u) instructions on RV64 when input is known to be extended from i8/i16. 2021-01-24 13:48:29 -08:00
Michael Munday
e28b6a60bc [RISCV][NFC] Regenerate RISCV CodeGen tests
Regenerated using:

./llvm/utils/update_llc_test_checks.py -u llvm/test/CodeGen/RISCV/*.ll

This has added comments to spill-related instructions and added @plt to
some symbols.

Differential Revision: https://reviews.llvm.org/D92841
2020-12-09 19:42:49 +00:00
Luis Marques
3d0fbafd0b [RISCV] Switch to the Machine Scheduler
Most of the test changes are trivial instruction reorderings and differing
register allocations, without any obvious performance impact.

Differential Revision: https://reviews.llvm.org/D66973

llvm-svn: 372106
2019-09-17 11:15:35 +00:00
Luis Marques
2d550d19b3 Revert Patch from Phabricator
This reverts r372092 (git commit e38695a0255c9e7b53639f349f8101bae1ce5c04)

llvm-svn: 372104
2019-09-17 10:52:09 +00:00
Luis Marques
e38695a025 Patch from Phabricator
llvm-svn: 372092
2019-09-17 09:43:08 +00:00
Alex Bradbury
d834d8301d [RISCV] Add RV64F codegen support
This requires a little extra work due tothe fact i32 is not a legal type. When
call lowering happens post-legalisation (e.g. when an intrinsic was inserted
during legalisation). A bitcast from f32 to i32 can't be introduced. This is
similar to the challenges with RV32D. To handle this, we introduce
target-specific DAG nodes that perform bitcast+anyext for f32->i64 and
trunc+bitcast for i64->f32.

Differential Revision: https://reviews.llvm.org/D53235

llvm-svn: 352807
2019-01-31 22:48:38 +00:00