265 Commits

Author SHA1 Message Date
Paul Kirth
03a61d34eb
[RISCV] Support TLSDESC in the RISC-V backend (#66915)
This patch adds basic TLSDESC support in the RISC-V backend.

Specifically, we add new relocation types for TLSDESC, as prescribed in 
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373, and add a
new pseudo instruction to simplify code generation.

This patch does not try to optimize the local dynamic case, which can be
improved in separate patches. 

Linker side changes will also be handled separately.

The current implementation is only enabled when passing the new
`-enable-tlsdesc` codegen flag.
2024-01-23 16:16:07 -08:00
Anatoly Trosinenko
10bd69a4f7
[MachineOutliner] Refactor iterating over Candidate's instructions (#78972)
Make Candidate's front() and back() functions return references to
MachineInstr and introduce begin() and end() returning iterators, the
same way it is usually done in other container-like classes.

This makes possible to iterate over the instructions contained in
Candidate the same way one can iterate over MachineBasicBlock (note that
begin() and end() return bundled iterators, just like MachineBasicBlock
does, but no instr_begin() and instr_end() are defined yet).
2024-01-23 17:21:40 +03:00
Simon Pilgrim
a369619694 Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFC. 2024-01-23 11:30:06 +00:00
Simeon K
297b77036e
[RISCV] Fix stack size computation when M extension disabled (#78602)
Ensure that getVLENFactoredAmount does not fail when the scale amount
requires the use of a non-trivial multiplication but the M extension is
not enabled. In such case, perform the multiplication using shifts and
adds.
2024-01-22 23:10:25 -08:00
Alex Bradbury
bc90b91885 Revert "[RISCV] Implement RISCVInsrInfo::getConstValDefinedInReg"
This reverts commit 4b7d997aaed7a2399d5e73fc3adfaaa6a3d35d1f.

A miscompile was reported
<https://github.com/llvm/llvm-project/pull/77610#issuecomment-1896193835>.
Reverting so it can be investigated.
2024-01-17 19:27:36 +00:00
Alex Bradbury
57d517c257
[RISCV] Implement RISCVInsrInfo::getConstValDefinedInReg (#77610)
This helper function handles common cases where we can determine a
constant value is being defined in a register. Although it looks like
codegen changes are possible due to this being called in
PeepholeOptimizer, my main motivation is to use this in
describeLoadedValue.
2024-01-16 07:14:41 +00:00
Craig Topper
c9da4dc77f
[RISCV] Refactor GPRF64 register class to make it usable for Zacas. (#77408)
-Rename to GPRPair.
-Rename registers to be named like X10_X11 instead of X10_PD. Except X0
 which is now X0_Pair since it is not paired with X1.
-Use unknown size and offset for the subreg indices. This might
 be a functional change, but does not affect any lit tests.
2024-01-09 09:21:27 -08:00
Jim Lin
96c4f1034c
[RISCV] Add support predicating for ANDN/ORN/XNOR with short-forward-branch-opt. (#77077)
ANDN/ORN/XNOR are like other ALU instructions. It should be able to be
predicated by the cpu that supports short-forward-branch.
2024-01-09 11:12:44 +08:00
Craig Topper
faa326de97
[RISCV] Add branch+c.mv macrofusion for sifive-p450. (#76169)
sifive-p450 supports a very restricted version of the short forward
branch optimization from the sifive-7-series.

For sifive-p450, a branch over a single c.mv can be macrofused as a
conditional move operation. Due to encoding restrictions on c.mv, we
can't conditionally move from X0. That would require c.li instead.
2024-01-08 15:23:26 -08:00
Fangrui Song
360996ac5a
[RISCV] Merge machine operand flag MO_PLT into MO_CALL (#77253)
Since #72467, `@plt` in assembly output "call foo@plt" is omitted. We
can trivially merge MO_PLT and MO_CALL without any functional change to
assembly/relocatable file output.

Earlier architectures use different call relocation types whether a PLT
is potentially needed: R_386_PLT32/R_386_PC32, R_68K_PLT32/R_68K_PC32,
R_SPARC_WDISP30/R_SPARC_WPLT320. However, as the PLT property is
per-symbol instead of per-call-site and linkers can optimize out a PLT,
the distinction has been confusing.

Arm made good names R_ARM_CALL/R_AARCH64_CALL. Let's use MO_CALL instead
of MO_PLT.

As follow-ups, we can merge fixup_riscv_call/fixup_riscv_call_plt and
VK_RISCV_CALL/VK_RISCV_CALL_PLT.
2024-01-07 12:43:39 -08:00
Craig Topper
0ebe97115d Revert "[RISCV] Refactor subreg indices. (#77173)"
This reverts commit b5de136ef3fd63c6a6aabaea16792e47be1eeeff.

Based on post commit feedback, I need to some other work before
this makes sense.
2024-01-06 18:51:15 -08:00
Craig Topper
b5de136ef3
[RISCV] Refactor subreg indices. (#77173)
-Rename sub_32_hi to sub_gpr_odd
-Add dedicated sub_gpr_even.
-Rename sub_32 and sub_16 to sub_fpr32 and sub_fpr16.
-Remove start offset from sub_gpr_odd. AArch64 doesn't use non-zero offset for GPR
tuples so I don't think we need to.

This is preparation for a RV64 GPRPair for Zacas.
2024-01-06 11:42:53 -08:00
Alex Bradbury
02c2bf8c05
[RISCV] Change heuristic used for load clustering (#75341)
Split out from #73789, so as to leave that PR just for flipping load
clustering to on by default. Clusters if the operations are within a
cache line of each other (as AMDGPU does in shouldScheduleLoadsNear).
X86 does something similar, but does `((Offset2 - Offset1) / 8 > 64)`.
I'm not sure if that's intentionally set to 512 bytes or if the division
is in error.

Adopts the suggestion from @wangpc-pp to query the cache line size and
use it if available.

We also cap the maximum cluster size to cap the potential register
pressure impact (which may lead to additional spills).
2024-01-02 16:28:24 +00:00
Alex Bradbury
b717365216
[MachineScheduler][NFCI] Add Offset and OffsetIsScalable args to shouldClusterMemOps (#73778)
These are picked up from getMemOperandsWithOffsetWidth but weren't then
being passed through to shouldClusterMemOps, which forces backends to
collect the information again if they want to use the kind of heuristics
typically used for the similar shouldScheduleLoadsNear function (e.g.
checking the offset is within 1 cache line).

This patch just adds the parameters, but doesn't attempt to use them.
There is potential to use them in the current PPC and AArch64
shouldClusterMemOps implementation, and I intend to use the offset in
the heuristic for RISC-V. I've left these for future patches in the
interest of being as incremental as possible.

As noted in the review and in an inline FIXME, an ElementCount-style abstraction may later be used to condense these two parameters to one argument. ElementCount isn't quite suitable as it doesn't support negative offsets.
2023-12-06 15:30:48 +00:00
Alex Bradbury
d6fbd96e5e
[RISCV] Support FrameIndex operands in getMemOperandsWithOffsetWidth / getMemOperandWithOffsetWidth (#73802)
I noted AArch64 happily accepts a FrameIndex operand as well as a
register. This doesn't cause any changes outside of my C++ unit test for
the current state of in-tree, but this will cause additional test
changes if #73789 is rebased on top of it.

Note that the returned Offset doesn't seem at all as meaningful if you
have a FrameIndex base, though the approach taken here follows AArch64
(see D54847). This change won't harm the approach taken in
shouldClusterMemOps because memOpsHaveSameBasePtr will only return true
if the FrameIndex operand is the same for both operations.
2023-12-05 21:26:56 +00:00
Alex Bradbury
85c9c16895
[RISCV] Support load clustering in the MachineScheduler (off by default) (#73754)
This adds minimal support for load clustering, but disables it by
default. The intent is to iterate on the precise heuristic and the
question of turning this on by default in a separate PR. Although
previous discussion indicates hope that the MachineScheduler would
replace most uses of the SelectionDAG scheduler, it does seem most
targets aren't using MachineScheduler load clustering right now:
PPC+AArch64 seem to just use it to help with paired load/store formation
and although AMDGPU uses it for general clustering it also implements
ShouldScheduleLoadsNear for the SelectionDAG scheduler's clustering.
2023-11-29 10:01:55 +00:00
Alex Bradbury
9c5003cc0c
[RISCV] Implement RISCVInstrInfo::getMemOperandsWithOffsetWidth (#73681)
This hook is called by the default implementation of
getMemOperandWithOffset and by the load/store clustering code in the
MachineScheduler though this isn't enabled by default and is not yet
enabled for RISC-V. Only return true for queries on scalar loads/stores
for now (this is a conservative starting point, and vector load/store
can be handled in a follow-on patch).
2023-11-29 04:48:43 +00:00
Craig Topper
a845061935
[AArch64] Use the same fast math preservation for MachineCombiner reassociation as X86/PowerPC/RISCV. (#72820)
Don't blindly copy the original flags from the pre-reassociated
instrutions.
This copied the integer poison flags which are not safe to preserve
after reassociation.
    
For the FP flags, I think we should only keep the intersection of
the flags. Override setSpecialOperandAttr to do this.

Fixes #72777.
2023-11-22 14:17:45 -08:00
Craig Topper
9ae04a77d1 [RISCV] Don't set nsw/nuw/exact flag after MachineCombiner reassociation.
This matches what PowerPC and X86 do.
2023-11-19 11:08:28 -08:00
Alex Bradbury
7f28e8ced7
[RISCV] Implement RISCVInstrInfo::isAddImmediate (#72356)
This hook is called by the target-independent implementation of
TargetInstrInfo::describeLoadedValue. I've opted to test it via a C++
unit test, which although fiddly to set up seems the right way to test a
function with such clear intended semantics (rather than testing the
impact indirectly).

isAddImmediate will never recognise ADDIW as an add immediate which I
_think_ is conservatively correct, as the caller may not understand its
semantics vs ADDI.

Note that although the doc comment for isAddImmediate specifies its
behaviour solely in terms of physical registers, none of the current
in-tree implementations (including this one) bail out on virtual
registers (see #72357).
2023-11-16 14:43:31 +00:00
Alex Bradbury
ac378ac493 [RISCV][NFC] Rewrite doc comment for RISCVInstrInfo::getMemOperandWithOffsetWidth
Attempt to clarify the expected behaviour.
2023-11-15 14:50:37 +00:00
Craig Topper
e0e0891d74 [RISCV][GISel] Select G_BRCOND and G_ICMP together when possible.
This allows us to fold the G_ICMP operands into the conditional branch.

This reuses the helper function we have for folding a G_ICMP into
G_SELECT.
2023-11-12 15:53:23 -08:00
Wang Pengcheng
e179b125fb
[RISCV][NFC] Pass MCSubtargetInfo instead of FeatureBitset in RISCVMatInt (#71770)
The use of `hasFeature` is more descriptive and the callers of
`RISCVMatInt` have no need to call `getFeatureBits()` any more.
2023-11-09 15:15:23 +08:00
Min-Yih Hsu
1e39575a98
[RISCV] CSE by swapping conditional branches (#71111)
DAGCombiner, as well as InstCombine, tend to canonicalize GE/LE into
GT/LT, namely:
```
X >= C --> X > (C - 1)
```
Which sometime generates off-by-one constants that could have been CSE'd
with surrounding constants.
Instead of changing such canonicalization, this patch tries to swap
those branch conditions post-isel, in the hope of resurfacing more
constant CSE opportunities. More specifically, it performs the following
optimization:

For two constants C0 and C1 from
```
li Y, C0
li Z, C1
```
To remove redundnat `li Y, C0`,
 1. if C1 = C0 + 1 we can turn: 
    (a) blt Y, X -> bge X, Z
    (b) bge Y, X -> blt X, Z
 2. if C1 = C0 - 1 we can turn: 
    (a) blt X, Y -> bge Z, X
    (b) bge X, Y -> blt Z, X

This optimization will be done by PeepholeOptimizer through
RISCVInstrInfo::optimizeCondBranch.
2023-11-03 09:03:52 -07:00
Craig Topper
284d136c4a
[RISCV] Teach copyPhysReg to allow copies between GPR<->FPR32/FPR64 (#70525)
This is needed because GISel emits copies instead of bitcasts like
SelectionDAG.
2023-10-30 09:58:51 -07:00
Wang Pengcheng
a316f14fdd
[RISCV][NFC] Move getRVVMCOpcode to RISCVInstrInfo (#70637)
To simplify more code.
2023-10-30 19:03:04 +08:00
Craig Topper
8363996894
[RISCV] Reduce the number of parameters to copyPhysRegVector. NFC (#70502)
The Lmul and SubRegIdx can be derived from the opcode.

Make NF default to 1.
2023-10-27 13:36:44 -07:00
Craig Topper
b679ec86e3 [RISCV] Use RISCVInstrInfo::movImm to implement most of RISCVPostRAExpandPseudo::expandMovImm (#70389) 2023-10-27 13:35:56 -07:00
Craig Topper
035c154f4f [RISCV] Refactor RISCVPostRAExpandPseudo::expandMovImm and RISCVInstrInfo::movImm to prepare for merging.
Fix small bug where RISCVPostRAExpandPseudo::expandMovImm set the
kill flag on X0.
2023-10-27 13:35:56 -07:00
Craig Topper
c18e78cfe3
[RISCV] Add copyPhysRegVector to extract common vector code out of copyPhysRegVector. (#70497)
Call this method directly from each vector case with the correct
arguments. This allows us to treat each type of copy as its own
special case and not pass variables to a common merge point. This
is similar to how AArch64 is structured.
    
I think I can reduce the number of operands to this new method, but
I'll do that as a follow up.
2023-10-27 12:43:56 -07:00
Craig Topper
124613c8ed
[RISCV] Separate FPR and VR copyPhysReg implementation. (#70492)
This duplicates the BuildMI from the 3 FP copy types, but separates them
from the VR code and gets rid of the IsScalableVector flag.

I need to add FPR<->GPR copies for GISel which needs another variation
of BuildMI. So it seems cleaner to start enumerating each case as its
own if.
2023-10-27 12:25:42 -07:00
Craig Topper
116eb323b1
[RISCV] Correct copyPhysReg for GPRPF64. (#70419)
GPRF64 represents a pair of registers. We were only copying the even
part. We need to copy the odd part too.
2023-10-26 23:54:46 -07:00
Jim Lin
c249e27786 [RISCV] Add missing break in switch-case in convertToThreeAddress function. NFC. 2023-10-25 17:51:17 +08:00
Wang Pengcheng
69ade08b4a
[RISCV][NFC] Fix comments in foldMemoryOperandImpl (#70033)
I think the TODO is stale now.
2023-10-25 11:13:42 +08:00
Sacha Coppey
776889bc1c [RISCV] Add Stackmap/Statepoint/Patchpoint support without targets
This patch adds stackmap support for RISC-V without targets (i.e. the nop patchable forms).

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D123496
2023-10-11 09:18:55 +05:30
Craig Topper
45636ecf2c
[RISCV] Add sink-and-fold support for RISC-V. (#67602)
This uses the recently introduced sink-and-fold support in MachineSink.
https://reviews.llvm.org/D152828
    
This enables folding ADDI into load/store addresses.
    
Enabling by default will be a separate PR.
2023-10-07 10:38:35 -07:00
Luke Lau
e577e7025d
[RISCV] Move vector pseudo hasAllNBitUsers switch into RISCVInstrInfo. NFC (#67593)
The handling for vector pseudos in hasAllNBitUsers is duplicated across
RISCVISelDAGToDAG and RISCVOptWInstrs. This deduplicates it between the
two,
with the common denominator between the two call sites being the opcode
and
SEW: We need to handle extracting these separately since one operates at
the
SelectionDAG level and the other at the MachineInstr level.
2023-10-03 12:24:11 +01:00
Craig Topper
62f5636838 [RISCV] Don't set KILL flag on X0 in RISCVInstrInfo::movImm.
Extracted from #67159.
2023-09-25 13:40:08 -07:00
Craig Topper
bbe3ee061f
[RISCV] Add more instructions for the short forward branch optimization. (#66789)
This adds the shifts and the immediate forms of the instructions that
were already supported.

There are still more instructions that can be predicated, but this is
the rest of what we had in our downstream.
2023-09-19 10:21:39 -07:00
liqin.weng
700042cd88 [RISCV] Remove debug location to spill/reload instructions
Spill/reload instructions are artificially generated by the compiler and
have no relation to the original source code. So the best thing to do is
not attach any debug location to them (instead of just taking the next
debug location we find on following instructions).

Refered to https://reviews.llvm.org/rG3e081703c349dd00b8ef6991c2d15964915dd8f4

Reviewed By: asb, kito-cheng, benshi001

Differential Revision: https://reviews.llvm.org/D129173
2023-09-09 16:39:28 +08:00
Craig Topper
e4b2f2d4a6 [RISCV][GISel] Legalize G_PHI and G_BRCOND.
Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D157818
2023-08-14 10:21:58 -07:00
Craig Topper
da56750f82 [RISCV] Change naming of vector pseudos with scalar FP operand.
We need a pseudo for each scalar FP register class. Previously
we distinquished the pseudos by naming them with F16, F32, F64, or
BF16 in place of the F in the normal instruction name.

Because these strings can appear in other parts of the name we had
to do things like matching "_VBF16" to "_VF".

This patch replaces the F16, F32, F64 strings with FPR16, FPR32, and
FPR64. We also use FPR16 for BF16 since that is the scalar register
class for bf16.

Since the FPR16/32/64 string does not anywhere else in the pseudo
names, we can use this to simplify the string replacements. This
also allows us to simplify some BF16 related code.

Reviewed By: wangpc

Differential Revision: https://reviews.llvm.org/D157749
2023-08-12 11:20:47 -07:00
Alex Bradbury
667602793b [RISCV] Implement support for bf16 select when zfbfmin is enabled
These test cases previously caused an error. RISCVInstrInfo::copyPhysReg also needed a tweak in order to account for copying bf16 values in FPR16 registers.

Differential Revision: https://reviews.llvm.org/D156883
2023-08-02 20:04:30 +01:00
Craig Topper
f3b4c266e8 [RISCV] Adjust the Zfhmin handling in RISCVInstrInfo::copyPhysReg.
Instead of checking '!Zfh && Zhfmin' first, handle Zfh. Then assert
that the other case is F+Zfhmin. The F+Zfhmin check will need to be
relaxed for bfloat16 support. As it was written before there would
be now error to catch that. Instead it would just silently create
fsgnj.h instructions.
2023-07-16 20:20:59 -07:00
eopXD
00093667b1 [2/8][RISCV] Add rounding mode control variant for vfwadd, vfwsub
Depends on D154628

For the cover letter of the patch-set, please checkout D154628.

This is the 2nd patch of the patch-set.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D154629
2023-07-13 00:42:00 -07:00
Craig Topper
1aecb0e000 [RISCV] Clear kill flags when forming FMA instructions in MachineCombiner.
If the operands to the mul have other uses we may be extending their
live range past a kill flag.

Reviewed By: asb, asi-sc

Differential Revision: https://reviews.llvm.org/D155046
2023-07-12 08:03:45 -07:00
Philip Reames
5cd41dc62d [RISCV] Remove legacy TA/TU pseudo distinction for binary instructions
This change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295.

This change handles most of the binary pseudos. I excluded pseudos which _TIED variants, and those that produce mask results. Both a bit different in functionality, and deserve their own change and review. As with previous changes in the series, we replace the existing TA and TU forms with a single unified pseudo with a passthru (which may be implicit_def) and a policy operand.

As before, we see codegen changes (some improvements and some regressions) due to scheduling differences caused by the extra implicit_def instructions.

Differential Revision: https://reviews.llvm.org/D154245
2023-07-11 10:21:42 -07:00
Philip Reames
92b5a3405d [RISCV] Remove legacy TA/TU pseudo distinction for unary instructions
This change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295. In D153155, we started removing the legacy distinction between unsuffixed (TA) and _TU pseudos. This patch continues that effort for the unary instruction families.

The change consists of a few interacting pieces:
* Adding a vector policy operand to VPseudoUnaryNoMaskTU.
* Then using VPseudoUnaryNoMaskTU for all cases where VPseudoUnaryNoMask was previously used and deleting the unsuffixed form.
* Then renaming VPseudoUnaryNoMaskTU to VPseudoUnaryNoMask, and adjusting the RISCVMaskedPseudo table to use the combined pseudo.
* Fixing up two places in C++ code which manually construct VMV_V_* instructions.

Normally, I'd try to factor this into a couple of changes, but in this case, the table structure is tied to naming and thus we can't really separate the otherwise NFC bits.

As before, we see codegen changes (some improvements and some regressions) due to scheduling differences caused by the extra implicit_def instructions.

Differential Revision: https://reviews.llvm.org/D153899
2023-06-29 07:34:14 -07:00
Philip Reames
c6b56cec8b [RISCV] Check that SEW and policy operands are immediates in verifier
This converts a crash (due an assertion inside getImm) into a verifier failure.  Much easier to debug when you have malformed instructions.
2023-06-26 11:45:17 -07:00
Craig Topper
b105b3266f [RISCV] Properly handle partial writes in isConvertibleToVMV_V_V.
We were only checking for the previous insructions to write exactly
the register or a super register. We ignored writes to a subregister
and continued searching for the producing instruction. We need to
abort instead.

There's another check inside the if body to abort if the registers
don't match exactly. So we just need to check for overlap so we
enter the if body.

Reviewed By: fakepaper56

Differential Revision: https://reviews.llvm.org/D153490
2023-06-25 23:08:47 -07:00