75 Commits

Author SHA1 Message Date
Piyou Chen
82b179ca66
[RISCV][VLOPT] Consider EMUL if it is unknown in EMULAndEEWAreEqual (#139670)
Fix https://github.com/llvm/llvm-project/issues/139288
2025-05-14 18:03:35 +08:00
Michael Maitland
f8416fcfec
[RISCV][VLOPT] Look through PHI instructions (#132236)
Similar to what we do for copies. We may reduce one of the PHI operands
and not the other, and thats perfectly okay.
2025-03-24 09:26:09 -04:00
Luke Lau
f4f7c71c55
[RISCV][VLOPT] Move mayReadPastVL check into getMinimumVLForUser. NFC (#127972)
checkUsers currently does two things, a) work out the minimum VL read by
every user and b) check that the operand info of the MI and users match.

getMinimumVLForUser handles most of a), with the exception of the check
for instructions that read past VL e.g. vrgather which is still in
checkUsers.

This moves it into getMinimumVLForUser to keep all that logic in one
place and simplifies an upcoming patch.
2025-02-20 20:17:18 +08:00
Luke Lau
44feae8695 [RISCV][VLOPT] Mark some methods + arguments as const. NFC 2025-02-20 17:37:27 +08:00
Luke Lau
c58011dc65
[RISCV][VLOPT] Peek through copies in checkUsers (#127656)
Currently if a user of an instruction isn't a vector pseudo we bail. For
simple non-subreg virtual COPYs, we can peek through their uses by using
a worklist.

This is extracted from a loop in TSVC2 (s273) that contains a fcmp +
select, which produces a copy that doesn't seem to be coalesced away.
2025-02-20 12:01:06 +08:00
LiqinWeng
fb394451ca
[RISCV][VLOPT] Add vfsqrt/vfrsqrt7 instruction to isSupportInstr (#127462) 2025-02-19 14:11:16 +08:00
Craig Topper
0cc532b79e
[RISCV] Move the RISCVII namespaced enums into RISCVVType namespace in RISCVTargetParser.h. NFC (#127585)
The VLMUL and policy enums originally lived in RISCVBaseInfo.h in the
backend which is where everything else in the RISCVII namespace is
defined.

RISCVTargetParser.h is used by much more of the compiler and it
doesn't really make sense to have 2 different namespaces exposed.
These enums are both associated with VTYPE so using the RISCVVType
namespace seems like a good home for them.
2025-02-18 08:27:25 -08:00
Luke Lau
36530414e3
[RISCV][VLOPT] Add support for Vector Fixed-Point Arithmetic Instructions (#126483)
This patch adds the remaining support for fixed-point arithmetic
instructions (we previously had support for averaging adds and
subtracts).

For saturating adds/subs/multiplies/clips, we can't change `vl` if
`vxsat` is used, since changing `vl` may change its value. So this patch
checks to see if it's dead before considering it a candidate.
2025-02-10 23:43:16 +08:00
Luke Lau
af2a228e0b
[RISCV][VLOPT] Fix passthru operand info for mixed-width instructions (#126504)
After #124066 we started allowing users that are passthrus. However for
widening/narrowing instructions we were returning the wrong operand info
for passthru operands since it originally assumed the operand would
never be a passthru. This fixes it by handling it in IsMODef.
2025-02-10 21:30:05 +08:00
Luke Lau
771f6b9f43
[RISCV][VLOPT] Add support for Widening Floating-Point Fused Multiply-Add Instructions (#126485)
We already had getOperandInfo support, so this marks the instructions as
supported in isCandidate. It also adds support for vfwmaccbf16.v{v,f}
from zvfbfwma
2025-02-10 19:55:22 +08:00
Luke Lau
19a41358ff
[RISCV][VLOPT] Add support for Single-Width Floating-Point Fused Multiply-Add Instructions (#125652)
These instructions have EEW=SEW for all operands.
2025-02-05 10:09:20 +08:00
Alex Bradbury
52c116218b [RISCV][VLOPT] Clear DemandedVLs for each invocation of runOnMachineFunction
I was running into failed assertions of `isCandidate(UserMI)` in
`getMinimumVLForUser`, but only occurring with
`-enable-machine-outliner=never`. I believe this is a red herring, and
it just so happens the memory allocation pattern on my machine exposed
the bug with that flag.

DemandedVLs is never cleared, which means it accumulates more
MachineInstr pointer keys over time, and it's possible that when e.g.
running on function 'b', a MachineInstr pointer points to the same
memory location used for a candidate in 'a'. This causes the assertion
to fail.

Comment left on #124530 with more information.
2025-02-02 18:05:13 +00:00
Craig Topper
bd95b57ef0
[RISCV][VLOpt] Move OperandInfo into anonymous namespace. Move getEMULEqualsEEWDivSEWTimesLMUL out of RISCVVType namespace. NFC (#125138)
We don't want OperandInfo to be visible outside of this translation
unit.

getEMULEqualsEEWDivSEWTimesLMUL is local to this file and declared
static. There's no reason to put it in a namespace.
2025-01-31 09:24:57 -08:00
Luke Lau
eb7e19998d
[RISCV][VLOPT] Allow users that are passthrus if tail elements aren't demanded (#124066)
The motivation for this to allow reducing the vl when a user is a
ternary pseudo, where the third operand is tied and also acts as a
passthru.

When checking the users of an instruction, we currently bail if the user
is used as a passthru because all of its elements past vl will be used
for the tail.

We can allow passthru users if we know the tail of their result isn't
used, which we will have computed beforehand after #124530

It's worth noting that this is all irrelevant of the tail policy,
because tail agnostic still ends up using the passthru.

I've checked that SPEC CPU 2017 + llvm-test-suite pass with this (on
qemu with rvv_ta_all_1s=true)

Fixes #123760
2025-01-30 23:45:24 +08:00
Luke Lau
8675cd3fac
[RISCV][VLOPT] Compute demanded VLs up front (#124530)
This replaces the worklist by instead computing what VL is demanded by
each instruction's users first, which is done via checkUsers.

The demanded VLs are stored in a DenseMap, and then we can just do a
single forward pass of tryReduceVL where we check if a candidate's
demanded VL is less than its VLOp.

This means the pass should now be linear in complexity, and allows us to
relax the restriction on tied operands in more easily as in #124066.
2025-01-29 12:39:38 +08:00
Luke Lau
ff271d04a2
[RISCV][VLOPT] Fix assertion failure across blocks (#124734)
Whilst adding a cross-block test, I encountered an assertion failure in
the second pass where we check the instruction popped off the worklist
is a candidate.

The leaf instruction %c in this case will be added to the worklist when
its VL is VLMAX, but during the first pass it will have its VL reduced
to 1.

Then in the second pass when its processed via the worklist, isCandidate
will no longer be true due to its VL == 1.

This fixes it by moving the VL == 1 check to tryReduceVL, keeping it
alongside the other VL check for bailing out early as an optimisation.
2025-01-29 11:00:50 +08:00
Luke Lau
c8d3ccfa16 [RISCV] Use llvm::reverse instead of make_range(rbegin, rend). NFC 2025-01-28 16:08:29 +08:00
Luke Lau
cb6f021af2
[RISCV][VLOPT] Remove unnecessary passthru restriction (#124549)
We currently check for passthrus in two places, on the instruction to
reduce in isCandidate, and on the users in checkUsers.

We cannot reduce the VL if an instruction has a user that's a passthru,
because the user will read elements past VL in the tail.

However it's fine to reduce an instruction if it itself contains a
non-undef passthru. Since the VL can only be reduced, not increased, the
previous tail will always remain the same.
2025-01-27 23:54:32 +08:00
Michael Maitland
bf258dbd57
[RISCV][VLOPT] support fp sign injection instructions (#124195) 2025-01-23 16:50:35 -05:00
Michael Maitland
f402e06e7d
[RISCV][VLOPT] Add vector fp min/max instructions to isSupportedInstr (#124196) 2025-01-23 16:47:14 -05:00
Luke Lau
ba3e6f0f0f
[RISCV][VLOPT] Remove dead passthru check in getOperandLog2EEW. NFC (#123911)
We already bail if the user is tied in checkUsers, which is true for all
passthrus. Remove the check in getOperandLog2EEW so that it only worries
about computing the OperandInfo, and leaves the passthru correctness to
checkUsers.
2025-01-23 10:17:39 +08:00
Philip Reames
27ccc99c4f
[RISCV][VLOpt] Minor worklist invariant cleanup [NFC] (#123989)
In retrospect, this probably should have been rolled into #123973. It
seemed more involved when I first decided to split. :)
2025-01-22 14:42:52 -08:00
Michael Maitland
1687aa2a99
[RISCV][VLOPT] Don't reduce the VL is the same as CommonVL (#123878)
This fixes the slowdown in #123862.
2025-01-22 13:49:54 -05:00
Philip Reames
589593254e
[RISCV][VLOpt] Reorganize visit order and worklist management (#123973)
This implements a suggestion by Craig in PR #123878. We can move the
worklist management out of the per-instruction work and do it once at
the end of scanning all the instructions. This should reduce repeat
visitation of the same instruction when no changes can be made.

Note that this does not remove the inherent O(N^2) in the algorithm.
We're still potentially visiiting every user of every def.

I also included a guard for unreachable blocks since that had been
mentioned as a possible cause. It seems we've rulled that out, but
guarding for this case is still a good idea.
2025-01-22 10:42:15 -08:00
Luke Lau
437e1a70ca
[RISCV][VLOPT] Handle tied pseudos in getOperandInfo (#123170)
For .wv widening instructions when checking if the opperand is vs1 or
vs2, we take into account whether or not it has a passthru. For tied
pseudos though their passthru is the vs2, and we weren't taking this
into account.
2025-01-16 23:00:13 +08:00
Michael Maitland
e44f03dd4e
[RISCV][VLOPT] Add floating point widening and narrowing bf16 convert support (#122353)
We already have getOperandInfo tests that cover this instruction.
2025-01-13 15:38:03 -05:00
Craig Topper
41e4018f9c
[RISCV][VLOPT] Simplify code by removing extra temporary variables. NFC (#122333)
Just do the conditional operator in the return statement.
2025-01-09 18:05:41 -08:00
Michael Maitland
d0373dbe7c
[RISCV][VLOPT] Add vadc to isSupportedInstr (#122345) 2025-01-09 19:44:40 -05:00
Michael Maitland
04e54cc19f
[RISCV][VLOPT] Add Vector Single-Width Averaging Add and Subtract to isSupportedInstr (#122351) 2025-01-09 19:39:12 -05:00
Michael Maitland
328c3a843f
[RISCV][VLOPT] Add vmerge to isSupportedInstr (#122340) 2025-01-09 16:10:40 -05:00
Craig Topper
b16777afb0
[RISCV] Return MILog2SEW for mask instructions getOperandLog2EEW. NFC (#122332)
The SEW operand for these instructions should have a value of 0. This
matches what was done for vcpop/vfirst.
2025-01-09 11:36:09 -08:00
Michael Maitland
5f70fea79f [RISCV][VLOPT] Add Vector Floating-Point Compare Instructions to getSupportedInstr 2025-01-09 10:50:32 -08:00
Michael Maitland
b419edeec3 [RISCV][VLOPT] Add widening floating point multiply to isSupportedInstr 2025-01-09 10:50:32 -08:00
Michael Maitland
a484fa1d0a [RISCV][VLOPT] Add floating point multiply divide instructions to getSupportedInstr 2025-01-09 10:50:32 -08:00
Michael Maitland
8beb9d393d [RISCV][VLOPT] Add vector widening floating point add subtract instructions to isSupportedInstr 2025-01-09 10:50:31 -08:00
Michael Maitland
c036a9a2c2 [RISCV][VLOPT] Add vector single width floating point add subtract instructions to isSupportedInstr 2025-01-09 10:50:31 -08:00
Michael Maitland
d5145715f7
[RISCV][VLOPT] Add vfirst and vcpop to getOperandInfo (#122295) 2025-01-09 13:31:02 -05:00
Michael Maitland
550841f839
[RISCV][VLOPT] Add fp-reductions to getOperandInfo (#122151) 2025-01-09 09:43:26 -05:00
Michael Maitland
f77a7dd875
[RISCV][VLOPT] Add getOperandInfo for integer and floating point widening reductions (#122176) 2025-01-09 09:35:06 -05:00
Philip Reames
0b4fca5b75
[RISCV][VLOpt] Remove State field from OperandInfo [nfc] (#122160)
We can just use a std::optional to wrap the operand info instead. The
state field is confusing as we have a "partially known" state where EEW
is known and EMUL is nullopt, but it's still "Known".
2025-01-08 12:37:28 -08:00
Philip Reames
983a957768
[RISCV][VLOpt] Consolidate EMUL=SEW/EEW*LMUL logic [NFC] (#122021)
All but one of the cases in tree today have EMUL=SEW/EEW*LMUL. Repeating
this each time is verbose and introduces oppurtunity for error. (For
instance, the comment associated with vwmul.vv was out of sync with the
code for same.)

Introduce getOperandLog2EEW and move most complexity to it. Then
introduce getOperandInfo as a wrapper around previous, and special case
the one case which requires it.

---------

Co-authored-by: Luke Lau <luke_lau@icloud.com>
2025-01-08 10:58:37 -08:00
Michael Maitland
e93181bf13
[RISCV][VLOPT] Add vector fp-conversion instruction to isSupportedInstr (#122033)
When these instructions are marked nofpexcept, we can optimize them.
There are some added toggles in the output, likley because other
noexcept fp instructions are not part of isSupportedInstr yet. We may
want to avoid marking an instruction as isSupported in the future if any
of its FP users are missing nofpexcept to avoid added toggles. However,
we seem to get some GPRs back as a result of this change, which may
outweigh the cost of avoiding extra toggles.

The plan is to follow this patch up with added support for more FP
instructions in the same way. The instructions in this patch are a
natural starting point because they allow us to test with integer
instructions which have good support already.
2025-01-08 13:30:40 -05:00
Michael Maitland
b253a80f54
[RISCV][VLOPT] Add mask load to isSupported and getOperandInfo (#122030)
Add mask store to getOperandInfo since it has the same behavior.
2025-01-07 22:07:57 -05:00
Philip Reames
4c4364869c
[RISCV][VLOpt] Kill all uses of and remove twoTimesVLMUL [NFC] (#122003)
Case analysis:
* EEW=SEW*2, getEMULEqualsEEWDivSEWTimesLMUL(EEW) returns 2 x VLMUL
* EEW=SEW, getEMULEqualsEEWDivSEWTimesLMUL(EEW) returns VLMUL
2025-01-07 15:14:45 -08:00
Michael Maitland
142787d368
[RISCV][VLOPT] Add support for checkUsers when UserMI is a Single-Width Integer Reduction (#120345)
Reductions are weird because for some operands, they are vector
registers but only read the first lane. For these operands, we do not
need to check to make sure the EEW and EMUL ratios match. The EEWs,
however, do need to match.
2025-01-07 17:56:07 -05:00
Michael Maitland
36e4176f1d
[RISCV][VLOPT] Add strided, unit strided, and indexed loads to isSupported (#121705)
Add to getOperandInfo too since that is needed to reduce the VL.
2025-01-07 17:45:06 -05:00
Michael Maitland
8b577043b1
[RISCV][VLOPT] Add vmv.x.s and vfmv.f.s to isVectorOpUsedAsScalarOp (#121588) 2025-01-05 11:19:45 -05:00
Michael Maitland
b48e5f0ff3
[RISCV][VLOPT] Add Vector FP instructions to getOperandInfo (#121609)
Although we cannot reduce the VL of these instructions (i.e. add to
isSupported) we can add them to getOperandInfo to enable optimization
where the FP vector instruction are users. Most of the instructions are
covered by existing tests, and I added tests for the narrowing
conversions because I was a little unsure whether the dest or the source
was 2*SEW and 2*LMUL.
2025-01-05 11:19:08 -05:00
Michael Maitland
50a457d9e8
[RISCV][VLOPT] Add getOperandInfo for saturating signed multiply (#120351)
These instructions are covered by the existing tests. We don't add them to
isSupported because of VXSAT. This decision was made in #120358.
2024-12-30 09:00:27 -05:00
Michael Maitland
3710050566
[RISCV][VLOPT] Set CommonVL as the largest of the users (#120349)
Prior to this patch, we required that all users had the same VL in order
to optimize. But as the FIXME said, we can use the largest VL to
optimize, as long as we can determine what the largest is. This patch
implements the FIXME.
2024-12-19 13:22:31 -05:00