54 Commits

Author SHA1 Message Date
Sacha Coppey
d2f8ba7d6d
[RISCV][NFC] Add generateMCInstSeq in RISCVMatInt (#84462)
This allows to avoid duplicating the code handling the instructions
outputted by `generateInstSeq` when emitting `MCInst`s.
2024-03-23 01:08:13 +08:00
Wang Pengcheng
e179b125fb
[RISCV][NFC] Pass MCSubtargetInfo instead of FeatureBitset in RISCVMatInt (#71770)
The use of `hasFeature` is more descriptive and the callers of
`RISCVMatInt` have no need to call `getFeatureBits()` any more.
2023-11-09 15:15:23 +08:00
Craig Topper
3c0990c188
[RISCV] Generalize the (ADD (SLLI X, 32), X) special case in constant materialization. (#66931)
We don't have to limit ourselves to a shift amount of 32. We can support
other shift amounts that make the upper 32 bits line up.
2023-10-02 13:03:06 -07:00
Craig Topper
c75e3ea4bd
[RISCV] Improve constant materialization by using a sequence that end… (#66943)
…s with 2 addis in some cases.

If the lower 13 bits are something like 0x17ff, we can first materialize
it as 0x1800 followed by an addi to subtract a small offset. This might
be cheaper to materialize since the constant ending in 0x1800 can use a
simm12 immediate for its final addi.
2023-09-27 21:48:41 -07:00
Craig Topper
cbd4596168 Recommmit "[RISCV] Improve contant materialization to end with 'not' if the cons… (#66950)"
With MC test updates.

Original commit message

We can invert the value and treat it as if it had leading zeroes.
2023-09-20 18:51:51 -07:00
Craig Topper
ea064ba6a2 Revert "[RISCV] Improve contant materialization to end with 'not' if the cons… (#66950)"
This reverts commit a8b8e9476451e125e81bd24fbde6605246c59a0e.

Forgot to update MC tests.
2023-09-20 17:05:00 -07:00
Craig Topper
a8b8e94764
[RISCV] Improve contant materialization to end with 'not' if the cons… (#66950)
…tant has leading ones.

We can invert the value and treat it as if it had leading zeroes.
2023-09-20 16:52:51 -07:00
Craig Topper
b41e0fb44f [RISCV] Prevent overflowing the small size of InstSeq in generateInstSeq.
The small size is 8 which is the worst case of the core recursive
algorithm.

The special cases use the core algorithm and append additonal
instructions. We were pushing the extra instructions before checking
the profitability. This could lead to 9 and maybe 10 instructions
in the sequence which overflows the small size.

This patch does the profitability check before inserting the
extra instructions so that we don't create 9 or 10 insruction
sequences.

Alternative we could bump the small size to 9 or 10, but then
we're pushing things that are never going be used.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D152308
2023-06-12 15:29:03 -07:00
Craig Topper
bb10612587 [RISCV] Use PACK in RISCVMatInt for constants that have the same lower and upper 32 bits.
This requires Zbkb.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D152293
2023-06-06 13:30:33 -07:00
Craig Topper
4f5f38bdab [RISCV] Add early out to generateInstSeq when the initial sequence is 1 or 2 instructions.
This avoids checking the size of the sequence repeatedly for each
special case. Especially on RV32 where none of the special cases
apply.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D152300
2023-06-06 13:23:17 -07:00
Craig Topper
03bc33c809 Revert "[RISCV] Minor readability improvement to RISCVMatInt. NFC"
This reverts commit 1ebe06017df607d4fc140f6b166e35cd32fc5f16.

I've been informed the old way was documented in the psABI.
2023-06-05 23:22:28 -07:00
Craig Topper
1ebe06017d [RISCV] Minor readability improvement to RISCVMatInt. NFC
When splitting a simm32 into LUI+ADDI(W). Subtract Lo12 from Val
to calculate Hi20. This replaces the old method of adding 0x800 to
Val. This change makes the math the reverse of how the LUI+ADDI(W)
create the immediate.
2023-06-05 23:07:35 -07:00
Craig Topper
dc5679df71 [RISCV] Rename FeatureExtZc* to FeatureStdExtZc*. NFC
Even for experimental extensions, I think we always include "Std"
in the feature name.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D146997
2023-03-27 13:19:01 -07:00
Kazu Hirata
639b7865a6 [RISCV] Use llvm::rotl (NFC) 2023-02-13 20:16:48 -08:00
Philipp Tomsich
fc02eeb24f [RISCV] Add vendor-defined XTheadBb (basic bit-manipulation) extension
The vendor-defined XTHeadBb (predating the standard Zbb extension)
extension adds some bit-manipulation extensions with somewhat similar
semantics as some of the Zbb instructions.

It is supported by the C9xx cores (e.g., found in the wild in the
Allwinner D1) by Alibaba T-Head.

The current (as of this commit) public documentation for XTHeadBb is
available from:
  https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf

Support for these instructions has already landed in GNU Binutils:
  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=8254c3d2c94ae5458095ea6c25446ba89134b9da

Depends on D143036

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D143439
2023-02-13 17:02:09 +01:00
Philipp Tomsich
b0c3132226 Revert "[RISCV] Add vendor-defined XTheadBb (basic bit-manipulation) extension"
This reverts commit 19a59099095b3cbc9846e5330de26fca0a44ccbe.
2023-02-08 08:00:34 +01:00
Philipp Tomsich
19a5909909 [RISCV] Add vendor-defined XTheadBb (basic bit-manipulation) extension
The vendor-defined XTHeadBb (predating the standard Zbb extension)
extension adds some bit-manipulation extensions with somewhat similar
semantics as some of the Zbb instructions.

It is supported by the C9xx cores (e.g., found in the wild in the
Allwinner D1) by Alibaba T-Head.

The current (as of this commit) public documentation for XTHeadBb is
available from:
  https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.2.2/xthead-2023-01-30-2.2.2.pdf

Support for these instructions has already landed in GNU Binutils:
  https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=8254c3d2c94ae5458095ea6c25446ba89134b9da

Depends on D143036

Differential Revision: https://reviews.llvm.org/D143439
2023-02-08 07:57:27 +01:00
Kazu Hirata
e078201835 [Target] Use llvm::count{l,r}_{zero,one} (NFC) 2023-01-28 09:23:07 -08:00
Kazu Hirata
b3af04f880 [llvm] Use llvm::countr_zero instead of findFirstSet (NFC)
At each call to findFirstSet in this patch, the argument is known to
be nonzero, so we can safely switch to llvm::countr_zero, which will
become std::countr_zero once C++20 is available.
2023-01-23 23:07:41 -08:00
Kazu Hirata
caa99a01f5 Use llvm::popcount instead of llvm::countPopulation(NFC) 2023-01-22 12:48:51 -08:00
Kazu Hirata
83d56fb17a Drop the ZeroBehavior parameter from countLeadingZeros and the like (NFC)
This patch drops the ZeroBehavior parameter from bit counting
functions like countLeadingZeros.  ZeroBehavior specifies the behavior
when the input to count{Leading,Trailing}Zeros is zero and when the
input to count{Leading,Trailing}Ones is all ones.

ZeroBehavior was first introduced on May 24, 2013 in commit
eb91eac9fb866ab1243366d2e238b9961895612d.  While that patch did not
state the intention, I would guess ZeroBehavior was for performance
reasons.  The x86 machines around that time required a conditional
branch to implement countLeadingZero<uint32_t> that returns the 32 on
zero:

        test    edi, edi
        je      .LBB0_2
        bsr     eax, edi
        xor     eax, 31
.LBB1_2:
        mov     eax, 32

That is, we can remove the conditional branch if we don't care about
the behavior on zero.

IIUC, Intel's Haswell architecture, launched on June 4, 2013,
introduced several bit manipulation instructions, including lzcnt and
tzcnt, which eliminated the need for the conditional branch.

I think it's time to retire ZeroBehavior as its utility is very
limited.  If you care about compilation speed, you should build LLVM
with an appropriate -march= to take advantage of lzcnt and tzcnt.
Even if not, modern host compilers should be able to optimize away
quite a few conditional branches because the input is often known to
be nonzero from dominating conditional branches.

Differential Revision: https://reviews.llvm.org/D141798
2023-01-18 19:58:44 -08:00
Craig Topper
564e09c778 [RISCV] Use bseti for 2048 in RISCVMatInt when Zbs is enabled.
2048 requires an LUI and ADDI instruction due to ADDI using a
signed immediate. It can also be done with C.LI+C.SLLI for better
code size.

With Zbs we can use a single BSETI to have an instruction.

Reorder the checks so that BSETI is checked first, with an extra
qualification to prefer a single LUI or ADDI when possible. I'm
continuing to think about other ways to structure this code, but
this works for now.

Fixes PR59362.
2022-12-07 20:14:22 -08:00
Craig Topper
f2ffdbeb9c [RISCV] Add accessors to RISCVMatInt::Inst.
Make fields private. This helps hide that the Imm field doesn't
store a full int64_t.
2022-12-07 19:02:01 -08:00
Craig Topper
2c52d516da Revert "[RISCV] Return InstSeq from generateInstSeqImpl instead of using an output parameter. NFC"
This reverts commit d24915207c631b7cf637081f333b41bc5159c700.

Thinking about this more this probably chewed up 100+ bytes of stack
for each recursive call. So this probably needs more thought. The
code simplification wasn't that much.
2022-12-07 12:59:31 -08:00
Craig Topper
938d0d6d7b [RISCV] Replace uses of hasStdExtC with COrZca.
Except MakeCompressible which will need more work.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D139504
2022-12-07 09:34:01 -08:00
Craig Topper
d6cfdf0440 [RISCV] Pass ZB_Undefined to countTrailingZeros/countLeadingZeros. NFC
We know the input is not zero so we can simplify the generated code.
2022-12-06 14:57:28 -08:00
Craig Topper
d24915207c [RISCV] Return InstSeq from generateInstSeqImpl instead of using an output parameter. NFC
We should be able to rely on RVO here.
2022-12-06 14:57:27 -08:00
Craig Topper
1806ce9097 [RISCV] Teach RISCVMatInt to prefer li+slli over lui+addi(w) for compressibility.
With C extension, li with a 6 bit immediate followed by slli is 4 bytes.
The lui+addi(w) sequence is at least 6 bytes.

The two sequences probably have similar execution latency. The exception
being if the target supports lui+addi(w) macrofusion.

Since the execution latency is probably the same I didn't restrict
this to C extension.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D139135
2022-12-06 10:31:17 -08:00
Craig Topper
ce66f4d0a6 [RISCV] Restrict when RISCVMatInt will retry SLLI as a last step. NFC
The main algorithm will already end with a SLLI when there are 12
or more trailing zeros. We only need to retry when there are less
than 12 trailing zeros since the main algorithm will pick an ADDI
or ADDIW at the end for those cases.
2022-12-06 09:25:20 -08:00
Craig Topper
dd3fe52492 [RISCV] Remove some RISCVMatInt early exits.
These were early exiting if we replaced a sequence with a 2 instruction
sequence since that is the best we could do. All the later optimizations
only occur if the sequence is more than 2 instructions so this wasn't a
functional check.

At best it helps the compiler generate better code, but I don't think
that was analyzed when it was added. Remove it to simplify the code.
2022-12-05 16:29:16 -08:00
Craig Topper
47ff3042e7 [RISCV] Use findFirstSet instead of countTrailingZeros. NFC
findFirstSet is a wrapper around countTrailingZeros so they are
equivalent here, but I think findFirstSet more cleary describes
the algorithm here.
2022-12-04 18:00:36 -08:00
Craig Topper
c8c1d7afa9 [RISCV] Use emplace_back to shorten lines in RISCVMatInt. NFC
A few other minor improvements.
2022-12-04 18:00:27 -08:00
jacquesguan
0fe5f03eeb [RISCV][NFC] Use nested namespace definations.
Since we use C++17 now, we could use nested namespace definations to simplify code.

Differential Revision: https://reviews.llvm.org/D131751
2022-08-13 09:56:59 +08:00
Craig Topper
d2ee2c9c8d [RISCV] Add an operand kind to the opcode/imm returned from RISCVMatInt.
Instead of matching opcodes to know the format to emit, use an
enum value that we can get from the RISCVMatInt::Inst class.

Change the consumers to use fully covered switches so that we get
a compiler warning if a new kind is added. With the opcode checks
it was easier to forget to update one of the 3 consumers.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D126317
2022-05-24 14:56:29 -07:00
Craig Topper
5c38373125 [RISCV] Improve constant materialization for cases that can use LUI+ADDI instead of LUI+ADDIW.
It's possible that we have a constant that isn't simm32 so we can't
use LUI+ADDIW, but we can use LUI+ADDI. Because ADDI uses a sign
extended constant, it's possible that after subtracting it out, we
end up with a simm32 that maps to LUI.

This patch detects this case after removing Lo12 and before shifting
the value for SLLI.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D124222
2022-04-29 08:58:32 -07:00
Craig Topper
9534811aa8 [RISCV] Teach generateInstSeqImpl to generate BSETI for single bit cases.
If the immediate has one bit set, but isn't a simm32 we can try
the BSETI instruction from Zbs.
2022-04-21 12:08:34 -07:00
Craig Topper
98b866892d [RISCV] Add special case to constant materialization to remove trailing zeros first.
If there are fewer than 12 trailing zeros, we'll try to use an ADDI
at the end of the sequence. If we strip trailing zeros and end the
sequence with a SLLI we might find a shorter sequence.

Differential Revision: https://reviews.llvm.org/D124148
2022-04-21 09:43:32 -07:00
Craig Topper
186d5c8af5 [RISCV] Make getInstSeqCost handle other Zb* instructions.
We haven't been updating this as Zb* instructions have been used
for immediate materialization. They will hit the default case and
trigger an llvm_unreachable. Instead of trying to list them all,
assume instructions that aren't explicitly listed aren't compressible.

Spotted while looking at integer materialization for other reasons.
I haven't seen a crash from this yet.
2022-04-20 22:08:04 -07:00
Craig Topper
70046438d0 [RISCV] Only try LUI+SH*ADD+ADDI for int materialization if LUI+ADDI+SH*ADD failed.
There's an assert in LUI+SH*ADD+ADDI materialization that makes sure the
lower 12 bits aren't zero since that case should have been handled as
LUI+ADDI+SH*ADD. But nothing prevented the LUI+SH*ADD+ADDI checks from
running after the earlier code handled it.

The sequence would be the same length or longer so it wouldn't replace
the earlier sequence, but the assert happened before that was checked.

The vector holding the sequence also wasn't reset before the second
check so that guaranteed the sequence would never be found to be
shorter.

This patch fixes this by only trying the second expansion when the
earlier fails.

Fixes PR54812.

Reviewed By: benshi001

Differential Revision: https://reviews.llvm.org/D123406
2022-04-09 08:52:15 -07:00
Alex Bradbury
588f121ada [RISCV][NFC] Make Zb* instruction naming match the convention used elsewhere in the RISC-V backend
Where the instruction mnemonic contains a dot, we name the corresponding
instruction in the .td file using a _ in the place of the dot. e.g. LR_W
rather than LRW. This commit updates RISCVInstrInfoZb.td to follow that
convention.
2022-01-28 15:20:37 +00:00
Baoshan Pang
af931a51b9 [RISCV] Materializing constants with 'rori'
Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D116574
2022-01-07 15:39:22 -08:00
Ben Shi
4c3d916c4b [RISCV] Optimize immediate materialisation with SH*ADD
Use LUI+SH*ADD+ADDI to compose specific immediates.

Reviewed By: craig.topper, luismarques

Differential Revision: https://reviews.llvm.org/D113568
2021-11-15 23:34:28 +00:00
Ben Shi
97e52e1c35 [RISCV] Optimize immediate materialisation with SLLI.UW in the Zba extension
Simplify "LUI+SLLI+ADDI+SLLI" and "LUI+ADDIW+SLLI+ADDI+SLLI" to
"LUI+ADDIW+SLLIUW" to reduce total instruction amount.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111933
2021-10-27 02:48:38 +00:00
Ben Shi
4fe5ab4b00 [RISCV] Optimize immediate materialisation with SH*ADD
Use SH1ADD/SH2ADD/SH3ADD along with LUI+ADDI to compose int32*3,
int32*5 and int32*9.

Reviewed By: craig.topper, luismarques

Differential Revision: https://reviews.llvm.org/D111484
2021-10-15 06:46:41 +00:00
Ben Shi
7e81526126 [RISCV] Optimize immediate materialisation with BSETI/BCLRI
Opitimize immediate materialisation in the following way if profitable:
1. Use BCLRI for upper 32 bits if the lower 32 bits are negative int32.
2. Use BSETI for upper 32 bits if the lower 32 bits are positive int32.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111508
2021-10-14 04:56:47 +00:00
Ben Shi
481db13fec [RISCV] Optimize immediate materialisation with SLLI.UW
Use LUI+SLLI.UW to compose the upper bits instead of LUI+SLLI.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111705
2021-10-14 02:24:50 +00:00
Ben Shi
787eeb8597 [RISCV] Optimize immediate materialisation with BCLRI
Do the following optimization for immediate materialisation:

1. For values in range 0xffffffff 7fffffff ~ 0xffffffff 00000000, first
   generate the lower 32-bit with Val|0x80000000 (which is expected be an
   int32), then emit (BCLRI r, 31).

2. For values in range 0x80000000 ~ 0xffffffff, first generate the lower
   32-bit with Val&~0x80000000 (which is expected to be an int32), then
   emit (BSETI r, 31).

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D111532
2021-10-13 00:59:23 +00:00
Jim Lin
f29336104d [RISCV] Rename prefix FeatureExt* to FeatureStdExt* for all sub-extension
Rename prefix `FeatureExt*` to `FeatureStdExt*` for all sub-extension for consistency

Reviewed By: HsiangKai, asb

Differential Revision: https://reviews.llvm.org/D108187
2021-09-13 16:24:15 +08:00
Alexander Pivovarov
1104e3258b Fix typo in RISCVMatInt.cpp comments 2021-09-02 18:11:09 -07:00
Craig Topper
81efb82570 [RISCV] Teach RISCVMatInt about cases where it can use LUI+SLLI to replace LUI+ADDI+SLLI for large constants.
If we need to shift left anyway we might be able to take advantage
of LUI implicitly shifting its immediate left by 12 to cover part
of the shift. This allows us to use more bits of the LUI immediate
to avoid an ADDI.

isDesirableToCommuteWithShift now considers compressed instruction
opportunities when deciding if commuting should be allowed.

I believe this is the same or similar to one of the optimizations
from D79492.

Reviewed By: luismarques, arcbbb

Differential Revision: https://reviews.llvm.org/D105417
2021-07-20 09:22:06 -07:00