66 Commits

Author SHA1 Message Date
Craig Topper
5d03235c73
[RISCV] Add -mcpu=sifive-p550. (#122164)
This is the CPU in SiFive's HiFive Premier P550 development board.

Scheduler model will come in a later patch.
2025-01-08 21:02:46 -08:00
Petr Penzin
e934a39e01
[RISC-V] Base scheduling model for tt-ascalon-d8 (#120160)
First part of tt-ascalon-d8 scheduling model, only containing scalar
ops. Scheduling for vector instructions will be added in a follow-up
patch.

---------

Co-authored-by: Anton Blanchard <antonb@tenstorrent.com>
Co-authored-by: Pengcheng Wang <wangpengcheng.pp@bytedance.com>
2024-12-20 15:30:17 -05:00
Djordje Todorovic
3222060124
Reland "[RISCV] Add scheduling model for mips p8700 CPU" (#120550)
This patch introduces a scheduling model for the MIPS p8700, an
out-of-order
RISC-V processor. The model includes pipelines for the following units:

- 2 Integer Arithmetic/Logical Units (ALU and AL2)
- Multiply/Divide Unit (MDU)
- Branch Unit (CTI)
- Load/Store Unit (LSU)
- Short Floating-Point Pipe (FPUS)
- Long Floating-Point Pipe (FPUL)

For additional details, refer to the official product page:
https://mips.com/products/hardware/p8700/.

Also adds `UnsupportedSchedZfhmin` to handle cases like
`WriteFCvtF16ToF32` that
previously caused build failures.
2024-12-19 14:26:43 +01:00
Djordje Todorovic
9fa109a508
Revert "[RISCV] Add scheduling model for mips p8700 CPU" (#120537)
Reverts llvm/llvm-project#119885

llvm-project/llvm/lib/Target/RISCV/RISCVSchedMIPSP8700.td:20:5:
error: Processor does not define resources for WriteFCvtF32ToF16
def MIPSP8700Model : SchedMachineModel {
2024-12-19 10:01:46 +01:00
Djordje Todorovic
0f9257b9ab
[RISCV] Add scheduling model for mips p8700 CPU (#119885)
Depends on #119882.
2024-12-19 09:52:16 +01:00
Pengcheng Wang
9571d2023b
[RISCV] Add tune info for postra scheduling direction (#115864)
The results differ on different platforms so it is really hard to
determine a common default value.
    
Tune info for postra scheduling direction is added and CPUs can
set their own preferable postra scheduling direction.
2024-12-16 12:18:38 +08:00
Djordje Todorovic
52e9f2c52c
[RISCV] Add MIPS P8700 processor (#119882)
The P8700 is a high-performance processor from MIPS designed to meet the
demands of modern workloads, offering exceptional scalability and
efficiency. It builds on MIPS's established architectural strengths
while introducing enhancements that set it apart. For more details, you
can check out the official product page here:
https://mips.com/products/hardware/p8700/.

Scheduling model will be added in a separate commit/PR.
2024-12-13 20:54:25 +01:00
Pengcheng Wang
35619c791d
[RISCV] Add tune info for mem* expansion (#118439)
So that CPUs can tune these options.
2024-12-06 14:48:37 +08:00
Pengcheng Wang
d36a4c0715
[RISCV] Rename some Feature* to Tune* (#117966)
These features should be tune features.
2024-11-28 15:01:49 +08:00
Felipe Magno de Almeida
e3fdc3aa81
[RISCV] Allow hoisting VXRM writes out of loops speculatively (#110044)
Change the intersect for the anticipated algorithm to ignore unknown
when anticipating. This effectively allows VXRM writes speculatively
because it could do a VXRM write even when there's branches where VXRM
is unneeded.

The importance of this change is because VXRM writes causes pipeline
flushes in some micro-architectures and so it makes sense to allow more
aggressive hoisting even if it causes some degradation for the slow
path.

An example is this code:
```
typedef unsigned char uint8_t;
__attribute__ ((noipa))
void foo (uint8_t *dst,  int i_dst_stride,
           uint8_t *src1, int i_src1_stride,
           uint8_t *src2, int i_src2_stride,
           int i_width, int i_height )
{
   for( int y = 0; y < i_height; y++ )
     {
       for( int x = 0; x < i_width; x++ )
         dst[x] = ( src1[x] + src2[x] + 1 ) >> 1;
       dst  += i_dst_stride;
       src1 += i_src1_stride;
       src2 += i_src2_stride;
     }
}
```
With this patch, the code above generates a hoisting VXRM writes out of
the outer loop.
2024-11-27 13:31:39 -08:00
Pengcheng Wang
4da960b898 [RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)
We can get these information via `sys_riscv_hwprobe`.

This can be used to implement `__builtin_cpu_is`.
2024-11-22 22:58:54 +08:00
Mikhail Goncharov
d1dae1e861 Revert "[RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)" chain
This reverts commit b36fcf4f493ad9d30455e178076d91be99f3a7d8.
This reverts commit c11b6b1b8af7454b35eef342162dc2cddf54b4de.
This reverts commit 775148f2367600f90d28684549865ee9ea2f11be.

multiple bot build breakages, e.g. https://lab.llvm.org/buildbot/#/builders/3/builds/8076
2024-11-22 14:09:13 +01:00
Pengcheng Wang
775148f236
[RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)
We can get these information via `sys_riscv_hwprobe`.

This can be used to implement `__builtin_cpu_is`.
2024-11-22 19:54:45 +08:00
Petr Penzin
41c86ca714
[RISCV] Add TT-Ascalon-d8 processor (#115100)
Ascalon is an out-of-order CPU core from Tenstorrent. Overview:
https://tenstorrent.com/ip/tt-ascalon

Adding 8-wide version, -mcpu=tt-ascalon-d8. Scheduling model will be
added in a separate PR.

---------

Co-authored-by: Anton Blanchard <antonb@tenstorrent.com>
2024-11-19 14:20:55 -08:00
Luke Lau
5a16ed96c5
[RISCV] Add +unaligned-scalar-mem to spacemit-x60 (#115125)
I can't find any official documentation on this, but from other
discussions[^1] and my own testing the spacemit-x60 seems to support
unaligned scalar loads and stores.

They seem to be performant, and just from a quick test we get a 2.45%
speedup on 500.perlbench_r on the Banana Pi F3[^2].

This would allow it to take advantage of #107548.

[^1]:
https://github.com/llvm/llvm-project/issues/110454#issuecomment-2382199460
[^2]: https://lnt.lukelau.me/db_default/v4/nts/32
2024-11-06 18:49:21 +08:00
Luke Lau
beb12f92c7
[RISCV] Add +optimized-nfN-segment-load-store (#114414)
This is a follow up to #111511, where after benchmarking we learnt that
the Banana Pi F3 has fast segmented loads for not just NF=2, but also
NF=3 and NF=4:
https://github.com/preames/bp3-microarch#vlseg_lmul_x_sew_throughput

This adds tuning features to allow these segment loads and stores to be
costed cheaper and enables it for the spacemit-x60.

It also enables +optimized-nf2-segment-load-store by default in the
generic tuning to maintain the previous behaviour when compiled without
-mcpu or -mtune.
2024-11-04 06:43:58 +08:00
Anton Sidorenko
09fc178180
[RISCV] Add scheduling model for Syntacore SCR7 (#108814)
Syntacore SCR7 is rv64imafdcv_zba_zbb_zbc_zbs_zkn.
Scheduling model for RVV will be added later.
Overview: https://syntacore.com/products/scr7

---------

Co-authored-by: Dmitrii Petrov <dmitrii.petrov@syntacore.com>
Co-authored-by: Anton Afanasyev <anton.afanasyev@syntacore.com>
Co-authored-by: Elena Lepilkina <elena.lepilkina@syntacore.com>
2024-09-17 18:52:55 +03:00
Anton Sidorenko
dbdf84388a
[RISCV] Add Syntacore SCR7 processor definition (#108406)
Syntacore SCR7 is a high-performance Linux-capable RISC-V processor
core.
The core has rv64imafdcv_zba_zbb_zbc_zbs_zkn march.
Overview: https://syntacore.com/products/scr7

Scheduling model will be added in a subsequent PR.

---------

Co-authored-by: Dmitrii Petrov <dmitrii.petrov@syntacore.com>
Co-authored-by: Anton Afanasyev <anton.afanasyev@syntacore.com>
Co-authored-by: Elena Lepilkina <elena.lepilkina@syntacore.com>
2024-09-16 13:09:37 +03:00
Sam Elliott
9fa2386ff1
[RISCV] Add Hazard3 Core as taped out for RP2350 (#102452)
Luke Wren's Hazard3 is a configurable, open-source 32-bit RISC-V core.
The core's source code and docs are available on github:
https://github.com/wren6991/hazard3

This is the RISC-V core used in the RP2350, a recently announced SoC by
Raspberry Pi (which also contains Arm cores):
https://datasheets.raspberrypi.com/rp2350/rp2350-datasheet.pdf

We have agreed to name this `-mcpu` option `rp2350-hazard3`, and it
reflects exactly the options configured in the RP2350 chips. Notably,
the Zbc is not configured, and nor is B because the `misa.B` bit is not
either.
2024-08-21 08:45:45 +01:00
Anton Sidorenko
5ab99bf1a7
[RISCV] Add scheduling model for Syntacore SCR4 and SCR5 (#102909)
Syntacore SCR4 is a microcontroller-class processor core that has much
in common with SCR3, but also supports F and D extensions.
Overview: https://syntacore.com/products/scr4

Syntacore SCR5 is an entry-level Linux-capable 32/64-bit RISC-V
processor core which scheduling model almost match SCR4.
Overview: https://syntacore.com/products/scr5

Co-authored-by: Dmitrii Petrov <dmitrii.petrov@syntacore.com>
Co-authored-by: Anton Afanasyev <anton.afanasyev@syntacore.com>
2024-08-14 11:42:31 +03:00
Anton Sidorenko
02645d66f9
[RISCV] Add Syntacore SCR5 RV32/64 processors definition (#102285)
Syntacore SCR5 is an entry-level Linux-capable 32/64-bit RISC-V
processor core.
Overview: https://syntacore.com/products/scr5

Scheduling model will be added in a subsequent PR.

Co-authored-by: Dmitrii Petrov <dmitrii.petrov@syntacore.com>
Co-authored-by: Anton Afanasyev <anton.afanasyev@syntacore.com>
2024-08-09 16:02:27 +03:00
Craig Topper
898d6eb7be
[RISCV] Use RVA22U64Features in the definition of sifive-p450 and sifive-p670. (#102350)
This matches sifive-p470.

RVA22U64Features includes the Zicntr extension which was not present for
these CPUs before. I believe that was a mistake due to weird history of
the Zicntr extension. I've updated the p470 test accordingly since this
was missed there too.
2024-08-07 23:17:42 -07:00
Michael Maitland
0c25f85e5b
[RISCV] Add sifive-p470 processor (#102022)
This is an OOO core that has a vector unit. For more information see
https://www.sifive.com/cores/performance-p450-470.

Use the existing P400 scheduler model. This model is missing accurate
vector scheduling support, but it will be added in a follow up patch.

Other tunings can come in future patches too.
2024-08-07 08:30:42 -04:00
Anton Sidorenko
9884fd33db
[RISCV] Add Syntacore SCR4 RV32/64 processors definition (#101321)
Syntacore SCR4 is a microcontroller-class processor core that has much
in common with SCR3. The most significant difference for compilers is F
and D extensions support. Overview: https://syntacore.com/products/scr4

Two CPUs are added:
  * 'syntacore-scr4-rv32' -- rv32imfdc
  * 'syntacore-scr4-rv64' -- rv64imafdc

Scheduling model will be added in a separate PR.

Co-authored-by: Dmitrii Petrov <dmitrii.petrov@syntacore.com>
Co-authored-by: Anton Afanasyev <anton.afanasyev@syntacore.com>
2024-08-05 17:26:05 +03:00
Pengcheng Wang
27b608055f
[RISCV] Increase default tail duplication threshold to 6 at -O3 (#98873)
This is just like AArch64.

Changing the threshold to 6 will increase the code size, but will
also decrease unconditional branches. CPUs with wide fetch/issue units
can benefit from it.

The value 6 may be debatable, we can set it to `SchedModel.IssueWidth`.
2024-08-01 12:24:25 +08:00
Craig Topper
9086f9df6b
[RISCV] Remove feature implication from TuneSiFive7. (#100694)
Add all the implied feature directly to the SiFive7 CPUs tuning feature
list instead.

The implication is dangerous because explicitly disalbing any of the
implied features through the command line would also clear the SiFive7
feature bit.
2024-07-26 09:12:12 -07:00
Philip Reames
b5657d6dc7
[RISCV] Reverse default assumption about performance of vlseN.v vd, (rs1), x0 (#98205)
Some cores implement an optimization for a strided load with an x0
stride, which results in fewer memory operations being performed then
implied by VL since all address are the same. It seems to be the case
that this is the case only for a minority of available implementations.
We know that sifive-x280 does, but sifive-p670 and spacemit-x60 both do
not.

(To be more precise, measurements on the x60 appear to indicate that a
 stride of x0 has similar latency to a non-zero stride, and that both
 are about twice a vleN.v.  I'm taking this to mean the x0
 case is not optimized.)

We had an existing flag by which a processor could opt out of this
assumption but no upstream users. Instead of adding this flag to the
p670 and x60, this patch reverses the default and adds the opt-in flag
only to the x280.
2024-07-10 07:35:56 -07:00
Anton Sidorenko
2d84e0ffef
[RISCV] Add scheduling model for Syntacore SCR3 (#95427)
Syntacore SCR3 is a microcontroller-class processor core. Overview:
https://syntacore.com/products/scr3

Co-authored-by: Dmitrii Petrov <dmitrii.petrov@syntacore.com>
2024-06-25 11:34:59 +03:00
Anton Sidorenko
d59a4cac5f
[RISCV] Add Syntacore SCR3 processor definition (#95953)
Syntacore SCR3 is a microcontroller-class processor core. Overview:
https://syntacore.com/products/scr3
This PR introduces two CPUs:
  * 'syntacore-scr3-rv32' which is rv32imc
  * 'syntacore-scr3-rv64' which is rv64imac

---------

Co-authored-by: Dmitrii Petrov <dmitrii.petrov@syntacore.com>
2024-06-21 11:40:10 +03:00
Shao-Ce SUN
aede380210
[RISCV] Add processor definition for SpacemiT-X60 (#94564)
SpacemiT-X60 is an RVV 1.0 core integrated into the SpacemiT-K1, an
8-core SoC, and it is incorporated into the BPi-F3 development board.

According to the
[document](https://developer.spacemit.com/#/documentation?token=BWbGwbx7liGW21kq9lucSA6Vnpb),
relevant information for extensions has been obtained.

BPi-F3 Datasheet:
https://docs.banana-pi.org/en/BPI-F3/SpacemiT_K1_datasheet
Spacemit-K1 Datasheet:
https://developer.spacemit.com/#/documentation?token=DBd4wvqoqi2fiqkiERTcbEDknBh
2024-06-18 21:56:26 +08:00
Michael Maitland
66b5f16b2f
[RISCV] Do not check PostRAScheduler in enablePostRAScheduler (#92781)
On RISC-V, there are a few ways to control whether the
PostMachineScheduler is enabled. If `-enable-post-misched` is passed or
passed with a value of true, then the PostMachineScheduler is enabled.
If it is passed with a value of false then the PostMachineScheduler is
disabled. If the option is not passed at all, then
`RISCVSubtarget::enablePostRAMachineScheduler` decides whether the pass
should be enabled or not. `TargetSubtargetInfo::enablePostRAScheduler`
and `TargetSubtargetInfo::enablePostRAMachineScheduler` who check the
SchedModel value are not called by RISC-V backend.

`RISCVSubtarget::enablePostRAMachineScheduler` currently checks if the
active scheduler model sets `PostRAScheduler`. If it is set to true by
the scheduler model, then the pass is enabled. If it is not set to true
by the scheduler model, then the value of `UsePostRAScheduler` subtarget
feature is used.

I argue that the RISC-V backend should not use `PostRAScheduler` field
of the scheduler model to control whether the PostMachineScheduler is
enabled for the following reasons:

1. No other targets use this value to control whether
PostMachineScheduler is enabled. They only use it to check whether the
legacy PostRASchedulerList scheduler is enabled.

2. We can add the `UsePostRAScheduler` feature to the processor
definition in RISCVProcessors.td to tie a processor to whether the pass
should be enabled by default. This makes the feature and the sched model
field redundant.

3. Since these options are redundant, we should prefer the feature,
since we can set `+` and `-` on the feature, but the value of the
scheduler cannot be controlled on the command line.

4. Keeping both options allows us to set the feature and the scheduler
model value to conflicting values. Although the scheduler model value
will win out, it feels awkward to allow it.
2024-05-24 14:31:14 -04:00
Craig Topper
6cebd35772 [RISCV] Remove extra indentation from RISCVProcessors.td. 2024-04-20 19:21:59 -07:00
Craig Topper
f09f99ed32 [RISCV] Add RISCVTuneProcessorModel to 'generic' CPU. NFC
Remove hardcode GENERIC cpu from RISCVTargetDefEmitter.cpp.
2024-04-19 16:06:54 -07:00
Craig Topper
9067070d91
[RISCV] Re-separate unaligned scalar and vector memory features in the backend. (#88954)
This is largely a revert of commit
e81796671890b59c110f8e41adc7ca26f8484d20.

As #88029 shows, there exists hardware that only supports unaligned
scalar.

I'm leaving how this gets exposed to the clang interface to a future
patch.
2024-04-16 15:40:32 -07:00
Craig Topper
65b0cc610f
[RISCV] Add FeatureStdExtI to all CPUs in RISCVProcessors.td. NFC (#88805)
This is currently being implied in RISCVISAInfo.cpp. Make it explicit.

I'm planning to move all extension information to RISCVFeatures.td and
have tablegen create the tables for RISCVISAInfo.cpp. This requires
making the creation of RISCVTargetParserDef.inc in tablegen independent
of RISCVISAInfo.cpp. So we need an accurate extension list for CPUs in
tablegen.
2024-04-15 21:54:26 -07:00
Michael Maitland
c48d8182f1
[RISCV] Add SiFiveP600Model SchedModel that is used by sifive-p670 (#84962)
This PR includes an initial scheduler model shows improvement on
multiple workloads over NoSchedModel and SiFive7Model for sifive-p670.
We plan on making significant changes to this model in the future so
that it is more accurate. This patch would close
https://github.com/llvm/llvm-project/pull/80612.
2024-03-18 13:44:21 -04:00
Yingwei Zheng
373d9d7214
[RISCV] Add sched model for XiangShan-NanHu (#70232)
[XiangShan](https://github.com/OpenXiangShan/XiangShan) is an
open-source high-performance RISC-V processor.

This PR adds the schedule model for XiangShan-NanHu, the 2nd Gen core of
the XiangShan processor series.
Overview:
https://xiangshan-doc.readthedocs.io/zh-cn/latest/integration/overview/

It is based on the patch [D122556](https://reviews.llvm.org/D122556) by
@SForeKeeper. The original patch hasn't been updated for a long time and
it is out of sync with the current RTL design.

---------

Co-authored-by: SForeKeeper <zkliu6@gmail.com>
2024-02-12 15:00:54 +08:00
Michael Maitland
f13aac6517
[RISCV] Add TuneNoSinkSplatOperands to sifive-p670 (#79492) 2024-01-26 11:05:04 -05:00
Michael Maitland
63f742c15f
[RISCV] Add sifive-p670 processor (#79015)
This is an OOO core that has a vector unit. For more information see
https://www.sifive.com/cores/performance-p650-670.

Scheduler model and other tuning will come in separate patches.
2024-01-23 21:45:24 -05:00
Craig Topper
904b0901ef
[RISCV] Add FeatureFastUnalignedAccess to sifive-p450. (#79075) 2024-01-22 20:17:36 -08:00
Craig Topper
95c1039eca [RISCV] Add TuneNoDefaultUnroll to sifive-p450. 2024-01-22 13:26:25 -08:00
Craig Topper
80b67eebd2
[RISCV] Add Zic64b, Ziccamoa, Ziccif, Zicclsm, Ziccrse, and Za64rs to sifive-p450. (#79030) 2024-01-22 13:05:47 -08:00
Craig Topper
f2b5a314b2
[RISCV] Add LUI/AUIPC+ADDI fusions to sifive-p450. (#78501) 2024-01-17 16:01:34 -08:00
Craig Topper
847c787269
[RISCV] Add scheduler model for sifive-p450. (#77989)
This is a slightly cleaned up version of what we've been using in our
downstream toolchain.
2024-01-16 08:43:09 -08:00
Craig Topper
faa326de97
[RISCV] Add branch+c.mv macrofusion for sifive-p450. (#76169)
sifive-p450 supports a very restricted version of the short forward
branch optimization from the sifive-7-series.

For sifive-p450, a branch over a single c.mv can be macrofused as a
conditional move operation. Due to encoding restrictions on c.mv, we
can't conditionally move from X0. That would require c.li instead.
2024-01-08 15:23:26 -08:00
Craig Topper
6dc5ba4cca [RISCV] Remove XSfcie extension.
This reverts 0d3eee33f262402562a1ff28106dbb2f59031bdb and
4c37d30e22ae655394c8b3a7e292c06d393b9b44.

XSfcie is not an official SiFive extension name. It stands for
SiFive Custom Instruction Extension, which is mentioned in the S76
manual, but then elsewhere in the manual says it is not supported
for S76.

LLVM had various instructions and CSRs listed as part of this
extension, but as far as SiFive is concerned, none of them are part
of it. There are no documented extension names for these instructions
and CSRs either externally or internally.

If these are important to LLVM users, I can facilitate creating
extension names for them and have them documented. For now I'm
removing everything.

Unfortunately, these instructions and CSRs are in LLVM 17 so this
is an incompatible change.
2023-12-28 13:54:15 -08:00
Wang Pengcheng
f9c908862a
[RISCV] Split TuneShiftedZExtFusion (#76032)
We split `TuneShiftedZExtFusion` into three fusions to make them
reusable and match the GCC implementation[1].

The zexth/zextw fusions can be reused by XiangShan[2] and other
commercial processors, but shifted zero extension is not so common.

`macro-fusions-veyron-v1.mir` is renamed so it's not relevant to
specific processor.

References:
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637303.html
[2] https://xiangshan-doc.readthedocs.io/zh_CN/latest/frontend/decode
2023-12-22 14:37:26 +08:00
wangpc
90f816e61f [RISCV] Rename TuneVeyronFusions to TuneVentanaVeyron
And fusion features are added to processor definition.
2023-12-22 14:29:31 +08:00
Craig Topper
b03f0c596a
[RISCV] Add sifive-p450 CPU. (#75760)
This is an out of order core with no vector unit. More information:
https://www.sifive.com/cores/performance-p450-470

Scheduler model and other tuning will come in separate patches.
2023-12-20 09:52:02 -08:00
Mikhail Gudim
29ee66f4a0
[RISCV] Macro-fusion support for veyron-v1 CPU. (#70012)
Support was added for the following fusions:
  auipc-addi, slli-srli, ld-add
Some parts of the code became repetative, so small refactoring of
existing lui-addi fusion was done.
2023-12-11 16:34:13 -05:00