97 Commits

Author SHA1 Message Date
林克
6842cc5562
[RISCV] Add SpacemiT XSMTVDot (SpacemiT Vector Dot Product) extension. (#151706)
The full spec can be found at spacemit-x60 processor support scope:
Section 2.1.2.2 (Features):

https://developer.spacemit.com/documentation?token=BWbGwbx7liGW21kq9lucSA6Vnpb#2.1

This patch only supports assembler.
2025-08-18 18:03:17 +08:00
Daniel Henrique Barboza
35bd40d321
[RISCV] add more generic macrofusions (#151140)
These are some macrofusions that are used internally in Ventana in an
yet not upstreamed processor. Figured it would be good to contribute
them ahead of the processor to allow the community to also use them in
their own processors, while also alleaviting our own downstream upkeep.

The macrofusions being added are, considering load =
lb,lh,lw,ld,lbu,lhu,lwu:

- bfext (slli+srli)
- auipc+load
- lui+load
- add(.uw)+load 
- addi+load
- shXadd(.uw)+load, where X=1,2,3
2025-08-06 14:46:34 -04:00
Daniel Henrique Barboza
8e57689c34
[RISCV] add load/store misched/PostRA subtarget features (#149409)
Some processors benefit more from store clustering than load clustering,
and vice-versa, depending on factors that are exclusive to each one
(e.g. macrofusions implemented).

Likewise, certain optimizations benefits more from misched clustering
than postRA clustering. Macrofusions are again an example: in a
processor with store pair macrofusions, like the veyron-v1, it is
observed that misched clustering increases the amount of macrofusions
more than postRA clustering. This of course isn't necessarily true for
other processors, but it shows that processors can benefit from a more
fine grained control of clustering mutations, and each one is able to do
it differently.

Add 4 new subtarget features that deprecates the existing
riscv-misched-load-store-clustering and
riscv-postmisched-load-store-clustering
options:

- disable-misched-load-clustering and disable-misched-store-clustering:
disable load/store clustering during misched;

- disable-postmisched-load-clustering and
disable-postmisched-store-clustering:
disable load/store clustering during PostRA.

Note that the new subtarget features disables specific stages of the
default
clustering settings. The default per se (load and store clustering for
both
misched and PostRA) is left untouched.

Disable all clustering but misched-store-clustering for the veyron-v1
processor
using the new features.
2025-08-06 09:08:25 -07:00
Luke Lau
7c812ea01a
[RISCV] Avoid vl toggles when lowering vector_splice/experimental_vp_splice and add +vl-dependent-latency tuning feature (#146746)
When vectorizing a loop with a fixed-order recurrence we use a splice,
which gets lowered to a vslidedown and vslideup pair.

However with the way we lower it today we end up with extra vl toggles
in the loop, especially with EVL tail folding, e.g:

    .LBB0_5:                                # %vector.body
# =>This Inner Loop Header: Depth=1
    	sub	a5, a2, a3
    	sh2add	a6, a3, a1
    	zext.w	a7, a4
    	vsetvli	a4, a5, e8, mf2, ta, ma
    	vle32.v	v10, (a6)
    	addi	a7, a7, -1
    	vsetivli	zero, 1, e32, m2, ta, ma
    	vslidedown.vx	v8, v8, a7
    	sh2add	a6, a3, a0
    	vsetvli	zero, a5, e32, m2, ta, ma
    	vslideup.vi	v8, v10, 1
    	vadd.vv	v8, v10, v8
    	add	a3, a3, a4
    	vse32.v	v8, (a6)
    	vmv2r.v	v8, v10
    	bne	a3, a2, .LBB0_5

Because the vslideup overwrites all but UpOffset elements from the
vslidedown, we currently set the vslidedown's AVL to said offset.

But in the vslideup we use either VLMAX or the EVL which causes a
toggle.

This increases the AVL of the vslidedown so it matches vslideup, even if
the extra elements are overridden, to avoid the toggle.

A new tuning feature +vl-dependent-latency has been added which keeps
the old behaviour for microarchitectures that dynamically dispatch uops
based on vl, e.g. sifive-x280.

+vl-dependent-latency can be reused for the recently proposed Ovlt
optimization directive if/when it's ratified:
https://lists.riscv.org/g/tech-privileged/message/2487

If we wanted to aggressively optimise for vl at the expense of
introducing more toggles we could probably look at doing this in
RISCVVLOptimizer.
2025-07-09 11:09:13 +08:00
UmeshKalappa
032966ff56
[RISCV] Added the MIPS prefetch extensions for MIPS RV64 P8700. (#145647)
the extension enabled with xmipscbop.

Please refer "MIPS RV64 P8700/P8700-F Multiprocessing System
Programmer’s Guide" for more info on the extension at
https://mips.com/wp-content/uploads/2025/06/P8700_Programmers_Reference_Manual_Rev1.84_5-31-2025.pdf
2025-07-03 10:59:10 +02:00
Jim Lin
2f9c97c030
[RISCV] Add Andes AX45MPV processor definition (#145267)
Andes AX45MPV is 64-bit in-order dual-issue 8-stage pipeline
linux-capable CPU implementing the RV64IMAFDCV ISA extension. That is
developed by Andes Technology https://www.andestech.com, a RISC-V IP
provider.

The overviews for AX45MPV:
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-ax45mpv/

Scheduling model for RVV extension will be implemented a follow-up PR.
2025-06-24 08:57:55 +08:00
Min-Yih Hsu
f40909f605
[RISCV] Add SiFive X390 scheduling model (#143938)
This patch adds the scheduling model for sifive-x390. X390 is a dual
issue in-order CPU. It has two scalar and two vector pipes, with
VLEN=1024 and DLEN=512.

Co-authored-by: Michael Maitland <michaeltmaitland@gmail.com>
2025-06-23 10:06:53 -07:00
Jim Lin
f78819aeef Revert "Revert "[RISCV] Remove B and Zbc extension from Andes series cpus." (#144402)"
Since the fix https://github.com/llvm/llvm-project/pull/144848 for post-commit CI failure
has landed.

This reverts commit f83d09a1f60aee28a8ed9020cd72971ec2885f24.
2025-06-22 17:54:37 +08:00
Aaron Ballman
f83d09a1f6
Revert "[RISCV] Remove B and Zbc extension from Andes series cpus." (#144402)
Reverts llvm/llvm-project#144022

This has been failing postcommit CI for two days:
https://lab.llvm.org/buildbot/#/builders/63
2025-06-16 14:53:15 -04:00
Jim Lin
24c8d900c4
[RISCV] Remove B and Zbc extension from Andes series cpus. (#144022)
The Andes CPU is configurable with optional extensions. The minimal
required extension set does not include `B` and `Zbc` extensions. So we
decided to remove them.
2025-06-15 11:38:04 +08:00
Jim Lin
483d19619c
[RISCV] Add tune features for Andes 45 series cpus (#143899)
Add tune features TuneNoDefaultUnroll, TuneShortForwardBranchOpt and 
TunePostRAScheduler for Andes 45 series cpus.
2025-06-13 14:26:50 +08:00
Jim Lin
2a8c7d3c69
[RISCV] Add support for -mtune=andes-45-series (#142900)
Enables the use of `-mtune=andes-45-series` to generate code optimized
with the Andes 45 series scheduling model and tuning features.
2025-06-06 11:34:19 +08:00
Min-Yih Hsu
feb21e26fa
[RISCV] Add SiFive X390 processor definition (#142517)
X390 is an in-order core designed for AI/ML workload, with VLEN=1024.
https://www.sifive.com/cores/intelligence-x300-series

Scheduling model will be added in a follow-up patch.
2025-06-04 09:25:59 -07:00
Jim Lin
991d754074
[RISCV] Implement base scheduling model for andes 45 series processor. (#141008)
This patch implements scheduling model for IMAFD and Zb extension. The
latency and throughput of all instructions, except load/store, are
measured by llvm-exegesis.

Scheduling model for V and other extensions will be added in a follow-up
patch.
2025-06-04 16:11:43 +08:00
Jim Lin
04211ba727
[RISCV] Add FeatureVendorXAndesPerf to Andes N45/NX45/A45/AX45 (#141007)
Andes N45/NX45/A45/AX45 also support XAndesPerf.
2025-05-22 14:41:58 +08:00
Jim Lin
569b6f6dad
[RISCV] Add Andes A25/AX25 processor definition (#140681)
Andes A25/AX25 are 32/64bit, 5-stage pipeline, linux-capable CPUs that
implement the RV[32|64]IMAFDC_Zba_Zbb_Zbc_Zbs ISA extensions. They are
developed by Andes Technology https://www.andestech.com, a RISC-V IP
provider.

The overviews for A25/AX25:
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-a25/
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-ax25/

Scheduling model will be implemented in a later PR.
2025-05-22 09:22:32 +08:00
Min-Yih Hsu
b92b548168
[RISCV] Add scheduling model for SiFive P800 processors (#139316)
The scheduling model for SiFive P800 series cores. They have 6 integer
pipes, 2 floating point pipes, and 2 vector pipes.

https://chipsandcheese.com/p/hot-chips-2023-sifives-p870-takes-risc-v-further

The tests are meant to have the same coverage as its P600 counterpart.
2025-05-20 09:13:08 -07:00
Mikhail R. Gadelha
4eac576654
[RISCV] Add scheduler definitions for SpacemiT-X60 (#137343)
This patch adds an initial scheduler model for the SpacemiT-X60,
including latency for scalar instructions only.

The scheduler is based on the documented characteristics of the C908,
which the SpacemiT-X60 is believed to be based on, and provides the
expected latency for several instructions. I ran a probe to confirm all
of these values and to get the latency of instructions not provided by
the C908 documentation (e.g., double floating-point instructions).

For load and store instructions, the C908 documentation says the latency
is \>= 3 for load and 1 for store. I tried a few combinations of values
until I got the current values of 5 and 3, which yield the best results.

Although the X60 does appear to support multiple issue for at least some
floating point instructions, this model assumes single issue as
increasing it reduces the gains below.

This patch gives a geomean improvement of ~4% on SPEC CPU 2017 for both
rva22u64 and rva22u64_v, with some benchmarks improving up to 18%
(508.namd_r). There were a couple of execution time regressions, but
only in noisy benchmarks (523.xalancbmk_r and 510.parest_r).

* rva22u64: https://lnt.lukelau.me/db_default/v4/nts/507?compare_to=405
(compares a55f7275 to the baseline 8286b804)
* rva22u64_v:
https://lnt.lukelau.me/db_default/v4/nts/474?compare_to=404 (compares
a55f7275 to the baseline 8286b804)

This initial scheduling model is strongly focused on providing
sufficient definitions to provide improved performance for the
SpacemiT-X60. Further incremental gains may be possible through a much
more detailed microarchitectural analysis, but that is left to future
work.

Further scheduling definitions for RVV can be added in a future PR.
2025-05-06 13:30:57 -03:00
Min-Yih Hsu
ca1ebff9de
[RISCV] Add processor definition for SiFive P870 (#137725)
SiFive P870 is a RVA23 compatible high-performance CPU:
https://www.sifive.com/cores/performance-p800

Scheduling model will be added in a follow-up PR.
2025-05-05 18:48:21 -07:00
Min-Yih Hsu
74593f6678
[RISCV][NFC] Remove duplicate extensions from tt-ascalon-d8 CPU (#137865)
Sscofpmf is already in RVA23S64 and Zicsr is in RVA20U64. I also added a
check against Sscofpmf. NFC.
2025-04-29 14:19:51 -07:00
Jim Lin
5981be7692
[RISCV] Add Andes A45/AX45 processor definition (#136832)
Andes A45/AX45 are 32/64bit in-order dual-issue 8-stage pipeline
linux-capable CPU implementing the RV[32|64]IMAFDC_Zba_Zbb_Zbs ISA
extensions. They are developed by Andes Technology
https://www.andestech.com, a RISC-V IP provider.

The overviews for A45/AX45:
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-a45/
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-ax45/

Scheduling model will be implemented in a later PR.
2025-04-24 09:16:12 +08:00
Jim Lin
832ca744f2
[RISCV] Add Andes N45/NX45 processor definition (#136670)
Andes N45/NX45 are 32/64bit in-order dual-issue 8-stage pipeline CPU
architecture implementing the RV[32|64]IMAFDC_Zba_Zbb_Zbs ISA
extensions. They are developed by Andes Technology
https://www.andestech.com, a RISC-V IP provider.

The overviews for N45/NX45:
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-n45/
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-nx45/

Scheduling model will be implemented in a later PR.
2025-04-23 14:16:23 +08:00
Chyaka
0e3e0bf42c
[RISCV] Add processor definition for XiangShan-KunMingHu-V2R2 (#123193)
XiangShan-KunMingHu is the third generation of Open-source
high-performance RISC-V processor developed by Beijing Institute of Open
Source Chip (BOSC) , and its latest version is V2R2.

The KunMingHu manual is now available at
https://github.com/OpenXiangShan/XiangShan-User-Guide/releases.
It will be updated on the official XiangShan documentation site:
https://docs.xiangshan.cc/zh-cn/latest

You can find the corresponding ISA extension from the XiangShan Github
repository:
https://github.com/OpenXiangShan/XiangShan/blob/master/src/main/scala/xiangshan/Parameters.scala

If you want to track the latest performance data of KunMingHu, please
check XiangShan Biweekly: https://docs.xiangshan.cc/zh-cn/latest/blog

This PR adds the processor definition for KunMingHu V2R2, developed by
the XSCC team https://github.com/orgs/OpenXiangShan/teams/xscc.

The scheduling model for XiangShan-KunMingHu V2R2 will be submitted in a
subsequent PR.

---------

Co-authored-by: Shenglin Tang <tangshenglin@ict.ac.cn>
Co-authored-by: Xu, Zefan <ceba_robot@outlook.com>
Co-authored-by: Tang Haojin <tanghaojin@outlook.com>
2025-04-21 10:06:43 +08:00
Djordje Todorovic
d30a5b41fe
[RISCV] Fix xmipscmov extension name (#135647)
The right name was used in riscv-toolchain-conventions docs.
2025-04-15 23:17:03 +02:00
Petr Penzin
b44fbdee00
[RISCV] Tune flag for fast vrgather.vv (#124664)
Add tune knob for N*Log2(N) vrgather.vv cost.
2025-03-03 16:04:49 -08:00
Pengcheng Wang
7eadc1960d
[RISCV] Add a generic OOO CPU (#120712)
We add a generic out-of-order CPU model here just like what GCC
has done.
    
People may use this model to evaluate some optimizations, and more
importantly, people can use this model as a template to customize
their own CPU models.
    
The design (units, cycles, ...) of this model is random so don't
take it seriously.
2025-02-14 17:35:02 +08:00
Philip Reames
059722da5e Revert "[RISCV] Default to MicroOpBufferSize = 1 for scheduling purposes (#126608)" and follow up commit.
This reverts commit 9cc8442a2b438962883bbbfd8ff62ad4b1a2b95d.
This reverts commit 859c871184bdfdebb47b5c7ec5e59348e0534e0b.

A performance regression was reported on the original review.  There appears
to have been an unexpected interaction here.  Reverting during investigation.
2025-02-13 09:57:33 -08:00
Pengcheng Wang
9cc8442a2b
[RISCV][NFC] Move GenericModel to standalone file (#127003)
And fix some typos in comments.

In the future, we may add more scheduling info to GenericModel.
2025-02-13 15:26:03 +08:00
Philip Reames
859c871184
[RISCV] Default to MicroOpBufferSize = 1 for scheduling purposes (#126608)
This change introduces a default schedule model for the RISCV target
which leaves everything unchanged except the MicroOpBufferSize. The
default value of this flag in NoSched is 0. Both configurations
represent in order cores (i.e. no reorder window), the difference
between them comes down to whether heuristics other than latency are
allowed to apply. (Implementation details below)

I left the processor models which explicitly set MicroOpBufferSize=0
unchanged in this patch, but strongly suspect we should change those
too. Honestly, I think the LLVM wide default for this flag should be
changed, but don't have the energy to manage the updates for all
targets.

Implementation wise, the effect of this change is that schedule units
which are ready to run *except that* one of their predecessors may not
have completed yet are added to the Available list, not the Pending one.
The result of this is that it becomes possible to chose to schedule a
node before it's ready cycle if the heuristics prefer. This is
essentially chosing to insert a resource stall instead of e.g.
increasing register pressure.

Note that I was initially concerned there might be a correctness aspect
(as in some kind of exposed pipeline design), but the generic scheduler
doesn't seem to know how to insert noop instructions. Without that, a
program wouldn't be guaranteed to schedule on an exposed pipeline
depending on the program and schedule model in question.

The effect of this is that we sometimes prefer register pressure in
codegen results. This is mostly churn (or small wins) on scalar because
we have many more registers, but is of major importance on vector -
particularly high LMUL - because we effectively have many fewer
registers and the relative cost of spilling is much higher. This is a
significant improvement on high LMUL code quality for default rva23u
configurations - or any non -mcpu vector configuration for that matter.

Fixes #107532
2025-02-12 12:31:39 -08:00
Djordje Todorovic
0cb7636a46
[RISCV] Add MIPS extensions (#121394)
Adding two extensions for MIPS p8700 CPU:
  1. cmove (conditional move)
  2. lsp (load/store pair)

The official product page here:
https://mips.com/products/hardware/p8700
2025-01-28 08:04:09 +01:00
Craig Topper
ea9993a9a3
[RISCV] Add P550 scheduler model. (#124639)
P550 falls between P450 and P650. It has 1 additional FEX pipe over
P450. Mul and cpop latency are 3 instead of 2.

I've set the MicroOpBufferSize to 96 instead of 56 based on the ROB size
measurement from
https://chipsandcheese.com/p/inside-sifives-p550-microarchitecture I
believe we set this value too low for P450 and P650 and should update
them in a separate PR.
2025-01-27 22:40:05 -08:00
Craig Topper
5d03235c73
[RISCV] Add -mcpu=sifive-p550. (#122164)
This is the CPU in SiFive's HiFive Premier P550 development board.

Scheduler model will come in a later patch.
2025-01-08 21:02:46 -08:00
Petr Penzin
e934a39e01
[RISC-V] Base scheduling model for tt-ascalon-d8 (#120160)
First part of tt-ascalon-d8 scheduling model, only containing scalar
ops. Scheduling for vector instructions will be added in a follow-up
patch.

---------

Co-authored-by: Anton Blanchard <antonb@tenstorrent.com>
Co-authored-by: Pengcheng Wang <wangpengcheng.pp@bytedance.com>
2024-12-20 15:30:17 -05:00
Djordje Todorovic
3222060124
Reland "[RISCV] Add scheduling model for mips p8700 CPU" (#120550)
This patch introduces a scheduling model for the MIPS p8700, an
out-of-order
RISC-V processor. The model includes pipelines for the following units:

- 2 Integer Arithmetic/Logical Units (ALU and AL2)
- Multiply/Divide Unit (MDU)
- Branch Unit (CTI)
- Load/Store Unit (LSU)
- Short Floating-Point Pipe (FPUS)
- Long Floating-Point Pipe (FPUL)

For additional details, refer to the official product page:
https://mips.com/products/hardware/p8700/.

Also adds `UnsupportedSchedZfhmin` to handle cases like
`WriteFCvtF16ToF32` that
previously caused build failures.
2024-12-19 14:26:43 +01:00
Djordje Todorovic
9fa109a508
Revert "[RISCV] Add scheduling model for mips p8700 CPU" (#120537)
Reverts llvm/llvm-project#119885

llvm-project/llvm/lib/Target/RISCV/RISCVSchedMIPSP8700.td:20:5:
error: Processor does not define resources for WriteFCvtF32ToF16
def MIPSP8700Model : SchedMachineModel {
2024-12-19 10:01:46 +01:00
Djordje Todorovic
0f9257b9ab
[RISCV] Add scheduling model for mips p8700 CPU (#119885)
Depends on #119882.
2024-12-19 09:52:16 +01:00
Pengcheng Wang
9571d2023b
[RISCV] Add tune info for postra scheduling direction (#115864)
The results differ on different platforms so it is really hard to
determine a common default value.
    
Tune info for postra scheduling direction is added and CPUs can
set their own preferable postra scheduling direction.
2024-12-16 12:18:38 +08:00
Djordje Todorovic
52e9f2c52c
[RISCV] Add MIPS P8700 processor (#119882)
The P8700 is a high-performance processor from MIPS designed to meet the
demands of modern workloads, offering exceptional scalability and
efficiency. It builds on MIPS's established architectural strengths
while introducing enhancements that set it apart. For more details, you
can check out the official product page here:
https://mips.com/products/hardware/p8700/.

Scheduling model will be added in a separate commit/PR.
2024-12-13 20:54:25 +01:00
Pengcheng Wang
35619c791d
[RISCV] Add tune info for mem* expansion (#118439)
So that CPUs can tune these options.
2024-12-06 14:48:37 +08:00
Pengcheng Wang
d36a4c0715
[RISCV] Rename some Feature* to Tune* (#117966)
These features should be tune features.
2024-11-28 15:01:49 +08:00
Felipe Magno de Almeida
e3fdc3aa81
[RISCV] Allow hoisting VXRM writes out of loops speculatively (#110044)
Change the intersect for the anticipated algorithm to ignore unknown
when anticipating. This effectively allows VXRM writes speculatively
because it could do a VXRM write even when there's branches where VXRM
is unneeded.

The importance of this change is because VXRM writes causes pipeline
flushes in some micro-architectures and so it makes sense to allow more
aggressive hoisting even if it causes some degradation for the slow
path.

An example is this code:
```
typedef unsigned char uint8_t;
__attribute__ ((noipa))
void foo (uint8_t *dst,  int i_dst_stride,
           uint8_t *src1, int i_src1_stride,
           uint8_t *src2, int i_src2_stride,
           int i_width, int i_height )
{
   for( int y = 0; y < i_height; y++ )
     {
       for( int x = 0; x < i_width; x++ )
         dst[x] = ( src1[x] + src2[x] + 1 ) >> 1;
       dst  += i_dst_stride;
       src1 += i_src1_stride;
       src2 += i_src2_stride;
     }
}
```
With this patch, the code above generates a hoisting VXRM writes out of
the outer loop.
2024-11-27 13:31:39 -08:00
Pengcheng Wang
4da960b898 [RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)
We can get these information via `sys_riscv_hwprobe`.

This can be used to implement `__builtin_cpu_is`.
2024-11-22 22:58:54 +08:00
Mikhail Goncharov
d1dae1e861 Revert "[RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)" chain
This reverts commit b36fcf4f493ad9d30455e178076d91be99f3a7d8.
This reverts commit c11b6b1b8af7454b35eef342162dc2cddf54b4de.
This reverts commit 775148f2367600f90d28684549865ee9ea2f11be.

multiple bot build breakages, e.g. https://lab.llvm.org/buildbot/#/builders/3/builds/8076
2024-11-22 14:09:13 +01:00
Pengcheng Wang
775148f236
[RISCV] Add mvendorid/marchid/mimpid to CPU definitions (#116202)
We can get these information via `sys_riscv_hwprobe`.

This can be used to implement `__builtin_cpu_is`.
2024-11-22 19:54:45 +08:00
Petr Penzin
41c86ca714
[RISCV] Add TT-Ascalon-d8 processor (#115100)
Ascalon is an out-of-order CPU core from Tenstorrent. Overview:
https://tenstorrent.com/ip/tt-ascalon

Adding 8-wide version, -mcpu=tt-ascalon-d8. Scheduling model will be
added in a separate PR.

---------

Co-authored-by: Anton Blanchard <antonb@tenstorrent.com>
2024-11-19 14:20:55 -08:00
Luke Lau
5a16ed96c5
[RISCV] Add +unaligned-scalar-mem to spacemit-x60 (#115125)
I can't find any official documentation on this, but from other
discussions[^1] and my own testing the spacemit-x60 seems to support
unaligned scalar loads and stores.

They seem to be performant, and just from a quick test we get a 2.45%
speedup on 500.perlbench_r on the Banana Pi F3[^2].

This would allow it to take advantage of #107548.

[^1]:
https://github.com/llvm/llvm-project/issues/110454#issuecomment-2382199460
[^2]: https://lnt.lukelau.me/db_default/v4/nts/32
2024-11-06 18:49:21 +08:00
Luke Lau
beb12f92c7
[RISCV] Add +optimized-nfN-segment-load-store (#114414)
This is a follow up to #111511, where after benchmarking we learnt that
the Banana Pi F3 has fast segmented loads for not just NF=2, but also
NF=3 and NF=4:
https://github.com/preames/bp3-microarch#vlseg_lmul_x_sew_throughput

This adds tuning features to allow these segment loads and stores to be
costed cheaper and enables it for the spacemit-x60.

It also enables +optimized-nf2-segment-load-store by default in the
generic tuning to maintain the previous behaviour when compiled without
-mcpu or -mtune.
2024-11-04 06:43:58 +08:00
Anton Sidorenko
09fc178180
[RISCV] Add scheduling model for Syntacore SCR7 (#108814)
Syntacore SCR7 is rv64imafdcv_zba_zbb_zbc_zbs_zkn.
Scheduling model for RVV will be added later.
Overview: https://syntacore.com/products/scr7

---------

Co-authored-by: Dmitrii Petrov <dmitrii.petrov@syntacore.com>
Co-authored-by: Anton Afanasyev <anton.afanasyev@syntacore.com>
Co-authored-by: Elena Lepilkina <elena.lepilkina@syntacore.com>
2024-09-17 18:52:55 +03:00
Anton Sidorenko
dbdf84388a
[RISCV] Add Syntacore SCR7 processor definition (#108406)
Syntacore SCR7 is a high-performance Linux-capable RISC-V processor
core.
The core has rv64imafdcv_zba_zbb_zbc_zbs_zkn march.
Overview: https://syntacore.com/products/scr7

Scheduling model will be added in a subsequent PR.

---------

Co-authored-by: Dmitrii Petrov <dmitrii.petrov@syntacore.com>
Co-authored-by: Anton Afanasyev <anton.afanasyev@syntacore.com>
Co-authored-by: Elena Lepilkina <elena.lepilkina@syntacore.com>
2024-09-16 13:09:37 +03:00
Sam Elliott
9fa2386ff1
[RISCV] Add Hazard3 Core as taped out for RP2350 (#102452)
Luke Wren's Hazard3 is a configurable, open-source 32-bit RISC-V core.
The core's source code and docs are available on github:
https://github.com/wren6991/hazard3

This is the RISC-V core used in the RP2350, a recently announced SoC by
Raspberry Pi (which also contains Arm cores):
https://datasheets.raspberrypi.com/rp2350/rp2350-datasheet.pdf

We have agreed to name this `-mcpu` option `rp2350-hazard3`, and it
reflects exactly the options configured in the RP2350 chips. Notably,
the Zbc is not configured, and nor is B because the `misa.B` bit is not
either.
2024-08-21 08:45:45 +01:00