llvm-project

Author	SHA1	Message	Date
Harrison Hao	8c14d3f44f	[MISched] Use SchedRegion in overrideSchedPolicy and overridePostRASchedPolicy (#149297 ) This patch updates `overrideSchedPolicy` and `overridePostRASchedPolicy` to take a `SchedRegion` parameter instead of just `NumRegionInstrs`. This provides access to both the instruction range and the parent `MachineBasicBlock`, which enables looking up function-level attributes. With this change, targets can select post-RA scheduling direction per function using a function attribute. For example: ```cpp void overridePostRASchedPolicy(MachineSchedPolicy &Policy, const SchedRegion &Region) const { const Function &F = Region.RegionBegin->getMF()->getFunction(); Attribute Attr = F.getFnAttribute("amdgpu-post-ra-direction"); ... }	2025-07-22 15:55:12 +08:00
Sam Elliott	a6eb5eee38	[RISCV][NFC] Remove hasStdExtCOrZca (#145139 ) As of 20b5728b7b1ccc4509a316efb270d46cc9526d69, C always enables Zca, so the check `C \|\| Zca` is equivalent to just checking for `Zca`. This replaces any uses of `HasStdExtCOrZca` with a new `HasStdExtZca` (with the same assembler description, to avoid changes in error messages), and simplifies everywhere where C++ needed to check for either C or Zca. The Subtarget function is just deprecated for the moment.	2025-06-23 10:49:47 -07:00
Jim Lin	483d19619c	[RISCV] Add tune features for Andes 45 series cpus (#143899 ) Add tune features TuneNoDefaultUnroll, TuneShortForwardBranchOpt and TunePostRAScheduler for Andes 45 series cpus.	2025-06-13 14:26:50 +08:00
Craig Topper	a0b6cfd975	[RISCV] Add MC layer support for XSfmm*. (#133031 ) This adds assembler/disassembler support for XSfmmbase 0.6 and related SiFive matrix multiplication extensions based on the spec here https://www.sifive.com/document-file/xsfmm-matrix-extensions-specification Functionality-wise, this is the same as the Zvma extension proposal that SiFive shared with the Attached Matrix Extension Task Group. The extension names and instruction mnemonics have been changed to use vendor prefixes. Note this is a non-conforming extension as the opcodes used here are in the standard opcode space in OP-V or OP-VE. --------- Co-authored-by: Brandon Wu <brandon.wu@sifive.com>	2025-05-21 08:26:35 -07:00
Pengcheng Wang	95e38baf24	[RISCV] Enable prefetching writes (#130561 ) We should prefetch writes since `Zicbop` has `prefetch.w`.	2025-03-11 11:39:21 +08:00
Petr Penzin	b44fbdee00	[RISCV] Tune flag for fast vrgather.vv (#124664 ) Add tune knob for N*Log2(N) vrgather.vv cost.	2025-03-03 16:04:49 -08:00
Craig Topper	6e14d75f54	[RISCV] Fix some implicit conversions from Register to unsigned. NFC	2025-02-05 17:06:10 -08:00
Djordje Todorovic	0cb7636a46	[RISCV] Add MIPS extensions (#121394 ) Adding two extensions for MIPS p8700 CPU: 1. cmove (conditional move) 2. lsp (load/store pair) The official product page here: https://mips.com/products/hardware/p8700	2025-01-28 08:04:09 +01:00
Pengcheng Wang	2c782ab271	[RISCV] Add software pipeliner support (#117546 ) This patch adds basic support of `MachinePipeliner` and disable it by default. The functionality should be OK and all llvm-test-suite tests have passed.	2024-12-19 13:00:08 +08:00
Djordje Todorovic	cedc9bf94a	[RISCV] Add MIPSP8700 RISCVProcFamilyEnum (#120073 )	2024-12-16 16:06:53 +01:00
Sergei Barannikov	03847f19f2	[SelectionDAG] Add empty implementation of SelectionDAGInfo to some targets (#119968 ) #119969 adds a couple of new methods to this class, which will need to be overridden by these targets. Part of #119709. Pull Request: https://github.com/llvm/llvm-project/pull/119968	2024-12-16 15:13:46 +03:00
Pengcheng Wang	9571d2023b	[RISCV] Add tune info for postra scheduling direction (#115864 ) The results differ on different platforms so it is really hard to determine a common default value. Tune info for postra scheduling direction is added and CPUs can set their own preferable postra scheduling direction.	2024-12-16 12:18:38 +08:00
Min-Yih Hsu	ea76b2d8d8	[XRay][RISCV] RISCV support for XRay (#117368 ) Add RISC-V support for XRay. The RV64 implementation has been tested in both QEMU and in our hardware environment. Currently this requires D and C extensions, but since both RV64GC and RVA22/RVA23 are becoming mainstream, I don't think this requirement will be a big problem. Based on the previous work by @a-poduval : https://reviews.llvm.org/D117929 --------- Co-authored-by: Ashwin Poduval <ashwin.poduval@gmail.com>	2024-12-10 17:57:04 -08:00
Michael Maitland	84efad0b47	[RISCV][MRI] Account for fixed registers when determining callee saved regs (#115756 ) This fixes https://discourse.llvm.org/t/fixed-register-being-spill-and-restored-in-clang/83058. We need to do it in `MachineRegisterInfo::getCalleeSavedRegs` instead of `RISCVRegisterInfo::getCalleeSavedRegs` since the MF argument of `TargetRegisterInfo:::getCalleeSavedRegs` is `const`, so we can't call `MF->getRegInfo().disableCalleeSavedRegister` there. So to put it in `MachineRegisterInfo::getCalleeSavedRegs`, we move `isRegisterReservedByUser` into `TargetSubtargetInfo`.	2024-12-06 14:07:27 -05:00
Pengcheng Wang	35619c791d	[RISCV] Add tune info for mem* expansion (#118439 ) So that CPUs can tune these options.	2024-12-06 14:48:37 +08:00
Pengcheng Wang	6633916ef5	[RISCV] Remove getPostRAMutations (#117527 ) We are using `PostMachineScheduler` instead of `PostRAScheduler` since #68696. The hook `getPostRAMutations` is only used in `PostRAScheduler` so it is actually dead code for RISC-V now.	2024-11-26 10:55:43 +08:00
Pengcheng Wang	9122c5235e	[RISCV] Enable bidirectional scheduling and tracking register pressure (#115445 ) This is based on other targets like PPC/AArch64 and some experiments. This PR will only enable bidirectional scheduling and tracking register pressure. Disclaimer: I haven't tested it on many cores, maybe we should make some options being features. I believe downstreams must have tried this before, so feedbacks are welcome.	2024-11-15 17:53:14 +08:00
Philip Reames	a905203b9e	[RISCV] Prefer strided load for interleave load with only one lane active (#115069 ) If only one of the elements is actually used, then we can legally use a strided load in place of the segment load. Doing so reduces vector register pressure, so if both segment and strided are believed to be element/segment at a time, then prefer the strided load variant. Note that I've seen the vectorizer emitting wide interleave loads to represent a strided load, so this does happen in practice. It doesn't matter much for small LMUL*NF, but at large NF can start causing problems in register allocation. Note that this patch only covers the fixed vector formation cases. In theory, we should do the same patch for scalable, but we can currently only represent NF2 in scalable IR, and NF2 is assumed to be optimized to better than segment-at-a-time by default, so there's currently nothing to do.	2024-11-05 16:15:20 -08:00
Nikita Popov	dfa54298ff	[InitUndef] Enable the InitUndef pass on non-AMDGPU targets (#108353 ) The InitUndef pass works around a register allocation issue, where undef operands can be allocated to the same register as early-clobber result operands. This may lead to ISA constraint violations, where certain input and output registers are not allowed to overlap. Originally this pass was implemented for RISCV, and then extended to ARM in #77770. I've since removed the target-specific parts of the pass in #106744 and #107885. This PR reduces the pass to use a single requiresDisjointEarlyClobberAndUndef() target hook and enables it by default. The hook is disabled for AMDGPU, because overlapping early-clobber and undef operands are known to be safe for that target, and we get significant codegen diffs otherwise. The motivating case is the one in arm64-ldxr-stxr.ll, where we were previously incorrectly allocating a stxp input and output to the same register.	2024-09-16 09:48:25 +02:00
Pengcheng Wang	27b608055f	[RISCV] Increase default tail duplication threshold to 6 at -O3 (#98873 ) This is just like AArch64. Changing the threshold to 6 will increase the code size, but will also decrease unconditional branches. CPUs with wide fetch/issue units can benefit from it. The value 6 may be debatable, we can set it to `SchedModel.IssueWidth`.	2024-08-01 12:24:25 +08:00
Craig Topper	43de4e03a3	[RISCV] Rename hasVInstructionsBF16 to hasVInstructionsBF16Minimal. NFC (#101080 ) This makes it more consistent with Zvfhmin since it is not a complete bf16 implementation.	2024-07-29 21:55:42 -07:00
Craig Topper	1819323781	[RISCV] Store a std::unique_ptr<RISCVRegisterBankInfo> in RISCVSubtarget. NFC (#98375 ) Instead of std::unique_ptr<RegisterBankInfo>. This allows us to return a RISCVRegisterBankInfo* from getRegBankInfo so we can avoid a static_cast. This does require an additional header file to be included in RISCVSubtarget.h, but I don't think it's a big deal.	2024-07-10 14:17:56 -07:00
Michael Maitland	24619f6aaf	[RISCV][GISEL] Do not initialize GlobalISel objects unless needed (#98233 ) Prior to this commit, we created the GlobalISel objects in the RISCVSubtarget constructor, even if we are not running GlobalISel. This patch moves creation of the GlobalISel objects into their getters, which ensures that we only create these objects if they are actually needed. This helps since some of the constructors of the GlobalISel objects have a significant amount of code. We make the `unique_ptr`s `mutable` since GlobalISel passes only have access to `const TargetSubtargetInfo` through `MF.getSubtarget()`. This patch is tested by the fact that all existing RISC-V GlobalISel tests remain passing.	2024-07-10 15:12:58 -04:00
Michael Maitland	66b5f16b2f	[RISCV] Do not check PostRAScheduler in enablePostRAScheduler (#92781 ) On RISC-V, there are a few ways to control whether the PostMachineScheduler is enabled. If `-enable-post-misched` is passed or passed with a value of true, then the PostMachineScheduler is enabled. If it is passed with a value of false then the PostMachineScheduler is disabled. If the option is not passed at all, then `RISCVSubtarget::enablePostRAMachineScheduler` decides whether the pass should be enabled or not. `TargetSubtargetInfo::enablePostRAScheduler` and `TargetSubtargetInfo::enablePostRAMachineScheduler` who check the SchedModel value are not called by RISC-V backend. `RISCVSubtarget::enablePostRAMachineScheduler` currently checks if the active scheduler model sets `PostRAScheduler`. If it is set to true by the scheduler model, then the pass is enabled. If it is not set to true by the scheduler model, then the value of `UsePostRAScheduler` subtarget feature is used. I argue that the RISC-V backend should not use `PostRAScheduler` field of the scheduler model to control whether the PostMachineScheduler is enabled for the following reasons: 1. No other targets use this value to control whether PostMachineScheduler is enabled. They only use it to check whether the legacy PostRASchedulerList scheduler is enabled. 2. We can add the `UsePostRAScheduler` feature to the processor definition in RISCVProcessors.td to tie a processor to whether the pass should be enabled by default. This makes the feature and the sched model field redundant. 3. Since these options are redundant, we should prefer the feature, since we can set `+` and `-` on the feature, but the value of the scheduler cannot be controlled on the command line. 4. Keeping both options allows us to set the feature and the scheduler model value to conflicting values. Although the scheduler model value will win out, it feels awkward to allow it.	2024-05-24 14:31:14 -04:00
Luke Lau	f565b79f9f	[RISCV] Handle fixed length vectors with exact VLEN in lowerINSERT_SUBVECTOR (#84107 ) This is the insert_subvector equivalent to #79949, where we can avoid sliding up by the full LMUL amount if we know the exact subregister the subvector will be inserted into. This mirrors the lowerEXTRACT_SUBVECTOR changes in that we handle this in two parts: - We handle fixed length subvector types by converting the subvector to a scalable vector. But unlike EXTRACT_SUBVECTOR, we may also need to convert the vector being inserted into too. - Whenever we don't need a vslideup because either the subvector fits exactly into a vector register group or the vector is undef, we need to emit an insert_subreg ourselves because RISCVISelDAGToDAG::Select doesn't correctly handle fixed length subvectors yet: see d7a28f7ad A subvector exactly fits into a vector register group if its size is a known multiple of the size of a vector register, and this adds a new overload for TypeSize::isKnownMultipleOf for scalable to scalable comparisons to help reason about this. I've left RISCVISelDAGToDAG::Select untouched for now (minus relaxing an invariant), so that the insert_subvector and extract_subvector code paths are the same. We should teach it to properly handle fixed length subvectors in a follow-up patch, so that the "exact subregsiter" logic is handled in one place instead of being spread across both RISCVISelDAGToDAG.cpp and RISCVISelLowering.cpp.	2024-05-01 01:35:13 +08:00
Paul Kirth	bffc0b6569	[RISCV][NFC] Add isTargetAndroid API in RISCVSubtarget (#87671 ) This is required to set target specific code generation options for Android, like using the TLS slot for the stack protector.	2024-04-04 15:21:59 -07:00
Jack Styles	28233408a2	[CodeGen] [ARM] Make RISC-V Init Undef Pass Target Independent and add support for the ARM Architecture. (#77770 ) When using Greedy Register Allocation, there are times where early-clobber values are ignored, and assigned the same register. This is illeagal behaviour for these intructions. To get around this, using Pseudo instructions for early-clobber registers gives them a definition and allows Greedy to assign them to a different register. This then meets the ARM Architecture Reference Manual and matches the defined behaviour. This patch takes the existing RISC-V patch and makes it target independent, then adds support for the ARM Architecture. Doing this will ensure early-clobber restraints are followed when using the ARM Architecture. Making the pass target independent will also open up possibility that support other architectures can be added in the future.	2024-02-26 12:12:31 +00:00
Yeting Kuo	7e97ae35ae	[RISCV] Teach RISCVMakeCompressible handle Zca/Zcf/Zce/Zcd. (#81844 ) Make targets which don't have C but have Zca/Zcf/Zce/Zcd benefit from this pass.	2024-02-22 15:51:19 +08:00
Philip Reames	8603a7b21f	[RISCV] Add a query for exact VLEN to RISCVSubtarget [nfc] We've now got enough of these in tree that we can see which patterns appear to be idiomatic. As such, extract a helper for checking if we know the exact VLEN.	2024-02-20 17:27:47 -08:00
Wang Pengcheng	3fdb431b63	[RISCV] Use TableGen-based macro fusion (#72224 ) We convert existed macro fusions to TableGen. Bacause `Fusion` depend on `Instruction` definitions which is defined below `RISCVFeatures.td`, so we recommend user to add fusion features when defining new processor.	2024-01-25 17:10:49 +08:00
Craig Topper	faa326de97	[RISCV] Add branch+c.mv macrofusion for sifive-p450. (#76169 ) sifive-p450 supports a very restricted version of the short forward branch optimization from the sifive-7-series. For sifive-p450, a branch over a single c.mv can be macrofused as a conditional move operation. Due to encoding restrictions on c.mv, we can't conditionally move from X0. That would require c.li instead.	2024-01-08 15:23:26 -08:00
Wang Pengcheng	f9c908862a	[RISCV] Split TuneShiftedZExtFusion (#76032 ) We split `TuneShiftedZExtFusion` into three fusions to make them reusable and match the GCC implementation[1]. The zexth/zextw fusions can be reused by XiangShan[2] and other commercial processors, but shifted zero extension is not so common. `macro-fusions-veyron-v1.mir` is renamed so it's not relevant to specific processor. References: [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637303.html [2] https://xiangshan-doc.readthedocs.io/zh_CN/latest/frontend/decode	2023-12-22 14:37:26 +08:00
Yeting Kuo	cdc0392669	[RISCV] Update implies for subtarget feature. (#75824 ) PR #75576 and #75735 update some implies in llvm/lib/Support/RISCVISAInfo.cpp, but both of them miss the subtarget feature part. This patch still preserve predicate HasStdExtZfhOrZfhmin and HasStdExtZhinxOrZhinxmin, since they could make error message more readable. ( Users might not know that zfh implies zfhmin.)	2023-12-19 09:47:46 +08:00
Mikhail Gudim	29ee66f4a0	[RISCV] Macro-fusion support for veyron-v1 CPU. (#70012 ) Support was added for the following fusions: auipc-addi, slli-srli, ld-add Some parts of the code became repetative, so small refactoring of existing lui-addi fusion was done.	2023-12-11 16:34:13 -05:00
Kazu Hirata	b85f1f9b18	[Target] Include bitset (NFC) These files are relying on the transitive include of <bitset> from GIMatchTableExecutor.h, which doesn't actually use std::bitset.	2023-12-09 18:34:57 -08:00
Wang Pengcheng	5973272af7	[RISCV] Add MinimumJumpTableEntries to TuneInfo (#72963 ) This is like what AArch64 has done in #71166 except that we don't handle `HasMinSize` case now.	2023-11-23 14:05:23 +08:00
Wang Pengcheng	9bb69c1d96	[RISCV] Enable LoopDataPrefetch pass (#66201 ) So that we can benefit from data prefetch when `Zicbop` extension is supported. Tune information for data prefetching are added in `RISCVTuneInfo`.	2023-11-10 15:39:58 +08:00
Mikhail Gudim	ae7f7f2ef2	[RISCV] Add `TuneVentanaVeyron` subtarget feature. (#70414 ) This will be used to add veyron fusions in a later commit.	2023-10-31 00:37:30 -04:00
Wang Pengcheng	f3c92a06b9	[RISCV] Make PostRAScheduler a target feature (#68692 ) This is what AArch64 has done in https://reviews.llvm.org/D20762. Tests are added in macro fusion tests, which uncover a bug that DAG mutations don't take effect.	2023-10-11 10:51:03 +08:00
Wang Pengcheng	08165c444e	[RISCV] Add searchable table for tune information (#66193 ) There are many information that can be used for tuning, like alignments, cache line size, etc. But we can't make all of them `SubtargetFeature` because some of them are not with enumerable value, for example, `PrefetchDistance` used by `LoopDataPrefetch`. In this patch, a searchable table `RISCVTuneInfoTable` is added, in which each entry contains the CPU name and all tune information defined in `RISCVTuneInfo`. Each field of `RISCVTuneInfo` should have a default value and processor definitions can override the default value via `let` statements. We don't need to define a `RISCVTuneInfo` for each processor and it will use the default value (which is for `generic`) if no `RISCVTuneInfo` defined. For processors in the same series, a subclass can inherit from `RISCVTuneInfo` and override the fields. And we can also override the fields in processor definitions if there are some differences in the same processor series. When initilizing `RISCVSubtarget`, we will use `TuneCPU` as the key to serach the tune info table. So, the behavior here is if we don't specify the tune CPU, we will use specified `CPU`, which is expected I think. This patch almost undoes 61ab106, in which I added tune features of preferred function/loop alignments. More tune information can be added in the future.	2023-09-26 12:26:35 +08:00
Philip Reames	faed70d38f	[RISCV] Remove XLen field from RISCVSubtarget [nfc] The isRV64 field contains the same information, and we can derive XLen from that. Differential Revision: https://reviews.llvm.org/D159306	2023-09-01 07:42:03 -07:00
Philip Reames	3e89aca446	[RISCV] Rename getELEN to getELen [nfc] Let's follow the naming scheme use for DLen, XLen, and FLen.	2023-08-31 11:27:00 -07:00
Philip Reames	1c43aa44d8	[RISCV] Kill off redundant field XLenVT [nfc] We're already tracking XLen, we can compute XLenVt from that. Note that XLen itself should probably be driven from IsRV64 (the processor flag), but I'm leaving that to a separate change (with review).	2023-08-31 11:20:06 -07:00
Jianjian GUAN	759903568f	[RISCV] Add Zvfhmin extension support for llvm RISCV backend This patch supports Zvfhmin for RISCV codegen. Reviewed By: michaelmaitland Differential Revision: https://reviews.llvm.org/D151414	2023-08-23 16:47:47 +08:00
Yunze Zhu	5f73d2b780	[RISCV] Enable alias analysis by default In llvm alias analysis is off by default now. This patch enable alias analysis on RISCV target during code generation by default, and this makes more chances for improving performance. Modified related test cases. Differential Revision: https://reviews.llvm.org/D157250	2023-08-10 10:48:43 +08:00
Craig Topper	54e8cfe6d6	[RISCV] Simplify some predicate functions in RISCVSubtarget.h. NFC Remove some redundancy. HasStdExtZve32f implies HasStdExtF HasStdExtZve64d implies HasStdExtD HasStdExtZvfbfwma implies HasStdExtZvfbfmin	2023-07-30 12:37:50 -07:00
Jun Sha (Joshua)	e56bf13317	[RISCV] Remove some instructions from Zvfbfwma by implying Zfbfmin according to the latest spec According to the latest spec, Zvfbfwma requires Zvfbfmin and Zvfbfmin requires Zfbfmin, with FLH/FSH/FMV.H.X/HMV.X.H removed from Zvfbfwma. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D155916	2023-07-28 15:52:03 +08:00
Jun Sha (Joshua)	2b6df4a336	[RISCV] Add codegen support for bf16 vector This patch adds codegen support for vector with bfloat16 type in llvm backend. With this patch, Zvbfmin/Zvbfwma instructions as well as vle16/vse16 can generated from newly added bf16 IR intrinsics. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D156287	2023-07-28 09:54:23 +08:00
Craig Topper	6ac2ce7d84	[RISCV] Introduce the concept of DLEN(data path width) into getLMULCost. SiFive's x280 CPU has a vector unit that VLEN/2 bits wide. This means that LMUL=1 operations take 2 to process all VLEN bits. This patch adds a DLenFactor tuning parameter and applies it to TuneSiFive7. getLMULCost has been updated to use this factor in its calculations. I've added an x280 command line to one cost model test to demonstrate the effect. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D152421	2023-06-13 16:09:25 -07:00
Alex Bradbury	c4efcd6970	[RISCV] Generalise shouldExtendTypeInLibcall logic to apply to all <XLEN floats on soft ABIs This results in improved codegen for half/bf16 libcalls on soft ABIs Adds a RISCVSubtarget helper method for determining if a soft FP ABI is being targeted (future bf16 related patches make use of this). Differential Revision: https://reviews.llvm.org/D151434	2023-05-30 11:04:03 +01:00

1 2 3 4

159 Commits