llvm-project

Author	SHA1	Message	Date
Youngsuk Kim	d31e314131	[llvm] Don't call raw_string_ostream::flush() (NFC) Don't call raw_string_ostream::flush(), which is essentially a no-op. As specified in the docs, raw_string_ostream is always unbuffered. ( 65b13610a5226b84889b923bae884ba395ad084d for further reference )	2024-09-20 12:19:59 -05:00
Jonathon Penix	866b93e6b3	[RISCV] Don't outline pcrel_lo when the function has a section prefix (#107943 ) GNU ld will error when encountering a pcrel_lo whose corresponding pcrel_hi is in a different section. [1] introduced a check to help prevent this issue by preventing outlining in a few circumstances. However, we can also hit this same issue when outlining from functions with prefixes ("hot"/"unlikely"/"unknown" from profile information, for example) as the outlined function might not have the same prefix, possibly resulting in a "paired" pcrel_lo and pcrel_hi ending up in different sections. To prevent this issue, take a similar approach as [1] and additionally prevent outlining when we see a pcrel_lo and the function has a prefix. [1] `96c85f80f0` Fixes #107520	2024-09-11 09:53:11 -07:00
Luke Lau	933fc63a1d	[RISCV] Rematerialize vmv.s.x and vfmv.s.f (#108012 ) Continuing with #107993 and #108007, this handles the last of the main rematerializable vector instructions. There's an extra spill in one of the test cases, but it's likely noise from the spill weights and isn't an issue in practice.	2024-09-11 09:44:57 +08:00
Luke Lau	21a0176c58	[RISCV] Rematerialize vfmv.v.f (#108007 ) This is the same principle as vmv.v.x in #107993, but for floats.	2024-09-11 09:38:29 +08:00
Luke Lau	77fc8dae22	[RISCV] Rematerialize vmv.v.x (#107993 ) Even though vmv.v.x has a non constant scalar operand, we can still rematerialize it because we have split register allocation between vectors and scalars. InlineSpiller will check to make sure that the scalar operand is live at the point where the rematerialization occurs, so this won't extend any scalar live ranges. However this also means we may not be able to rematerialize in some cases, as shown in @vmv.v.x_needs_extended. It might be worthwhile teaching InlineSpiller to extend scalar live ranges in a future patch. I experimented with this locally and it reduced spills on 531.deepsjeng_r by a further 3%.	2024-09-11 09:13:23 +08:00
Luke Lau	65dc53baca	[RISCV] Rematerialize vmv.v.i (#107550 ) This continues the line of work started in #97520, and gives a 2.5% reduction in the number of spills on SPEC CPU 2017. Program regalloc.NumSpills lhs rhs diff 605.mcf_s 141.00 141.00 0.0% 505.mcf_r 141.00 141.00 0.0% 519.lbm_r 73.00 73.00 0.0% 619.lbm_s 68.00 68.00 0.0% 631.deepsjeng_s 354.00 353.00 -0.3% 531.deepsjeng_r 354.00 353.00 -0.3% 625.x264_s 1896.00 1886.00 -0.5% 525.x264_r 1896.00 1886.00 -0.5% 508.namd_r 6665.00 6598.00 -1.0% 644.nab_s 761.00 753.00 -1.1% 544.nab_r 761.00 753.00 -1.1% 638.imagick_s 4287.00 4181.00 -2.5% 538.imagick_r 4287.00 4181.00 -2.5% 602.gcc_s 12771.00 12450.00 -2.5% 502.gcc_r 12771.00 12450.00 -2.5% 510.parest_r 43876.00 42740.00 -2.6% 500.perlbench_r 4297.00 4179.00 -2.7% 600.perlbench_s 4297.00 4179.00 -2.7% 526.blender_r 13503.00 13103.00 -3.0% 511.povray_r 2006.00 1937.00 -3.4% 620.omnetpp_s 984.00 946.00 -3.9% 520.omnetpp_r 984.00 946.00 -3.9% 657.xz_s 302.00 289.00 -4.3% 557.xz_r 302.00 289.00 -4.3% 541.leela_r 378.00 356.00 -5.8% 641.leela_s 378.00 356.00 -5.8% 623.xalancbmk_s 1646.00 1548.00 -6.0% 523.xalancbmk_r 1646.00 1548.00 -6.0% Geomean difference -2.5% I initially held off submitting this patch because it surprisingly introduced a lot of spills in the test diffs, but after #107290 the vmv.v.is that caused them are now gone. The gist is that marking vmv.v.i as spillable decreased its spill weight, which actually resulted in more m8 registers getting evicted and spilled during register allocation. The SPEC results show this isn't an issue in practice though, and I plan on posting a separate patch to explain this in more detail.	2024-09-09 13:11:08 +08:00
Luke Lau	3d729571fd	[RISCV] Model dest EEW and fix peepholes not checking EEW (#105945 ) Previously for vector peepholes that fold based on VL, we checked if the VLMAX is the same as a proxy to check that the EEWs were the same. This only worked at LMUL >= 1 because the EMULs of the Src output and user's input had to be the same because the register classes needed to match. At fractional LMULs we would have incorrectly folded something like this: %x:vr = PseudoVADD_VV_MF4 $noreg, $noreg, $noreg, 4, 4 /* e16 /, 0 %y:vr = PseudoVMV_V_V_MF8 $noreg, %x, 4, 3 / e8 */, 0 This models the EEW of the destination operands of vector instructions with a TSFlag, which is enough to fix the incorrect folding. There's some overlap with the TargetOverlapConstraintType and IsRVVWideningReduction. If we model the source operands as well we may be able to subsume them.	2024-09-05 15:27:48 +08:00
Kyungwoo Lee	93b8d07a75	[MachineOutliner][NFC] Refactor (#105398 ) This patch prepares the NFC groundwork for global outlining using CGData, which will follow https://github.com/llvm/llvm-project/pull/90074. - The `MinRepeats` parameter is now explicitly passed to the `getOutliningCandidateInfo` function, rather than relying on a default value of 2. For local outlining, the minimum number of repetitions is typically 2, but for the global outlining (mentioned above), we will optimistically create a single `Candidate` for each `OutlinedFunction` if stable hashes match a specific code sequence. This parameter is adjusted accordingly in global outlining scenarios. - I have also implemented `unique_ptr` for `OutlinedFunction` to ensure safe and efficient memory management within `FunctionList`, avoiding unnecessary implicit copies. This depends on https://github.com/llvm/llvm-project/pull/101461. This is a patch for https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753.	2024-08-27 14:38:36 -07:00
Piyou Chen	b01c006f73	[TII][RISCV] Add renamable bit to copyPhysReg (#91179 ) The renamable flag is useful during MachineCopyPropagation but renamable flag will be dropped after lowerCopy in some case. This patch introduces extra arguments to pass the renamable flag to copyPhysReg.	2024-08-27 10:08:43 +08:00
Pengcheng Wang	27b608055f	[RISCV] Increase default tail duplication threshold to 6 at -O3 (#98873 ) This is just like AArch64. Changing the threshold to 6 will increase the code size, but will also decrease unconditional branches. CPUs with wide fetch/issue units can benefit from it. The value 6 may be debatable, we can set it to `SchedModel.IssueWidth`.	2024-08-01 12:24:25 +08:00
Pengcheng Wang	ed4e75d5e5	[CodeGen] Remove AA parameter of isSafeToMove (#100691 ) This `AA` parameter is not used and for most uses they just pass a nullptr. The use of `AA` was removed since 8d0383e.	2024-07-26 15:47:47 +08:00
Matt Arsenault	3cb5604d2c	MachineOutliner: Use PM to query MachineModuleInfo (#99688 ) Avoid getting this from the MachineFunction	2024-07-24 13:22:56 +04:00
Craig Topper	dbc3df1718	[RISCV] Remove unnecessary call to MachineFunction::getSubtarget. NFC RISCVInstrInfo already caches a reference to the subtarget object that owns it. We can use that.	2024-07-16 15:39:20 -07:00
R	3c5f929ad0	[RISCV] Add QingKe "XW" compressed opcode extension (#97925 ) This extension consists of 8 additional 16-bit compressed forms for existing standard load/store opcodes. These opcodes are found in some RISC-V microcontrollers from WCH / Nanjing Qinheng Microelectronics. As discussed in the Discourse forums, this uses incompatible extension and opcode names vs the vendor binary toolchain. The chosen names instead follow the conventions for other vendor extensions listed on the "riscv-non-isa" project.	2024-07-11 11:10:02 +08:00
Craig Topper	6299681665	[RISCV] Remove unnecessary cast to RISCVTargetMachine in getInstSizeInBytes. NFC getMCAsmInfo is a method on the base class so we don't need a cast here.	2024-07-10 10:42:18 -07:00
Luke Lau	ac20135605	[RISCV] Rematerialize vid.v (#97520 ) This adds initial support for rematerializing vector instructions, starting with vid.v since it's simple and has the least number of operands. It has one passthru operand which we need to check is undefined. It also has an AVL operand, but it's fine to rematerialize with it because it's scalar and register allocation is split between vector and scalar. RISCVInsertVSETVLI can still happen before vector regalloc if -riscv-vsetvl-after-rvv-regalloc is false, so this makes sure that we only rematerialize after regalloc by checking for the implicit uses that are added.	2024-07-04 11:34:25 +08:00
Nikita Popov	4169338e75	[IR] Don't include Module.h in Analysis.h (NFC) (#97023 ) Replace it with a forward declaration instead. Analysis.h is pulled in by all passes, but not all passes need to access the module.	2024-06-28 14:30:47 +02:00
Jianjian Guan	7625465651	[RISCV] Make M imply Zmmul (#95070 ) According to the spec, M implies Zmmul.	2024-06-21 11:11:10 +08:00
Liao Chunyu	f4d2f7a3b7	[RISCV] Codegen support for XCVbi extension (#89719 ) spec: https://github.com/openhwgroup/cv32e40p/blob/master/docs/source/instruction_set_extensions.rst#immediate-branching-operations Contributors: @CharKeaney, @jeremybennett, @lewis-revill, @NandniJamnadas, @PaoloS02, @simonpcook, @xingmingjie, @realqhc, @PhilippvK,@melonedo	2024-05-08 11:22:16 +08:00
Pengcheng Wang	2c1c887c8e	[RISCV] Make fixed-point instructions commutable (#90035 ) This PR includes: * vsadd.vv/vsaddu.vv * vaadd.vv/vaaddu.vv * vsmul.vv	2024-04-28 12:04:09 +08:00
Min-Yih Hsu	5f67ce5611	[RISCV][MachineCombiner] Add reassociation optimizations for RVV instructions (#88307 ) This patch covers a really basic reassociation optimizations for VADD_VV and VMUL_VV.	2024-04-25 16:36:11 -07:00
Wang Pengcheng	2125080fd5	[RISCV][NFC] Undef CASE_RVV_OPCODE* macros after using	2024-04-25 17:56:26 +08:00
Michael Maitland	12d47247e5	[RISCV][NFC] Move RISCVMaskedPseudoTable to RISCVInstrInfo	2024-04-24 09:29:20 -07:00
Xu Zhang	f6d431f208	[CodeGen] Make the parameter TRI required in some functions. (#85968 ) Fixes #82659 There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411. Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact. After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.	2024-04-24 14:24:14 +01:00
Min-Yih Hsu	5fe93b0a4d	[CodeGen][TII] Allow reassociation on custom operand indices (#88306 ) This opens up a door for reusing reassociation optimizations on target-specific binary operations with non-standard operand list. This is effectively a NFC.	2024-04-23 11:10:37 -07:00
Craig Topper	fca2a49325	[RISCV] Simplify FindRegWithEncoding in copyPhysRegVector. NFC (#89001 ) Instead of searching all encodings, we can convert the encoding back to a register and use getMatchingSuperReg.	2024-04-16 21:46:57 -07:00
Harald van Dijk	8cee94e989	[RISCV] Fix obvious copy paste error. CASE_VFMA_OPCODE_VV and CASE_VFMA_CHANGE_OPCODE_VV need to match up if we are are to avoid "Unexpected opcode" errors, but in CASE_VFMA_CHANGE_OPCODE_VV, CASE_VFMA_CHANGE_OPCODE_LMULS_MF2 had mistakenly been used instead of CASE_VFMA_CHANGE_OPCODE_LMULS_MF4.	2024-04-16 16:32:57 +01:00
Pengcheng Wang	d34a2c2adb	[RISCV] Make more vector pseudos commutable This PR includes: * vadd.vv/vand.vv/vor.vv/vxor.vv * vmseq.vv/vmsne.vv * vmin.vv/vminu.vv/vmax.vv/vmaxu.vv * vmul.vv/vmulh.vv/vmulhu.vv * vwadd.vv/vwaddu.vv * vwmul.vv/vwmulu * vwmacc.vv/vwmaccu.vv * vadc.vvm There is no test change, I may add it later. Fixes part of #64422 Reviewers: michaelmaitland, preames, lukel97, topperc, asb Reviewed By: topperc, lukel97 Pull Request: https://github.com/llvm/llvm-project/pull/88379	2024-04-16 15:55:14 +08:00
Yingwei Zheng	b5b17bf613	[RISCV] Fix assertion failure in `genShXAddAddShift` (#88757 ) Fix assertion failure in our downstream CI https://github.com/dtcxzyw/llvm-codegen-benchmark/issues/1.	2024-04-16 01:50:22 +08:00
Craig Topper	040efafa9f	[RISCV] Support uimm32 immediates in RISCVInstrInfo::movImm for RV32. (#88464 ) This allows us to support larger stack offsets for FrameLowering. Fixes #88365.	2024-04-12 08:45:03 -07:00
Michael Maitland	43248ffea7	[RISCV] Split widening floating point fused multiple-add pseudo instructions by SEW Co-authored-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>	2024-04-12 07:06:40 -07:00
Michael Maitland	c6b7944be4	[RISCV] Split single width floating point fused multiple-add pseudo instructions by SEW Co-authored-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>	2024-04-12 07:06:40 -07:00
Michael Maitland	aece68269c	[RISCV] Split PseudoVFWADD, PseudoVFWSUB, and PseudoVFWMUL by SEW Co-authored-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>	2024-04-12 07:06:39 -07:00
Pengcheng Wang	b564036933	[MachineCombiner][NFC] Split target-dependent patterns We split target-dependent MachineCombiner patterns into their target folder. This makes MachineCombiner much more target-independent. Reviewers: davemgreen, asavonic, rotateright, RKSimon, lukel97, LuoYuanke, topperc, mshockwave, asi-sc Reviewed By: topperc, mshockwave Pull Request: https://github.com/llvm/llvm-project/pull/87991	2024-04-11 12:20:27 +08:00
Sacha Coppey	53003e36e9	[RISCV] Implement Statepoint and Patchpoint lowering to call instructions (#77337 ) This patch adds stackmap support for RISC-V with call targets. Based on patch from https://reviews.llvm.org/D129848.	2024-04-11 12:19:56 +08:00
Craig Topper	7f1b9adfc8	[RISCV] Add MachineCombiner to fold (sh3add Z, (add X, (slli Y, 6))) -> (sh3add (sh3add Y, Z), X). (#87884 ) This improves a pattern that occurs in 531.deepsjeng_r. Reducing the dynamic instruction count by 0.5%. This may be possible to improve in SelectionDAG, but given the special cases around shXadd formation, it's not obvious it can be done in a robust way without adding multiple special cases. I've used a GEP with 2 indices because that mostly closely resembles the motivating case. Most of the test cases are the simplest GEP case. One test has a logical right shift on an index which is closer to the deepsjeng code. This requires special handling in isel to reverse a DAGCombiner canonicalization that turns a pair of shifts into (srl (and X, C1), C2).	2024-04-10 08:39:56 -07:00
Philip Reames	39f6d015dd	[RISCV] Eliminate getVLENFactoredAmount and expose muladd [nfc] (#87881 ) This restructures the code to make the fact that most of getVLENFactoredAmount is just a generic multiply w/immediate more obvious and prepare for a couple of upcoming enhancements to this code. Note that I plan to switch mulImm to early return, but decided I'd do that as a separate commit to keep this diff readable. --------- Co-authored-by: Luke Lau <luke_lau@icloud.com>	2024-04-08 10:24:27 -07:00
Pengcheng Wang	73ddb2a747	[RISCV] Store VLMul/NF into RegisterClass's TSFlags This TSFlags was introduced by https://reviews.llvm.org/D108767. A base class of all RISCV RegisterClass is added and we store IsVRegClass/VLMul/NF into TSFlags and add helpers to get them. This can reduce some lines and I think there will be more usages. Reviewers: preames, topperc Reviewed By: topperc Pull Request: https://github.com/llvm/llvm-project/pull/84894	2024-04-08 13:35:37 +08:00
Pengcheng Wang	f3b5597364	[RISCV] Use larger copies when register tuples are aligned When the encoding of register tuples are aligned, we can use a copy with larger LMUL to reduce copies. Reviewers: preames, topperc, lukel97 Reviewed By: topperc, lukel97 Pull Request: https://github.com/llvm/llvm-project/pull/84455	2024-04-08 13:24:57 +08:00
Craig Topper	152fcf6e77	[RISCV] Add validation of SPIMM for cm.push/pop. (#84989 ) This checks the immediate is a multiple of 16 bytes.	2024-03-28 08:38:18 -07:00
Wang Pengcheng	d9746a6a5d	[RISCV][NFC] Pass LMUL to copyPhysRegVector The opcode will be determined by LMUL. Reviewers: preames, lukel97, topperc Reviewed By: lukel97, topperc Pull Request: https://github.com/llvm/llvm-project/pull/84448	2024-03-25 12:42:59 +08:00
David Green	601e102bdb	[CodeGen] Use LocationSize for MMO getSize (#84751 ) This is part of #70452 that changes the type used for the external interface of MMO to LocationSize as opposed to uint64_t. This means the constructors take LocationSize, and convert ~UINT64_C(0) to LocationSize::beforeOrAfter(). The getSize methods return a LocationSize. This allows us to be more precise with unknown sizes, not accidentally treating them as unsigned values, and in the future should allow us to add proper scalable vector support but none of that is included in this patch. It should mostly be an NFC. Global ISel is still expected to use the underlying LLT as it needs, and are not expected to see unknown sizes for generic operations. Most of the changes are hopefully fairly mechanical, adding a lot of getValue() calls and protecting them with hasValue() where needed.	2024-03-17 18:15:56 +00:00
David Green	44be5a7fdc	[Codegen] Make Width in getMemOperandsWithOffsetWidth a LocationSize. (#83875 ) This is another part of #70452 which makes getMemOperandsWithOffsetWidth use a LocationSize for Width, as opposed to the unsigned it currently uses. The advantages on it's own are not super high if getMemOperandsWithOffsetWidth usually uses known sizes, but if the values can come from an MMO it can help be more accurate in case they are Unknown (and in the future, scalable).	2024-03-06 17:40:13 +00:00
Craig Topper	c161720ab4	[RISCV] Slightly improve expanded multiply emulation in getVLENFactoredAmount. (#84113 ) Instead of initializing the accumulator to 0. Initialize it on first assignment with a mv from the register that holds VLENB << ShiftAmount. Fix a missing kill flag on the final Add. I have no real interest in this case, just an easy optimization I noticed.	2024-03-06 08:56:37 -08:00
Craig Topper	5fb331106d	[RISCV] Use uint32_t for NumOfVReg in getVLENFactoredAmount. (#84110 ) The rest of the code pretty much assumed this anyway.	2024-03-05 20:59:07 -08:00
Craig Topper	55b5ac8917	[RISCV] Remove X0 handling from RISCVInstrInfo::optimizeCondBranch. (#81931 ) This was trying to rewrite a branch that uses X0 to a branch that uses a register produced by LI of 1 or -1. Using X0 is free so there is no reason to rewrite it. Doing so would just extend the live range of the LI register increasing register pressure. In practice this might not have triggered often because we were calling MRI.hasOneUse on X0. I'm not sure what the returns for a physical reigster.	2024-02-15 16:45:54 -08:00
Craig Topper	feee627974	[RISCV] Make sure ADDI replacement in optimizeCondBranch has a virtual reg destination. (#81938 ) If it isn't virtual, we may extend the live range of the physical register past were it is valid. For example, across a call. Found while trying to enable -riscv-enable-sink-fold which enables some copy propagation in machine sink that led to ADDIs with physical register destinations.	2024-02-15 16:34:40 -08:00
Craig Topper	0b6e04005c	[RISCV] Exclude X1 and X5 from register scavenging for long branch. (#80215 ) When a branch target is too far away we need to emit an indirect branch. We scavenge a register for this since we don't know we need this until after register allocation. Jumps using X1 and X5 as the source are hints to the hardware to pop the return-address stack. We should avoiding using them for jumps that aren't a return or tail call.	2024-02-12 09:17:50 -08:00
Yeting Kuo	0716d31649	[RISCV][NFC] Use maybe_unused instead of casting to void to fix unused variable warning. (#80651 )	2024-02-06 14:41:47 +08:00
Philip Reames	3ff7caea33	[TTI] Use Register in isLoadFromStackSlot and isStoreToStackSlot [nfc] (#80339 )	2024-02-01 17:52:35 -08:00

1 2 3 4 5 ...

415 Commits