llvm-project

Author	SHA1	Message	Date
Xu Zhang	f6d431f208	[CodeGen] Make the parameter TRI required in some functions. (#85968 ) Fixes #82659 There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411. Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact. After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.	2024-04-24 14:24:14 +01:00
Min-Yih Hsu	5fe93b0a4d	[CodeGen][TII] Allow reassociation on custom operand indices (#88306 ) This opens up a door for reusing reassociation optimizations on target-specific binary operations with non-standard operand list. This is effectively a NFC.	2024-04-23 11:10:37 -07:00
Craig Topper	fca2a49325	[RISCV] Simplify FindRegWithEncoding in copyPhysRegVector. NFC (#89001 ) Instead of searching all encodings, we can convert the encoding back to a register and use getMatchingSuperReg.	2024-04-16 21:46:57 -07:00
Harald van Dijk	8cee94e989	[RISCV] Fix obvious copy paste error. CASE_VFMA_OPCODE_VV and CASE_VFMA_CHANGE_OPCODE_VV need to match up if we are are to avoid "Unexpected opcode" errors, but in CASE_VFMA_CHANGE_OPCODE_VV, CASE_VFMA_CHANGE_OPCODE_LMULS_MF2 had mistakenly been used instead of CASE_VFMA_CHANGE_OPCODE_LMULS_MF4.	2024-04-16 16:32:57 +01:00
Pengcheng Wang	d34a2c2adb	[RISCV] Make more vector pseudos commutable This PR includes: * vadd.vv/vand.vv/vor.vv/vxor.vv * vmseq.vv/vmsne.vv * vmin.vv/vminu.vv/vmax.vv/vmaxu.vv * vmul.vv/vmulh.vv/vmulhu.vv * vwadd.vv/vwaddu.vv * vwmul.vv/vwmulu * vwmacc.vv/vwmaccu.vv * vadc.vvm There is no test change, I may add it later. Fixes part of #64422 Reviewers: michaelmaitland, preames, lukel97, topperc, asb Reviewed By: topperc, lukel97 Pull Request: https://github.com/llvm/llvm-project/pull/88379	2024-04-16 15:55:14 +08:00
Yingwei Zheng	b5b17bf613	[RISCV] Fix assertion failure in `genShXAddAddShift` (#88757 ) Fix assertion failure in our downstream CI https://github.com/dtcxzyw/llvm-codegen-benchmark/issues/1.	2024-04-16 01:50:22 +08:00
Craig Topper	040efafa9f	[RISCV] Support uimm32 immediates in RISCVInstrInfo::movImm for RV32. (#88464 ) This allows us to support larger stack offsets for FrameLowering. Fixes #88365.	2024-04-12 08:45:03 -07:00
Michael Maitland	43248ffea7	[RISCV] Split widening floating point fused multiple-add pseudo instructions by SEW Co-authored-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>	2024-04-12 07:06:40 -07:00
Michael Maitland	c6b7944be4	[RISCV] Split single width floating point fused multiple-add pseudo instructions by SEW Co-authored-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>	2024-04-12 07:06:40 -07:00
Michael Maitland	aece68269c	[RISCV] Split PseudoVFWADD, PseudoVFWSUB, and PseudoVFWMUL by SEW Co-authored-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>	2024-04-12 07:06:39 -07:00
Pengcheng Wang	b564036933	[MachineCombiner][NFC] Split target-dependent patterns We split target-dependent MachineCombiner patterns into their target folder. This makes MachineCombiner much more target-independent. Reviewers: davemgreen, asavonic, rotateright, RKSimon, lukel97, LuoYuanke, topperc, mshockwave, asi-sc Reviewed By: topperc, mshockwave Pull Request: https://github.com/llvm/llvm-project/pull/87991	2024-04-11 12:20:27 +08:00
Sacha Coppey	53003e36e9	[RISCV] Implement Statepoint and Patchpoint lowering to call instructions (#77337 ) This patch adds stackmap support for RISC-V with call targets. Based on patch from https://reviews.llvm.org/D129848.	2024-04-11 12:19:56 +08:00
Craig Topper	7f1b9adfc8	[RISCV] Add MachineCombiner to fold (sh3add Z, (add X, (slli Y, 6))) -> (sh3add (sh3add Y, Z), X). (#87884 ) This improves a pattern that occurs in 531.deepsjeng_r. Reducing the dynamic instruction count by 0.5%. This may be possible to improve in SelectionDAG, but given the special cases around shXadd formation, it's not obvious it can be done in a robust way without adding multiple special cases. I've used a GEP with 2 indices because that mostly closely resembles the motivating case. Most of the test cases are the simplest GEP case. One test has a logical right shift on an index which is closer to the deepsjeng code. This requires special handling in isel to reverse a DAGCombiner canonicalization that turns a pair of shifts into (srl (and X, C1), C2).	2024-04-10 08:39:56 -07:00
Philip Reames	39f6d015dd	[RISCV] Eliminate getVLENFactoredAmount and expose muladd [nfc] (#87881 ) This restructures the code to make the fact that most of getVLENFactoredAmount is just a generic multiply w/immediate more obvious and prepare for a couple of upcoming enhancements to this code. Note that I plan to switch mulImm to early return, but decided I'd do that as a separate commit to keep this diff readable. --------- Co-authored-by: Luke Lau <luke_lau@icloud.com>	2024-04-08 10:24:27 -07:00
Pengcheng Wang	73ddb2a747	[RISCV] Store VLMul/NF into RegisterClass's TSFlags This TSFlags was introduced by https://reviews.llvm.org/D108767. A base class of all RISCV RegisterClass is added and we store IsVRegClass/VLMul/NF into TSFlags and add helpers to get them. This can reduce some lines and I think there will be more usages. Reviewers: preames, topperc Reviewed By: topperc Pull Request: https://github.com/llvm/llvm-project/pull/84894	2024-04-08 13:35:37 +08:00
Pengcheng Wang	f3b5597364	[RISCV] Use larger copies when register tuples are aligned When the encoding of register tuples are aligned, we can use a copy with larger LMUL to reduce copies. Reviewers: preames, topperc, lukel97 Reviewed By: topperc, lukel97 Pull Request: https://github.com/llvm/llvm-project/pull/84455	2024-04-08 13:24:57 +08:00
Craig Topper	152fcf6e77	[RISCV] Add validation of SPIMM for cm.push/pop. (#84989 ) This checks the immediate is a multiple of 16 bytes.	2024-03-28 08:38:18 -07:00
Wang Pengcheng	d9746a6a5d	[RISCV][NFC] Pass LMUL to copyPhysRegVector The opcode will be determined by LMUL. Reviewers: preames, lukel97, topperc Reviewed By: lukel97, topperc Pull Request: https://github.com/llvm/llvm-project/pull/84448	2024-03-25 12:42:59 +08:00
David Green	601e102bdb	[CodeGen] Use LocationSize for MMO getSize (#84751 ) This is part of #70452 that changes the type used for the external interface of MMO to LocationSize as opposed to uint64_t. This means the constructors take LocationSize, and convert ~UINT64_C(0) to LocationSize::beforeOrAfter(). The getSize methods return a LocationSize. This allows us to be more precise with unknown sizes, not accidentally treating them as unsigned values, and in the future should allow us to add proper scalable vector support but none of that is included in this patch. It should mostly be an NFC. Global ISel is still expected to use the underlying LLT as it needs, and are not expected to see unknown sizes for generic operations. Most of the changes are hopefully fairly mechanical, adding a lot of getValue() calls and protecting them with hasValue() where needed.	2024-03-17 18:15:56 +00:00
David Green	44be5a7fdc	[Codegen] Make Width in getMemOperandsWithOffsetWidth a LocationSize. (#83875 ) This is another part of #70452 which makes getMemOperandsWithOffsetWidth use a LocationSize for Width, as opposed to the unsigned it currently uses. The advantages on it's own are not super high if getMemOperandsWithOffsetWidth usually uses known sizes, but if the values can come from an MMO it can help be more accurate in case they are Unknown (and in the future, scalable).	2024-03-06 17:40:13 +00:00
Craig Topper	c161720ab4	[RISCV] Slightly improve expanded multiply emulation in getVLENFactoredAmount. (#84113 ) Instead of initializing the accumulator to 0. Initialize it on first assignment with a mv from the register that holds VLENB << ShiftAmount. Fix a missing kill flag on the final Add. I have no real interest in this case, just an easy optimization I noticed.	2024-03-06 08:56:37 -08:00
Craig Topper	5fb331106d	[RISCV] Use uint32_t for NumOfVReg in getVLENFactoredAmount. (#84110 ) The rest of the code pretty much assumed this anyway.	2024-03-05 20:59:07 -08:00
Craig Topper	55b5ac8917	[RISCV] Remove X0 handling from RISCVInstrInfo::optimizeCondBranch. (#81931 ) This was trying to rewrite a branch that uses X0 to a branch that uses a register produced by LI of 1 or -1. Using X0 is free so there is no reason to rewrite it. Doing so would just extend the live range of the LI register increasing register pressure. In practice this might not have triggered often because we were calling MRI.hasOneUse on X0. I'm not sure what the returns for a physical reigster.	2024-02-15 16:45:54 -08:00
Craig Topper	feee627974	[RISCV] Make sure ADDI replacement in optimizeCondBranch has a virtual reg destination. (#81938 ) If it isn't virtual, we may extend the live range of the physical register past were it is valid. For example, across a call. Found while trying to enable -riscv-enable-sink-fold which enables some copy propagation in machine sink that led to ADDIs with physical register destinations.	2024-02-15 16:34:40 -08:00
Craig Topper	0b6e04005c	[RISCV] Exclude X1 and X5 from register scavenging for long branch. (#80215 ) When a branch target is too far away we need to emit an indirect branch. We scavenge a register for this since we don't know we need this until after register allocation. Jumps using X1 and X5 as the source are hints to the hardware to pop the return-address stack. We should avoiding using them for jumps that aren't a return or tail call.	2024-02-12 09:17:50 -08:00
Yeting Kuo	0716d31649	[RISCV][NFC] Use maybe_unused instead of casting to void to fix unused variable warning. (#80651 )	2024-02-06 14:41:47 +08:00
Philip Reames	3ff7caea33	[TTI] Use Register in isLoadFromStackSlot and isStoreToStackSlot [nfc] (#80339 )	2024-02-01 17:52:35 -08:00
Paul Kirth	03a61d34eb	[RISCV] Support TLSDESC in the RISC-V backend (#66915 ) This patch adds basic TLSDESC support in the RISC-V backend. Specifically, we add new relocation types for TLSDESC, as prescribed in https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373, and add a new pseudo instruction to simplify code generation. This patch does not try to optimize the local dynamic case, which can be improved in separate patches. Linker side changes will also be handled separately. The current implementation is only enabled when passing the new `-enable-tlsdesc` codegen flag.	2024-01-23 16:16:07 -08:00
Anatoly Trosinenko	10bd69a4f7	[MachineOutliner] Refactor iterating over Candidate's instructions (#78972 ) Make Candidate's front() and back() functions return references to MachineInstr and introduce begin() and end() returning iterators, the same way it is usually done in other container-like classes. This makes possible to iterate over the instructions contained in Candidate the same way one can iterate over MachineBasicBlock (note that begin() and end() return bundled iterators, just like MachineBasicBlock does, but no instr_begin() and instr_end() are defined yet).	2024-01-23 17:21:40 +03:00
Simon Pilgrim	a369619694	Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFC.	2024-01-23 11:30:06 +00:00
Simeon K	297b77036e	[RISCV] Fix stack size computation when M extension disabled (#78602 ) Ensure that getVLENFactoredAmount does not fail when the scale amount requires the use of a non-trivial multiplication but the M extension is not enabled. In such case, perform the multiplication using shifts and adds.	2024-01-22 23:10:25 -08:00
Alex Bradbury	bc90b91885	Revert "[RISCV] Implement RISCVInsrInfo::getConstValDefinedInReg" This reverts commit 4b7d997aaed7a2399d5e73fc3adfaaa6a3d35d1f. A miscompile was reported <https://github.com/llvm/llvm-project/pull/77610#issuecomment-1896193835>. Reverting so it can be investigated.	2024-01-17 19:27:36 +00:00
Alex Bradbury	57d517c257	[RISCV] Implement RISCVInsrInfo::getConstValDefinedInReg (#77610 ) This helper function handles common cases where we can determine a constant value is being defined in a register. Although it looks like codegen changes are possible due to this being called in PeepholeOptimizer, my main motivation is to use this in describeLoadedValue.	2024-01-16 07:14:41 +00:00
Craig Topper	c9da4dc77f	[RISCV] Refactor GPRF64 register class to make it usable for Zacas. (#77408 ) -Rename to GPRPair. -Rename registers to be named like X10_X11 instead of X10_PD. Except X0 which is now X0_Pair since it is not paired with X1. -Use unknown size and offset for the subreg indices. This might be a functional change, but does not affect any lit tests.	2024-01-09 09:21:27 -08:00
Jim Lin	96c4f1034c	[RISCV] Add support predicating for ANDN/ORN/XNOR with short-forward-branch-opt. (#77077 ) ANDN/ORN/XNOR are like other ALU instructions. It should be able to be predicated by the cpu that supports short-forward-branch.	2024-01-09 11:12:44 +08:00
Craig Topper	faa326de97	[RISCV] Add branch+c.mv macrofusion for sifive-p450. (#76169 ) sifive-p450 supports a very restricted version of the short forward branch optimization from the sifive-7-series. For sifive-p450, a branch over a single c.mv can be macrofused as a conditional move operation. Due to encoding restrictions on c.mv, we can't conditionally move from X0. That would require c.li instead.	2024-01-08 15:23:26 -08:00
Fangrui Song	360996ac5a	[RISCV] Merge machine operand flag MO_PLT into MO_CALL (#77253 ) Since #72467, `@plt` in assembly output "call foo@plt" is omitted. We can trivially merge MO_PLT and MO_CALL without any functional change to assembly/relocatable file output. Earlier architectures use different call relocation types whether a PLT is potentially needed: R_386_PLT32/R_386_PC32, R_68K_PLT32/R_68K_PC32, R_SPARC_WDISP30/R_SPARC_WPLT320. However, as the PLT property is per-symbol instead of per-call-site and linkers can optimize out a PLT, the distinction has been confusing. Arm made good names R_ARM_CALL/R_AARCH64_CALL. Let's use MO_CALL instead of MO_PLT. As follow-ups, we can merge fixup_riscv_call/fixup_riscv_call_plt and VK_RISCV_CALL/VK_RISCV_CALL_PLT.	2024-01-07 12:43:39 -08:00
Craig Topper	0ebe97115d	Revert "[RISCV] Refactor subreg indices. (#77173 )" This reverts commit b5de136ef3fd63c6a6aabaea16792e47be1eeeff. Based on post commit feedback, I need to some other work before this makes sense.	2024-01-06 18:51:15 -08:00
Craig Topper	b5de136ef3	[RISCV] Refactor subreg indices. (#77173 ) -Rename sub_32_hi to sub_gpr_odd -Add dedicated sub_gpr_even. -Rename sub_32 and sub_16 to sub_fpr32 and sub_fpr16. -Remove start offset from sub_gpr_odd. AArch64 doesn't use non-zero offset for GPR tuples so I don't think we need to. This is preparation for a RV64 GPRPair for Zacas.	2024-01-06 11:42:53 -08:00
Alex Bradbury	02c2bf8c05	[RISCV] Change heuristic used for load clustering (#75341 ) Split out from #73789, so as to leave that PR just for flipping load clustering to on by default. Clusters if the operations are within a cache line of each other (as AMDGPU does in shouldScheduleLoadsNear). X86 does something similar, but does `((Offset2 - Offset1) / 8 > 64)`. I'm not sure if that's intentionally set to 512 bytes or if the division is in error. Adopts the suggestion from @wangpc-pp to query the cache line size and use it if available. We also cap the maximum cluster size to cap the potential register pressure impact (which may lead to additional spills).	2024-01-02 16:28:24 +00:00
Alex Bradbury	b717365216	[MachineScheduler][NFCI] Add Offset and OffsetIsScalable args to shouldClusterMemOps (#73778 ) These are picked up from getMemOperandsWithOffsetWidth but weren't then being passed through to shouldClusterMemOps, which forces backends to collect the information again if they want to use the kind of heuristics typically used for the similar shouldScheduleLoadsNear function (e.g. checking the offset is within 1 cache line). This patch just adds the parameters, but doesn't attempt to use them. There is potential to use them in the current PPC and AArch64 shouldClusterMemOps implementation, and I intend to use the offset in the heuristic for RISC-V. I've left these for future patches in the interest of being as incremental as possible. As noted in the review and in an inline FIXME, an ElementCount-style abstraction may later be used to condense these two parameters to one argument. ElementCount isn't quite suitable as it doesn't support negative offsets.	2023-12-06 15:30:48 +00:00
Alex Bradbury	d6fbd96e5e	[RISCV] Support FrameIndex operands in getMemOperandsWithOffsetWidth / getMemOperandWithOffsetWidth (#73802 ) I noted AArch64 happily accepts a FrameIndex operand as well as a register. This doesn't cause any changes outside of my C++ unit test for the current state of in-tree, but this will cause additional test changes if #73789 is rebased on top of it. Note that the returned Offset doesn't seem at all as meaningful if you have a FrameIndex base, though the approach taken here follows AArch64 (see D54847). This change won't harm the approach taken in shouldClusterMemOps because memOpsHaveSameBasePtr will only return true if the FrameIndex operand is the same for both operations.	2023-12-05 21:26:56 +00:00
Alex Bradbury	85c9c16895	[RISCV] Support load clustering in the MachineScheduler (off by default) (#73754 ) This adds minimal support for load clustering, but disables it by default. The intent is to iterate on the precise heuristic and the question of turning this on by default in a separate PR. Although previous discussion indicates hope that the MachineScheduler would replace most uses of the SelectionDAG scheduler, it does seem most targets aren't using MachineScheduler load clustering right now: PPC+AArch64 seem to just use it to help with paired load/store formation and although AMDGPU uses it for general clustering it also implements ShouldScheduleLoadsNear for the SelectionDAG scheduler's clustering.	2023-11-29 10:01:55 +00:00
Alex Bradbury	9c5003cc0c	[RISCV] Implement RISCVInstrInfo::getMemOperandsWithOffsetWidth (#73681 ) This hook is called by the default implementation of getMemOperandWithOffset and by the load/store clustering code in the MachineScheduler though this isn't enabled by default and is not yet enabled for RISC-V. Only return true for queries on scalar loads/stores for now (this is a conservative starting point, and vector load/store can be handled in a follow-on patch).	2023-11-29 04:48:43 +00:00
Craig Topper	a845061935	[AArch64] Use the same fast math preservation for MachineCombiner reassociation as X86/PowerPC/RISCV. (#72820 ) Don't blindly copy the original flags from the pre-reassociated instrutions. This copied the integer poison flags which are not safe to preserve after reassociation. For the FP flags, I think we should only keep the intersection of the flags. Override setSpecialOperandAttr to do this. Fixes #72777.	2023-11-22 14:17:45 -08:00
Craig Topper	9ae04a77d1	[RISCV] Don't set nsw/nuw/exact flag after MachineCombiner reassociation. This matches what PowerPC and X86 do.	2023-11-19 11:08:28 -08:00
Alex Bradbury	7f28e8ced7	[RISCV] Implement RISCVInstrInfo::isAddImmediate (#72356 ) This hook is called by the target-independent implementation of TargetInstrInfo::describeLoadedValue. I've opted to test it via a C++ unit test, which although fiddly to set up seems the right way to test a function with such clear intended semantics (rather than testing the impact indirectly). isAddImmediate will never recognise ADDIW as an add immediate which I _think_ is conservatively correct, as the caller may not understand its semantics vs ADDI. Note that although the doc comment for isAddImmediate specifies its behaviour solely in terms of physical registers, none of the current in-tree implementations (including this one) bail out on virtual registers (see #72357).	2023-11-16 14:43:31 +00:00
Alex Bradbury	ac378ac493	[RISCV][NFC] Rewrite doc comment for RISCVInstrInfo::getMemOperandWithOffsetWidth Attempt to clarify the expected behaviour.	2023-11-15 14:50:37 +00:00
Craig Topper	e0e0891d74	[RISCV][GISel] Select G_BRCOND and G_ICMP together when possible. This allows us to fold the G_ICMP operands into the conditional branch. This reuses the helper function we have for folding a G_ICMP into G_SELECT.	2023-11-12 15:53:23 -08:00
Wang Pengcheng	e179b125fb	[RISCV][NFC] Pass MCSubtargetInfo instead of FeatureBitset in RISCVMatInt (#71770 ) The use of `hasFeature` is more descriptive and the callers of `RISCVMatInt` have no need to call `getFeatureBits()` any more.	2023-11-09 15:15:23 +08:00

1 2 3 4 5 ...

342 Commits