llvm-project

Author	SHA1	Message	Date
Philip Reames	88ee17c24e	[RISCV][IA] Prefer getElementCount over getNumElements [nfc] Small cleanup, with the eventual goal of making code easier to merge between the various routines.	2025-07-15 07:39:38 -07:00
Luke Lau	612afab512	[RISCV] Use MachineInstr::isFullCopy in a few places. NFC Instead of checking that there's no subregisters.	2025-07-15 21:39:59 +08:00
Fangrui Song	5ba458c559	MCFixup: Replace getTargetKind with getKind	2025-07-15 00:21:07 -07:00
Fangrui Song	0b674f4c52	MCFixup: Replace getTargetKind with getKind MCFixupKind is now a type alias (fixup kinds are inherently target-specific). getTargetKind is no longer necessary.	2025-07-15 00:08:45 -07:00
quic_hchandel	0be51cff91	[RISCV] Add ISel patterns for Qualcomm uC Xqcicli extension (#148121 ) Add CodeGen patterns for conditional load immediate instructions	2025-07-15 12:13:57 +05:30
Kazu Hirata	7c83d66719	[llvm] Remove unused includes (NFC) (#148768 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-07-14 22:19:14 -07:00
Craig Topper	028dfd7756	[RISCV] Replace tab character. NFC	2025-07-14 21:53:55 -07:00
Craig Topper	9ba45c5c5e	[RISCV] Move RISCVDAGToDAGISel::SelectAddrRegRegScale definition later. NFC This function was placed between some static functions and their callers. Reorder to keep the related code together.	2025-07-14 21:12:10 -07:00
Sam Elliott	3faaa5cdb0	[RISCV] Fix QC.E.LI -> C.LI with Bare Symbol Compression (#146763 ) There's a long comment explaining this approach in RISCVInstrInfoXqci.td This change also fixes some problems when fixups are able to be resolved for `qc.e.li` and `qc.li`.	2025-07-14 21:00:38 -07:00
Craig Topper	4923313727	[RISCV] Fix typo in comment. NFC (#148754 ) 'unsigned' was misspelled, but it seemed easier to write uimm9 than to spell it out.	2025-07-14 20:56:07 -07:00
Craig Topper	31944ac45b	[RISCV] Render P-ext simm10_unsigned as a simm10 after parsing. (#148749 ) Instead of allowing a parsed MCInst to have a either uimm10 or simm10, always render as simm10. This avoids a mismatch between parsed MCInst and disassembled MCInst when a uimm10 value is used.	2025-07-14 20:55:10 -07:00
Craig Topper	3265a36c55	[RISCV] Refactor RISCVDAGToDAGISel::selectSimm5Shl2. NFC (#148731 ) Return from the for loop body instead of using a break and checking the shift amount after.	2025-07-14 20:54:06 -07:00
Jim Lin	96148f9214	[RISCV] Use cond_code instead for PseudoCCNDS_BFOS and PseudoCCNDS_BFOZ.	2025-07-15 11:19:09 +08:00
Jim Lin	22707fd4a5	[RISCV] Add Andes XAndesBFHCvt (Andes Scalar BFLOAT16) extension (#148563 ) The spec can be found at: https://github.com/andestech/andes-v5-isa/releases/tag/ast-v5_4_0-release. The extension includes only two instructions: one for converting from f32 to f16, and another for converting from f16 to f32. This patch only implements MC support for XAndesBFHCvt.	2025-07-15 08:59:00 +08:00
Sudharsan Veeravalli	085e8f1e52	[RISCV] Relax destination instruction dag operand matching in CompresInstEmitter (#148660 ) We have some 48-bit instructions in the `Xqci` spec that currently cannot be compressed to their 32-bit variants due to the constraint in `CompressInstEmitter` on destination instruction operands not being allowed to mismatch with the DAG operands. For eg. the` QC_E_ADDI` instruction can be compressed to the `ADDI` instruction when the immediate is signed-12 bit but this is currently not possible since the `QC_E_ADDI` instruction has `GPRNoX0` register operands while the `ADDI` instruction has `GPR` register operands leading to an operand type validation error. I think we can remove the check that only source instruction operands can mismatch with the corresponding DAG operands and rely on the fact that we check if the DAG register operand type is a subclass of the instruction register operand type.	2025-07-15 04:52:51 +05:30
Craig Topper	19b2dd9d79	[RISCV] Use emplace_back instead of push_back+make_pair. NFC (#148711 )	2025-07-14 13:47:40 -07:00
Craig Topper	d5ac1b5e28	[RISCV] Improve hasAllNBitUsers for SLLIW. (#148344 )	2025-07-14 09:02:10 -07:00
Sudharsan Veeravalli	0ae1506847	[RISCV] Add ISel patterns for Xqciac QC_SHLADD instruction (#148256 ) Add a couple of patterns to generate the Xqciac QC_SHLADD shift left and add immediate instruction.	2025-07-14 16:43:41 +05:30
Jim Lin	2886d30dd6	[RISCV] Add short forward branch scheduling for Andes45 (#147890 )	2025-07-14 09:26:19 +08:00
Craig Topper	cc9b5c3480	[RISCV] Remove unused Predicates. NFC	2025-07-11 22:51:42 -07:00
Craig Topper	390fbe664c	[RISCV] Use Predicates instead of Added Complexity to prefer QC_SELECTEQI over QC_MVEQI. NFC (#148312 ) IMHO AddedComplexity should be used as a last resort. We should use other mechanism like Predicates and PatFrag predicates to give priority.	2025-07-11 22:34:54 -07:00
Craig Topper	d0a0a1ae63	[RISCV] Remove unneeded AddedComplexity from Xqcibi patterns. NFCI (#148301 ) We don't have any tests that show why this AddedComplexity is needed. ImmLeafs are automatically ranked higher than register operands so there is no ambgiuity with the base ISA here.	2025-07-11 22:16:38 -07:00
Craig Topper	5a95ec6dc1	[RISCV] Add riscv_vlm/vsm to RISCVTargetLowering::getTgtMemIntrinsic. (#148265 )	2025-07-11 16:59:47 -07:00
Craig Topper	794698031c	[RISCV] Use i32 instead of XLenVT in Xqci patterns. NFC (#148271 ) This allows the i64 RV64 patterns to be filtered out of RISCVGenDAGISel.inc. This saves about 1500 bytes.	2025-07-11 16:59:16 -07:00
Min-Yih Hsu	bf94c8ddb3	[RISCV][NFC] Split InterleavedAccess related TLI hooks into a separate file (#148040 ) There have been discussions on splitting RISCVISelLowering.cpp. I think InterleavedAccess related TLI hooks would be some of the low hanging fruit as it's relatively isolated and also because X86 is already doing it. NFC.	2025-07-11 11:04:41 -07:00
Luke Lau	6563c795cd	[RISCV] Handle implicit defs when ensuring pseudo dominates in peephole (#148181 ) Previously we just assumed that no instruction that needed to be moved would have an implicit def, but vnclip pseudos will. We can still try to move them but we just need to check that no instructions between have any reads or writes to the physical register. Fixes #147986	2025-07-12 01:57:45 +08:00
Craig Topper	6882a30ace	[RISCV] Add BREV8 and ORC_B to hasAllNBitUsers in RISCVOptWInstrs. (#148076 ) These were removed in #147830 due to ignoring that these instructions operate on bytes. This patch adds them back with tests including a test for the byte boundary issue. I seperated out the commits to show bad optimization if we don't round Bits to the nearest byte.	2025-07-11 09:23:50 -07:00
Alex Bradbury	798f4c156f	Revert "[RISCV] AddEdge between mask producer and user of V0 (#146855 )" This reverts commit aee21c368b41cd5f7765a31b9dbe77f2bffadd4e. As noted <https://github.com/llvm/llvm-project/pull/146855#issuecomment-3061784904> this causes compile errors for several RVV configurations: fatal error: error in backend: SmallVector unable to grow. Requested capacity (4294967296) is larger than maximum value for size type (4294967295)	2025-07-11 14:04:43 +01:00
Liao Chunyu	aee21c368b	[RISCV] AddEdge between mask producer and user of V0 (#146855 ) If there are multiple masks producers followed by multiple masked consumers, may a move(vmv* v0, vx) will be generated to save mask. By moving the mask's producer after the mask's use, the spill can be eliminated, and the move can be removed.	2025-07-11 17:57:01 +08:00
Sudharsan Veeravalli	9de657abaf	[RISCV] Add ISel patterns for Xqciac QC.MULIADD instruction (#147661 ) Add basic isel patterns for the multiple accumulate QC.MULIADD instruction. While most case work with just the TD file pattern, there are few cases which need to be handled in ISelLowering depending on the immediate we are multiplying with: - imm + 1 , imm - 1, 1 - imm, -1 - imm are a power of 2 --> these become slli and add/sub - immediate is 2^n - 2 ^m --> this becomes (add/sub (shl X, C1), (shl X, C2)) - imm - 2, imm - 4, imm - 6 is a power of 2 --> these use shxadd when zba is enabled The patch does not decompose mul if Xqciac is present, for the above conditions. There could be cases where this may not beneficial which I plan to address in follow up patches.	2025-07-11 12:16:11 +05:30
quic_hchandel	66969c9494	[RISCV] Add ISel patterns for Qualcomm uC Xqcics extension (#146675 ) Add CodeGen support for conditional select instructions in this extension	2025-07-11 10:27:13 +05:30
Ramkumar Ramachandra	19c2fb2325	[ISel/RISCV] Custom-lower vector [l]lround (#147713 ) Lower it just like the vector [l]lrint, using vfcvt, with the right rounding mode. Updating costs to account for this custom-lowering is left to a companion patch.	2025-07-10 10:33:46 +01:00
Luke Lau	da8d7f49ff	[RISCV] Unify non-vp and vp rounding intrinsic costing (#147872 ) Currently we have slightly different costing for the vp and non-vp version of the rounding intrinsics. We can delete this code and use the generic BasicTTIImpl code for the vp intrinsics which falls back to the non-vp versions. I'm not sure if the zvfh costing is correct, this should probably be fixed in a follow up patch. At the moment the non-vp cost is more important since it is what the loop vectorizer will use.	2025-07-10 15:46:05 +08:00
Luke Lau	20becf373e	[TTI] Move vp.{select,merge} costing from RISCV to BasicTTIImpl. NFC (#147870 ) Move the costing to the generic implementation in BasicTTIImpl since it just falls back to the non-vp costing. Also pass through the OperandValueInfo if using value based costing, but I don't believe this affects the result for any in-tree target currently.	2025-07-10 14:30:52 +08:00
Pengcheng Wang	b57df56b48	[RISCV] Add UnsupportedSchedXXX for vendor extensions package (#147666 ) There will be more schedule definitions for vendor extentions and we need to add these `UnsupportedSchedXXX` to exsiting models every time we add new schedule definitions. The fact is that each vendor will barely implement other vendors' extensions, so we can package these definitions into one.	2025-07-10 14:15:22 +08:00
Boyao Wang	697beb3f17	[TargetLowering] Change getOptimalMemOpType and findOptimalMemOpLowering to take LLVM Context (#147664 ) Add LLVM Context to getOptimalMemOpType and findOptimalMemOpLowering. So that we can use EVT::getVectorVT to generate EVT type in getOptimalMemOpType. Related to [#146673](https://github.com/llvm/llvm-project/pull/146673).	2025-07-10 11:11:09 +08:00
Craig Topper	574b66f241	[RISCV] Use Selection::haveNoCommonBitsSet in RISCVDAGToDAGISel::orDisjoint. (#147838 )	2025-07-09 16:18:51 -07:00
Craig Topper	20a68c6179	[RISCV] Remove BREV8 and ORC_B from hasAllNBitUsers in RISCVOptWInstrs. (#147830 ) These instructions operate on bytes so we need to round the demanded bits up to the nearest byte which we aren't doing. I think we forgot to update this when we changed from hasAllWUsers to hasNBitUsers. We don't have any test case for these instruction so remove them until we can put together a test.	2025-07-09 16:18:04 -07:00
Philip Reames	7bf439d260	[IA] Partially revert interface change from 4a66ba As noted in post commit review, the API change here was not required. I'd apparently confused myself when teasing apart patches from my development branch.	2025-07-09 12:02:52 -07:00
Philip Reames	4a66ba2a4d	[IA] Support deinterleave intrinsics w/ fewer than N extracts (#147572 ) For the fixed vector cases, we already support this, but the deinterleave intrinsic cases (primary used by scalable vectors) didn't. Supporting it requires plumbing through the Factor separately from the extracts, as there can now be fewer extracts than the Factor. Note that the fixed vector path handles this slightly differently - it uses the shuffle and indices scheme to achieve the same thing.	2025-07-09 09:41:07 -07:00
Min-Yih Hsu	c2a818f48b	[RISCV] Add scheduling info for XSfvqmaccdod/qoq and XSfvfwmaccqqq instructions (#147626 ) XSfvqmaccdod/qoq and XSfvfwmaccqqq are SiFive's small-size matrix multiplication extensions. This patches add scheduling info for their instructions along with six new SchedReadWrite.	2025-07-09 09:38:44 -07:00
Min-Yih Hsu	d59d2652c8	[RISCV] Add scheduling info for XSfvfnrclipxfqf instructions (#147586 ) This patch adds scheduling data for the XSfvfnrclipxfqf instruction, which narrows / clips FP32 data to INT8 according to value range specified by a scalar register. Three new SchedReadWrites are introduced.	2025-07-09 09:06:08 -07:00
Craig Topper	3640a5842b	[RISCV] Add Commutable flag to XNOR. (#147654 )	2025-07-09 07:54:07 -07:00
Ryan Buchner	8905b1c38f	[RISCV] Efficiently lower (select %cond, andn (f, x), f) using zicond (#147369 ) The following case is now optimized: (select c, (and f, ~x), f) -> (andn f, (czero_eqz x, c))	2025-07-09 09:32:54 -04:00
Ramkumar Ramachandra	9c97b38d44	[ISel/RISCV] Custom-promote [b]f16 in [l]lrint (#146507 ) Extend lowerVectorXRINT to also do a FP_EXTEND_VL when the source element type is [b]f16, and wire up this custom-promote. Updating the cost-model to not give these an invalid cost is left to a companion patch.	2025-07-09 10:24:38 +01:00
Luke Lau	b02920f369	[RISCV] Don't increase vslide or splat vl if +vl-dependent-latency is present (#147089 ) If the subtarget's latency is dependent on vl, then we shouldn't try to fold away vsetvli toggles if it means increasing vl.	2025-07-09 16:25:22 +08:00
Jim Lin	7a6568dcd5	[RISCV] Support LLVM IR intrinsics for XAndesVSIntLoad (#147493 ) This patch adds LLVM IR intrinsic support for XAndesVSIntLoad. The document for the intrinsics can be found at: https://github.com/andestech/andes-vector-intrinsic-doc/blob/ast-v5_4_0-release-v5/auto-generated/andes-v5/intrinsic_funcs/04_andes_vector_int4_load_extension.adoc The clang part will be added in a later patch. --------- Co-authored-by: Lino Hsing-Yu Peng <linopeng@andestech.com>	2025-07-09 13:02:57 +08:00
Craig Topper	4d0c25f4a6	[RISCV] Select disjoint_or+not as xnor. (#147636 ) A disjoint OR can be converted to XOR. And a XOR+NOT is XNOR. Idea taken from #147279. I changed the existing xnor pattern to have the not on the outside instead of the inside. These are equivalent for xor since xor is associative. Tablegen was already generating multiple variants of the isel pattern using associativity. There are some issues here. The disjoint flag isn't preserved through type legalization. I was hoping we could recover it manually for the masked merge cases, but that doesn't work either.	2025-07-08 21:50:23 -07:00
Craig Topper	b7248b5cd4	[RISCV] Use cast instead of dyn_cast to MemSDNode in RISCVISelDAGToDAG.cpp. (#147643 ) All of these are guaranteed to be MemSDNode. The only intrinsics that aren't are vlm and vsm. We should add those to RISCVTargetLowering::getTgtMemIntrinsic to fix that.	2025-07-08 21:44:39 -07:00
Luke Lau	7c812ea01a	[RISCV] Avoid vl toggles when lowering vector_splice/experimental_vp_splice and add +vl-dependent-latency tuning feature (#146746 ) When vectorizing a loop with a fixed-order recurrence we use a splice, which gets lowered to a vslidedown and vslideup pair. However with the way we lower it today we end up with extra vl toggles in the loop, especially with EVL tail folding, e.g: .LBB0_5: # %vector.body # =>This Inner Loop Header: Depth=1 sub a5, a2, a3 sh2add a6, a3, a1 zext.w a7, a4 vsetvli a4, a5, e8, mf2, ta, ma vle32.v v10, (a6) addi a7, a7, -1 vsetivli zero, 1, e32, m2, ta, ma vslidedown.vx v8, v8, a7 sh2add a6, a3, a0 vsetvli zero, a5, e32, m2, ta, ma vslideup.vi v8, v10, 1 vadd.vv v8, v10, v8 add a3, a3, a4 vse32.v v8, (a6) vmv2r.v v8, v10 bne a3, a2, .LBB0_5 Because the vslideup overwrites all but UpOffset elements from the vslidedown, we currently set the vslidedown's AVL to said offset. But in the vslideup we use either VLMAX or the EVL which causes a toggle. This increases the AVL of the vslidedown so it matches vslideup, even if the extra elements are overridden, to avoid the toggle. A new tuning feature +vl-dependent-latency has been added which keeps the old behaviour for microarchitectures that dynamically dispatch uops based on vl, e.g. sifive-x280. +vl-dependent-latency can be reused for the recently proposed Ovlt optimization directive if/when it's ratified: https://lists.riscv.org/g/tech-privileged/message/2487 If we wanted to aggressively optimise for vl at the expense of introducing more toggles we could probably look at doing this in RISCVVLOptimizer.	2025-07-09 11:09:13 +08:00

... 3 4 5 6 7 ...

7012 Commits