llvm-project

Author	SHA1	Message	Date
Craig Topper	98e8f01d18	[RISCV] Rename MIPS_PREFETCH->MIPS_PREF. NFC (#154062 ) This matches the instruction's assembler mnemonic.	2025-08-18 07:38:10 -07:00
Kazu Hirata	07eb7b7692	[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.	2025-08-18 07:01:29 -07:00
林克	6842cc5562	[RISCV] Add SpacemiT XSMTVDot (SpacemiT Vector Dot Product) extension. (#151706 ) The full spec can be found at spacemit-x60 processor support scope: Section 2.1.2.2 (Features): https://developer.spacemit.com/documentation?token=BWbGwbx7liGW21kq9lucSA6Vnpb#2.1 This patch only supports assembler.	2025-08-18 18:03:17 +08:00
Jim Lin	127ba533bd	[RISCV] Remove ST->hasVInstructions() from getIntrinsicInstrCost for cttz/ctlz/ctpop. NFC. (#154064 ) That isn't necessary if we've checked ST->hasStdExtZvbb().	2025-08-18 15:24:25 +08:00
Kazu Hirata	cbf5af9668	[llvm] Remove unused includes (NFC) (#154051 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-08-17 23:46:35 -07:00
Kazu Hirata	400dde6ca8	[RISCV] Remove an unnecessary cast (NFC) (#154049 ) &UncompressedMI is already of MCInst *.	2025-08-17 23:46:27 -07:00
Craig Topper	4a3b69920b	[RISCV] Accept [-128,255] instead of [0, 255] for pli.b (#153913 ) pli.h and pli.w both accept signed immediates, so pli.b should too. But unlike those instructions, pli.b doesn't do any extension so its ok to accept an unsigned immediate as well.	2025-08-17 21:39:08 -07:00
Brandon Wu	98f4b7797e	[RISCV][llvm] Support fixed-length vector inline assembly constraints (#150724 )	2025-08-18 03:36:12 +00:00
Craig Topper	e67ec12640	[RISCV] Remove experimental from Smctr and Ssctr. (#153903 ) These extensions were ratified in November 2024.	2025-08-15 17:18:09 -07:00
Sergei Barannikov	b7d6f484c8	[RISCV] Remove non-existent operand of nds.vfwcvt/nds.vfncvt instructions (#153865 ) Mask operand is likely a copy-past error, they don't have one.	2025-08-16 00:46:19 +03:00
Craig Topper	c84a43ff3b	[RISCV] Fold (sext_inreg (xor (setcc), -1), i1) -> (add (setcc), -1). (#153855 ) This improves all 3 vendor extensions that make sext_inreg i1 legal Fixes #153781.	2025-08-15 12:55:18 -07:00
Nikita Popov	01bc742185	[CodeGen] Give ArgListEntry a proper constructor (NFC) (#153817 ) This ensures that the required fields are set, and also makes the construction more convenient.	2025-08-15 18:06:07 +02:00
Craig Topper	e2eaea412a	[RISCV] Add MC support for more P extension instructions. (#153629 ) This implements pages 10-14 from https://jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf Test cases copied from #123271 with a couple mistakes fixed. Co-authored-by: realqhc <caiqihan021@hotmail.com>	2025-08-14 23:23:28 -07:00
Luke Lau	e261f2895f	[RISCV] Add TSFlag for reading past VL behaviour. NFCI (#149704 ) Currently we have a switch statement that checks if a vector instruction may read elements past VL. However it currently doesn't account for instructions in vendor extensions. Handling all possible vendor instructions will result in quite a lot of opcodes being added, so I've created a new TSFlag that we can declare in TableGen, and added it to the existing instruction definitions. I've tried to be conservative as possible here: All SiFive vendor vector instructions should be covered by the flag, as well as all of XRivosVizip, and ri.vextract from XRivosVisni. For now this should be NFC because coincidentally, these instructions aren't handled in getOperandInfo, so RISCVVLOptimizer should currently avoid touching them despite them being liberally handled in getMinimumVLForUser. However in an upcoming patch we'll need to also bail in getMinimumVLForUser, so this prepares for it.	2025-08-15 01:19:03 +00:00
Craig Topper	9465916a61	[RISCV] Stop passing the merge opcode around in RISCVMoveMerger. NFC (#153687 ) What most code wants to know is the direction and we have to decode the opcode to figure that out. Instead pass the direction around as a bool and convert to opcode when we create the merge instruction.	2025-08-14 18:08:23 -07:00
Craig Topper	defbbf0129	[RISCV][MoveMerge] Don't copy kill flag when moving past an instruction that reads the register. (#153644 ) If we're moving the second copy before another instruction that reads the copied register, we need to clear the kill flag on the combined move. Fixes #153598.	2025-08-14 14:52:54 -07:00
Craig Topper	cba5f1b6c1	[RISCV] Add MC support for P extensions with scalar second operands. (#153502 ) These are the instructions from page 8 and the second half of page 9 here in https://jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf Co-authored-by: realqhc <caiqihan021@hotmail.com>	2025-08-14 07:03:36 -07:00
Nikita Popov	d1952baa5d	[CodeGen] Remove unnecessary setTypeListBeforeSoften() parameter (NFC) It does not make sense to set the softening type list without setting IsSoften=true.	2025-08-14 10:04:56 +02:00
Piotr Fusik	18782db4c9	[RISCV] Improve instruction selection for most significant bit extraction (#151687 ) (seteq (and X, 1<<XLEN-1), 0) -> (xori (srli X, XLEN-1), 1) (seteq (and X, 1<<31), 0) -> (xori (srliw X, 31), 1) // RV64 (setlt X, 0) -> (srli X, XLEN-1) // SRLI is compressible (setlt (sext X), 0) -> (srliw X, 31) // RV64	2025-08-14 09:59:43 +02:00
quic_hchandel	71b066e3a2	[RISCV] Add CodeGen support for qc.insbi and qc.insb insert instructions (#152447 ) This patch adds CodeGen support for qc.insbi and qc.insb instructions defined in the Qualcomm uC Xqcibm extension. qc.insbi and qc.insb inserts bits into destination register from immediate and register operand respectively. A sequence of `xor`, `and` & `xor` depending on appropriate conditions are converted to `qc.insbi` or `qc.insb` which depends on the immediate's value.	2025-08-14 12:08:28 +05:30
Craig Topper	ace08d5ccf	[RISCV] Add MC support for more P extension instructions. (#153458 ) These instructions are the shift by immediate and saturate by immediate instructions from the top half of page 9 of https://jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf I've also improved the CHECK lines in the invalid tests to check line and column number from the diagnostic. Co-authored-by: realqhc <caiqihan021@hotmail.com>	2025-08-13 22:07:03 -07:00
Craig Topper	059e49ceaa	[RISCV] Fix typo in comment Interger->Integer. NFC	2025-08-13 11:19:49 -07:00
Craig Topper	57fb38a536	[RISCV] Indent body of let scopes in RISCVInstrInfoP.td. NFC (#153349 ) I think this makes the code a little more readable.	2025-08-13 08:41:33 -07:00
Mikhail R. Gadelha	489a41d474	[RISCV][VLOPT] Added support for the zvbc and the remaining zvbb instructions (#153234 ) Follow-up PR to #153071, adding the remaining zvbb instructions (VBREV8_V and VREV8_V), plus the zvbc instruction (VCLMUL_VV, VCLMUL_VX, VCLMULH_VV, VCLMULH_VX).	2025-08-13 14:43:25 +00:00
Sergey Kachkov	bdddff2488	[RISCV][RVV] Prohibit conversion of scalar store to single-element vse if vmv.x.s has multiple uses (#152112 ) Godbolt example: https://godbolt.org/z/ThdfP475a In the example single-element vse is used to store reduction result instead of scalar store ([this optimization was introduced by this patch](https://reviews.llvm.org/D109482)). However, vmv.x.s can't be eliminated here because it has other uses (e.g. CopyToReg), so it seems more profitable to use scalar store (we already have store value in a scalar register, and can save one vsetvli which is likely to be required for single-element vse). The proposed solution is to this transform only if vmv.x.s has one use (in store instruction)	2025-08-13 13:10:27 +03:00
Sam Elliott	7317e3c9dd	[NFC][RISCV] Correct signed/unsigned in Comment	2025-08-12 16:17:22 -07:00
Min-Yih Hsu	ca05058b49	[IA][RISCV] Recognize deinterleaved loads that could lower to strided segmented loads (#151612 ) Turn the following deinterleaved load patterns ``` %l = masked.load(%ptr, /mask=/110110110110, /passthru=/poison) %f0 = shufflevector %l, [0, 3, 6, 9] %f1 = shufflevector %l, [1, 4, 7, 10] %f2 = shufflevector %l, [2, 5, 8, 11] ``` into ``` %s = riscv.vlsseg2(/passthru=/poison, %ptr, /mask=/1111) %f0 = extractvalue %s, 0 %f1 = extractvalue %s, 1 %f2 = poison ``` The mask `110110110110` is regarded as 'gap mask' since it effectively skips the entire third field / component. Similarly, turning the following snippet ``` %l = masked.load(%ptr, /mask=/110000110000, /passthru=/poison) %f0 = shufflevector %l, [0, 3, 6, 9] %f1 = shufflevector %l, [1, 4, 7, 10] ``` into ``` %s = riscv.vlsseg2(/passthru=/poison, %ptr, /mask=/1010) %f0 = extractvalue %s, 0 %f1 = extractvalue %s, 1 ``` Right now this patch only tries to detect gap mask from a constant mask supplied to a masked.load/vp.load.	2025-08-12 14:08:18 -07:00
Sam Elliott	9b93ccbcbe	[RISCV] Fix Immediate Check for Xqcibi UGT (#153141 ) The check should be about unsigned 16-bit immediates, not signed ones. This is not a bug per-se, as the old codegen was correct for the uint16_max case, it just didn't end up using `qc.e.bgeui`, which we would prefer it did.	2025-08-12 11:06:00 -07:00
Mikhail R. Gadelha	d455d45654	[RISCV][VLOPT] Added support for several vector crypto instructions (#153071 ) This PR adds support for the following instructions to the RISC-V VLOptimizer: vandn.vx, vandn.vv, vbrev.v, vclz.v, vcpop.v, vctz.v, vror.vi, vror.vx, vror.vv, vrol.vx, vrol.vv.	2025-08-12 12:05:03 -03:00
Sam Elliott	9e8f7acd2b	[RISCV] Track Linker Relaxable through Assembly Relaxation (#152602 ) Span-dependent instructions on RISC-V interact in a complex manner with linker relaxation. The span-dependent assembler algorithm implemented in LLVM has to start with the smallest version of an instruction and then only make it larger, so we compress instructions before emitting them to the streamer. When the instruction is streamed, the information that the instruction (or rather, the fixup on the instruction) is linker relaxable must be accurate, even though the assembler relaxation process may transform a not-linker-relaxable instruction/fixup into one that that is linker relaxable, for instance `c.jal` becoming `qc.e.jal`, or `bne` getting turned into `beq; jal` (the `jal` is linker relaxable). In order for this to work, the following things have to happen: - Any instruction/fixup which might be relaxed to a linker-relaxable instruction/fixup, gets marked as `RelaxCandidate = true` in RISCVMCCodeEmitter. - In RISCVAsmBackend, when emitting the `R_RISCV_RELAX` relocation, we have to check that the relocation/fixup kind is one that may need a relax relocation, as well as that it is marked as linker relaxable (the latter will not be set if relaxation is disabled). - Linker Relaxable instructions streamed to a Relaxable fragment need to mark the fragment and its section as linker relaxable. I also added more debug output for Sections/Fixups which are marked Linker Relaxable. This results in more relocations, when these PC-relative fixups cross an instruction with a fixup that is resolved as not linker-relaxable but caused the fragment to be marked linker relaxable at streaming time (i.e. `c.j`). Fixes: #150071	2025-08-12 09:02:48 +01:00
Luke Lau	81b576e66b	[RISCV] Cost casts with illegal types that can't be legalized (#153030 ) If we have a floating point vector and no zve32f/zve64f/zve64d, we can end up with an invalid type-legalization cost from getTypeLegalizationCost. Previously this triggered an assertion that the type must have been legalized if the "legal" type is a vector, but in this case when it's not possible to legalize the original type is spat back out. This fixes it by just checking that the legalization cost is valid. We don't have much testing for zve64x, so we may have other places in the cost model with this issue. Fixes #153008	2025-08-12 00:29:39 +08:00
Craig Topper	f55281ac38	[RISCV] Add a high half PACKW+PACK pattern for RV64. (#152760 ) Similar to the PACKH+PACK pattern for RV32. We can end up with the shift left by 32 neeed by our PACK pattern hidden behind an OR that packs 2 half words.	2025-08-11 07:37:55 -05:00
Nikita Popov	e92b7e9641	[CodeGen] Provide original IR type to CC lowering (NFC) (#152709 ) It is common to have ABI requirements for illegal types: For example, two i64 argument parts that originally came from an fp128 argument may have a different call ABI than ones that came from a i128 argument. The current calling convention lowering does not provide access to this information, so backends come up with various hacks to support it (like additional pre-analysis cached in CCState, or bypassing the default logic entirely). This PR adds the original IR type to InputArg/OutputArg and passes it down to CCAssignFn. It is not actually used anywhere yet, this just does the mechanical changes to thread through the new argument.	2025-08-11 08:57:53 +02:00
Craig Topper	7fb8630e71	[RISCV] Add another packh+packw pattern. (#152744 ) If the upper 32 bits are demanded, we might have a sext_inreg in the pattern on the byte shifted by 24. We can also match this case since packw sign extends from bit 31.	2025-08-09 09:23:44 -05:00
Min-Yih Hsu	c065ed3912	[RISCV] Add intrinsics for strided segment stores with fixed vectors (#152038 ) These are the strided versions of `riscv.segN.store.mask` intrinsics.	2025-08-08 14:08:08 -07:00
Mikhail R. Gadelha	e91f68487c	[RISCV] Update SpacemiT-X60 vector fixed-point arithmetic latencies (#150517 ) This PR adds hardware-measured latencies for all instructions defined in Section 12 of the RVV specification: "Vector Fixed-Point Arithmetic Instructions" to the SpacemiT-X60 scheduling model.	2025-08-08 11:57:35 -03:00
Luke Lau	7074471593	[RISCV] Enable tail folding by default (#151681 ) We have been tracking the performance of EVL tail folding in the loop vectorizer on RISC-V for a while now, and after much hard work from various contributors we think it should be generally profitable to enable by default now. With tail folding there is a 21% improvement on 525.x264_r on SPEC CPU 2017 on the BPI-F3 (-march=rva22u64_v -O3 -flto), as well as a 30% geomean codesize reduction on SPEC and TSVC, with no significant regressions detected. Now that we are early into the LLVM 22.x development cycle it seems like a good time to enable it to catch any issues. There are still more EVL related items of work being tracked in #123069, which should continue to improve performance.	2025-08-08 14:26:23 +08:00
Jim Lin	b9ca01b746	[RISCV] Move the decoder table for XCV, Xqci and XRivos from standard section to vendor section. NFC	2025-08-08 11:18:18 +08:00
Fangrui Song	3769ce013b	MC: Refine ALIGN relocation conditions Each section now tracks the index of the first linker-relaxable fragment, enabling two changes: * Delete redundant ALIGN relocations before the first linker-relaxable instruction in a section. The primary example is the offset 0 R_RISCV_ALIGN relocation for a text section aligned by 4. * For alignments larger than the NOP size after the first linker-relaxable instruction, ALIGN relocations are now generated, even in norelax regions. This fixes the issue #150159. The new test llvm/test/MC/RISCV/Relocations/align-after-relax.s verifies the required ALIGN in a norelax region following linker-relaxable instructions. By using a fragment index within the subsection (which is less than or equal to the section's index), the implementation may generate redundant ALIGN relocations in lower-numbered subsections before the first linker-relaxable instruction. align-option-relax.s demonstrates the ALIGN optimization. Add an initial `call` to a few tests to prevent the ALIGN optimization. --- When the alignment exceeds 2, we insert $alignment-2 bytes of NOPs, even in non-RVC code. This enables non-RVC code following RVC code to handle a 2-byte adjustment without requiring an additional state in MCSection or AsmParser. ``` .globl _start _start: // GNU ld can relax this to 6505 lui a0, 0x1 // LLD hasn't implemented this transformation. lui a0, %hi(foo) .option push .option norelax .option norvc // Now we generate R_RISCV_ALIGN with addend 2, even if this is a norvc region. .balign 4 b0: .word 0x3a393837 .option pop foo: ``` Pull Request: https://github.com/llvm/llvm-project/pull/150816	2025-08-07 19:16:58 -07:00
Sam Elliott	4e11f89904	[RISCV] Basic Objdump Mapping Symbol Support (#151452 ) This implements very basic support for RISC-V mapping symbols in llvm-objdump, sharing the implementation with how Arm/AArch64/CSKY implement this feature. This only supports the `$x` (instruction) and `$d` (data) mapping symbols for RISC-V, and not the version of `$x` which includes an architecture string suffix.	2025-08-07 11:28:07 -07:00
Mikhail R. Gadelha	f3db0cb4d8	Reland "[RISCV] Refactor X60 scheduling model helper classes. NFC." (#152336 ) This PR fixes the issue that caused an ub in PR #151472. The issue was a shl call taking a negative shift amount (posDiff). The result was never used, but tablegen would perform the calculation anyway. The fix was to replace the shl call with just multiplications with constants. Original PR description: This patch improves the helper classes in the SpacemiT-X60 vector scheduling model and will be used in follow-up PRs: There are now two functions to map LMUL to values: * ConstValueUntilLMULThenDoubleBase: returns BaseValue for LMUL values before startLMUL, Value for startLMUL, then doubles Value for each subsequent LMUL. Useful for cases where fractional LMULs have constant cycles, and integer LMULs double as they increase. * GetLMULValue: takes an ordered list of LMUL cycles and LMUL and returns the corresponding cycle. Useful for cases we can't easily cover with ConstValueUntilLMULThenDoubleBase. This PR also adds some useful simplified versions of ConstValueUntilLMULThenDoubleBase, e.g.: ConstValueUntilLMULThenDouble (when BaseValue == Value), or ConstOneUntilMF4ThenDouble (when cycles start to double after MF2)	2025-08-07 11:27:46 -03:00
Nikita Popov	406d9b1dd6	[CodeGen] Move IsFixed into ArgFlags (NFCI) (#152319 ) The information whether a specific argument is vararg or fixed is currently stored separately from all the other argument information in ArgFlags. This means that it is not accessible from CCAssign, and backends have developed all kinds of workarounds for how they can access it after all. Move this information to ArgFlags to make it directly available in all relevant places. I've opted to invert this and store it as IsVarArg, as I think that both makes the meaning more obvious and provides for a better default (which is IsVarArg=false).	2025-08-07 09:12:40 +02:00
Craig Topper	886b2133e3	[RISCV] Relax one of the zexti8 in the PACKH+PACK(W)/SLLI patterns. (#152384 ) For RV32 we don't need the byte shifted by 24 to be zero extend since the extended bits are shifted out. For RV64, we don't need the byte shifted by 24 to be zero extended if the upper 32 bits of the result aren't demanded.	2025-08-06 17:46:43 -07:00
Daniel Henrique Barboza	35bd40d321	[RISCV] add more generic macrofusions (#151140 ) These are some macrofusions that are used internally in Ventana in an yet not upstreamed processor. Figured it would be good to contribute them ahead of the processor to allow the community to also use them in their own processors, while also alleaviting our own downstream upkeep. The macrofusions being added are, considering load = lb,lh,lw,ld,lbu,lhu,lwu: - bfext (slli+srli) - auipc+load - lui+load - add(.uw)+load - addi+load - shXadd(.uw)+load, where X=1,2,3	2025-08-06 14:46:34 -04:00
Craig Topper	e232f05dfd	[RISCV] Add packw+packh isel pattern for unaligned loads on RV64. (#152159 ) This is similar to an existing pattern from RV32 with the simpliflication proposed by #152045. Instead of pack we need to use packw and we need to know that the upper 32 bits are being ignored since packw sign extends from bit 31. The use of allBinOpWUsers prevents tablegen from automatically reassociating the pattern so we need to do it manually. Tablegen is still able to commute operands though.	2025-08-06 09:09:39 -07:00
Daniel Henrique Barboza	8e57689c34	[RISCV] add load/store misched/PostRA subtarget features (#149409 ) Some processors benefit more from store clustering than load clustering, and vice-versa, depending on factors that are exclusive to each one (e.g. macrofusions implemented). Likewise, certain optimizations benefits more from misched clustering than postRA clustering. Macrofusions are again an example: in a processor with store pair macrofusions, like the veyron-v1, it is observed that misched clustering increases the amount of macrofusions more than postRA clustering. This of course isn't necessarily true for other processors, but it shows that processors can benefit from a more fine grained control of clustering mutations, and each one is able to do it differently. Add 4 new subtarget features that deprecates the existing riscv-misched-load-store-clustering and riscv-postmisched-load-store-clustering options: - disable-misched-load-clustering and disable-misched-store-clustering: disable load/store clustering during misched; - disable-postmisched-load-clustering and disable-postmisched-store-clustering: disable load/store clustering during PostRA. Note that the new subtarget features disables specific stages of the default clustering settings. The default per se (load and store clustering for both misched and PostRA) is left untouched. Disable all clustering but misched-store-clustering for the veyron-v1 processor using the new features.	2025-08-06 09:08:25 -07:00
Kazu Hirata	62fc0028bf	[Target] Remove unnecessary casts (NFC) (#152262 ) value() already returns uint64_t.	2025-08-06 07:11:07 -07:00
Craig Topper	6ba6efea84	[RISCV] Simplify one of the RV32 PACK isel patterns. (#152045 ) This pattern previously checked a specific variant of 4 bytes being packed that is generated by unaligned load expansion. Our individual PACK patterns don't handle that particular case because a DAG combine turns (or (or A, (shl B, 8)), (shl (or C, (shl D, 8)), 16)) into (or (or A, (shl B, 8)), (or (shl C, 16), (shl D, 24)). After this, the outer OR doesn't have a shl operand so we needed a pattern that looks through 2 layers of OR. To match this pattern we don't need to look at the (or A, (shl B, 8)) part since that part wasn't affected by the DAG combine and can be matched to PACKH by itself. It's enough to make sure that part of the pattern has zeros in the upper 16 bits. This allows tablegen to automatically generate more permutations of this pattern. The associative variant expansion is limited to 3 children.	2025-08-05 20:11:59 -07:00
Craig Topper	73685583c8	[VP][RISCV] Add a vp.load.ff intrinsic for fault only first load. (#128593 ) There's been some interest in supporting early-exit loops recently. https://discourse.llvm.org/t/rfc-supporting-more-early-exit-loops/84690 This patch was extracted from our downstream where we've been using it in our vectorizer.	2025-08-05 16:12:42 -07:00
quic_hchandel	d1b363e0b0	[RISCV] Add Tied operands to insert instructions in Qualcomm uC extension Xqcibm (#151339 )	2025-08-05 15:02:19 +05:30

1 2 3 4 5 ...

6991 Commits