llvm-project

Author	SHA1	Message	Date
quic_hchandel	66969c9494	[RISCV] Add ISel patterns for Qualcomm uC Xqcics extension (#146675 ) Add CodeGen support for conditional select instructions in this extension	2025-07-11 10:27:13 +05:30
Ramkumar Ramachandra	19c2fb2325	[ISel/RISCV] Custom-lower vector [l]lround (#147713 ) Lower it just like the vector [l]lrint, using vfcvt, with the right rounding mode. Updating costs to account for this custom-lowering is left to a companion patch.	2025-07-10 10:33:46 +01:00
Boyao Wang	697beb3f17	[TargetLowering] Change getOptimalMemOpType and findOptimalMemOpLowering to take LLVM Context (#147664 ) Add LLVM Context to getOptimalMemOpType and findOptimalMemOpLowering. So that we can use EVT::getVectorVT to generate EVT type in getOptimalMemOpType. Related to [#146673](https://github.com/llvm/llvm-project/pull/146673).	2025-07-10 11:11:09 +08:00
Philip Reames	7bf439d260	[IA] Partially revert interface change from 4a66ba As noted in post commit review, the API change here was not required. I'd apparently confused myself when teasing apart patches from my development branch.	2025-07-09 12:02:52 -07:00
Philip Reames	4a66ba2a4d	[IA] Support deinterleave intrinsics w/ fewer than N extracts (#147572 ) For the fixed vector cases, we already support this, but the deinterleave intrinsic cases (primary used by scalable vectors) didn't. Supporting it requires plumbing through the Factor separately from the extracts, as there can now be fewer extracts than the Factor. Note that the fixed vector path handles this slightly differently - it uses the shuffle and indices scheme to achieve the same thing.	2025-07-09 09:41:07 -07:00
Ryan Buchner	8905b1c38f	[RISCV] Efficiently lower (select %cond, andn (f, x), f) using zicond (#147369 ) The following case is now optimized: (select c, (and f, ~x), f) -> (andn f, (czero_eqz x, c))	2025-07-09 09:32:54 -04:00
Ramkumar Ramachandra	9c97b38d44	[ISel/RISCV] Custom-promote [b]f16 in [l]lrint (#146507 ) Extend lowerVectorXRINT to also do a FP_EXTEND_VL when the source element type is [b]f16, and wire up this custom-promote. Updating the cost-model to not give these an invalid cost is left to a companion patch.	2025-07-09 10:24:38 +01:00
Luke Lau	7c812ea01a	[RISCV] Avoid vl toggles when lowering vector_splice/experimental_vp_splice and add +vl-dependent-latency tuning feature (#146746 ) When vectorizing a loop with a fixed-order recurrence we use a splice, which gets lowered to a vslidedown and vslideup pair. However with the way we lower it today we end up with extra vl toggles in the loop, especially with EVL tail folding, e.g: .LBB0_5: # %vector.body # =>This Inner Loop Header: Depth=1 sub a5, a2, a3 sh2add a6, a3, a1 zext.w a7, a4 vsetvli a4, a5, e8, mf2, ta, ma vle32.v v10, (a6) addi a7, a7, -1 vsetivli zero, 1, e32, m2, ta, ma vslidedown.vx v8, v8, a7 sh2add a6, a3, a0 vsetvli zero, a5, e32, m2, ta, ma vslideup.vi v8, v10, 1 vadd.vv v8, v10, v8 add a3, a3, a4 vse32.v v8, (a6) vmv2r.v v8, v10 bne a3, a2, .LBB0_5 Because the vslideup overwrites all but UpOffset elements from the vslidedown, we currently set the vslidedown's AVL to said offset. But in the vslideup we use either VLMAX or the EVL which causes a toggle. This increases the AVL of the vslidedown so it matches vslideup, even if the extra elements are overridden, to avoid the toggle. A new tuning feature +vl-dependent-latency has been added which keeps the old behaviour for microarchitectures that dynamically dispatch uops based on vl, e.g. sifive-x280. +vl-dependent-latency can be reused for the recently proposed Ovlt optimization directive if/when it's ratified: https://lists.riscv.org/g/tech-privileged/message/2487 If we wanted to aggressively optimise for vl at the expense of introducing more toggles we could probably look at doing this in RISCVVLOptimizer.	2025-07-09 11:09:13 +08:00
Craig Topper	be19a27cc5	[RISCV] Correct stride for strided load/store of vectors of pointers in lowerInterleavedLoad/lowerInterleavedStore. (#147598 ) We need to use DataLayout to get the size if the element type is a pointer.	2025-07-08 18:24:50 -07:00
Philip Reames	bdf7812855	[RISCV] Consolidate intrinsic ID tables [nfc]	2025-07-07 13:27:53 -07:00
Ramkumar Ramachandra	499e656cac	[ISel/RISCV] Modernize loops (NFC) (#147281 )	2025-07-07 17:03:08 +01:00
Matt Arsenault	d8ef156379	DAG: Remove verifyReturnAddressArgumentIsConstant (#147240 ) The intrinsic argument is already marked with immarg so non-constant values are rejected by the IR verifier.	2025-07-07 16:28:47 +09:00
Jim Lin	61529d9e36	[RISCV] Remove implied extension Zvfhmin for XAndesVPackFPH (#146861 ) XAndesVPackFPH can actually be used independently without requiring Zvfhmin. Therefore, we remove the implicitly required Zvfhmin extension from XAndesVPackFPH and imply that the f extension is sufficient.	2025-07-04 10:16:20 +08:00
Craig Topper	e35cf02e54	[RISCV] Pass RISCVSubtarget to translateSetCCForBranch. NFC	2025-07-03 13:34:46 -07:00
Ryan Buchner	be762b7b7d	[RISCV] Efficiently lower (select cond, u, rot[r/l](u, rot.amt)) using zicond extension (#143768 ) The following lowerings now occur: (select cond, u, rotr(u, rot.amt)) -> (rotr u, (czero_nez rot.amt, cond)) (select cond, rotr(u, rot.amt), u) -> (rotr u, (czero_eqz rot.amt, cond)) (select cond, u, rotl(u, rot.amt)) -> (rotl u, (czero_nez rot.amt, cond)) (select cond, rotl(u, rot.amt), u) -> (rotl u, (czero_eqz rot.amt, cond))	2025-07-03 15:27:09 -04:00
UmeshKalappa	032966ff56	[RISCV] Added the MIPS prefetch extensions for MIPS RV64 P8700. (#145647 ) the extension enabled with xmipscbop. Please refer "MIPS RV64 P8700/P8700-F Multiprocessing System Programmer’s Guide" for more info on the extension at https://mips.com/wp-content/uploads/2025/06/P8700_Programmers_Reference_Manual_Rev1.84_5-31-2025.pdf	2025-07-03 10:59:10 +02:00
Jim Lin	283f53ac6f	[RISCV] Add isel patterns for generating XAndesPerf branch immediate instructions (#145147 ) Similar to #139872. This patch adds isel patterns to match `riscv_brcc` and `riscv_selectcc_frag` to XAndesPerf branch instructions.	2025-07-03 12:47:53 +08:00
Simon Pilgrim	38200e94f1	[DAG] visitFREEZE - always allow freezing multiple operands (#145939 ) Always try to fold freeze(op(....)) -> op(freeze(),freeze(),freeze(),...). This patch proposes we drop the opt-in limit for opcodes that are allowed to push a freeze through the op to freeze all its operands, through the tree towards the roots. I'm struggling to find a strong reason for this limit apart from the DAG freeze handling being immature for so long - as we've improved coverage in canCreateUndefOrPoison/isGuaranteedNotToBeUndefOrPoison it looks like the regressions are not as severe. Hopefully this will help some of the regression issues in #143102 etc.	2025-07-02 11:28:37 +01:00
Ramkumar Ramachandra	652630b3c9	[ISel/RISCV] Fix fixed-vector [l]lrint lowering (#145898 ) Make the fixed-vector lowering of ISD::[L]LRINT use the custom-lowering routine, lowerVectorXRINT, and fix issues in lowerVectorXRINT related to this new functionality.	2025-06-30 13:44:34 +01:00
Ramkumar Ramachandra	7ff9669a2e	[ISel/RISCV] Refactor isPromotedOpNeedingSplit (NFC) (#146059 )	2025-06-28 11:41:26 +01:00
Ramkumar Ramachandra	2282d4faa0	[ISel/RISCV] Improve code in lowerFCOPYSIGN (NFC) (#146061 )	2025-06-27 17:02:38 +01:00
Craig Topper	375af75efb	[RISCV] Simplify the check for when to call EmitLoweredCascadedSelect. NFC (#145930 ) Based on the comments and tests, we only want to call EmitLoweredCascadedSelect on selects of FP registers. Everytime we add a new branch with immediate opcode, we've been excluding it here. This patch switches to checking that the comparison operands are both registers so branch on immediate is automatically excluded.	2025-06-27 08:56:49 -07:00
quic_hchandel	950d281eb2	[RISCV] Add ISel patterns for Qualcomm uC Xqcicm extension (#145643 ) Add codegen patterns for the conditional move instructions in this extension	2025-06-27 12:25:48 +05:30
Craig Topper	c8243251cb	[RISCV] Remove separate immediate condition codes from RISCVCC. NFC (#145762 ) This wasn't scalable and made the RISCVCC enum effectively just a different way of spelling the branch opcodes. This patch reduces RISCVCC back down to 6 enum values. The primary user is select pseudoinstructions which now share the same encoding across all vendor extensions. The select opcode and condition code are used to determine the branch opcode when expanding the pseudo. The Cond SmallVector returned by analyzeBranch now returns the opcode instead of the RISCVCC. reverseBranchCondition now works directly on opcodes. getOppositeBranchCondition is also retained. Stacked on #145622	2025-06-25 23:09:24 -07:00
Craig Topper	6fd182a3bb	[RISCV] Support fixed vector vp.reverse/splice with Zvfhmin/Zvfbfmin. (#145596 ) Fix the names of some tests I accidentally misspelled.	2025-06-25 13:47:00 -07:00
Ming Yan	10edc3df99	[RISCV] Try to optimize `vp.splice` to `vslide1up`. (#144871 ) Fold (vp.splice (insert_elt poison, scalar, 0), vec, 0, mask, 1, vl) to (vslide1up vec, scalar, mask, vl). Fold (vp.splice (splat_vector scalar), vec, 0, mask, 1, vl) to (vslide1up vec, scalar, mask, vl).	2025-06-25 23:03:20 +08:00
Craig Topper	9702d37062	[RISCV] Support scalable vector vp.reverse/splice with Zvfhmin/Zvfbfmin. (#145588 )	2025-06-24 15:40:24 -07:00
Craig Topper	7150b2c76a	[RISCV] Optimize vp.splice with 0 offset. (#145533 ) We can skip the slidedown if the offset is 0.	2025-06-24 10:02:28 -07:00
Jim Lin	f6ab1f02ec	[RISCV] Support LLVM IR intrinsics for XAndesVBFHCvt (#145321 ) This patch adds LLVM IR intrinsic support for XAndesVBFHCvt. The document for the intrinsics can be found at: https://github.com/andestech/andes-vector-intrinsic-doc/blob/ast-v5_4_0-release-v5/auto-generated/andes-v5/intrinsic_funcs.adoc#vector-widening-convert-intrinsicsxandesvbfhcvt https://github.com/andestech/andes-vector-intrinsic-doc/blob/ast-v5_4_0-release-v5/auto-generated/andes-v5/intrinsic_funcs.adoc#vector-narrowing-convert-intrinsicsxandesvbfhcvt Vector bf16 load/store intrisics is also enabled when +xandesvbfhcvt is specified. The corresponding LLVM IR intrisic testcase would be added in a follow-up patches. The clang part will be added in a later patch. Co-authored-by: Tony Chuan-Yue Yuan <yuan593@andestech.com>	2025-06-24 10:19:04 +08:00
Sudharsan Veeravalli	88b98d3367	[RISCV] Add ISel pattern for generating QC_BREV32 (#145288 ) The `QC_BREV32` instruction reverses the bit order of `rs1` and writes the result to `rd`	2025-06-24 07:11:46 +05:30
Sam Elliott	a6eb5eee38	[RISCV][NFC] Remove hasStdExtCOrZca (#145139 ) As of 20b5728b7b1ccc4509a316efb270d46cc9526d69, C always enables Zca, so the check `C \|\| Zca` is equivalent to just checking for `Zca`. This replaces any uses of `HasStdExtCOrZca` with a new `HasStdExtZca` (with the same assembler description, to avoid changes in error messages), and simplifies everywhere where C++ needed to check for either C or Zca. The Subtarget function is just deprecated for the moment.	2025-06-23 10:49:47 -07:00
Matt Arsenault	48155f93dd	CodeGen: Emit error if getRegisterByName fails (#145194 ) This avoids using report_fatal_error and standardizes the error message in a subset of the error conditions.	2025-06-23 16:33:35 +09:00
Craig Topper	0c47628515	Re-commit "[RISCV] Properly support RISCVISD::LLA in getTargetConstantFromLoad. (#145112 )" With proper co-author. Original message: We need to pass the operand of LLA to GetSupportedConstantPool. This replaces #142292 with test from there added as a pre-commit for both medlow and pic. Co-authored-by: Carl Nettelblad carl.nettelblad@rapidity-space.com	2025-06-21 10:18:49 -07:00
Craig Topper	fc36e47a49	Revert "[RISCV] Properly support RISCVISD::LLA in getTargetConstantFromLoad. (#145112 )" I missed the Co-authored-by that I tried to add. This reverts commit 1da864b574f699d5c9be68dca9b3969ad50f4803.	2025-06-21 10:18:34 -07:00
Craig Topper	1da864b574	[RISCV] Properly support RISCVISD::LLA in getTargetConstantFromLoad. (#145112 ) We need to pass the operand of LLA to GetSupportedConstantPool. This replaces #142292 with test from there added as a pre-commit for both medlow and pic.	2025-06-21 10:17:30 -07:00
Philip Reames	5886f0a183	[RISCV] Allow larger offset when matching build_vector as vid sequence (#144756 ) I happened to notice that when legalizing get.active.lane.mask with large vectors we were materializing via constant pool instead of just shifting by a constant. We should probably be doing a full cost comparison for the different lowering strategies as opposed to our current adhoc heuristics, but the few cases this regresses seem pretty minor. (Given the reduction in vset toggles, they might not be regressions at all.) --------- Co-authored-by: Craig Topper <craig.topper@sifive.com>	2025-06-20 14:20:17 -07:00
Craig Topper	04e2e581ac	[RISCV] Treat bf16->f32 as separate ExtKind in combineOp_VLToVWOp_VL. (#144653 ) This allows us to better track the narrow type we need and to fix miscompiles if f16->f32 and bf16->f32 extends are mixed. Fixes #144651.	2025-06-20 10:44:51 -07:00
Pengcheng Wang	ca29c632f0	[RISCV] Support non-power-of-2 types when expanding memcmp We can convert non-power-of-2 types into extended value types and then they will be widen. Reviewers: lukel97 Reviewed By: lukel97 Pull Request: https://github.com/llvm/llvm-project/pull/114971	2025-06-18 16:11:18 +08:00
Craig Topper	f3af1cd08c	[RISCV] Set the exact flag on the SRL created for converting vscale to a read of vlenb. (#144571 ) We know that vlenb is a multiple of RVVBytesPerBlock so we aren't shifting out any non-zero bits.	2025-06-17 16:24:50 -07:00
Philip Reames	391dafd8af	[RISCV] Consolidate both copies of getLMUL1VT [nfc] (#144568 ) Put one copy on RISCVTargetLowering as a static function so that both locations can use it, and rename the method to getM1VT for slightly improved readability.	2025-06-17 11:28:43 -07:00
Craig Topper	a3d35b87ea	[RISCV] Use RISCV::RVVBitsPerBlock instead of 64 in getLMUL1VT. NFC (#144401 )	2025-06-16 11:24:33 -07:00
Ryan Buchner	a59e4acd75	[RISCV] Lower SELECT's with one constant more efficiently using Zicond (#143581 ) See #143580 for MR with the test commit. Performs the following transformations: (select c, c1, t) -> (add (czero_nez t - c1, c), c1) (select c, t, c1) -> (add (czero_eqz t - c1, c), c1) @mgudim	2025-06-13 08:57:46 -04:00
Pengcheng Wang	4903c11a7e	[RISCV] Support memcmp expansion for vectors This patch adds the support of generating vector instructions for `memcmp`. This implementation is inspired by X86's. We convert integer comparisons (eq/ne only) into vector comparisons and do a vector reduction and to get the result. The range of supported load sizes is (XLEN, VLEN * LMUL8] and non-power-of-2 types are not supported. Fixes #143294. Reviewers: lukel97, asb, preames, topperc, dtcxzyw Reviewed By: topperc, lukel97 Pull Request: https://github.com/llvm/llvm-project/pull/114517	2025-06-13 14:31:48 +08:00
Serge Pavlov	953a778fab	[RISCV][FPEnv] Lowering of fpenv intrinsics (#141498 ) The change implements custom lowering of `get_fpenv`, `set_fpenv` and `reset_fpenv` for RISCV target.	2025-06-11 19:08:23 +07:00
Jim Lin	bfe0967603	[RISCV] Remove the TODO for vqdotsu. NFC. It has been supported by #141267.	2025-06-10 14:08:37 +08:00
Jim Lin	6881c7d5fa	[RISCV] Don't select sh{1,2,3}add if shl doesn't have one use (#143351 ) Try to fix https://github.com/llvm/llvm-project/pull/130829#pullrequestreview-2730533158. There's no benefit if shl doesn't have one use.	2025-06-10 13:34:12 +08:00
Philip Reames	2680afb76b	[RISCV] Migrate zvqdotq reduce matching to use partial_reduce infrastructure (#142212 ) This involves a codegen regression at the moment due to the issue described in 443cdd0b, but this aligns the lowering paths for this case and makes it less likely future bugs go undetected.	2025-06-09 17:47:08 -07:00
Philip Reames	939666380f	[SDAG] Add partial_reduce_sumla node (#141267 ) We have recently added the partial_reduce_smla and partial_reduce_umla nodes to represent Acc += ext(b) * ext(b) where the two extends have to have the same source type, and have the same extend kind. For riscv64 w/zvqdotq, we have the vqdot and vqdotu instructions which correspond to the existing nodes, but we also have vqdotsu which represents the case where the two extends are sign and zero respective (i.e. not the same type of extend). This patch adds a partial_reduce_sumla node which has sign extension for A, and zero extension for B. The addition is somewhat mechanical.	2025-06-09 07:17:45 -07:00
Jesse Huang	893fa06280	[RISC-V] Adjust trampoline code for branch control flow protection (#141949 ) Trampoline will use a alternative sequence when branch CFI is on. The stack of the test is organized as follow ``` 56 $ra 44 $a0 f 36 $a1 p 32 00038067 jalr t2 28 010e3e03 ld t3, 16(t3) 24 018e3383 ld t2, 24(t3) 20 00000e17 auipc t3, 0 sp+16 00000023 lpad 0 ```	2025-06-07 23:51:08 +08:00
Jim Lin	f8df24015a	[RISCV] Don't commute with shift if XAndesPerf is enabled (#142920 ) More nds.lea.{h,w,d} are generated, similar to sh{1,2,3}add	2025-06-06 11:08:23 +08:00

1 2 3 4 5 ...

2032 Commits