llvm-project

Author	SHA1	Message	Date
Paul Kirth	ec8b9ca47d	Revert "[clang][DebugInfo] Add virtuality call-site target informatio… (#182343 ) …n in DWARF. (#167666)" This reverts commit 418ba6e8ae2cde7924388142b8ab90c636d2c21f. The commit caused an ICE due to hitting unreachable in llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:1307 Fixes #182337	2026-02-19 12:19:11 -08:00
Carlos Alberto Enciso	418ba6e8ae	[clang][DebugInfo] Add virtuality call-site target information in DWARF. (#167666 ) Given the test case: struct CBase { virtual void foo(); }; void bar(CBase *Base) { Base->foo(); } and using '-emit-call-site-info' with llc, the following DWARF is produced for the indirect call 'Base->foo()': 1$: DW_TAG_structure_type "CBase" ... 2$: DW_TAG_subprogram "foo" ... 3$: DW_TAG_subprogram "bar" ... 4$: DW_TAG_call_site ... We add DW_AT_LLVM_virtual_call_origin to existing call-site information, linking indirect calls to the function-declaration they correspond to. 4$: DW_TAG_call_site ... DW_AT_LLVM_virtual_call_origin (2$ "_ZN5CBase3fooEv") The new attribute DW_AT_LLVM_virtual_call_origin helps to address the ambiguity to any consumer due to the usage of DW_AT_call_origin. The functionality is available to all supported debuggers.	2026-02-19 14:48:59 +00:00
Craig Topper	2cb342c733	[RISCV] Add combines to form WSUBAU on RV32 with P. (#181604 )	2026-02-17 15:32:47 -08:00
Craig Topper	7fd56a0d74	[RISCV] Calculate max call frame size in RISCVTargetLowering::finalizeLowering. (#181302 ) I want to enable the frame pointer when the call frame size is too large to access emergency spill slots. To do that I need to know the call frame size early enough to reserve FP. The code here is copied from AArch64. ARM does the same. I did not check other targets. Splitting this off separately because it stops us from unnecessarily reserving the base pointer in the some RVV tests. That appears to due to this check (!hasReservedCallFrame(MF) && (!MFI.isMaxCallFrameSizeComputed() \|\| MFI.getMaxCallFrameSize() != 0))) && By calculating early !MFI.isMaxCallFrameSizeComputed() is no longer true and the size is zero.	2026-02-13 20:32:48 -08:00
Craig Topper	75cc975c2c	[RISCV] Combine ADDD(lo, hi, x, 0) -> WADDAU(lo, hi, x, 0). Combine WADDAU (WADDAU lo, hi, x, 0), y, 0 -> WADDAU lo, hi, x, y (#181396 ) WADDAU is rd += zext(rs1) + zext(rs2) If we only have 1 32-bit input can force rs2 to avoid zeroing the upper part of a register pair to use ADDD. Unfortunately, WADDAU clobbers rd so it might need a GPRPair copy if we need the old value of rd. We might need to look into that in the future. Maybe we could have convertToThreeAddress could turn it back into ADDD+WADDU or ADDD+LI. Assisted-by: claude	2026-02-13 13:39:57 -08:00
Craig Topper	a809d6409f	[RISCV] Remove RISCVISD::WMACC*. Match during isel. NFC (#181197 ) I think we may want to be able to fold ADDD nodes independent of the MUL in some cases. For example turning NSRAI into NSRARI. If we fold ADDD into WMACC we would need to be able to extract it again. Keep the nodes separate avoids this. Code change was assisted by AI.	2026-02-12 22:06:01 -08:00
Craig Topper	664663cbbf	[RISCV] Improve 2*XLEN SHL legalization with P extension. (#181056 ) For an i64 shift by a constant < 32 on RV32, we can use NSRLI with 32-ShAmt to calculate the high half of the result. For non-constant shifts, we can use SLX and some bit tricks to avoid branches. I wanted to use the target independent code from TargetLowering, but it currently produces worse code. Assisted-by: claude	2026-02-11 23:32:02 -08:00
Craig Topper	db588931c5	[RISCV] Use NSRL/NSRA for legalizing i64 shifts with P extension on RV32. (#181040 ) If the shift amount might be in the range [0, 31], we can use NSRL/NSRA to shift the i64 value to compute the lower 32 bits of the result. If the shift amount is >= 32, the high half of the result is all zeros or sign bits. Otherwise it is a srl/sra of the high bits. I've handled the constant case in ReplaceNodeResults but deferred the non-constant case to lowerShiftRightParts. This function is not called for constants. This gives the opportunity for DAGCombine to optimize the SRL_PARTS/SRA_PARTS if the shift amount can be proven to be >= 32 or < 32. Sequences were also discussed on the P extension mailing list here https://lists.riscv.org/g/tech-p-ext/message/861 Assisted-by: claude	2026-02-11 22:37:47 -08:00
Folkert de Vries	6a81656f7d	[RISCV] improve `musttail` support (#170547 ) Basically https://github.com/llvm/llvm-project/pull/168506 but for riscv, so to be clear the hard work here is @heiher 's. I figured we may as well get some extra eyeballs on this from riscv too. Previously the riscv backend could not handle `musttail` calls with more arguments than fit in registers, or any explicit `byval` or `sret` parameters/return values. Those have now been implemented. This is part of my push to get more LLVM backends to support `byval` and `sret` parameters so that rust can stabilize guaranteed tail call support. See also: - https://github.com/llvm/llvm-project/pull/168956 - https://github.com/rust-lang/rust/issues/148748 --------- Co-authored-by: WANG Rui <wangrui@loongson.cn>	2026-02-11 17:27:51 +01:00
Pengcheng Wang	e84659b71b	[RISCV][CodeGen] Combine vwaddu+vabd(u) to vwabda(u) Note that we only support SEW=8/16 for `vwabda(u)`. Reviewers: topperc, lukel97, preames Reviewed By: topperc, lukel97 Pull Request: https://github.com/llvm/llvm-project/pull/180162	2026-02-11 18:53:29 +08:00
Luke Lau	cd2761f7ab	[RISCV] Remove vp.reverse mask check in performVP_REVERSECombine (#180724 ) Similar to #180706, the masked off lanes in vp.reverse are poison so can be replaced with anything. Because of this, we should be able to fold a masked vp.reverse(vp.load) into a vp.strided.load stride=-1 even when the mask isn't all ones.	2026-02-11 09:13:42 +00:00
Luke Lau	ffe446e734	[RISCV] Relax reversed mask's mask requirement in reverse to strided load/store combine (#180706 ) We have combines for vp.reverse(vp.load) -> vp.strided.load stride=-1 and vp.store(vp.reverse) -> vp.strided.store stride=-1. If the load or store is masked, the mask needs to be also a vp.reverse with the same EVL. However we also have the requirement that the mask's vp.reverse is unmasked (has an all-ones mask). vp.reverse's mask only sets masked off lanes to poison, and doesn't affect the permutation of elements. So given those lanes are poison, I believe the combine is valid for any mask, not just all ones. This is split off from another patch I plan on posting to generalize those combines to vector.splice+vector.reverse patterns, as part of #172961	2026-02-11 16:43:02 +08:00
Craig Topper	31e1bcfd09	[RISCV] Add basic scalar support for MERGE, MVM, and MVMN from P extension (#180677 ) These are 3 variations of the same operation with a different operand tied to the destination register. We need to pick the one that minimizes the number of mvs. To do this we take the approach used by AArch64 to select between BIT, BIF, and BSL which the same operations. We define a pseudo with no tied constraint and expand it after register allocation based on where the destination register ended up. If the destination register is none of the operands, we'll insert a mv. I've replaced RISCVISD::MVM with RISCVISD::MERGE and updated the operand order accordingly. I find the MERGE name easier to read so I've made it the canonical name. Ideally we could use commuteInstructionImpl and the TwoAddressInstructionPass to select the opcode before register allocation. That only works if you can commute exactly 2 operands and maybe change the opcode in the MI representation of any of the forms to get to the either of the other 2 forms. That is not possible. We'd need to define 3 more pseudoinstructions with different permutations. With the current approach it might be possible that we insert a mv not because all of the operand registers we needed by later instructions, but because the register allocator needed to put the result in a different register. It's possible a different allocation for other instructions might have avoided the mv. I wrote the patch based on the AArch64, but the tests were generated by AI.	2026-02-10 13:39:34 -08:00
Craig Topper	f33ea53451	[RISCV] Remove redundant czero in multi-word comparisons (#180485 ) When comparing multi-word integers with Zicond, we generate: (or (czero_eqz (lo1 < lo2), (hi1 == hi2)), (czero_nez (hi1 < hi2), (hi1 == hi2))) The czero_nez is redundant because when hi1 == hi2 is true, hi1 < hi2 is already 0. This patch adds a DAG combine to recognize: czero_nez (setcc X, Y, CC), (setcc X, Y, eq) -> (setcc X, Y, CC) when CC is a strict inequality (lt, gt, ult, ugt). This saves one instruction in 128-bit comparisons on RV64 with Zicond. Note the czero_nez becomes a czero.eqz in the final assembly because the seteq is replaced by an xor that produces 0 when the values are equal. Part of #179584 Assisted-by: claude	2026-02-09 21:48:14 -08:00
Ryan Buchner	d69ccf3b34	[RISCV] Combine shuffle of shuffles to a single shuffle (#178095 ) Compressing to a single shuffle doesn't remove any information and the backend can better apply specific optimizations to a single shuffle. Addresses #176218. --------- Co-authored-by: Luke Lau <luke_lau@igalia.com>	2026-02-09 14:48:31 -08:00
Craig Topper	e6a72a1d42	[RISCV] Combine ADDD+WMULSU to WMACCSU (#180454 ) Extend the existing combineADDDToWMACC DAG combine to also match RISCVISD::WMULSU and produce RISCVISD::WMACCSU. This is similar to how ADDD+UMUL_LOHI is combined to WMACCU and ADDD+SMUL_LOHI is combined to WMACC. This patch was generated by AI, but I reviewed it.	2026-02-09 08:51:27 -08:00
Pengcheng Wang	972e73b812	[RISCV][CodeGen] Lower `ISD::ABS` to Zvabd instructions We add pseudos/patterns for `vabs.v` instruction and handle the lowering in `RISCVTargetLowering::lowerABS`. Reviewers: topperc, 4vtomat, mshockwave, preames, lukel97, tclin914 Reviewed By: mshockwave Pull Request: https://github.com/llvm/llvm-project/pull/180142	2026-02-09 15:21:25 +08:00
Pengcheng Wang	e992593341	[RISCV][CodeGen] Lower `abds`/`abdu` to `Zvabd` instructions We directly lower `ISD::ABDS`/`ISD::ABDU` to `Zvabd` instructions. Note that we only support SEW=8/16 for `vabd.vv`/`vabdu.vv`. Reviewers: mshockwave, lukel97, topperc, preames, tclin914, 4vtomat Reviewed By: lukel97, topperc Pull Request: https://github.com/llvm/llvm-project/pull/180141	2026-02-09 15:12:22 +08:00
Craig Topper	769b734c02	[RISCV] Combine ADDD with UMUL_LOHI/SMUL_LOHI into WMACCU/WMACC (#180383 ) Combine the pattern: ADDD(addlo, addhi, UMUL_LOHI(x, y).0, UMUL_LOHI(x, y).1) into: WMACCU(x, y, addlo, addhi) And similarly for SMUL_LOHI -> WMACC. This patch was written with AI, but I reviewed it carefully.	2026-02-08 13:39:32 -08:00
Craig Topper	5c826f5172	[RISCV] Emit MULHU/MULHS/UMUL_LOHI/SMUL_LOHI from our custom XLen*2 expansion. (#180379 ) We already do all the checks necessary in order to prioritize MULHU/MULHS/UMUL_LOHI/SMUL_LOHI over MULHSU/WMULSU. We might as well just emit the nodes instead of letting generic type legalization redo the checks. This is slightly different than the default legalization because we don't have access to ExpandInteger so we have to emit TRUNCATES and BUILD_PAIR. Not sure if this will result in any differences in practice.	2026-02-08 13:39:15 -08:00
Craig Topper	a563e6bb7e	[RISCV] Add support for forming WMULSU during type legalization. (#180331 ) Add a DAG combine to turn it into MULHSU if the lower half result is unused.	2026-02-08 12:38:56 -08:00
Craig Topper	370764c8cb	[RISCV] Use addd/subd for i64 add/sub for RV32+P. (#180129 ) Add RISCVISD opcodes and custom type legalize to them.	2026-02-06 12:42:11 -08:00
Brandon Wu	d99f1cdd66	[RISCV][llvm] Support INSERT_VECTOR_ELT codegen for P extension (#179471 ) Add custom lowering for INSERT_VECTOR_ELT on P extension vector types using the MVM instruction. TODO: Handle <4 x i8> on RV64 which is constructed to extract_vector_elt + build_vector instead of insert_vector_elt.	2026-02-06 14:12:18 +08:00
Craig Topper	22c5c2583d	[RISCV] Reorder the operands for RISCVISD::PPAIRE_DB. NFC (#180111 ) Order the operands so the the low and high part of the rs1 regpair are first, followed by the low and high part of the rs2 regpair. Also change the type to use v4i8 for the result so that it's only shuffling elements not combining elements into a larger elment. I'm planning to add ADDD and SUBD opcodes that will be defined with the same operand order allowing RISCVISelDAGToDAG.cpp code to be shared.	2026-02-05 21:35:47 -08:00
Craig Topper	1ad20b9428	[RISCV] Rename RISCVISD::PPACK_DH->PPAIRE_DB. NFC (#180089 ) The instruction was renamed, but we hadn't renamed the ISD opcode.	2026-02-05 17:35:12 -08:00
Craig Topper	313d9ac1cf	[RISCV] Add wmul(u) codegen for RV32+P (#180032 ) mulh tests are to make sure we continue to use mulh when only the upper half is used.	2026-02-05 17:34:25 -08:00
Craig Topper	6c37aa8ffd	[RISCV] Remove P from RISCVISD::PASUB(U)/PMULHSU/PMULHR(U)/PMULHRSU. NFC (#180064 ) There's a good chance we'll want to use these for scalar too. Drop vector type from SDTypeProfile. Remove PMULHSU since we already have RISCVISD::MULHSU for scalars in the base ISA.	2026-02-05 17:33:35 -08:00
Jameson Nash	d762cc2f03	[GlobalISel] Add SVE support for alloca (#178976 ) Complementary to the same handling code in SelectionDAG: `f3d81d4110/llvm/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp (L160-L165)` `f3d81d4110/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (L4613-L4623)` Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-05 14:00:34 -05:00
Luke Lau	9ed7ba87c4	[RISCV] Remove redundant vand.vi with fptoi to i1 (#179876 ) If the source of an fptoi doesn't fit in the destination type, the result is poison. For i1 destinations, this means the result needs to be 0 or 1/-1, so we can just compare the result to 0 directly instead of truncating. The VP lowering for fpto*i already does this.	2026-02-06 00:06:32 +08:00
Craig Topper	fc56916a5d	[RISCV] Correct lowering of ISD::SETGE/SETULE/SETLE/SETUGE in lowerVPSetCCMaskOp. (#179801 ) XOR should be OR to match the comment. Found while reviewing #179622 which deletes this function. I would like to commit this first so we have a correct baseline for reviewing that patch.	2026-02-04 20:25:13 -08:00
Luke Lau	3794b83ae5	[RISCV] Don't emit VP_SETCC in combineVectorSizedSetCCEquality. NFC (#179479 ) This is part of the work to remove trivial VP intrinsics. In the combineVectorSizedSetCCEquality combine, used for the compares that ExpandMemcmp generates, we currently emit a VP_SETCC. We can just emit a regular SETCC and let RISCVVLOptimizer take care of reducing the VL.	2026-02-04 06:59:27 +00:00
Craig Topper	42b1beb3f0	[RISCV] Default all ISD opcodes to Expand for P extension. (#179396 ) Legal is the default for most opcodes, but we don't yet support all of them. Override the ones that we support back to Legal.	2026-02-02 22:59:32 -08:00
Nicolai Hähnle	6f0b873f1c	[CodeGen] Refactor targets to override the new getTgtMemIntrinsic overload (NFC) (#175844 ) This is a fairly mechanical change. Instead of returning true/false, we either keep the Infos vector empty or push one entry.	2026-02-02 17:40:02 -08:00
Francesco Petrogalli	c6086dd550	[RISC-V][Mach-O] Add codegen support for Mach-O object format. (#178263 ) This commit enables code generation for RISC-V targeting Mach-O: - Implement RISCVMachOTargetObjectFile::getNameWithPrefix method to handle Mach-O symbol naming requirements. - Use shouldAssumeDSOLocal() in RISCVTargetLowering::lowerGlobalAddress instead of isDSOLocal() for proper Mach-O semantics in global address lowering. Note that this is a NFC for RISCV when targeting ELF. - Add comprehensive tests for various relocation types (direct globals, GOT-based addressing, static vs PIC models). - Test function calls, tail calls, and various symbol reference patterns including addends and subtractions. This patch is based on code originally written by Tim Northover.	2026-02-02 14:11:27 -08:00
Craig Topper	80cbd1d696	[RISCV] Support ISD::CLMUL/CLMULH for i64 scalable vectors with Zvbc. (#178340 ) We also get some i32->i64 promotion for CLMULH. The DAGCombiner change is to prevent an infinite loop from that. Test file was rewritten to cover all types and split between clmul and clmulh. I added a couple masked tests to show that VectorPeephole works. The test outputs were already large so I didn't want to add more than a couple.	2026-01-29 13:17:03 -08:00
Craig Topper	f37bf0ce65	Revert "[RISCV] Support RISCV BitInt larger than 128 (#175515 )" (#178311 ) This reverts commit e3156c531da5aa4ec604605ed4e19638879d773c. We need to resolve a crash on trunk and LLVM 22. Reverting makes it easier to backport. Fixes #176637.	2026-01-29 07:16:14 -08:00
Craig Topper	c8b1ff90f3	[RISCV] Hoist a duplicate setOperationAction to a common place. NFC (#178364 )	2026-01-27 22:54:49 -08:00
Craig Topper	05e2ee9664	[RISCV] Replace riscv.clmul intrinsic with llvm.clmul (#178092 ) I did not replace riscv.clmulh/clmulr since those require a multiple instruction pattern match. I wanted to ensure that -O0 will select the correct instructions without relying on combines.	2026-01-26 21:12:48 -08:00
Sudharsan Veeravalli	3ed48305ab	[RISCV] Run combineOrToBitfieldInsert after DAG legalize (#177830 ) Not combing `OR` into `QC.INSB(I)` before DAG legalization helps known bits analysis to simplify the code if possible.	2026-01-26 15:43:00 +05:30
Craig Topper	5c35af8f1e	[RISCV] Replace RISCVISD::CLMUL* with ISD::CLMUL. (#177386 ) This patch does the minimum to remove RISCVISD::CLMUL. It does not remove existing intrinsics. There's some missed optimizations for i32 CLMULH/CLMULR on RV64, but those may be generic issues. I've put the test cases in the existing files so it's more obvious what the missed optimizations are by comparing within the file.	2026-01-22 09:39:44 -08:00
Craig Topper	73a309e20e	[RISCV] Add ZZZ_ to some inline assembly vector register classes to sort them after VR/VRNoV0 in regclass enum. (#177087 ) This prevents getCommonSubClass from finding them before VR/VRNoV0. Fixes a crash reported post-commit in #171231. getCommonSubClass returned one of these classes, but it doesn't have the same VTs as VR/VRNoV0 leading to an assertion failure. The subregister-undef-early-clobber.mir still ends up finding these register classes in the InitUndef pass.	2026-01-21 21:23:06 -08:00
Brandon Wu	72915ea145	[RISCV][llvm] Support setcc codegen for zvfbfa (#176866 )	2026-01-21 07:25:37 +00:00
Brandon Wu	d23c3a5ea7	[RISCV][llvm] Support strict fadd/fsub/fmul/fma codegen for zvfbfa (#176719 ) This is same as normal version. stack on: https://github.com/llvm/llvm-project/pull/176716	2026-01-21 14:40:01 +08:00
Matt Arsenault	aa57ee958d	CodeGen: Use LibcallLoweringInfo for stack protector insertion (#176829 ) Thread LibcallLoweringInfo into the TargetLowering hooks used by the stack protector passes.	2026-01-20 12:37:31 +01:00
Brandon Wu	1887fca885	[RISCV][llvm] Handle sub-register vector shifts for P-extension (#176109 ) For sub-register width vectors (v2i16, v4i8) on RV64 with P-extension, the type legalizer widens them to legal types, i.e. v4i16, v8i8, before they're getting unrolled, so they'll be redundant computation for higher part of register. The correct way to handle is similar to widening div/rem where there's undef padded for high part. stack on: https://github.com/llvm/llvm-project/pull/176093	2026-01-19 05:22:50 +00:00
Brandon Wu	2a8a694b50	[RISCV][llvm] Handle calling convention for P extension fixed vectors (#176093 ) P extension packed SIMD types are passed in GPRs. For types larger than XLen (e.g. v8i8 on RV32), they are split and passed via the 2XLen mechanism, similar to i64 on RV32. FIXME: Need to figure out the mechanism when P and V are enabled at the same time. stack on: https://github.com/llvm/llvm-project/pull/176193	2026-01-19 12:09:27 +08:00
Craig Topper	1621e007db	[RISCV] Remove unnecessary EVT->MVT->EVT conversions. NFC (#176214 ) We don't need to use getSimpleValueType if we're just passing to getNode.	2026-01-15 15:48:18 -08:00
Akshay Deodhar	3860147a7f	[NFC][TargetLowering] Make shouldExpandAtomicRMWInIR and shouldExpandAtomicCmpXchgInIR take a const Instruction pointer (#176073 ) Splits out change from https://github.com/llvm/llvm-project/pull/176015 Changes shouldExpandAtomicRMWInIR to take a constant argument: This is to allow some other TargetLowering constant-argument functions to call it. This change touches several backends. An alternative solution exists, but to me, this seems the "right" way.	2026-01-15 14:22:57 -08:00
Brandon Wu	546ba870f7	[RISCV][llvm] Refactor unpackFromMemLoc to use convertLocVTToValVT. NFC (#175969 ) Simplify unpackFromMemLoc to use convertLocVTToValVT for handling LocInfo conversions, making it consistent with unpackFromRegLoc.	2026-01-16 02:50:01 +08:00
Brandon Wu	c7e4350cdc	[RISCV][llvm] Support select codegen for P extension (#175741 ) This is scalar condition with fixed vector true/false value, we can just handle it same as scalars.	2026-01-14 14:05:45 +08:00

1 2 3 4 5 ...

2281 Commits