llvm-project

Author	SHA1	Message	Date
yingopq	754ed95b66	[Mips] Fix compiler crash when returning fp128 after calling a functi… (#117525 ) …on returning { i8, i128 } Fixes https://github.com/llvm/llvm-project/issues/96432.	2025-01-20 16:47:40 +08:00
Patryk Wychowaniec	814b34f31e	[AVR] Force relocations for non-encodable jumps (#121498 ) This commit changes the branch emission logic so that instead of throwing the "branch target out of range" error, we emit a relocation instead.	2025-01-20 09:23:57 +08:00
Philip Reames	143c33c6df	[RISCV] Consider only legally typed splats to be legal shuffles (#123415 ) Given the comment, I'd expected test coverage. There was none so let's do the simple thing which benefits the one thing we have tests for.	2025-01-17 19:13:04 -08:00
Craig Topper	0c6e03eea0	[RISCV] Fold vp.store(vp.reverse(VAL), ADDR, MASK) -> vp.strided.store(VAL, NEW_ADDR, -1, MASK) (#123123 ) Co-authored-by: Brandon Wu <brandon.wu@sifive.com>	2025-01-17 14:22:25 -08:00
Luke Lau	a761e26b23	[RISCV] Allow non-loop invariant steps in RISCVGatherScatterLowering (#122244 ) The motivation for this is to allow us to match strided accesses that are emitted from the loop vectorizer with EVL tail folding (see #122232) In these loops the step isn't loop invariant and is based off of @llvm.experimental.get.vector.length. We can relax this as long as we make sure to construct the updates after the definition inside the loop, instead of the preheader. I presume the restriction was previously added so that the step would dominate the insertion point in the preheader. I can't think of why it wouldn't be safe to calculate it in the loop otherwise.	2025-01-17 08:58:56 +08:00
Philip Reames	bb6e94a05d	[RISCV] Custom legalize <N x i128>, <4 x i256>, etc.. shuffles (#122352 ) I have a particular user downstream who likes to write shuffles in terms of unions involving _BitInt(128) types. This isn't completely crazy because there's a bunch of code in the wild which was written with SSE in mind, so 128 bits is a common data fragment size. The problem is that generic lowering scalarizes this to ELEN, and we end up with really terrible extract/insert sequences if the i128 shuffle is between other (non-i128) operations. I explored trying to do this via generic lowering infrastructure, and frankly got lost. Doing this a target specific DAG is a bit ugly - really, there's nothing hugely target specific here - but oh well. If reviewers prefer, I could probably phrase this as a generic DAG combine, but I'm not sure that's hugely better. If reviewers have a strong preference on how to handle this, let me know, but I may need a bit of help. A couple notes: * The argument passing weirdness is due to a missing combine to turn a build_vector of adjacent i64 loads back into a vector load. I'm a bit surprised we don't get that, but the isel output clearly has the build_vector at i64. * The splat case I plan to revisit in another patch. That's a relatively common pattern, and the fact I have to scalarize that to avoid an infinite loop is non-ideal.	2025-01-16 14:55:45 -08:00
Raphael Moreira Zinsly	01d7f434d2	[RISCV] Stack clash protection for dynamic alloca (#122508 ) Create a probe loop for dynamic allocation and add the corresponding SelectionDAG support in order to use it.	2025-01-16 11:58:42 -08:00
Craig Topper	fc7a1ed0ba	[RISCV] Fold vp.reverse(vp.load(ADDR, MASK)) -> vp.strided.load(ADDR, -1, MASK). (#123115 ) Co-authored-by: Brandon Wu <brandon.wu@sifive.com>	2025-01-16 08:20:17 -08:00
Luke Lau	437e1a70ca	[RISCV][VLOPT] Handle tied pseudos in getOperandInfo (#123170 ) For .wv widening instructions when checking if the opperand is vs1 or vs2, we take into account whether or not it has a passthru. For tied pseudos though their passthru is the vs2, and we weren't taking this into account.	2025-01-16 23:00:13 +08:00
Luke Lau	ec5d17b587	[RISCV] Explicitly check for passthru in doPeepholeMaskedRVV. NFC We were previously checking a combination of the vector policy op and the opcode to determine if we needed to skip copying the passthru from a masked pseudo to an unmasked pseudo. However we can just do this by checking RISCVII::isFirstDefTiedToFirstUse, which is a proxy for whether or not a pseudo has a passthru operand. This should hopefully remove the need for the changes in #123106	2025-01-16 11:28:05 +08:00
Guy David	1a935d7a17	[llvm] Mark scavenging spill-slots as spilled stack objects. (#122673 ) This seems like an oversight when copying code from other backends.	2025-01-14 10:18:31 +02:00
Piotr Fusik	cfe5a0847a	[RISCV] Enable Zbb ANDN/ORN/XNOR for more 64-bit constants (#122698 ) This extends PR #120221 to 64-bit constants that don't match the 12-low-bits-set pattern.	2025-01-14 09:15:14 +01:00
Luke Lau	ffe5cddb68	[RISCV] Support vp.{gather,scatter} in RISCVGatherScatterLowering (#122232 ) This adds support for lowering llvm.vp.{gather,scatter}s to experimental.vp.strided.{load,store}. This will help us handle strided accesses with EVL tail folding that are emitted from the loop vectorizer, but note that it's still not enough. We will also need to handle the vector step not being loop-invariant (i.e. produced by @llvm.experimental.vector.length) in a future patch.	2025-01-14 12:51:01 +08:00
Alexey Bataev	bab7920fd7	[RISCV][CG]Use processShuffleMasks for per-register shuffles Patch adds usage of processShuffleMasks in in codegen in lowerShuffleViaVRegSplitting. This function is already used for X86 shuffles estimations and in DAGTypeLegalizer::SplitVecRes_VECTOR_SHUFFLE functions, unifies the code. Reviewers: topperc, wangpc-pp, lukel97, preames Reviewed By: preames Pull Request: https://github.com/llvm/llvm-project/pull/121765	2025-01-13 17:06:25 -05:00
Michael Maitland	e44f03dd4e	[RISCV][VLOPT] Add floating point widening and narrowing bf16 convert support (#122353 ) We already have getOperandInfo tests that cover this instruction.	2025-01-13 15:38:03 -05:00
quic_hchandel	171d3edd05	[RISCV] Add Qualcomm uC Xqciint (Interrupts) extension (#122256 ) This extension adds eleven instructions to accelerate interrupt servicing. The current spec can be found at: https://github.com/quic/riscv-unified-db/releases/latest This patch adds assembler only support. --------- Co-authored-by: Harsh Chandel <hchandel@qti.qualcomm.com>	2025-01-13 16:36:05 +05:30
Craig Topper	7979e1ba29	[RISCV] Add a default assignment of Inst{12-7} to RVInst16CSS. NFC Some bits need to be overwritten by child classes, but at least a few of the upper bits are common to all child classes.	2025-01-10 14:28:54 -08:00
Raphael Moreira Zinsly	6f53886a9a	[RISCV] Add stack clash vector support (#119458 ) Use the probe loop structure to allocate vector code in the stack as well. We add the pseudo instruction RISCV::PROBED_STACKALLOC_RVV to differentiate from the normal loop.	2025-01-10 09:48:21 -08:00
Philip Reames	24bb180e8a	[RISCV] Attempt to widen SEW before generic shuffle lowering (#122311 ) This takes inspiration from AArch64 which does the same thing to assist with zip/trn/etc.. Doing this recursion unconditionally when the mask allows is slightly questionable, but seems to work out okay in practice. As a bit of context, it's helpful to realize that we have existing logic in both DAGCombine and InstCombine which mutates the element width of in an analogous manner. However, that code has two restriction which prevent it from handling the motivating cases here. First, it only triggers if there is a bitcast involving a different element type. Second, the matcher used considers a partially undef wide element to be a non-match. I considered trying to relax those assumptions, but the information loss for undef in mid-level opt seemed more likely to open a can of worms than I wanted.	2025-01-10 07:12:24 -08:00
LiqinWeng	98e5962b7c	[RISCV][CostModel] Add cost for fabs/fsqrt of type bf16/f16 (#118608 )	2025-01-10 17:22:51 +08:00
Craig Topper	6829f30883	[RISCV] Add a default common assignment of Inst{6-2} to the RVInst16CI base class. NFC (#122377 ) Many instructions assign all or a subset of Inst{6-2} to Imm{4-0}. Make this the default. Subsets of Inst{6-2} can be overridden as needed by derived classes/records which we already do with Inst{12} in a few places.	2025-01-09 22:11:04 -08:00
Shao-Ce SUN	369c61744a	[RISCV] Fix the cost of `llvm.vector.reduce.and` (#119160 ) I added some CodeGen test cases related to reduce. To maintain consistency, I also added cases for instructions like `vector.reduce.or`. For cases where `v1i1` type generates `VFIRST`, please refer to: https://reviews.llvm.org/D139512.	2025-01-10 10:10:42 +08:00
Craig Topper	41e4018f9c	[RISCV][VLOPT] Simplify code by removing extra temporary variables. NFC (#122333 ) Just do the conditional operator in the return statement.	2025-01-09 18:05:41 -08:00
Craig Topper	b11fe33aea	[RISCV] Correct the cost model for the i1 reduce.add and reduce.or. (#122349 ) reduce.add uses the same sequence as reduce.xor. reduce.or should use vmor not vmxor.	2025-01-09 18:05:22 -08:00
Michael Maitland	d0373dbe7c	[RISCV][VLOPT] Add vadc to isSupportedInstr (#122345 )	2025-01-09 19:44:40 -05:00
Michael Maitland	04e54cc19f	[RISCV][VLOPT] Add Vector Single-Width Averaging Add and Subtract to isSupportedInstr (#122351 )	2025-01-09 19:39:12 -05:00
Craig Topper	5d88a84ecd	[RISCV] Simplify some RISCVInstrInfoC classes by removing arguments that never change. NFC	2025-01-09 16:21:55 -08:00
Michael Maitland	328c3a843f	[RISCV][VLOPT] Add vmerge to isSupportedInstr (#122340 )	2025-01-09 16:10:40 -05:00
Craig Topper	b16777afb0	[RISCV] Return MILog2SEW for mask instructions getOperandLog2EEW. NFC (#122332 ) The SEW operand for these instructions should have a value of 0. This matches what was done for vcpop/vfirst.	2025-01-09 11:36:09 -08:00
Michael Maitland	5f70fea79f	[RISCV][VLOPT] Add Vector Floating-Point Compare Instructions to getSupportedInstr	2025-01-09 10:50:32 -08:00
Michael Maitland	b419edeec3	[RISCV][VLOPT] Add widening floating point multiply to isSupportedInstr	2025-01-09 10:50:32 -08:00
Michael Maitland	a484fa1d0a	[RISCV][VLOPT] Add floating point multiply divide instructions to getSupportedInstr	2025-01-09 10:50:32 -08:00
Michael Maitland	8beb9d393d	[RISCV][VLOPT] Add vector widening floating point add subtract instructions to isSupportedInstr	2025-01-09 10:50:31 -08:00
Michael Maitland	c036a9a2c2	[RISCV][VLOPT] Add vector single width floating point add subtract instructions to isSupportedInstr	2025-01-09 10:50:31 -08:00
Michael Maitland	d5145715f7	[RISCV][VLOPT] Add vfirst and vcpop to getOperandInfo (#122295 )	2025-01-09 13:31:02 -05:00
Michael Maitland	550841f839	[RISCV][VLOPT] Add fp-reductions to getOperandInfo (#122151 )	2025-01-09 09:43:26 -05:00
Michael Maitland	f77a7dd875	[RISCV][VLOPT] Add getOperandInfo for integer and floating point widening reductions (#122176 )	2025-01-09 09:35:06 -05:00
Luke Lau	c8ee1164bd	[RISCV] Fix masked->unmasked peephole handling masked pseudos with no passthru (#122253 ) Some masked pseudos like PseudoVCPOP_M_B8_MASK don't have a passthru, but in the masked->unmasked peephole we assumed the masked pseudo always had one. This checks for a passthru first and fixes #122245.	2025-01-09 19:54:37 +08:00
Craig Topper	5d03235c73	[RISCV] Add -mcpu=sifive-p550. (#122164 ) This is the CPU in SiFive's HiFive Premier P550 development board. Scheduler model will come in a later patch.	2025-01-08 21:02:46 -08:00
Craig Topper	b0f11dfc75	[RISCV] Add call preserved regmask to tail calls. (#122181 ) Every call should have regmask operand to indicate what registers are preserved or clobbered by the call. VirtRegRewriter uses this to tell MachineRegisterInfo what registers are clobbered by a function. If the mask isn't present the registers potentially clobbered by a tail called function aren't counted. I have checked ARM, AArch64, and X86 and they all have a regmask operand on their tail calls. I believe this fixes an issue I'm seeing with IPRA.	2025-01-08 16:19:31 -08:00
Philip Reames	0b4fca5b75	[RISCV][VLOpt] Remove State field from OperandInfo [nfc] (#122160 ) We can just use a std::optional to wrap the operand info instead. The state field is confusing as we have a "partially known" state where EEW is known and EMUL is nullopt, but it's still "Known".	2025-01-08 12:37:28 -08:00
Philip Reames	983a957768	[RISCV][VLOpt] Consolidate EMUL=SEW/EEWLMUL logic [NFC] (#122021 ) All but one of the cases in tree today have EMUL=SEW/EEWLMUL. Repeating this each time is verbose and introduces oppurtunity for error. (For instance, the comment associated with vwmul.vv was out of sync with the code for same.) Introduce getOperandLog2EEW and move most complexity to it. Then introduce getOperandInfo as a wrapper around previous, and special case the one case which requires it. --------- Co-authored-by: Luke Lau <luke_lau@icloud.com>	2025-01-08 10:58:37 -08:00
Michael Maitland	e93181bf13	[RISCV][VLOPT] Add vector fp-conversion instruction to isSupportedInstr (#122033 ) When these instructions are marked nofpexcept, we can optimize them. There are some added toggles in the output, likley because other noexcept fp instructions are not part of isSupportedInstr yet. We may want to avoid marking an instruction as isSupported in the future if any of its FP users are missing nofpexcept to avoid added toggles. However, we seem to get some GPRs back as a result of this change, which may outweigh the cost of avoiding extra toggles. The plan is to follow this patch up with added support for more FP instructions in the same way. The instructions in this patch are a natural starting point because they allow us to test with integer instructions which have good support already.	2025-01-08 13:30:40 -05:00
Michael Maitland	b253a80f54	[RISCV][VLOPT] Add mask load to isSupported and getOperandInfo (#122030 ) Add mask store to getOperandInfo since it has the same behavior.	2025-01-07 22:07:57 -05:00
Luke Quinn	dde5546b79	[RISCV] GISel custom lowering for G_ADD/G_SUB (#121587 ) Custom lowering for s32 G_ADD/SUB to help match selection dag better. Specifically for RV64 a s32 is produced as a add+sext the output this allows for fewer instructions to sign extend a couple patterns. Allows for the generation of addiw,subw,negw to reduce required instructions to load values quicker Log2_ceil_i32 in rvzbb.ll shows a more obvious improvement case.	2025-01-07 18:53:10 -08:00
Philip Reames	4c4364869c	[RISCV][VLOpt] Kill all uses of and remove twoTimesVLMUL [NFC] (#122003 ) Case analysis: * EEW=SEW2, getEMULEqualsEEWDivSEWTimesLMUL(EEW) returns 2 x VLMUL EEW=SEW, getEMULEqualsEEWDivSEWTimesLMUL(EEW) returns VLMUL	2025-01-07 15:14:45 -08:00
Min-Yih Hsu	90d79ca4c7	[RISCV] Update the latencies of MUL and CPOP in SiFive P400 scheduling model (#122007 ) According to llvm-exegesis, they should have around 2 cycles of latency on P400 cores.	2025-01-07 15:01:05 -08:00
Michael Maitland	142787d368	[RISCV][VLOPT] Add support for checkUsers when UserMI is a Single-Width Integer Reduction (#120345 ) Reductions are weird because for some operands, they are vector registers but only read the first lane. For these operands, we do not need to check to make sure the EEW and EMUL ratios match. The EEWs, however, do need to match.	2025-01-07 17:56:07 -05:00
Michael Maitland	36e4176f1d	[RISCV][VLOPT] Add strided, unit strided, and indexed loads to isSupported (#121705 ) Add to getOperandInfo too since that is needed to reduce the VL.	2025-01-07 17:45:06 -05:00
Craig Topper	afa8aeeeec	[RISCV][llvm-exegesis] Add default Pfm cycle counter. (#121866 ) Also tested with Ubuntu on SiFive's HiFive Premier P550 board. Curiously latency is reporting ~1.5 on basic scalar arithmetic, scalar mul is ~3.5, and div is ~36.5. This 0.5 cycles higher than I expect.	2025-01-07 09:51:34 -08:00

1 2 3 4 5 ...

5923 Commits