llvm-project

Author	SHA1	Message	Date
Craig Topper	7e15303062	[RISCV] Simplify scalable vector case in lowerVectorMaskExt. Since we have SPLAT_VECTOR_PARTS these days, I don't think we need to go through extra lengths to avoid introducing an illegal scalar type. We can just call getConstant using the scalable vector type and let it create either a SPLAT_VECTOR or a SPLAT_VECTOR_PARTS. Reviewed By: frasercrmck, rogfer01 Differential Revision: https://reviews.llvm.org/D121645	2022-03-17 09:43:13 -07:00
Jessica Clarke	659363c0cc	[RISCV] Ensure PseudoLA* can be hoisted Since we mark the pseudos as mayLoad but do not provide any MMOs, isSafeToMove conservatively returns false, stopping MachineLICM from hoisting the instructions. PseudoLA_TLS_GD does not actually expand to a load, so stop marking that as mayLoad to allow it to be hoisted, and for the others make sure to add MMOs during lowering to indicate they're GOT loads and thus can be freely moved. Fixes https://github.com/llvm/llvm-project/issues/54372 Reviewed By: MaskRay, arichardson Differential Revision: https://reviews.llvm.org/D121654	2022-03-16 18:45:36 +00:00
Craig Topper	06c5d74090	[RISCV] Remove lowerSPLAT_VECTOR This code handles fixed vector SPLAT_VECTOR, but is never called in any tests. We only form fixed vector splat vectors for vXi64 on RV32 as part of DAGCombine. This will be type legalized to SPLAT_VECTOR_PARTS. So the Custom handling for SPLAT_VECTOR is never needed. This patch makes SPLAT_VECTOR for vXi64 'Legal' on RV32 so that DAGCombine will create it, but there's no need for Custom handler. It will still be type legalized to SPLAT_VECTOR_PARTS. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D121673	2022-03-15 08:22:13 -07:00
Craig Topper	eeb3bfd74a	[RISCV] Merge ReplaceNodeResults code for SHFL and GREV/GORC. NFC	2022-03-13 18:42:26 -07:00
Lehua Ding	1648852c98	[RISCV][RVV] Fix vslide1up/down intrinsics overflow bug for SEW=64 on RV32 Reviewed By: craig.topper, kito-cheng Differential Revision: https://reviews.llvm.org/D120899	2022-03-13 18:06:09 +08:00
Craig Topper	fd4d584d6b	[RISCV] Add DAGCombine to fold (bitreverse (bswap X)) to brev8 with Zbkb. If the type is less than XLenVT, type legalization will turn this into (srl (bitreverse (bswap (srl (bswap X), C))), C). We can't completely recover from these shifts. They introduce zeros into the upper bits of the result and we can't easily tell if they are needed. By doing a DAG combine early, we avoid introducing these shifts.	2022-03-12 16:39:39 -08:00
Craig Topper	43f668b98e	[RISCV] Move GORCIW/GREVIW formation to isel patterns. Type legalize narrow RISCVISD::GREV/GORC with constant to a larger type without switching to W. Detect sext_inreg+gorci/grevi with a uimm5 immediate during isel to emit GREVIW/GORCIW. This allows us to better propagate known bits information through extended bits after type legalization. It will also simplify a change I'm considering for BREV8 with Zbkb. A future patch will add computeKnownBits support for GORC. A further improvement here would be to use hasAllWUsers and doPeepholeSExtW like we do for SLLIW, but I don't think we have the test coverage for that yet.	2022-03-11 18:02:47 -08:00
Craig Topper	337d49da84	[RISCV] Fix typo in comment. NFC	2022-03-10 22:00:18 -08:00
Craig Topper	1f3a8d58a6	[RISCV] Use ZERO_EXTEND instead of ANY_EXTEND when promoting i32 RISCVISD::SHFL. NFC We know the shift amount is a constant with bit 31 clear. anyext of constant will be either zext or sext which will produce the same result here. But we really shouldn't rely on that. It would be valid to put a random number in the upper bits. Our isel patterns expect the upper bits to be 0 so we should ask for it explicitly.	2022-03-10 20:57:04 -08:00
Craig Topper	9ce6b1ca86	[RISCV] Remove performANY_EXTENDCombine. This doesn't appear to be needed any more. I did some inspecting of the gcc torture suite and SPEC2006 with this removed and didn't find any meaningful changes. I think we're more aggressive about forming ADDIW now using sign_extend_inreg during type legalization and hasAllWUsers in isel. This probably helps catch the cases this helped with before.	2022-03-10 11:29:31 -08:00
Luke	0803dba7dd	[RISCV] Add fixed-length vector instrinsics for segment load Inspired by reviews.llvm.org/D107790. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119834	2022-03-10 16:23:40 +08:00
Craig Topper	d53707508a	[RISCV] Remove RISCVISD::VLE_VL/VSE_VL. Use intrinsics instead. Similar to what we do for other loads/stores, use the intrinsic version that we already have custom isel for. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D121166	2022-03-09 22:44:28 -08:00
Craig Topper	845bfcede1	[RISCV] Rename 'SplatOperand' to 'ScalarOperand'. NFC vslide1up/down have this flag set, but the value isn't a splat. Rename for clarity. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D121037	2022-03-07 11:28:32 -08:00
Craig Topper	bd5f124716	[RISCV] Add SimplifyDemandedBits support for FSR/FSL/FSRW/FSLW.	2022-03-05 21:26:51 -08:00
Craig Topper	232f57319d	[RISCV] Move vslide1up/down intrinsics into lowerVectorIntrinsicSplats. NFC Rename to lowerVectorIntrinsicScalars. This allows us to share the code that checks if the scalar needs to be type legalized.	2022-03-04 18:21:53 -08:00
Craig Topper	3d4e83f17d	[RISCV] With Zbb, fold (sext_inreg (abs X)) -> (max X, (negw X)) With Zbb, abs is expanded to (max X, neg) by default. If X has 33 or more sign bits, we can expand it a little early using negw instead of neg to save a sext_inreg. If X started as a 32 bit value, type legalization would have inserted a sext before the abs so X having 33 sign bits should always be true. Note: I've used ISD::FREEZE here since we increase the number of uses. Our default expansion for ABS doesn't do that, but I think that's a bug. We can't do this with custom type legalization because ISD::FREEZE doesn't propagate sign bits so later DAG combine won't expand be able to see optmize it. Alives2 https://alive2.llvm.org/ce/z/Gx3RNe Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D120597	2022-03-03 15:42:29 -08:00
Craig Topper	6cb42cd666	[RISCV] More correctly ignore Zfinx register classes in getRegForInlineAsmConstraint. Until Zfinx is supported in CodeGen we need to convert all Zfinx register classes to GPR. Remove the zfinx-types.ll test which didn't test anything meaningful since -mattr=zfinx isn't implemented completely in llc. Follow up to D93298.	2022-03-02 11:22:46 -08:00
Craig Topper	a1f8349d77	[RISCV] Don't combine ROTR ((GREV x, 24), 16)->(GREV x, 8) on RV64. This miscompile was introduced in D119527. This was a special pattern for rotate+bswap on RV32. It doesn't work for RV64 since the rotate needs to be half the bitwidth. The equivalent pattern for RV64 is ROTR ((GREV x, 56), 32) so match that instead. This could be generalized further as noted in the new FIXME. Reviewed By: Chenbing.Zheng Differential Revision: https://reviews.llvm.org/D120686	2022-03-02 09:47:06 -08:00
Shao-Ce SUN	0e38b29543	[RISCV] add the MC layer support of Zfinx extension This patch added the MC layer support of Zfinx extension. Authored-by: StephenFan Co-Authored-by: Shao-Ce Sun Reviewed By: asb Differential Revision: https://reviews.llvm.org/D93298	2022-03-02 14:25:19 +08:00
Craig Topper	b9d6e8c441	[RISCV] Lower VECTOR_SPLICE to RVV instructions. This lowers VECTOR_SPLICE of scalable vectors to a slidedown follow by a slideup. Fixed vectors are encouraged to use shufflevector instruction. The equivalent patch for fixed vectors is D119039. I've used a tail agnostic slidedown and limited the VL to only the elements that will not be overwritten by the slideup. The slideup uses VLMax for its VL. It unfortunately uses tail undisturbed policy but it isn't required as there is no tail. We just need the merge operand to carry the bits for the lower portion of the result. Care was taken to ensure that either the slideup or slidedown will be able to use a .vi instruction when the immediate is small. Which one uses the immediate depends on the sign of the immediate. Reviewed By: frasercrmck, ABataev Differential Revision: https://reviews.llvm.org/D119303	2022-03-01 10:10:13 -08:00
Craig Topper	e83db8c001	[RISCV] Only enable combineROTR_ROTL_RORW_ROLW with Zbp. I think the immediate values we check for on the GREV nodes already protect this, but better to be explicit.	2022-02-28 12:47:36 -08:00
Craig Topper	b083157b7b	[RISCV] Don't call combineROTR_ROTL_RORW_ROLW for SLLW/SRLW/SRAW nodes. NFC I think the function does the correct thing internally, but it's confusing to read.	2022-02-28 11:05:10 -08:00
Craig Topper	f46890711f	[RISCV] Custom type legalize i32 ISD::ABS on RV64 without Zbb. Default type legalization will create sext_inreg+abs, but we may not be able to remove the sext_inreg. Instead this patch expands abs during type legalization to Y = sraiw X, 31; subw(xor X, Y), Y) which doesn't require the input to be sign extended. This gives a big improvement for some neg-abs tests where the abs is used more than the the neg. Previously the abs was expanded a different way before and after type legalization. Now they are expanded in a similar way enabling more CSE. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D120636	2022-02-28 09:30:27 -08:00
Chenbing Zheng	b20e80aa59	[RISCV] DAG Combine vcpop and vfirst with VL=0 to li imm vcpop and vfirst are still useful when VL=0. vcpop equivalents to li 0 and vfirst equivalents to li -1, since no mask elements are active. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120302	2022-02-25 14:44:25 +08:00
Craig Topper	a975ca97c3	[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118974	2022-02-24 09:19:01 -08:00
Shao-Ce SUN	78b5f0fb05	[NFC][RISCV] Reuse ISD::NodeType in float extension Reviewed By: asb Differential Revision: https://reviews.llvm.org/D120412	2022-02-24 19:57:55 +08:00
Craig Topper	5b7ac107b1	[RISCV] Use SelectionDAG::getFreeze to simplify some code. NFC	2022-02-23 21:13:01 -08:00
Craig Topper	c7d6448d03	[DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable. Internally to DAGCombiner the SDValues were passed by non-const reference despite not being modified. They were then passed by const reference to TLI. This patch passes them by value which is consistent with the vast majority of code. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120420	2022-02-23 12:40:45 -08:00
Alex Bradbury	c5bcfb983e	[RISCV] Avoid infinite loop between DAGCombiner::visitMUL and RISCVISelLowering::transformAddImmMulImm See https://github.com/llvm/llvm-project/issues/53831 for a full discussion. The basic issue is that DAGCombiner::visitMUL and RISCVISelLowering;:transformAddImmMullImm get stuck in a loop, as the current checks in transformAddImmMulImm aren't sufficient to avoid all cases where DAGCombiner::isMulAddWithConstProfitable might trigger a transformation. This patch makes transformAddImmMulImm bail out if C0 (the constant used for multiplication) has more than one use. Differential Revision: https://reviews.llvm.org/D120332	2022-02-23 11:05:46 +00:00
Zakk Chen	f7dfc5d1af	[RISCV] Optimize tail agnostic vmv.s.x which don't need to select tail value. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120250	2022-02-21 14:53:37 -08:00
Craig Topper	90d240553d	[RISCV] Teach shouldSinkOperands to sink splat operands of vp.fma intrinsics. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D120167	2022-02-21 11:52:59 -08:00
Craig Topper	bbee9e77f3	[RISCV] Match shufflevector corresponding to slideup. This generalizes isElementRotate to work when there's only a single slide needed. I've removed matchShuffleAsSlideDown which is now redundant. Reviewed By: frasercrmck, khchen Differential Revision: https://reviews.llvm.org/D119759	2022-02-17 08:19:10 -08:00
Zakk Chen	eeb7754f68	[RISCV] Add the passthru operand for vmv.vv/vmv.vx/vfmv.vf IR intrinsics. Add the passthru operand for VMV_V_X_VL, VFMV_V_F_VL and SPLAT_VECTOR_SPLIT_I64_VL also. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D119688	2022-02-17 06:38:14 -08:00
Craig Topper	cfbbcc544c	[RISCV] Improve lowering of SHL_PARTS/SRL_PARTS/SRA_PARTS. Part of the shift lowering creates a (sub XLEN-1, ShAmt). When this value is used we know that ShAmt is [0..XLEN-1]. Since XLEN is a power of 2 we can replace the sub with an xor. This allows us to use XORI instead of LI+SUB. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D119411	2022-02-16 09:22:11 -08:00
Zakk Chen	b784719904	[RISCV] Add the passthru operand for RVV nomask binary intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Add passthru operand for VSLIDE1UP_VL and VSLIDE1DOWN_VL to support i64 scalar in rv32. The masked VSLIDE1 would only emit mask undisturbed policy regardless of giving mask agnostic policy until InsertVSETVLI supports mask agnostic. Reviewed by: craig.topper, rogfer01 Differential Revision: https://reviews.llvm.org/D117989	2022-02-15 18:36:18 -08:00
Craig Topper	ab6e02dded	[RISCV] Match vwmulsu_vx with scalar splat input. This is a more generic version of D119110 that uses MaskedValueIsZero to do the matching and SimplifyDemandedBits to remove any unneeded AND instructions. Tests were taken from D119110. Reviewed By: Chenbing.Zheng Differential Revision: https://reviews.llvm.org/D119622	2022-02-15 08:45:21 -08:00
Craig Topper	478c237e21	[RISCV] Fix incorrect extend type in vwmulsu combine. While matching widening multiply, if we matched an extend from i8->i32, i16->i64 or i8->i64, we need to reintroduce a narrower extend. If we're matching a vwmulsu we need to use a sext for op0 and a zext for op1. This bug exists in LLVM 14 and will need to be backported. Differential Revision: https://reviews.llvm.org/D119618	2022-02-12 12:47:20 -08:00
Chenbing.Zheng	9e975e558b	[RISCV][NFC] Move some combine patterns to DAG combine. Move some combine patterns to DAG combine，and it dealt with fixme left in RISCVInstrInfoZb.td. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119527	2022-02-12 02:52:21 +00:00
Craig Topper	b0e77d5e48	[RISCV] Lower the shufflevector equivalent of vector.splice We can lower a vector splice to a vslidedown and a vslideup. The majority of the matching code here came from X86's code for matching PALIGNR and VPALIGND/Q. The slidedown and slideup lowering don't really require it to be concatenation, but it happened to be an interesting pattern with existing analysis code I could use. This helps with cases where the scalar loop optimizer forwarded a load result from a previous loop iteration. For example, this happens if the loop uses x[i] and x[i+1] on the same iteration. The scalar optimizer will forward x[i+1] load from the previous loop to satisfy x[i] on this loop. When this get vectorized it results in one element of a vector being forwarded from the previous loop to be concatenated with elements loaded on this iteration. Whether that's more efficient than doing a shifted loaded or reloading the single scalar and using vslide1up is an interesting question. But that's not something the backend can help with. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D119039	2022-02-10 09:39:35 -08:00
Craig Topper	b861ddf365	[RISCV] Move the creation of VLMaxSentinel to isel. Use X0 during lowering. The VLMaxSentinel is represented as TargetConstant, but that's included in isa<ConstantSDNode>. To keep constant VLs and VLMax separate as long as possible, use the X0 register during lowering and only convert to VLMaxSentinel during isel. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118845	2022-02-10 09:28:44 -08:00
Craig Topper	279b3b8179	[RISCV][VP] Lower VP_FMA to RVV instructions. We already had FMA_VL node, but we didn't have masked patterns. I have not added the fneg variations. I'll do those after I add llvm.vp.fneg. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119196	2022-02-09 11:33:12 -08:00
Craig Topper	63e711549c	[RISCV] Lower VP_FNEG to RVV instructions Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119269	2022-02-09 10:56:39 -08:00
Fraser Cormack	62c4ac764b	[RISCV] Optimize splats of extracted vector elements This patch adds an optimization to splat-like operations where the splatted value is extracted from a identically-sized vector. On RVV we can splat that via vrgather.vx/vrgather.vi without dropping to scalar beforehand. We do have a similar VECTOR_SHUFFLE-specific optimization but that only works on fixed-length vector types and for those with a constant splat lane. This patch extends this optimization to make it work on scalable-vector types and on unknown extract indices. It is performed during fixed-vector BUILD_VECTOR lowering and during a new DAGCombine on SPLAT_VECTOR for scalable vectors. Reviewed By: craig.topper, khchen Differential Revision: https://reviews.llvm.org/D118456	2022-02-08 10:35:25 +00:00
wangpc	c53d99c37d	[RISCV] Split f64 undef into two i32 undefs So that no store instruction will be generated. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D118222	2022-02-08 13:42:15 +08:00
Craig Topper	c1cef111a3	Revert "[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X)." This reverts commit 673d68cd923a9daa5065b929453bf4a4b8d39650. This hadn't been reviewed yet.	2022-02-05 12:51:01 -08:00
Craig Topper	673d68cd92	[RISCV] Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X). Add a new ISD opcode to represent the sign extending behavior of vmv.x.h. Keep the previous anyext opcode to allow the existing (fmv_x_anyexth (fmv_h_x X)) combine to keep working without needing to generate a sign extend. For fmv.x.w we are able to match the sext_inreg in an isel pattern, but a 16-bit sext_inreg is lowered to a shift pair before isel. This seemed like a larger match than we should do in isel. Differential Revision: https://reviews.llvm.org/D118974	2022-02-05 12:42:12 -08:00
Craig Topper	234e54bdd8	[RISCV] Add more types of shuffles isShuffleMaskLegal. Add the vslidedown and interleave patterns that I recently implemented. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118952	2022-02-04 09:13:13 -08:00
Craig Topper	c83905a308	[RISCV] Add inline expansion for vector fround. This avoids a crash for scalable vectors and or scalarization for fixed vectors. The algorithm is different enough that I don't think it makes sense to merge with ceil/floor/trunc. Algorithm is adapted from gcc's X86 SSE2 output. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D117247	2022-02-04 09:12:09 -08:00
Craig Topper	2349fb0312	[RISCV] Remove RISCVISD::SPLAT_VECTOR_I64 in favor of RISCVISD::VMV_V_X_VL. SPLAT_VECTOR_I64 has the same semantics as RISCVISD::VMV_V_X_VL, it just assumed VLMax instead of carrying a VL operand. Include order of RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td has been swapped to avoid moving riscv_vmv_v_x_vl into RISCVInstrInfoVSDPatterns.td and to allow moving other "_vl" SDNodes back to RISCVInstrInfoVVLPatterns.td Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118841	2022-02-03 08:30:25 -08:00
Craig Topper	abc6716038	[RISCV] Remove unused variables. NFC	2022-02-02 19:23:16 -08:00

1 2 3 4 5 ...

620 Commits