llvm-project

Author	SHA1	Message	Date
Philip Reames	13a74d6cc8	[RISCV] Fix crash when legalizing mgather/scatter on rv32 This is a fix for a subset of legalization problems around 64 bit indices on rv32 targets. For RV32+V, we were using the wrong mask type for the manual truncation lowering for fixed length vectors. Instead, just use the generic TRUNCATE node, and let it be lowered as needed. Note that legalization is still broken for rv32+zve32. That appears to be a different issue.	2023-09-18 08:36:23 -07:00
Craig Topper	e6a007f6b5	[RISCV] Use getConstantOperandVal. NFC	2023-09-17 10:53:50 -07:00
Craig Topper	d76e96b627	[RISCV] Reuse existing XLenVT variable. NFC	2023-09-17 10:46:53 -07:00
Vettel	ddae50d1e6	[RISCV] Combine trunc (sra sext (x), zext (y)) to sra (x, smin (y, scalarsizeinbits(y) - 1)) (#65728 ) For RVV, If we want to perform an i8 or i16 element-wise vector arithmetic right shift in the upper C/C++ program, the value to be shifted would be first sign extended to i32, and the shift amount would also be zero_extended to i32 to perform the vsra.vv instruction, and followed by a truncate to get the final calculation result, such pattern will later expanded to a series of "vsetvli" and "vnsrl" instructions later, this is because the RVV spec only support 2 * SEW -> SEW truncate. But for vector, the shift amount can also be determined by smin (Y, ScalarSizeInBits(Y) - 1)). Also, for the vsra instruction, we only care about the low lg2(SEW) bits as the shift amount. - Alive2: https://alive2.llvm.org/ce/z/u3-Zdr - C++ Test cases : https://gcc.godbolt.org/z/q1qE7fbha	2023-09-17 17:11:28 +08:00
Yingwei Zheng	e042ff7eef	[SDAG][RISCV] Avoid expanding is-power-of-2 pattern on riscv32/64 with zbb This patch adjusts the legality check for riscv to use `cpop/cpopw` since `isOperationLegal(ISD::CTPOP, MVT::i32)` returns false on rv64gc_zbb. Clang vs gcc: https://godbolt.org/z/rc3s4hjPh Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D156390	2023-09-17 02:56:09 +08:00
Yingwei Zheng	b423e1f05d	[SDAG][RISCV] Avoid neg instructions when lowering atomic_load_sub with a constant rhs This patch avoids creating (sub x0, rhs) when lowering atomic_load_sub with a constant rhs. Comparison with GCC: https://godbolt.org/z/c5zPdP7j4 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D158673	2023-09-16 17:09:41 +08:00
Philip Reames	c663401f69	[RISCV] Prefer vrgatherei16 for shuffles (#66291 ) If the data type is larger than e16, and the requires more than LMUL1 register class, prefer the use of vrgatherei16. This has three major benefits: 1) Less work needed to evaluate the constant for e.g. vid sequences. Remember that arithmetic generally scales lineary with LMUL. 2) Less register pressure. In particular, the source and indices registers can overlap so using a smaller index can significantly help at m8. 3) Smaller constants. We've got a bunch of tricks for materializing small constants, and if needed, can use a EEW=16 load.	2023-09-15 15:57:23 -07:00
Philip Reames	ff2622b5ac	[RISCV] Optimize gather/scatter to unit-stride memop + shuffle (#66279 ) If we have a gather or a scatter whose index describes a permutation of the lanes, we can lower this as a shuffle + a unit strided memory operation. For RISCV, this replaces a indexed load/store with a unit strided memory operation and a vrgather (at worst). I did not bother to implement the vp.scatter and vp.gather variants of these transforms because they'd only be legal when EVL was VLMAX. Given that, they should have been transformed to the non-vp variants anyways. I haven't checked to see if they actually are.	2023-09-15 15:54:32 -07:00
Philip Reames	37aa07ad31	[RISCV] Move narrowIndex to be a DAG combine over target independent nodes In D154687, we added a transform to narrow indexed load/store indices of the form (shl (zext), C). We can move this into a generic transform over the target independent nodes instead, and pick up the fixed vector cases with no additional work required. This is an alternative to D158163. Performing this transform points out that we weren't eliminating zero_extends via the the generic DAG combine. Adjust the (existing) callbacks so that we do. This change removes the existing transform on the target specific intrinsic nodes. If anyone has a use case this impacts, please speak up. Note: Reviewed as part of a stack of changes in PR# 66405.	2023-09-15 15:02:14 -07:00
Philip Reames	2ff9175af7	[RISCV] Normalize gather/scatter addressing to UNSIGNED_SCALAR If the index type is greater or equal to XLEN, then signed and unsigned are the same. Canonacalize towards unsigned to simplify upcoming transform. Note: Reviewed as part of a stack of changes in PR# 66405.	2023-09-15 14:56:33 -07:00
Philip Reames	09a5aac514	[TLI] Add extend as explicit parameter to shouldRemoveExtendFromGSIndex [nfc] Note: Reviewed as part of a stack of changes in PR# 66405.	2023-09-15 14:48:02 -07:00
Philip Reames	52b33ff760	[RISCV] Avoid toggling VL for hidden splat case in constant buildvector lowering We have the analogous case in the single insert path. The reasoning here is that if the original VL fits in LMUL1, we'd prefer to clobber a few extra dead lanes than to force two VL toggles. VTYPE toggles are generally cheaper than VL toggles.	2023-09-15 12:33:21 -07:00
Nick Desaulniers	86735a4353	reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66264 ) reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) This reverts commit ee643b706be2b6bef9980b25cc9cc988dab94bb5. Fix up build failures in targets I missed in #66003 Kept as 3 commits for reviewers to see better what's changed. Will squash when merging. - reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) - fix all the targets I missed in #66003 - fix off by one found by llvm/test/CodeGen/SystemZ/inline-asm-addr.ll	2023-09-13 13:31:24 -07:00
Reid Kleckner	ee643b706b	Revert "[InlineAsm] wrap ConstraintCode in enum class NFC (#66003 )" This reverts commit 2ca4d136124d151216aac77a0403dcb5c5835bcd. Also revert the followup, "[InlineAsm] fix botched merge conflict resolution" This reverts commit 8b9bf3a9f715ee5dce96eb1194441850c3663da1. There were SystemZ and Mips build errors, too many to fix forward.	2023-09-13 09:58:02 -07:00
Nick Desaulniers	2ca4d13612	[InlineAsm] wrap ConstraintCode in enum class NFC (#66003 ) Similar to commit 2fad6e69851e ("[InlineAsm] wrap Kind in enum class NFC") Fix the TODOs added in commit 93bd428742f9 ("[InlineAsm] refactor InlineAsm class NFC (#65649)")	2023-09-13 08:48:09 -07:00
Philip Reames	17b071db6a	[RISCV] Rework gather/scatter DAG combine structure [NFC] Instead of switching on type before and after common code, use a helper function. This matches the style of DAGCombine.cpp more closely, and makes porting candidate changes from one place to the other much easier.	2023-09-12 10:57:12 -07:00
Luke Lau	b2f1a1b20b	[RISCV] Move getSmallestVTForIndex so it can be used by lowerINSERT_VECTOR_ELT. NFC	2023-09-12 15:58:19 +01:00
Philip Reames	5352c79398	[RISCV] Add a combine to form masked.load from unit strided load (#65674 ) Add a DAG combine to form a masked.load from a masked_strided_load intrinsic with stride equal to element size. This covers a couple of extra test cases, and allows us to simplify and common some existing code on the concat_vector(load, ...) to strided load transform. This is the first in a mini-patch series to try and generalize our strided load and gather matching to handle more cases, and common up different approaches to the same problems in different places.	2023-09-11 13:01:14 -07:00
Philip Reames	299d710e3d	[RISCV] Lower fixed vectors extract_vector_elt through stack at high LMUL This is the extract side of D159332. The goal is to avoid non-linear costing on patterns where an entire vector is split back into scalars. This is an idiomatic pattern for SLP. Each vslide operation is linear in LMUL on common hardware. (For instance, the sifive-x280 cost model models slides this way.) If we do a VL unique extracts, each with a cost linear in LMUL, the overall cost is O(LMUL2) * VLEN/ETYPE. To avoid the degenerate case, fallback to the stack if we're beyond LMUL2. There's a subtly here. For this to work, we're relying on an optimization in LegalizeDAG which tries to reuse the stack slot from a previous extract. In practice, this appear to trigger for patterns within a block, but if we ended up with an explode idiom split across multiple blocks, we'd still be in quadratic territory. I don't think that variant is fixable within SDAG. It's tempting to think we can do better than going through the stack, but well, I haven't found it yet if it exists. Here's the results for sifive-s280 on all the variants I wrote (all 16 x i64 with V): output/sifive-x280/linear_decomp_with_slidedown.mca:Total Cycles: 20703 output/sifive-x280/linear_decomp_with_vrgather.mca:Total Cycles: 23903 output/sifive-x280/naive_linear_with_slidedown.mca:Total Cycles: 21604 output/sifive-x280/naive_linear_with_vrgather.mca:Total Cycles: 22804 output/sifive-x280/recursive_decomp_with_slidedown.mca:Total Cycles: 15204 output/sifive-x280/recursive_decomp_with_vrgather.mca:Total Cycles: 18404 output/sifive-x280/stack_by_vreg.mca:Total Cycles: 12104 output/sifive-x280/stack_element_by_element.mca:Total Cycles: 4304 I am deliberately excluding scalable vectors. It functionally works, but frankly, the code quality for an idiomatic explode loop is so terrible either way that it felt better to leave that for future work. Differential Revision: https://reviews.llvm.org/D159375	2023-09-11 10:49:17 -07:00
Luke Lau	e33f3f09b8	[RISCV] Shrink vslidedown when lowering fixed extract_subvector (#65598 ) As noted in https://github.com/llvm/llvm-project/pull/65392#discussion_r1316259471, when lowering an extract of a fixed length vector from another vector, we don't need to perform the vslidedown on the full vector type. Instead we can extract the smallest subregister that contains the subvector to be extracted and perform the vslidedown with a smaller LMUL. E.g, with +Zvl128b: v2i64 = extract_subvector nxv4i64, 2 is currently lowered as vsetivli zero, 2, e64, m4, ta, ma vslidedown.vi v8, v8, 2 This patch shrinks the vslidedown to LMUL=2: vsetivli zero, 2, e64, m2, ta, ma vslidedown.vi v8, v8, 2 Because we know that there's at least 128*2=256 bits in v8 at LMUL=2, and we only need the first 256 bits to extract a v2i64 at index 2. lowerEXTRACT_VECTOR_ELT already has this logic, so this extracts it out and reuses it. I've split this out into a separate PR rather than include it in #65392, with the hope that we'll be able to generalize it later. This patch refactors extract_subvector lowering to lower to extract_subreg directly, and to shortcut whenever the index is 0 when extracting a scalable vector. This doesn't change any of the existing behaviour, but makes an upcoming patch that extends the scalable path slightly easier to read.	2023-09-11 17:25:12 +01:00
Luke Lau	b46d7011f2	[RISCV] Refactor extract_subvector lowering slightly. NFC (#65391 ) This patch refactors extract_subvector lowering to lower to extract_subreg directly, and to shortcut whenever the index is 0 when extracting a scalable vector. This doesn't change any of the existing behaviour, but makes an upcoming patch that extends the scalable path slightly easier to read.	2023-09-11 16:48:35 +01:00
Philip Reames	463c9f44dc	[RISCV] Move slide and gather costing to TLI [NFC] (PR #65396 ) As mentioned in TODOs from D159332. This PR doesn't actually common up that copy of the code because doing so is not NFC - due to DLEN. Fixing that will be a future PR.	2023-09-07 18:28:17 -07:00
Philip Reames	b4a99f1cd6	[RISCV] Lower constant build_vectors with few non-sign bits via vsext (#65648 ) If we have a build_vector such as [i64 0, i64 3, i64 1, i64 2], we instead lower this as vsext([i8 0, i8 3, i8 1, i8 2]). For vectors with 4 or fewer elements, the resulting narrow vector can be generated via scalar materialization. For shuffles which get lowered to vrgathers, constant build_vectors of small constants are idiomatic. As such, this change covers all shuffles with an output type of 4 or less. I deliberately started narrow here. I think it makes sense to expand this to longer vectors, but we need a more robust profit model on the recursive expansion. It's questionable if we want to do the zsext if we're going to generate a constant pool load for the narrower type anyways. One possibility for future exploration is to allow the narrower VT to be less than 8 bits. We can't use vsext for that, but we could use something analogous to our widening interleave lowering with some extra shifts and ands.	2023-09-07 16:01:16 -07:00
Philip Reames	de34d39b66	[RISCV] Cap build vector cost to avoid quadratic cost at high LMULs Each vslide1down operation is linear in LMUL on common hardware. (For instance, the sifive-x280 cost model models slides this way.) If we do a VL unique inserts, each with a cost linear in LMUL, the overall cost is O(VL*LMUL). Since VL is a linear function of LMUL, this means the current lowering is quadradic in both LMUL and VL. To avoid the degenerate case, fallback to the stack if the cost is more than a fixed (linear) threshold. For context, here's the sifive-x280 llvm-mca results for the current lowering and stack based lowering for each LMUL (using e64). Assumes code was compiled for V (i.e. zvl128b). buildvector_m1_via_stack.mca:Total Cycles: 1904 buildvector_m2_via_stack.mca:Total Cycles: 2104 buildvector_m4_via_stack.mca:Total Cycles: 2504 buildvector_m8_via_stack.mca:Total Cycles: 3304 buildvector_m1_via_vslide1down.mca:Total Cycles: 804 buildvector_m2_via_vslide1down.mca:Total Cycles: 1604 buildvector_m4_via_vslide1down.mca:Total Cycles: 6400 buildvector_m8_via_vslide1down.mca:Total Cycles: 25599 There are other schemes we could use to cap the cost. The next best is recursive decomposition of the vector into smaller LMULs. That's still quadratic, but with a better constant. However, stack based seems to cost better on all LMULs, so we can just go with the simpler scheme. Arguably, this patch is fixing a regression introduced with my D149667 as before that change, we'd always fallback to the stack, and thus didn't have the non-linearity. Differential Revision: https://reviews.llvm.org/D159332	2023-09-05 09:03:26 -07:00
Luke Lau	6098d7d5f6	[RISCV] Lower shuffles as rotates without zvbb Now that the codegen for the expanded ISD::ROTL sequence has been improved, it's probably profitable to lower a shuffle that's a rotate to the vsll+vsrl+vor sequence to avoid a vrgather where possible, even if we don't have the vror instruction. This patch relaxes the restriction on ISD::ROTL being legal in lowerVECTOR_SHUFFLEAsRotate. It also attempts to do the lowering twice: Once if zvbb is enabled before any of the interleave/deinterleave/vmerge lowerings, and a second time unconditionally just before it falls back to the vrgather. This way it doesn't interfere with any of the above patterns that may be more profitable than the expanded ISD::ROTL sequence. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D159353	2023-09-04 09:35:12 +01:00
Kazu Hirata	e2e68468f5	[RISCV] Use isNullConstant (NFC)	2023-09-04 00:31:38 -07:00
Matt Arsenault	b14e83d1a4	IR: Add llvm.exp10 intrinsic We currently have log, log2, log10, exp and exp2 intrinsics. Add exp10 to fix this asymmetry. AMDGPU already has most of the code for f32 exp10 expansion implemented alongside exp, so the current implementation is duplicating nearly identical effort between the compiler and library which is inconvenient. https://reviews.llvm.org/D157871	2023-09-01 19:45:03 -04:00
Craig Topper	319aba645f	[RISCV] Teach MatInt to use (ADD_UW X, (SLLI X, 32)) to materialize some constants. If the high and low 32 bits are the same, we try to use (ADD X, (SLLI X, 32)) but that only works if bit 31 is clear since the low 32 bits will be sign extended. If we have Zba we can use add.uw to zero the sign extended bits. Reviewed By: reames, wangpc Differential Revision: https://reviews.llvm.org/D159253	2023-08-31 20:24:34 -07:00
Luke Lau	1664eb05d0	[RISCV] Fix crash during during i1 vector bitreverse lowering A shuffle of v256i1 with a large enough minimum vlen might make it through type legalization and into lowering. In this case, zvl1024b was enough. The bitreverse shuffle lowering would then try to convert this to a v1i256 type which is invalid (v1i128 exists though, which is why the existing v128i1 tests were fine). This patch checks to make sure that the new type is not only legal but also valid. Reviewed By: craig.topper, reames Differential Revision: https://reviews.llvm.org/D159215	2023-08-31 19:39:08 +01:00
Luke Lau	7b33f60f13	[RISCV] Remove vmv_v_x_vl workaround for constant splat. NFC Now that DAG.getConstant uses splat_vector_parts if needed on RV32, we can use it directly without having to manually lower to a vmv_v_x_vl. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D159287	2023-08-31 19:36:09 +01:00
Philip Reames	3e89aca446	[RISCV] Rename getELEN to getELen [nfc] Let's follow the naming scheme use for DLen, XLen, and FLen.	2023-08-31 11:27:00 -07:00
Craig Topper	d1c3784adf	[RISCV] Prefer ShortForwardBranch over the fully generic Zicond expansion. Short forward branch is shorter than (or (czero.eqz), (czero.nez)). Reviewed By: reames Differential Revision: https://reviews.llvm.org/D159295	2023-08-31 11:07:35 -07:00
Philip Reames	079c968eb9	[RISCV] Form vmv.s.f/x from single element splats via DAG combine This re-implements the special casing we had in lowerScalarSplat as a DAG combine. As can be seen in the tests, this ends up triggering in a bunch more cases. The semantically interesting bit of this change is the use of the implicit truncate semantics for when XLEN > SEW. We'd already been doing this for vmv.v.x, but this change extends e.g. the constant matching to make the same assumption about vmv.s.x. Per my reading of the specification, this should be fine, and if anything, is more obviously true of vmv.s.x than vmv.v.x. Differential Revision: https://reviews.llvm.org/D158874	2023-08-30 12:44:36 -07:00
Philip Reames	fd465f377c	[RISCV] Move vmv_s_x and vfmv_s_f special casing to DAG combine We'd discussed this in the original set of patches months ago, but decided against it. I think we should reverse ourselves here as the code is significantly more readable, and we do pick up cases we'd missed by not calling the appropriate helper routine. Differential Revision: https://reviews.llvm.org/D158854	2023-08-30 12:04:48 -07:00
Luke Lau	976244bb84	[RISCV] Canonicalize vrot{l,r} to vrev8 when lowering shuffle as rotate A rotate of 8 bits of an e16 vector in either direction is equivalent to a byteswap, i.e. vrev8. There is a generic combine on ISD::ROT{L,R} to canonicalize these rotations to byteswaps, but on fixed vectors they are legalized before they have the chance to be combined. This patch teaches the rotate vector_shuffle lowering to emit these rotations as byteswaps to match the scalable vector behaviour. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D158195	2023-08-30 11:01:49 +01:00
Luke Lau	a61c4a0ef6	[RISCV][SelectionDAG] Lower shuffles as bitrotates with vror.vi when possible Given a shuffle mask like <3, 0, 1, 2, 7, 4, 5, 6> for v8i8, we can reinterpret it as a shuffle of v2i32 where the two i32s are bit rotated, and lower it as a vror.vi (if legal with zvbb enabled). We also need to make sure that the larger element type is a valid SEW, hence the tests for zve32x. X86 already did this, so I've extracted the logic for it and put it inside ShuffleVectorSDNode so it could be reused by RISC-V. I originally tried to add this as a generic combine in DAGCombiner.cpp, but it ended up causing worse codegen on X86 and PPC. Reviewed By: reames, pengfei Differential Revision: https://reviews.llvm.org/D157417	2023-08-30 11:01:47 +01:00
Craig Topper	7b5cf52f32	[RISCV] Improve splatPartsI64WithVL for fixed vector constants where Hi and Lo are the same and the VL is constant. If doubling the VL will fit in a vsetivli, use it. It will be cheap to change and cheap to change back. This improves codegen from D158896. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D158896	2023-08-29 09:27:48 -07:00
Craig Topper	398c855457	[RISCV] Improve splatPartsI64WithVL for vlmax scalable vector constants where Hi and Lo are the same. We can use a 32-bit splat and bitcast to i64 vector. This only handles the case where we are using vlmax so that the new vl is cheap to compute. This could be generalized to double the VL. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D158879	2023-08-25 14:15:41 -07:00
Craig Topper	4184bafa9b	[RISCV] Refactor lowerSPLAT_VECTOR_PARTS to use splatPartsI64WithVL for scalable vectors. There was quite a bit of duplication between splatPartsI64WithVL and the scalable vector handling in lowerSPLAT_VECTOR_PARTS, but scalable vector had one additional case. Move that case to splatPartsI64WithVL which improves some fixed vector tests. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D158876	2023-08-25 14:15:40 -07:00
LiaoChunyu	1b12427c01	[VP][RISCV] Add vp.is.fpclass and RISC-V support There is no vp.fpclass after FCLASS_VL(D151176), try to support vp.fpclass. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D152993	2023-08-25 15:40:55 +08:00
Luke Lau	e772c0ecd8	[RISCV] Use vmv.v.x if Hi bits are undef when lowering splat_vector_parts When lowering a splat_vector_parts, if the hi bits are undefined then we can splat the lo bits without having to check if it's going to be sign extended or not, because those bits will be undefined anyway. I've handled it for both fixed and scalable vectors, but there's no diff on the scalable vror tests, since the hi bits aren't combined away to undef in SimplifyDemanded for scalable vectors. I'm not sure why that is. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D158625	2023-08-24 12:19:09 +01:00
Luke Lau	06d3ee9603	[RISCV] Fix wrong operand being used for VL in shift combine At some point a merge operand was added to the binary vl ops, so this combine was using the mask for the VL. This causes a crash when trying to select the vmv_v_x_vl, which showed up locally when messing about with selectVSplat, but thankfully in ToT the vmv_v_x_vl gets pattern matched away into the .vx and .vi operands every time, so there's no noticeable change. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D158634	2023-08-23 17:44:21 +01:00
Jianjian GUAN	879e801a91	[RISCV] Apply promotion for f16 vector ops when only have zvfhmin For most fp16 vector ops, we could promote it to fp32 vector when zvfhmin is enable but zvfh is not. But for nxv32f16, we need to split it first since nxv32f32 is not a valid MVT. Reviewed By: michaelmaitland Differential Revision: https://reviews.llvm.org/D153848	2023-08-23 16:49:20 +08:00
Jianjian GUAN	759903568f	[RISCV] Add Zvfhmin extension support for llvm RISCV backend This patch supports Zvfhmin for RISCV codegen. Reviewed By: michaelmaitland Differential Revision: https://reviews.llvm.org/D151414	2023-08-23 16:47:47 +08:00
Philip Reames	c3b48ec6ff	[RISCV] Match strided loads with reversed indexing sequences This extends the concat_vector of loads to strided_load transform to handle reversed index pattern. The previous code expected indexing of the form (a0, a1+S, a2+S,...). However, we can also see indexing of the form (a1+S, a2+S, a3+S, .., aS). This form is a strided load starting at address aN + S(n-1) with stride -S. Note that this is also fixing what looks to be a bug in the memory location reasoning for forward strided case. A strided load with negative stride access eltsize bytes past base ptr, and then bytes before* base ptr. (That is, the range should extend from before base ptr to after base ptr.) Differential Revision: https://reviews.llvm.org/D157886	2023-08-22 07:59:49 -07:00
Philip Reames	ecb855a5a8	[RISCV] Reduce LMUL for vector extracts If we have a known (or bounded) index which definitely fits in a smaller LMUL register group size, we can reduce the LMUL of the slide and extract instructions. This loosens constraints on register allocation, and allows the hardware to do less work, at the potential cost of some additional VTYPE toggles. In practice, we appear (after prior patches) to do a decent job of eliminating the additional VTYPE toggles in most cases. Differential Revision: https://reviews.llvm.org/D158460	2023-08-22 07:36:17 -07:00
Craig Topper	b441fd60b2	[RISCV] Separate hasRoundModeOpNum into separate VXRM and FRM functions. Preparation for developing a new rounding mode insertion algorithm that is going to be different between them since VXRM doesn't need to be save/restored. This also unifies the FRM handling in RISCVISelLowering.cpp between scalar and vector. Fixes outdated comments in RISCVAsmPrinter and sorts the predicate function by the reverse order of the operands being skipped. Reviewed By: eopXD Differential Revision: https://reviews.llvm.org/D158326	2023-08-21 10:00:23 -07:00
Craig Topper	078eb4bd85	[RISCV] Fix a UBSAN failure for passing INT64_MIN to std::abs. clang recently started checking for INT64_MIN being passed to 64-bit std::abs. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D158304	2023-08-18 12:47:52 -07:00
Craig Topper	42dad521e3	[RISCV] Add RISCVII::getRoundModeOpNum to reduce code duplication. NFC	2023-08-16 12:00:02 -07:00
wangpc	ac00cca3d9	[RISCV] Fix assertion when passing f64 vectors via integer registers The vector arguments are split but assignments won't be pending. Fixes #64645 Reviewed By: asb Differential Revision: https://reviews.llvm.org/D157847	2023-08-15 12:11:08 +08:00

1 2 3 4 5 ...

1264 Commits