llvm-project

Author	SHA1	Message	Date
Zakk Chen	10fd2822b7	[RISCV] Add policy operand for masked compare and vmsbf/vmsif/vmsof IR intrinsics. Those operations are updated under a tail agnostic policy, but they could have mask agnostic or undisturbed. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D120228	2022-03-22 07:47:21 -07:00
Craig Topper	2e10671ec7	[RISCV] Improve detection of when to skip (and (srl x, c2) c1) -> (srli (slli x, c3-c2), c3) isel. We have a special case to skip this transform if c1 is 0xffffffff and x is sext_inreg in order to use sraiw+zext.w. But we were only checking that we have a sext_inreg opcode, not how many bits are being sign extended. This commit adds a check that it is a sext_inreg from i32 so we know for sure that an sraiw can be created.	2022-03-16 14:54:34 -07:00
Zakk Chen	3be907621f	[RISCV] Fix incorrect optimization for masked vmsgeu.vi with 0 immediate. vmsgeu.vi with 0 is always true, but in the masked with mask undisturbed policy, we still need to keep inactive elelemt which come from maskedoff. We could return mask directly if it's mask agnostic policy in the future. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121080	2022-03-06 19:22:35 -08:00
Zakk Chen	33b61c5678	[RISCV] Fix incorrect codegen introduced by D119688. We should not emit a tail agnostic vlse for a tail undisturbed vmv.s.x In D119688: - if (IsScalarMove && !Node->getOperand(0).isUndef()) + bool HasPassthruOperand = Node->getOpcode() != ISD::SPLAT_VECTOR; + if (HasPassthruOperand && !IsScalarMove && !Node->getOperand(0).isUndef()) break; The IsScalarMove check in the if statement had been changed. Differential Revision: https://reviews.llvm.org/D120963	2022-03-05 06:10:26 -08:00
Lian Wang	db85cd729a	[RISCV] Add FMV_W_X and FMV_H_X instrutions to hasAllNBitUsers Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120699	2022-03-01 08:13:59 +00:00
Chenbing Zheng	7f811ce127	[RISCV] Optimize (sext.w, srli) to sraiw with Zba. In this patch, we add a more narrower exclusion for zeroext (srl x) -> srli (slli x), so that it provides an opportunity for the selection of sraiw. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120467	2022-02-28 10:34:35 +08:00
Haocong.Lu	865fe131f8	[RISCV] Fix a mistake in PostprocessISelDAG With the condition N->use_empty(), the root node of DAG always misses peephole optimization. So a dummy node is needed. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D119934	2022-02-25 12:38:31 +00:00
Craig Topper	954fe404ab	[RISCV] Fix incorrect MemOperand copy converting splat+load to vlse. Due to an incorrect copy/paste from load intrinsic handling we checked if the splat node was a MemSDNode which of course it isn't. Instead get the MemOperand from the LoadSDNode for the source of the splat. This enables LICM to see the load is loop invariant and hoist it out of the loop. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D120014	2022-02-17 08:15:50 -08:00
Zakk Chen	eeb7754f68	[RISCV] Add the passthru operand for vmv.vv/vmv.vx/vfmv.vf IR intrinsics. Add the passthru operand for VMV_V_X_VL, VFMV_V_F_VL and SPLAT_VECTOR_SPLIT_I64_VL also. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D119688	2022-02-17 06:38:14 -08:00
Fangrui Song	8eb750189c	[RISCV] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds	2022-02-10 20:10:12 -08:00
Craig Topper	b861ddf365	[RISCV] Move the creation of VLMaxSentinel to isel. Use X0 during lowering. The VLMaxSentinel is represented as TargetConstant, but that's included in isa<ConstantSDNode>. To keep constant VLs and VLMax separate as long as possible, use the X0 register during lowering and only convert to VLMaxSentinel during isel. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118845	2022-02-10 09:28:44 -08:00
Fraser Cormack	fd43d99c93	[RISCV] Pre-process FP SPLAT_VECTOR to RISCVISD::VFMV_V_F_VL This patch builds on top of D119197 to canonicalize floating-point SPLAT_VECTOR as RISCVISD::VFMV_V_F_VL as a pre-process ISel step. This primarily benefits scalable-vector VP code, where our VP patterns only match VFMV_V_F_VL to reduce the burden on our ISel patterns, but where at the same time, scalable-vector code doesn't custom-legalize SPLAT_VECTOR. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117670	2022-02-10 09:56:00 +00:00
Craig Topper	c45c1b130b	[RISCV] Teach RISCVDAGToDAGISel::selectShiftMask to replace sub from constant with neg. If the shift amount is (sub C, X) where C is 0 modulo the size of the shift, we can replace it with neg or negw. Similar is is done for AArch64 and X86. Reviewed By: khchen Differential Revision: https://reviews.llvm.org/D119089	2022-02-09 12:33:01 -08:00
Craig Topper	e305b1de7e	[RISCV] Pre-process integer ISD::SPLAT_VECTOR to RISCISD::VMV_V_X_VL before isel. This allows us to remove some isel patterns that exist for both operations. Saving nearly 3000 bytes from the isel table. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D119197	2022-02-09 08:10:21 -08:00
Fraser Cormack	6449bea508	[RISCV] Select unmasked RVV pseudos in a DAG post-process This patch drops TableGen patterns matching all-ones masked RVV pseudos in the case where there are fallback patterns matching the generic masked forms to "_MASK" pseudos. This optimization is now performed with a SelectionDAG post-processing step which peephole-optimizes these same pseudos with all-ones masks and swaps them out to their unmasked pseudos. This cuts our generated ISel table down by around ~5% (~110kB) in lieu of a far smaller auto-generated table to help with the peephole. This only targets our custom RISCVISD::*_VL binary operator nodes, which use the one form for both masked and unmasked variants. A similar approach could be used for our intrinsics but we'd need to do some work, e.g., to represent unmasked intrinsics as true-masked intrinsics at the IR or ISel level. At a rough estimate, this could save us a further 9% on the size of our ISel table for the binary intrinsic patterns alone. There is no observable impact on our tests. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D118810	2022-02-09 07:50:15 +00:00
Craig Topper	5f35009996	[RISCV] Remove a ComputeNumSignBits call from an isel special case. Only isel (and (srl (sexti32 Y), c2), c1) -> (srliw (sraiw Y, 31), c3 - 32) when there is a sext_inreg present. Don't both checking for Y having 32 sign bits.	2022-02-04 23:26:53 -08:00
Craig Topper	d752ea9a72	[RISCV] Remove exclusions for zext.h/zext.w from our (and (srl X, C1), C2) selection code. This code tries to replace the pattern with a pair of shifts, but we were excluding if the And could be a zext.h or zext.w. The SLLI/SRL pair is more compressible and doesn't come with much down side. We do regress one test case in rv64i-exhaustive-w-insts.ll but we can probably add a narrower exclusion for that case.	2022-02-04 17:10:48 -08:00
Craig Topper	2349fb0312	[RISCV] Remove RISCVISD::SPLAT_VECTOR_I64 in favor of RISCVISD::VMV_V_X_VL. SPLAT_VECTOR_I64 has the same semantics as RISCVISD::VMV_V_X_VL, it just assumed VLMax instead of carrying a VL operand. Include order of RISCVInstrInfoVSDPatterns.td and RISCVInstrInfoVVLPatterns.td has been swapped to avoid moving riscv_vmv_v_x_vl into RISCVInstrInfoVSDPatterns.td and to allow moving other "_vl" SDNodes back to RISCVInstrInfoVVLPatterns.td Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D118841	2022-02-03 08:30:25 -08:00
Craig Topper	f1720abb54	[RISCV] Cleanup some places that assumed VLMaxSentinel and -1 constant mean the same thing. NFCI VLMaxSentintel happens to be represented as -1 TargetConstant. A user provided -1 would be an ISD::Constant. We shouldn't assume that they are the same thing. I'm still not entirely convinced that we should be treating -1 from the user as VLMAX. Also fix one place that failed to use XLenVT for the VLMaxSentinel, using MVT::i64 in code that only executes on RV32.	2022-02-02 12:23:12 -08:00
Alex Bradbury	588f121ada	[RISCV][NFC] Make Zb* instruction naming match the convention used elsewhere in the RISC-V backend Where the instruction mnemonic contains a dot, we name the corresponding instruction in the .td file using a _ in the place of the dot. e.g. LR_W rather than LRW. This commit updates RISCVInstrInfoZb.td to follow that convention.	2022-01-28 15:20:37 +00:00
Zakk Chen	9273378b85	[RISCV] Add the passthru operand for RVV nomask load intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Co-Authored-by: Hsiangkai Wang <Hsiangkai@gmail.com> Reviewers: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D117647	2022-01-25 17:31:36 -08:00
Fraser Cormack	d42678b453	[RISCV] Add side-effect-free vsetvli intrinsics This patch introduces new intrinsics that enable the use of vsetvli in contexts where only the returned vector length is of interest. The pre-existing intrinsics are marked with side-effects, which prevents even trivial optimizations on/across them. These intrinsics are intended to be used in situations where the vector length is fed in turn to RVV intrinsics or to vector-predication intrinsics during loop vectorization, for example. Those codegen paths ensure that instructions are generated with their own implicit vsetvli, so the vector length and vtype can be relied upon to be correct. No corresponding C builtins are planned at this stage, though that is a possibility for the future if the need arises. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D117910	2022-01-24 13:52:08 +00:00
Chenbing.Zheng	9ea772ff81	[RISCV] Block vmsgeu.vi with 0 immediate in Isel For vmsgeu.vi with 0, we know this is always true. So we can replace it with vmset.m (unmasked) or vmset.m+vmand.mm (masked). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D116584	2022-01-11 03:04:44 +00:00
jacquesguan	d0554ae4cf	[RISCV] Select vl op to X0 when it is equal to ~0. Now the backend will select ~0 vl to a register and load instruction, we could use X0 to replace it. Differential Revision: https://reviews.llvm.org/D116798	2022-01-11 10:56:25 +08:00
jacquesguan	b607cd3928	[RISCV] Use vmv.s.x to build one element splat vector. When we want to create an splat vector that only the first element is initialized, we could use vmv.s.x or vfmv.s.f to build it. Differential Revision: https://reviews.llvm.org/D116277	2022-01-11 10:21:18 +08:00
Craig Topper	b645bcd98a	[RISCV] Generalize (srl (and X, 0xffff), C) -> (srli (slli X, (XLen-16), (XLen-16) + C) optimization. This can be generalized to (srl (and X, C2), C) -> (srli (slli X, (XLen-C3), (XLen-C3) + C). Where C2 is a mask with C3 trailing ones. This can avoid constant materialization for C2. This is beneficial even when C2 can be selected to ANDI because the SLLI can become C.SLLI, but C.ANDI cannot cover all the immediates of ANDI. This also enables CSE in some cases of i8 sdiv by constant codegen.	2022-01-09 23:37:10 -08:00
Craig Topper	296e8cae5c	[RISCV] Isel (sra (sext_inreg X, i16), C) -> (srai (slli X, (XLen-16), (XLen-16) + C). Similar for (sra (sext_inreg X, i8), C). With Zbb, sext_inreg of i8 and i16 are legal for sext.b and sext.h. This transform makes the Zbb codegen the same as without Zbb. The shifts are more compressible. This also exposes an opportunity for CSE with another slli in the i16 sdiv by constant codegen.	2022-01-09 21:23:43 -08:00
jacquesguan	6b8362eb8d	[RISCV] Disable EEW=64 for index values when XLEN=32. Disable EEW=64 for vector index load/store when XLEN=32. Differential Revision: https://reviews.llvm.org/D106518	2022-01-10 10:51:27 +08:00
Craig Topper	2dd52f840b	[RISCV] Fold (srl (and X, 0xffff), C)->(srli (slli X, (XLen-16), (XLen-16) + C) even with Zbb/Zbp. We can use zext.h with Zbb, but srli/slli may offer more opportunities for compression.	2022-01-09 18:42:03 -08:00
Craig Topper	fd992aac19	[RISCV] Use macros to reduce repetive switch cases. NFC These 3 switches map LMUL enum to instruction names. These follow a regular pattern. Use a macro to reduce the number of source code lines. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D116631	2022-01-05 09:00:48 -08:00
wangpc	41454ab256	[RISCV] Use constant pool for large integers For large integers (for example, magic numbers generated by TargetLowering::BuildSDIV when dividing by constant), we may need about 4~8 instructions to build them. In the same time, it just takes two instructions to load constants (with extra cycles to access memory), so it may be profitable to put these integers into constant pool. Reviewed By: asb, craig.topper Differential Revision: https://reviews.llvm.org/D114950	2021-12-31 14:48:48 +08:00
Craig Topper	015ff729cb	[RISCV] Add a few more instructions to hasAllNBitUsers.	2021-12-29 09:17:47 -08:00
Craig Topper	7598ac5ec5	[RISCV] Convert (splat_vector (load)) to vlse with 0 stride. We already do this for splat nodes that carry a VL, but not for splats that use VLMAX. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D115483	2021-12-14 09:14:03 -08:00
Craig Topper	6f7de819b9	[RISCV] Use MULHU for more division by constant cases. D113805 improved handling of i32 divu/remu on RV64. The basic idea from that can be extended to (mul (and X, C2), C1) where C2 is any mask constant. We can replace the and with an SLLI by shifting by the number of leading zeros in C2 if we also shift C1 left by XLen - lzcnt(C1) bits. This will give the full product XLen additional trailing zeros, putting the result in the output of MULHU. If we can't use ANDI, ZEXT.H, or ZEXT.W, this will avoid materializing C2 in a register. The downside is it make take 1 additional instruction to create C1. But since that's not on the critical path, it can hopefully be interleaved with other operations. The previous tablegen pattern is replaced by custom isel code. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D115310	2021-12-09 09:10:14 -08:00
Zakk Chen	0649dfebba	[RISCV] Rename some assembler mnemonic and intrinsic functions for RVV 1.0. Rename vpopc/vmandnot/vmornot to vcpop/vmandn/vmorn assembler mnemonic. Reviewed By: frasercrmck, jrtc27, craig.topper Differential Revision: https://reviews.llvm.org/D111062	2021-11-04 10:08:01 -07:00
Craig Topper	1387483e72	[RISCV] Replace most uses of RISCVSubtarget::hasStdExtV. NFCI Add new hasVInstructions() which is currently equivalent. Replace vector uses of hasStdExtZfh/F/D with new vector specific versions. The vector spec no longer requires that the vectors implement the same types as scalar. It only requires that the scalar type is the maximum size the vectors can support. This is currently implemented using the scalar rule we were using before. Add new hasVInstructionsI64() begin using to qualify code that requires i64 vector elements. This is all NFC for now, but we can start using this to better implement D112408 which introduces the Zve extensions. Reviewed By: frasercrmck, eopXD Differential Revision: https://reviews.llvm.org/D112496	2021-10-27 19:33:48 -07:00
Ben Shi	4fe5ab4b00	[RISCV] Optimize immediate materialisation with SHADD Use SH1ADD/SH2ADD/SH3ADD along with LUI+ADDI to compose int323, int325 and int329. Reviewed By: craig.topper, luismarques Differential Revision: https://reviews.llvm.org/D111484	2021-10-15 06:46:41 +00:00
Hsiangkai Wang	80a6456306	[RISCV] Update to vlm.v and vsm.v according to v1.0-rc1. vle1.v -> vlm.v vse1.v -> vsm.v Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D106044	2021-10-05 21:49:54 +08:00
Craig Topper	715cf6ffb9	[RISCV] Add another isel optimization for (and (shl X, c2), c1). Where c1 is a shifted mask with 32-c2 leading zeros and c3 trailing zeros and c3>c2. We can select it as (slli (srliw X, c3-c2), c3).	2021-09-24 15:10:25 -07:00
Hsiangkai Wang	7d39a8a921	[RISCV] (1/2) Add the tail policy argument to builtins/intrinsics. Add the tail policy argument to LLVM IR intrinsics. There are two policies for tail elements. Tail agnostic means users do not care about the values in the tail elements and tail undisturbed means the values in the tail elements need to be kept after the operation. In order to let users control the tail policy, we add an additional argument at the end of the argument list. For unmasked operations, we have no maskedoff and the tail policy is always tail agnostic. If users want to keep tail elements under unmasked operations, they could use all one mask in the masked operations to do it. So, we only add the additional argument for masked operations for most cases. There are exceptions listed below. In this patch, we do not handle the following cases to reduce the complexity of the patch. There could be two separate patches for them. * Use dest argument to control tail policy vmerge.vvm/vmerge.vxm/vmerge.vim (add _t builtins with additional dest argument) vfmerge.vfm (add _t builtins with additional dest argument) vmv.v.v (add _t builtins with additional dest argument) vmv.v.x (add _t builtins with additional dest argument) vmv.v.i (add _t builtins with additional dest argument) vfmv.v.f (add _t builtins with additional dest argument) vadc.vvm/vadc.vxm/vadc.vim (add _t builtins with additional dest argument) vsbc.vvm/vsbc.vxm (add _t builtins with additional dest argument) * Always has tail argument for masked/unmasked intrinsics Vector Single-Width Integer Multiply-Add Instructions (add _t and _mt builtins) Vector Widening Integer Multiply-Add Instructions (add _t and _mt builtins) Vector Single-Width Floating-Point Fused Multiply-Add Instructions (add _t and _mt builtins) Vector Widening Floating-Point Fused Multiply-Add Instructions (add _t and _mt builtins) Vector Reduction Operations (add _t and _mt builtins) Vector Slideup Instructions (add _t and _mt builtins) Vector Slidedown Instructions (add _t and _mt builtins) Discussion: https://github.com/riscv/rvv-intrinsic-doc/pull/101 Differential Revision: https://reviews.llvm.org/D105092	2021-09-24 17:09:50 +08:00
Craig Topper	70f50114f3	[RISCV] Add another isel optimization for (and (shl x, c2), c1) Turn (and (shl x, c2), c1) -> (slli (srli x, c3-c2), c3) if c1 is a shifted mask with no leading zeros and c3 trailing zeros where c3 is greater than c2.	2021-09-23 14:18:07 -07:00
Craig Topper	4a69551d66	[RISCV] Add more isel optimizations for (and (shr x, c2), c1). Turn (and (shr x, c2), c1) -> (slli (srli x, c2+c3), c3) if c1 is a shifted mask with c2 leading zeros and c3 trailing zeros. When the leading zeros is C2+32 we can use SRLIW in place of SRLI.	2021-09-23 11:29:04 -07:00
Craig Topper	f0a422f935	[RISCV] Add fcvt.s.w(u)/fcvt.d.w(u)/fcvt.h.w(u) to hasAllNBitUsers These instructions only read the lower 32 bits of their input.	2021-09-22 14:24:26 -07:00
Craig Topper	73e5b9ea90	[RISCV] Select (srl (sext_inreg X, i32), uimm5) to SRAIW if only lower 32 bits are used. SimplifyDemandedBits can turn srl into sra if the bits being shifted in aren't demanded. This patch can recover the original sra in some cases. I've renamed the tablegen class for detecting W users since the "overflowing operator" term I originally borrowed from Operator.h does not include srl. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D109162	2021-09-16 11:03:35 -07:00
Craig Topper	9af8f1b18e	[SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode. Soft deprecrate isNullValue/isAllOnesValue and update in tree callers. This matches the changes to the APInt interface from D109483. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D109535	2021-09-09 13:28:30 -07:00
Craig Topper	b5fd6b46f5	[RISCV] Teach instruction selection to elide sext.w in some cases. If a sext_inreg is up for isel, and all its users are W instructions, we can skip emitting the sext_inreg. This helpful if the producing instruction can't become a W instruction. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D108966	2021-09-02 07:54:34 -07:00
Craig Topper	e4e69ba4d1	[RISCV] Split PseudoVSETVLI into 2 instructions to allow different register classes for rs1. X0 has special meaning for vsetvli, we need to make sure we never create it a vsetvli that uses it by accident. This could happen if the register coalescer coalesces a copy from X0 into this instruction. This patch splits the instruction so that we can have GPRNoX0 register class to use for the cases where we don't want the source to be X0. The verifier won't let us explicitly use X0 on a GPRNoX0 operand so we need a separate pseudo for those cases. I don't currently have a failing example for this. There was a failure in D107957, but the coalescable copy from that example should have been optimized away much earlier so I've fixed that. This is not a complete fix. We still need to prevent the same possible issue on the AVL operand of all of the vector instruction pseudos. I don't want to make two versions of all of those so we need to find a different solution for those. I have an idea I'm going to try. Differential Revision: https://reviews.llvm.org/D109110	2021-09-02 07:45:31 -07:00
Craig Topper	3f9b37ccb1	[RISCV] Remove sext_inreg+add/sub/mul/shl isel patterns. Let the sext_inreg be selected to sext.w. Remove unneeded sext.w during PostProcessISelDAG. This gives opportunities for some other isel patterns to match like the ADDIPair or matching mul with immediate to shXadd. This becomes possible after D107658 started selecting W instructions based on users. The sext.w will be considered a W user so isel will often select a W instruction for the sext.w input and we can just remove the sext.w. Otherwise we can combine the sext.w with a ADD/SUB/MUL/SLLI to create a new W instruction in parallel to the the original instruction. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D107708	2021-08-18 11:07:11 -07:00
Craig Topper	20e6265873	[RISCV] Improve constant materialization for stores of i16 or i32 negative constants. DAGCombiner::visitStore can clear the upper bits of constants used by stores. This leads prevents them from being recognized as sign extended negative values making them more expensive to materialize. This patch uses the hasAllNBitUsers method from D107658 to make a negative constant if none of the users care about the upper bits. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D108052	2021-08-18 10:25:12 -07:00
Craig Topper	d9ba1a9c5c	[RISCV] Teach isel to select ADDW/SUBW/MULW/SLLIW when only the lower 32-bits are used. We normally select these when the root node is a sext_inreg, but SimplifyDemandedBits can sometimes bypass the sext_inreg for some users. This can create situation where sext_inreg+add/sub/mul/shl is selected to a W instruction, and then the add/sub/mul/shl is separately selected to a non-W instruction with the same inputs. This patch tries to detect when it would still be ok to use a W instruction without the sext_inreg by checking the direct users. This can allow the W instruction to CSE with one created for a sext_inreg+add/sub/mul/shl. To minimize complexity and cost of checking, we make no attempt to determine if the CSE will happen and just always use a W instruction when we can. Differential Revision: https://reviews.llvm.org/D107658	2021-08-18 10:22:00 -07:00

1 2 3 4 5

227 Commits