llvm-project

Author	SHA1	Message	Date
Craig Topper	7cd725858b	[RISCV] RISCVDAGToDAGISel::selectShiftMask to shift by (sub size-1, X). If the shift amount is (sub C, X) where C is -1 modulo the size of the shift, we can replace the sub with a NOT. We could also use XORI X, size-1, but NOT would work better with c.not from the future Zce extension.	2022-12-29 16:33:18 -08:00
Craig Topper	e50976e569	[RISCV] Teach RISCVDAGToDAGISel::selectShiftMask to bypass adds with constant. If the shift amount is (add X, C) where C is 0 modulo the size of the shift, we can bypass the add. Similar to other targets like AArch64 and X86.	2022-12-29 15:10:36 -08:00
Craig Topper	0e9855c1f2	[RISCV] Add SH1ADD/SH2ADD/SH3ADD to RISCVDAGToDAGISel::hasAllNBitUsers.	2022-12-28 23:38:33 -08:00
Craig Topper	79d6e9c713	[RISCV] Prefer ADDI over ORI if the known bits are disjoint. There is no compressed form of ORI but there is a compressed form for ADDI. This also works for XORI since DAGCombine will turn Xor with disjoint bits in Or. Note: The compressed forms require a simm6 immediate, but I'm doing this for the full simm12 range. Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D140674	2022-12-28 19:59:42 -08:00
Craig Topper	6357b63735	[RISCV] Add RISCV::XORI to RISCVDAGToDAGISel::hasAllNBitUsers.	2022-12-28 15:17:41 -08:00
Craig Topper	cdf09ce7e7	[RISCV] Support SRLI in hasAllNBitUsers. We can recursively look through SRLI if the shift amount is less than the demanded bits. We can reduce the demanded bit count by the shift amount and check the users of the SRLI.	2022-12-28 13:10:52 -08:00
Nick Desaulniers	19a004b468	[llvm][SelectionDAGISel] support -{start\|stop}-{before\|after}= for remaining targets Follow up to the series: 1. https://reviews.llvm.org/D140161 2. https://reviews.llvm.org/D140349 3. https://reviews.llvm.org/D140331 4. https://reviews.llvm.org/D140323 Completes the work from the previous two for remaining targets. This creates the following named passes that can be run via `llc -{start\|stop}-{before\|after}`: - arc-isel - arm-isel - avr-isel - bpf-isel - csky-isel - hexagon-isel - lanai-isel - loongarch-isel - m68k-isel - msp430-isel - mips-isel - nvptx-isel - ppc-codegen - riscv-isel - sparc-isel - systemz-isel - ve-isel - wasm-isel - xcore-isel A nice way to write tests for SelectionDAGISel might be to use a RUN: line like: llc -mtriple=<triple> -start-before=<arch>-isel -stop-after=finalize-isel -o - Fixes: https://github.com/llvm/llvm-project/issues/59538 Reviewed By: asb, zixuan-wu Differential Revision: https://reviews.llvm.org/D140364	2022-12-21 13:25:15 -08:00
Craig Topper	c09edce1b3	[SelectionDAG] Give all the target specific subclasses of SelectionDAGISel their own pass ID. Previously we had a shared ID in SelectionDAGISel. AMDGPU has an initializePass function for its subclass of SelectionDAGISel. No other target does. This causes all target specific SelectionDAGISel passes to be known as "amdgpu-isel". I'm not sure what would happen if another target tried to implement an initializePass function too since the ID is already claimed. This patch gives all targets their own ID and passes it down to SelectionDAGISel constructor to MachineFunctionPass's constructor. Unfortunately, I think this causes most targets to lose print-before/after-all support for their SelectionDAGISel pass. And they probably no longer support start/stop-before/after. We can add initializePass functions to fix this as a follow up. NOTE: This was probably also broken if the AMDGPU target isn't compiled in. Step 1 to fixing PR59538. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D140161	2022-12-15 15:48:55 -08:00
Nitin John Raj	d741a31a39	[RISCV][CodeGen][SelectionDAG] Recursively check hasAllNBitUsers for logical machine opcodes We don’t have W versions of AND/OR/XOR/ANDN/ORN/XNOR so we should recursively check their users. We should limit the recursion to SelectionDAG::MaxRecursionDepth levels. We need to add a Depth argument, all existing callers should pass 0 to the Depth. The new recursive calls should increment it by 1. At the top of the function we should give up and return false if Depth >= SelectionDAG::MaxRecursionDepth. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D139462	2022-12-14 15:15:30 -08:00
Craig Topper	f2ffdbeb9c	[RISCV] Add accessors to RISCVMatInt::Inst. Make fields private. This helps hide that the Imm field doesn't store a full int64_t.	2022-12-07 19:02:01 -08:00
Craig Topper	9e0f9f1132	[RISCV] Preserve chain output when selecting splat as x0 strided load. We need the vlse node to have a chain output and it should replace the chain output of the original load.	2022-11-29 18:09:55 -08:00
Kazu Hirata	362ca6cbef	[RISCV] Use std::optional in RISCVISelDAGToDAG.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-11-25 23:02:26 -08:00
Craig Topper	24810acb62	[RISCV] Add isel patterns to select slli+shXadd.uw. This matches what we get for something like. %0 = shl i32 %x, C %1 = zext i32 %0 to i64 %2 = getelementptr i32, ptr %y, %1 The shift before the zext and the shift implied by the GEP get combined with an AND after them. We need to split it back into 2 shifts so we can fold one into shXadd.uw. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D137886	2022-11-21 09:32:51 -08:00
Craig Topper	3b75979806	[RISCV] Add PACKH/PACKW/PACK to hasAllNBitUsers.	2022-11-13 23:57:52 -08:00
wangpc	c66b69777c	[RISCV] Don't use zero-stride vector load if there's no optimized u-arch For vector strided instructions, as the RVV spec says: > When rs2=x0, then an implementation is allowed, but not required, to > perform fewer memory operations than the number of active elements, and > may perform different numbers of memory operations across different > dynamic executions of the same static instruction. So compiler shouldn't assume that fewer memory operations will be performed when rs2=x0. We add a target feature to specify whether u-arch supports optimized zero-stride vector load. And we do vector splat optimization iff this feature is supported. This feature is enabled by default since most designs implement this optimization. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D137699	2022-11-14 13:51:30 +08:00
Craig Topper	1a8ba9e19f	[RISCV] Improve selection of PACKW. Use hasAllWUsers to check if the upper bits are ignored so we can use PACKW even when no sign_extend_inreg is present before the OR.	2022-11-13 18:37:37 -08:00
Craig Topper	95388f7329	[RISCV] Improve selection of PACK/PACKW for AssertZExt input.	2022-11-13 16:00:45 -08:00
wangpc	8801e60c11	[RISCV][NFC] Remove ISel of SPLAT_VECTOR Since we have converted SPLAT_VECTOR to VMV_V_X_VL or VFMV_V_F_VL in RISCVDAGToDAGISel::PreprocessISelDAG(). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D136814	2022-10-27 13:36:25 +08:00
Craig Topper	a54f3347e8	[RISCV] Add shift amount operands of shift, rotate, and Zbs instructions to hasAllNBitUsers.	2022-10-24 22:07:22 -07:00
Craig Topper	223f466f4f	[RISCV] Add ORI to hasAllNBitUsers. If the immediate is negative with sufficient leading ones, then the upper bits of the other operand aren't demanded.	2022-10-24 21:33:17 -07:00
Craig Topper	a41c1f3168	[RISCV] Make selectShiftMask look for negate opportunities after looking through AND. Previously we would only look for an AND or a negate. But its possible there is a negate after looking through the AND.	2022-10-23 14:23:13 -07:00
Craig Topper	4830fa18aa	[RISCV] Make sure we always call tryShrinkShlLogicImm for ISD:AND during isel. There was an early out that prevented us from calling this for (and (sext_inreg (shl X, C1), i32), C2).	2022-10-22 14:30:13 -07:00
LiaoChunyu	a835b92e6c	[RISCV] Use hasAllWUsers to recover XORI/ORI reference 0fbe71e91f44. Also add testcase for addi. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D135538	2022-10-10 14:16:50 +08:00
Craig Topper	f3e87a63e5	[RISCV] Add missing VL arguments to the creation of RISCVISD::VMV_V_X_VL nodes. VMV_V_X_VL nodes should always have a passthru, a splat, and a VL. We were sometimes missing the VL. This went unnoticed because these cases were all selected into the following node to form a .vx or .vi instruction. The ComplexPattern that does this, doesn't check the VL operand. I've added an assert to the ComplexPattern to catch if the operand is missing. @qcolombet spotted some of these in D134703.	2022-10-03 12:21:05 -07:00
Craig Topper	a55cdcae3e	Revert "[RISCV] Add missing VL arguments to the creation of RISCVISD::VMV_V_X_VL nodes." This reverts commit 4c03c9f375f326a87065443d649c6568a4b7dd67. Forgot to squash	2022-10-03 12:15:28 -07:00
Craig Topper	4c03c9f375	[RISCV] Add missing VL arguments to the creation of RISCVISD::VMV_V_X_VL nodes. VMV_V_X_VL nodes should always have a passthru, a splat, and a VL. We were sometimes missing the VL. This went unnoticed because these cases were all selected into the following node to form a .vx or .vi instruction. The ComplexPattern that does this, doesn't check the VL operand. I've added an assert to the ComplexPattern to catch if the operand is missing. @qcolombet spotted some of these in D134703.	2022-10-03 12:13:21 -07:00
Craig Topper	9273f860c0	[RISCV] Prevent performCombineVMergeAndVOps from creating cycles in the DAG. If True has a Chain result, the other operands of the vmerge may depend on it through that Chain. We need to ensure it isn't a predecessor of those operands. Reviewed By: fakepaper56 Differential Revision: https://reviews.llvm.org/D134980	2022-09-30 20:01:45 -07:00
Craig Topper	8b8e18e11f	[RISCV] Replace RISCVISD::GREV/GORC/SHFL/UNSHFL with BREV8/ORC_B/ZIP/UNZIP. With Zbp removed, we no longer need the generalized forms. The computeKnownBitsForTargetNode code brev8/orc.b is still based on the general form with the shift amount forced to 7.	2022-09-21 21:57:59 -07:00
Craig Topper	182aa0cbe0	[RISCV] Remove support for the unratified Zbp extension. This extension does not appear to be on its way to ratification. Still need some follow up to simplify the RISCVISD nodes.	2022-09-21 21:22:42 -07:00
jacquesguan	1cbf44bd50	[RISCV] Support peephole optimization to fold vmerge.vvm that has tail agnostic policy and unmasked intrinsics. This patch supports the tail agnostic part of D130442. Reviewed By: fakepaper56 Differential Revision: https://reviews.llvm.org/D132923	2022-09-21 10:56:37 +08:00
Anton Sidorenko	3cd503f181	[NFC][RISCV] Move calculations of SDNode policy operand idx to a separate function Since there is no guaranteed correspondence of SDNode and MI operands, we need getters simular to RISCVII::get*OpNum for SDNodes. More uses of getVecPolicyOpIdx will be added in D130895. Reviewed By: craig.topper, arcbbb Differential Revision: https://reviews.llvm.org/D134179	2022-09-20 10:36:47 -07:00
Yeting Kuo	1b56b2b267	[RISCV] Transform VMERGE_VVM_<LMUL>_TU with all ones mask to VADD_VI_<LMUL>_TU. The transformation is benefit because vmerge.vvm always needs mask operand but vadd.vi may not. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D133255	2022-09-14 10:01:37 +08:00
Yeting Kuo	5fcb5d7759	[RISCV] Add assertion of hasVecPolicyOp to catch masked intrinsic without policy operand. The original code may have incorrect result if there is a masked instruction without policy operand to make us set its policy to TUMU. The patch adds an assertion to catch the instruction. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D133302	2022-09-13 10:09:49 +08:00
Craig Topper	e25eb61d03	[RISCV] Enable (srl (and X, C2), C) to form SRLIW in more cases. Don't require the AND has one use and don't depend on targetShrinkDemandedConstant turning C2 into 0xffffffff. Instead, check that the constant is 0xffffffff after replacing any bits that will be shifted out with 1s. Another way to fix this might be to prevent SimplifyDemandedBits from destroying the ANDI after type legalization using targetShrinkDemandedBits. That would prevent the CSE that created this mess. targetShrinkDemandedBits is currently only enable after legalize ops. Quick experiment shows we can't just change when it runs, we would need to try a different heuristic for post type legalization.	2022-08-29 15:52:08 -07:00
Craig Topper	0fbe71e91f	[RISCV] Use hasAllWUsers to recover ANDI. SimplifyDemandedBits can 0 the upper bits and targetShrinkDemandedConstant isn't alway able to recover it. At least part of that may be because targetShrinkDemandedConstant only runs in the last DAGCombine. Might be worth seeing what happens if we move it post type legalization.	2022-08-29 14:11:09 -07:00
Yeting Kuo	abf0416328	[RISCV] Merge vmerge.vvm and unmasked intrinsic with VLMAX vector length. The motivation of this patch is to lower the IR pattern (vp.merge mask, (add x, y), false, vl) to (PseudoVADD_VV_<LMUL>_MASK false, x, y, mask, vl). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D131841	2022-08-29 11:44:51 +08:00
Kazu Hirata	2833760c57	[Target] Qualify auto in range-based for loops (NFC)	2022-08-28 17:35:09 -07:00
Craig Topper	5349aa2354	[RISCV] Copy SDNodeFlags in doPeepholeMaskedRVV and doPeepholeMergeVVMFold Especially the NoFPExcept flag for FP. Reviewed By: fakepaper56 Differential Revision: https://reviews.llvm.org/D132173	2022-08-18 20:42:46 -07:00
jacquesguan	0fe5f03eeb	[RISCV][NFC] Use nested namespace definations. Since we use C++17 now, we could use nested namespace definations to simplify code. Differential Revision: https://reviews.llvm.org/D131751	2022-08-13 09:56:59 +08:00
Yeting Kuo	875694089d	[RISCV] Peephole optimization to fold merge.vvm and unmasked intrinsics. The patch uses peephole method to fold merge.vvm and unmasked intrinsics to masked intrinsics. Using peephole intead of tablegen patterns is to avoid large auto gnerated code. Note: The patch ignores segment loads since I don't know how to test them. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D130442	2022-08-11 17:58:11 +08:00
Craig Topper	a304d70ee9	[RISCV] Reorder (and/or/xor (shl X, C1), C2) if we can form ANDI/ORI/XORI. InstCombine and DAGCombine prefer to keep shl before binops. This patch teaches isel to convert to (shl (and/or/xor X, C1 >> C2), C2) if (C1 >> C2) is a simm12. The idea was taken from X86's isel code. There's a special case implemented for a sext_inreg between the shift and the binop. Differential Revision: https://reviews.llvm.org/D130610	2022-07-27 17:35:26 -07:00
Craig Topper	31b8939ded	[RISCV] Recognize bexti from (srl (and X, 1<<C), C). This is the form we get for (zext (setne (and X 1<<C))). We only had bexti patterns for the alternative form (and (srl X, C), 1).	2022-07-20 15:03:52 -07:00
Craig Topper	79016f6eef	[RISCV] Refine the heuristics for our custom (mul (and X, C2), C1) isel. Prefer to use SLLI instead of zext.w/zext.h in more cases. SLLI might be better for compression.	2022-07-14 18:24:10 -07:00
Craig Topper	759e5e0096	[RISCV] Remove doPeepholeLoadStoreADDI. All of the cases should be handled by SelectAddrRegImm now. Reviewed By: asb, luismarques Differential Revision: https://reviews.llvm.org/D129451	2022-07-11 10:44:33 -07:00
Craig Topper	907d923a20	[RISCV] Move the custom isel for (add X, imm) into SelectAddrRegImm. This custom isel was used to split the lo12 bits of the imm so that they could be folded into load/store addresses via a post-isel peephole. This patch instead splits the immediate during isel and folds the lo12 removing the need for the post-isel peephole to do anything. After this we'll be able to remove the post-isel peephole. Reviewed By: asb, luismarques Differential Revision: https://reviews.llvm.org/D129450	2022-07-11 10:44:33 -07:00
Craig Topper	5f7641a3be	[RISCV] Modify the custom isel for (add X, imm) used by load/stores. We have custom isel that tries to select the Lo12 bits using a separate ADDI that can later folded into the load/store address by the post-isel peephole. This patch disables this if the load/store already had a non-zero offset. A non-zero offset implies that CodeGenPrepare split several large offsets used by different loads and stores into a common large offset and multiple small offsets that could be folded. Folding more of the lo12 bits changes this common offset by increasing the small offsets. While this can save an instruction to materialize the common offset, it can also prevent the small offsets from fitting in a compressed load/store instruction. Removing this also simplifies the last piece needed to fold the custom isel for add into SelectAddrRegImm and remove the post-isel peephole.	2022-07-09 22:47:27 -07:00
Craig Topper	9c6a2200e2	[RISCV] Support folding constant addresses in SelectAddrRegImm. We already handled this by folding an ADDI in the post-isel peephole. My goal is to remove that peephole so this adds the functionality to isel.	2022-07-09 13:12:02 -07:00
Craig Topper	088bb8a328	[RISCV] Add more SHXADD patterns where the input is (and (shl/shr X, C2), C1) It might be possible to rewrite (and (shl/shr X, C2), C1) in a way that gives us a left shift that can be folded into a SHXADD.	2022-07-05 16:21:47 -07:00
Craig Topper	a1cd3f49b6	[RISCV] Use a switch statement in PreprocessISelDAG. NFC This should make it easier to add more peepholes in the future.	2022-07-05 12:25:04 -07:00
Craig Topper	c15bcad2f9	[RISCV] Update PreprocessISelDAG to use RemoveDeadNodes. Instead of deleting nodes as we go, delete all dead nodes if a change is made. This allows adding peepholes that might make multiple nodes dead.	2022-07-05 12:25:03 -07:00

1 2 3 4 5 ...

270 Commits