llvm-project

Author	SHA1	Message	Date
Simon Pilgrim	46aa3c6c33	[DAG] visitVECTOR_SHUFFLE - MergeInnerShuffle - improve shuffle(shuffle(x,y),shuffle(x,y)) merging MergeInnerShuffle currently attempts to merge shuffle(shuffle(x,y),z) patterns into a single shuffle, using 1 or 2 of the x,y,z ops. However if we already match 2 ops we might be able to handle the third op if its also a shuffle that references one of the previous ops, allowing us to handle some cases like: shuffle(shuffle(x,y),shuffle(x,y)) shuffle(shuffle(shuffle(x,z),y),z) shuffle(shuffle(x,shuffle(x,y)),z) etc. This isn't an exhaustive match and is dependent on the order the candidate ops are encountered - if one of the matched ops was a shuffle that was peek-able we don't go back and try to split that, I haven't found much need for that amount of analysis yet. This is a preliminary patch that will allow us to later improve x86 HADD/HSUB matching - but needs to be reviewed separately as its in generic code and affects existing Thumb2 tests. Differential Revision: https://reviews.llvm.org/D94671	2021-01-15 15:08:31 +00:00
Simon Pilgrim	7c30c05ff7	[DAG] visitVECTOR_SHUFFLE - MergeInnerShuffle - reset shuffle ops and reorder early-out and second op matching. NFCI. I'm hoping to reuse MergeInnerShuffle in some other folds - so ensure the candidate ops/mask are reset at the start of each run. Also, move the second op matching before bailing to make it simpler to try to match other things afterward.	2021-01-14 11:55:20 +00:00
Simon Pilgrim	af8d27a7a8	[DAG] visitVECTOR_SHUFFLE - pull out shuffle merging code into lambda helper. NFCI. Make it easier to reuse in a future patch.	2021-01-14 11:05:19 +00:00
Kazu Hirata	5c1c39e8d8	[llvm] Use *Set::contains (NFC)	2021-01-13 19:14:41 -08:00
Simon Pilgrim	993c488ed2	[DAG] visitVECTOR_SHUFFLE - use all_of to check for all-undef shuffle mask. NFCI.	2021-01-13 17:19:41 +00:00
Juneyoung Lee	25eb7b08ba	[DAGCombiner] Fold BRCOND(FREEZE(COND)) to BRCOND(COND) This patch resolves the suboptimal codegen described in http://llvm.org/pr47873 . When CodeGenPrepare lowers select into a conditional branch, a freeze instruction is inserted. It is then translated to `BRCOND(FREEZE(SETCC))` in SelDag. The `FREEZE` in the middle of `SETCC` and `BRCOND` was causing a suboptimal code generation however. This patch adds `BRCOND(FREEZE(cond))` -> `BRCOND(cond)` fold to DAGCombiner to remove the `FREEZE`. To make this optimization sound, `BRCOND(UNDEF)` simply should nondeterministically jump to the branch or not, rather than raising UB. It wasn't clear what happens when the condition was undef according to the comments in ISDOpcodes.h, however. I updated the comments of `BRCOND` to make it explicit (as well as `BR_CC`, which is also a conditional branch instruction). Note that it diverges from the semantics of `br` instruction in IR, which is explicitly UB. Since the UB semantics was necessary to explain optimizations that use branching conditions, and SelDag doesn't seem to have such optimization, I think this divergence is okay. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D92015	2021-01-13 09:36:52 +09:00
Craig Topper	df74c001fa	[DAGCombiner] Replace static helper function isConstantFPBuildVectorOrConstantFP with the identical version in SelectionDAG. NFC	2021-01-11 23:41:40 -08:00
Joe Ellis	007358239d	[DAGCombiner] Use getVectorElementCount inside visitINSERT_SUBVECTOR This avoids TypeSize-/ElementCount-related warnings. Differential Revision: https://reviews.llvm.org/D92747	2021-01-11 14:15:11 +00:00
QingShan Zhang	7539c75bb4	[DAGCombine] Remove the check for unsafe-fp-math when we are checking the AFN We are checking the unsafe-fp-math for sqrt but not for fpow, which behaves inconsistent. As the direction is to remove this global option, we need to remove the unsafe-fp-math check for sqrt and update the test with afn fast-math flags. Reviewed By: Spatel Differential Revision: https://reviews.llvm.org/D93891	2021-01-11 02:25:53 +00:00
Craig Topper	4ef91f5871	[DAGCombiner] Don't speculatively create an all ones constant in visitREM that might not be used. This looks to have been done to save some duplicated code under two different if statements, but it ends up being harmful to D94073. This speculative constant can be called on a scalable vector type with i64 element size when i64 scalars aren't legal. The code tries and fails to find a vector type with i32 elements that it can use. So only create the node when we know it will be used.	2021-01-05 12:45:57 -08:00
Cameron McInally	92be640bd7	[FPEnv][AMDGPU] Disable FSUB(-0,X)->FNEG(X) DAGCombine when subnormals are flushed This patch disables the FSUB(-0,X)->FNEG(X) DAG combine when we're flushing subnormals. It requires updating the existing AMDGPU tests to use the fneg IR instruction, in place of the old fsub(-0,X) canonical form, since AMDGPU is the only backend currently checking the DenormalMode flags. Note that this will require follow-up optimizations to make sure the FSUB(-0,X) form is handled appropriately Differential Revision: https://reviews.llvm.org/D93243	2021-01-04 14:44:10 -06:00
Layton Kifer	d29f93bda5	[DAGCombiner] Don't create sexts of deleted xors when they were in-visit replaced Fixes a bug introduced by D91589. When folding `(sext (not i1 x)) -> (add (zext i1 x), -1)`, we try to replace the not first when possible. If we replace the not in-visit, then the now invalidated node will be returned, and subsequently we will return an invalid sext. In cases where the not is replaced in-visit we can simply return SDValue, as the not in the current sext should have already been replaced. Thanks @jgorbe, for finding the below reproducer. The following reduced test case crashes clang when built with `clang -O1 -frounding-math`: ``` template <class> class a { int b() { return c == 0.0 ? 0 : -1; } int c; }; template class a<long>; ``` A debug build of clang produces this "assertion failed" error: ``` clang: /home/jgorbe/code/llvm/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:264: void {anonymous}::DAGCombiner::AddToWorklist(llvm:: SDNode*): Assertion `N->getOpcode() != ISD::DELETED_NODE && "Deleted Node added to Worklist"' failed. ``` Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93274	2020-12-23 16:16:26 -08:00
Layton Kifer	385e9a2a04	[DAGCombiner] Improve shift by select of constant Clean up a TODO, to support folding a shift of a constant by a select of constants, on targets with different shift operand sizes. Reviewed By: RKSimon, lebedev.ri Differential Revision: https://reviews.llvm.org/D90349	2020-12-18 02:21:42 +00:00
Kerry McLaughlin	05edfc5475	[SVE][CodeGen] Add DAG combines for s/zext_masked_gather This patch adds the following DAGCombines, which apply if isVectorLoadExtDesirable() returns true: - fold (and (masked_gather x)) -> (zext_masked_gather x) - fold (sext_inreg (masked_gather x)) -> (sext_masked_gather x) LowerMGATHER has also been updated to fetch the LoadExtType associated with the gather and also use this value to determine the correct masked gather opcode to use. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92230	2020-12-09 11:53:19 +00:00
Kerry McLaughlin	4519ff4b6f	[SVE][CodeGen] Add the ExtensionType flag to MGATHER Adds the ExtensionType flag, which reflects the LoadExtType of a MaskedGatherSDNode. Also updated SelectionDAGDumper::print_details so that details of the gather load (is signed, is scaled & extension type) are printed. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D91084	2020-12-09 11:19:08 +00:00
Huihui Zhang	8e6fc1f97e	[AArch64][SVE] Add lowering for llvm.maxnum\|minnum for scalable type. LLVM intrinsic llvm.maxnum\|minnum is overloaded intrinsic, can be used on any floating-point or vector of floating-point type. This patch extends current infrastructure to support scalable vector type. This patch also fix a warning message of incorrect use of EVT::getVectorNumElements() for scalable type, when DAGCombiner trying to split scalable vector. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92607	2020-12-08 09:35:53 -08:00
Kai Luo	44bd8ea167	[DAGCombine][PowerPC] Simplify nabs by using legal `smin` operation Convert `0 - abs(x)` to `smin (x, -x)` if `smin` is a legal operation. Verification: https://alive2.llvm.org/ce/z/vpquFR Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92637	2020-12-08 03:24:07 +00:00
Simon Pilgrim	b6e847c396	[DAG] Cleanup by folding some single use VT.getScalarSizeInBits() calls into its comparison. NFCI.	2020-12-07 18:23:54 +00:00
Kerry McLaughlin	111f559bbd	[SVE][CodeGen] Call refineIndexType & refineUniformBase from visitMGATHER The refineIndexType & refineUniformBase functions added by D90942 can also be used to improve CodeGen of masked gathers. These changes were split out from D91092 Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D92319	2020-12-07 13:20:19 +00:00
Bing1 Yu	eee30a6dce	[CodeGen] Modify the refineIndexType(...)'s code to fix a bug in D90942. In previous code, when refineIndexType(...) is called and Index is undef, Index.getOperand(0) will raise a assertion fail. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D92548	2020-12-07 08:49:07 +08:00
Layton Kifer	ac522f8700	[DAGCombiner] Fold (sext (not i1 x)) -> (add (zext i1 x), -1) Move fold of (sext (not i1 x)) -> (add (zext i1 x), -1) from X86 to DAGCombiner to improve codegen on other targets. Differential Revision: https://reviews.llvm.org/D91589	2020-12-06 11:52:10 -05:00
Simon Pilgrim	6f4ee6f870	[DAGCombiner] Use const APInt& for getConstantOperandAPInt results. NFCI. Avoid unnecessary instantiation. Noticed while removing unnecessary autos	2020-12-04 09:44:58 +00:00
dfukalov	2ce38b3f03	[NFC] Reduce include files dependency. 1. Removed #include "...AliasAnalysis.h" in other headers and modules. 2. Cleaned up includes in AliasAnalysis.h. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92489	2020-12-03 18:25:05 +03:00
Joe Ellis	78c0ea54a2	[DAGCombine] Fix TypeSize warning in DAGCombine::visitLIFETIME_END Bail out early if we encounter a scalable store. Reviewed By: peterwaller-arm Differential Revision: https://reviews.llvm.org/D92392	2020-12-03 12:12:41 +00:00
Layton Kifer	d7fec38f05	[DAGCombiner][NFC] Replace duplicate implementation flipBoolean with DAG.getLogicalNOT Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D92246	2020-12-01 22:23:04 +03:00
Benjamin Kramer	107e92dff8	[DAG] Remove unused variable. NFC.	2020-12-01 16:29:02 +01:00
Simon Pilgrim	1b209ff9e3	[DAG] Move vselect(icmp_ult, 0, sub(x,y)) -> usubsat(x,y) to DAGCombine (PR40111) Move the X86 VSELECT->USUBSAT fold to DAGCombiner - there's nothing target specific about these folds.	2020-12-01 14:25:29 +00:00
Simon Pilgrim	6dbd0d36a1	[DAG] Move vselect(icmp_ult, -1, add(x,y)) -> uaddsat(x,y) to DAGCombine (PR40111) Move the X86 VSELECT->UADDSAT fold to DAGCombiner - there's nothing target specific about these folds. The SSE42 test diffs are relatively benign - its avoiding an extra constant load in exchange for an extra xor operation - there are extra register moves, which is annoying as all those operations should commute them away. Differential Revision: https://reviews.llvm.org/D91876	2020-12-01 11:56:26 +00:00
QingShan Zhang	4d83aba422	[DAGCombine] Adding a hook to improve the precision of fsqrt if the input is denormal For now, we will hardcode the result as 0.0 if the input is denormal or 0. That will have the impact the precision. As the fsqrt added belong to the cold path of the cmp+branch, it won't impact the performance for normal inputs for PowerPC, but improve the precision if the input is denormal. Reviewed By: Spatel Differential Revision: https://reviews.llvm.org/D80974	2020-11-27 02:10:55 +00:00
QingShan Zhang	9c588f53fc	[DAGCombine] Add hook to allow target specific test for sqrt input PowerPC has instruction ftsqrt/xstsqrtdp etc to do the input test for software square root. LLVM now tests it with smallest normalized value using abs + setcc. We should add hook to target that has test instructions. Reviewed By: Spatel, Chen Zheng, Qiu Chao Fang Differential Revision: https://reviews.llvm.org/D80706	2020-11-25 05:37:15 +00:00
Kai Luo	5931be60b5	[DAGCombine][PowerPC] Convert negated abs to trivial arithmetic ops This patch converts `0 - abs(x)` to `Y = sra (X, size(X)-1); sub (Y, xor (X, Y))` for better codegen. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D91120	2020-11-24 09:43:35 +00:00
Kerry McLaughlin	306c8ab208	[SVE][CodeGen] Improve codegen of scalable masked scatters If the scatter store is able to perform the sign/zero extend of its index, this is folded into the instruction with refineIndexType(). Additionally, refineUniformBase() will return the base pointer and index from an add + splat_vector. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D90942	2020-11-13 11:19:36 +00:00
Francesco Petrogalli	fc2fe6817e	[llvm][AArch64] Simplify (and (sign_extend..) #bitmask). Fold VT = (and (sign_extend NarrowVT to VT) #bitmask) into VT = (zero_extend NarrowVT) With this combine, the test replaces a sign extended load + an unsigned extention with a zero extended load to render one of the operands of the last multiplication. BEFORE \| AFTER f_i16_i32: \| f_i16_i32: .fnstart \| .fnstart ldrsh r0, [r0] \| ldrh r1, [r1] ldrsh r1, [r1] \| ldrsh r0, [r0] smulbb r0, r1, r0 \| smulbb r0, r0, r1 uxth r1, r1 \| mul r0, r0, r1 mul r0, r0, r1 \| bx lr bx lr \| Reviewed By: resistor Differential Revision: https://reviews.llvm.org/D90605	2020-11-09 12:53:36 +00:00
Fraser Cormack	f99580c1e5	[DAGCombine] Fix bug in load scalarization Summary: For vector element types which are not byte-sized, we would generate incorrect scalar offsets and produce incorrect codegen. This optimization could potentially be supported in the future, e.g. by loading in bytes, then shifting and masking out the remaining bits of the vector element. However, without an upstream target to test against it's best to avoid the bad codegen in the simplest possible way. Related to this bug: https://bugs.llvm.org/show_bug.cgi?id=27600 Reviewed by: foad Differential Revision: https://reviews.llvm.org/D78568	2020-11-04 19:02:40 +00:00
Kerry McLaughlin	f2412d372d	[SVE][CodeGen] Lower scalable integer vector reductions This patch uses the existing LowerFixedLengthReductionToSVE function to also lower scalable vector reductions. A separate function has been added to lower VECREDUCE_AND & VECREDUCE_OR operations with predicate types using ptest. Lowering scalable floating-point reductions will be addressed in a follow up patch, for now these will hit the assertion added to expandVecReduce() in TargetLowering. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D89382	2020-11-04 11:38:49 +00:00
Simon Pilgrim	f53d7f55f1	[DAG] Move canFoldInAddressingMode before foldBinOpIntoSelect. NFC. Reduces the diff in D90113.	2020-10-28 12:16:05 +00:00
Peter Waller	5b742a0c10	[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination The modified code in visitSTORE was missing a scalable vector check, and still using the now deprecated implicit cast of TypeSize to uint64_t through the overloaded operator. This patch fixes these issues. This brings the logic in line with the comment on the context line immediately above the added precondition. Add a test in sve-redundant-store.ll that the warning is not triggered. Differential Revision: https://reviews.llvm.org/D89701	2020-10-26 16:37:48 +00:00
Peter Waller	6536d6040f	Revert "[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination" This reverts commit 4604441386dc5fcd3165f4b39f5fa2e2c600f1bc. Reverting because it was not the intended version of the patch, which follows this patch.	2020-10-26 16:37:00 +00:00
Peter Waller	4604441386	[SVE][CodeGen][DAGCombiner] Fix TypeSize warning in redundant store elimination The modified code in visitSTORE was missing a scalable vector check, and still using the now deprecated implicit cast of TypeSize to uint64_t through the overloaded operator. This patch fixes these issues. This brings the logic in line with the comment on the context line immediately above the added precondition. Add a test in Redundantstores.ll that the warning is not triggered.	2020-10-26 16:23:42 +00:00
Qiu Chaofan	1b2fe71ecf	[DAGCombiner] Tighten reasscociation of visitFMA From LangRef, FMF contract should not enable reassociating to form arbitrary contractions. So it should not help rearrange nodes like (fma (fmul x, c1), c2, y) into (fma x, c1*c2, y). Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D89527	2020-10-20 10:13:01 +08:00
Amy Kwan	6a946fd06f	[DAGCombiner][PowerPC] Remove isMulhCheaperThanMulShift TLI hook, Use isOperationLegalOrCustom directly instead. MULH is often expanded on targets. This patch removes the isMulhCheaperThanMulShift hook and uses isOperationLegalOrCustom instead. Differential Revision: https://reviews.llvm.org/D80485	2020-10-19 12:23:04 -05:00
David Sherwood	3945b69e81	[SVE][CodeGen] Replace more TypeSize comparison operators with their scalar equivalents In certain places in llvm/lib/CodeGen we were relying upon the TypeSize comparison operators when in fact the code was only ever expecting either scalar values or fixed width vectors. This patch changes a few functions that were always expecting to work on scalar or fixed width types: 1. DAGCombiner::mergeTruncStores - deals with scalar integers only. 2. DAGCombiner::ReduceLoadWidth - not valid for vectors. 3. DAGCombiner::createBuildVecShuffle - should only be used for fixed width vectors. 4. SelectionDAGLegalize::ExpandFCOPYSIGN and SelectionDAGLegalize::getSignAsIntValue - only work on scalars. Differential Revision: https://reviews.llvm.org/D88562	2020-10-19 08:38:50 +01:00
David Sherwood	35a531fb45	[SVE][CodeGen][NFC] Replace TypeSize comparison operators with their scalar equivalents In certain places in llvm/lib/CodeGen we were relying upon the TypeSize comparison operators when in fact the code was only ever expecting either scalar values or fixed width vectors. I've changed some of these places to use the equivalent scalar operator. Differential Revision: https://reviews.llvm.org/D88482	2020-10-19 08:30:31 +01:00
David Sherwood	f693f915a0	[SVE][CodeGen] Replace uses of TypeSize comparison operators In certain places in the code we can never end up in a situation where we're mixing fixed width and scalable vector types. For example, we can't have truncations and extends that change the lane count. Also, in other places such as GenWidenVectorStores and GenWidenVectorLoads we know from the behaviour of FindMemType that we can never choose a vector type with a different scalable property. In various places I have used EVT::bitsXY functions instead of TypeSize::isKnownXY, where it probably makes sense to keep an assert that scalable properties match. Differential Revision: https://reviews.llvm.org/D88654	2020-10-19 08:08:41 +01:00
Craig Topper	1687a8d83b	[X86][SelectionDAG] Add SADDO_CARRY and SSUBO_CARRY to support multipart signed add/sub overflow legalization. This passes existing X86 test but I'm not sure if it handles all type legalization cases it needs to. Alternative to D89200 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D89222	2020-10-12 23:18:29 -07:00
David Sherwood	c5ba0d33cc	[SVE] Make ElementCount and TypeSize use a new PolySize class I have introduced a new template PolySize class, where the template parameter determines the type of quantity, i.e. for an element count this is just an unsigned value. The ElementCount class is now just a simple derivation of PolySize<unsigned>, whereas TypeSize is more complicated because it still needs to contain the uint64_t cast operator, since there are still many places in the code that rely upon this implicit cast. As such the class also still needs some of it's own operators. I've tried to minimise the amount of code in the base PolySize class, which led to a couple of changes: 1. In some places we were relying on '==' operator comparisons between ElementCounts and the scalar value 1. I didn't put this operator in the new PolySize class, and thought it was actually clearer to use the isScalar() function instead. 2. I removed the isByteSized function and replaced it with calls to isKnownMultipleOf(8). I've also renamed NextPowerOf2 to be coefficientNextPowerOf2 so that it's more consistent with coefficientDivideBy. Differential Revision: https://reviews.llvm.org/D88409	2020-10-12 08:23:38 +01:00
Esme-Yi	e9fd8823ba	[DAGCombiner] Add decomposition patterns for Mul-by-Imm. Summary: This patch is derived from D87384. In this patch we expand the existing decomposition of mul-by-constant to be more general by implementing 2 patterns: ``` mul x, (2^N + 2^M) --> (add (shl x, N), (shl x, M)) mul x, (2^N - 2^M) --> (sub (shl x, N), (shl x, M)) ``` The conversion will be trigged if the multiplier is a big constant that the target can't use a single multiplication instruction to handle. This is controlled by the hook `decomposeMulByConstant`. More over, the conversion benefits from an ILP improvement since the instructions are independent. A case with the sequence like following also gets benefit since a shift instruction is saved. ``` res1 = a 0x8800; res2 = a 0x8080; ``` Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D88201	2020-10-09 08:51:40 +00:00
David Sherwood	4ed47d50ea	[SVE][CodeGen] Fix DAGCombiner::ForwardStoreValueToDirectLoad for scalable vectors In DAGCombiner::ForwardStoreValueToDirectLoad I have fixed up some implicit casts from TypeSize -> uint64_t and replaced calls to getVectorNumElements() with getVectorElementCount(). There are some simple cases of forwarding that we can definitely support for scalable vectors, i.e. when the store and load are both scalable vectors and have the same size. I have added tests for the new code paths here: CodeGen/AArch64/sve-forward-st-to-ld.ll Differential Revision: https://reviews.llvm.org/D87098	2020-10-06 08:04:03 +01:00
Craig Topper	1127662c6d	[SelectionDAG] Make sure FMF are propagated when getSetcc canonicalizes FP constants to RHS. getNode handling for ISD:SETCC calls FoldSETCC which can canonicalize FP constants to the RHS. When this happens we should create the node with the FMF that was requested. By using FlagInserter when can ensure any calls to getNode/getSetcc during canonicalization will also get the flags. Differential Revision: https://reviews.llvm.org/D88063	2020-10-05 14:55:23 -07:00
Sanjay Patel	2ccbf3dbd5	[SDAG] fold x * 0.0 at node creation time In the motivating case from https://llvm.org/PR47517 we create a node that does not get constant folded before getNegatedExpression is attempted from some other node, and we crash. By moving the fold into SelectionDAG::simplifyFPBinop(), we get the constant fold sooner and avoid the problem.	2020-10-04 11:31:57 -04:00

1 2 3 4 5 ...

2963 Commits