llvm-project

Author	SHA1	Message	Date
Konstantina Mitropoulou	17fc78e7a4	[DAGCombiner] Change foldAndOrOfSETCC() to optimize and/or patterns with floating points. This reverts commit 48fa79a503a7cf380f98b6335fbd349afae1bd86. Reviewed By: brooksmoses Differential Revision: https://reviews.llvm.org/D159240	2023-08-31 11:36:50 -07:00
Luke Lau	3a4ad45a2c	[DAGCombiner] Combine trunc (splat_vector x) -> splat_vector (trunc x) From the discussion in https://reviews.llvm.org/D158853, moving the truncate into the splat helps more splatted scalar operands get selected on RISC-V, and also avoids the need for splat_vector_parts on RV32. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D159147	2023-08-30 15:22:57 +01:00
Simon Pilgrim	d037445f3a	[DAG] visitSHL - use FoldConstantArithmetic to fold constants in (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) fold Matches what we do in the (shl (mul x, c1), c2) -> (mul x, c1 << c2) fold as well as inside visitShiftByConstant	2023-08-29 18:52:24 +01:00
Konstantina Mitropoulou	48fa79a503	Revert "[DAGCombiner] Change foldAndOrOfSETCC() to optimize and/or patterns with floating points." This reverts commit 5ec13535235d07eafd64058551bc495f87c283b1.	2023-08-24 20:39:04 -07:00
Konstantina Mitropoulou	5ec1353523	[DAGCombiner] Change foldAndOrOfSETCC() to optimize and/or patterns with floating points. CMP(A,C)\|\|CMP(B,C) => CMP(MIN/MAX(A,B), C) CMP(A,C)&&CMP(B,C) => CMP(MIN/MAX(A,B), C) If the operands are proven to be non NaN, then the optimization can be applied for all predicates. We can apply the optimization for the following predicates for FMINNUM/FMAXNUM (for quiet and signaling NaNs) and for FMINNUM_IEEE/FMAXNUM_IEEE if we can prove that the operands are not signaling NaNs. - ordered lt/le and \|\| - ordered gt/ge and \|\| - unordered lt/le and && - unordered gt/ge and && Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D155267	2023-08-24 10:48:56 -07:00
Simon Pilgrim	d254014fdb	[DAG] Add willNotOverflowAdd/willNotOverflowSub helper functions. Matches similar instructions on InstCombine	2023-08-24 17:52:54 +01:00
Craig Topper	2ad50f354a	[DAGCombiner][RISCV][AArch64][PowerPC] Restrict foldAndOrOfSETCC from using SMIN/SMAX where and OR/AND would do. This removes some diffs created by D153502. I'm assuming an AND/OR won't be worse than an SMIN/SMAX. For RISC-V at least, AND/OR can be a shorter encoding than SMIN/SMAX. It's weird that we have two different functions responsible for folding logic of setccs, but I'm not ready to try to untangle that. I'm unclear if the PowerPC chang is a regression or not. It looks like it might use more registers, but I don't understand PowerPC register so I'm not sure. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D158292	2023-08-23 20:26:23 -07:00
Peter Rong	f58fbfc746	[X86][CodeGen] Add a dag pattern to fix #64323 After recent patch D30189, #64323's error message become a new one. When DAGCombiner was optimizing `(vextract (scalar_to_vector val, 0) -> val`, it didn't consider the possibility that the inserted value type has less bit than the dest type. This patch fixes that. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D158355	2023-08-23 10:50:32 -07:00
Simon Pilgrim	ba818c4019	[DAG] replaceStoreOfInsertLoad - don't fold if the inserted element is implicitly truncated D152276 wasn't handling the case where the inserted element is implicitly truncated into the vector - resulting in a i1 element (implicitly truncated from i8) overwriting 8 bits instead of 1 bit. This patch is intended to be merged into 17.x so I've just disallowed any vector element vs inserted element type mismatch - technically we could be more elegant and permit truncated stores (as long as the store is still byte sized), but the use cases for that are so limited I'd prefer to play it safe for now. Candidate patch for #64655 17.x merge Differential Revision: https://reviews.llvm.org/D158366	2023-08-21 11:22:07 +01:00
Jim Lin	18f5ada244	[DAGCombiner] Don't reduce BUILD_VECTOR to BITCAST before LegalizeTypes if VT is legal. Targets may lose some optimization opportunities for certain vector operation if we reduce BUILD_VECTOR to BITCAST early. And if VT is not legal, reduce BUILD_VECTOR to BITCAST before LegailizeTypes can get benefit. Because type-legalizer often scalarizes illegal type of vectors. Reviewed By: sebastian-ne Differential Revision: https://reviews.llvm.org/D156645	2023-08-19 12:53:50 +08:00
Philip Reames	92e0c0dc1a	[DAG] Restrict insert_subvector undef, splat_veector, dontcare transform On the extract_subvector side, we already have the restriction. With D158201, we'd start getting unprofitable splat combines unless we add the same one on the extract_subvector side. Differential Revision: https://reviews.llvm.org/D158202	2023-08-18 12:44:09 -07:00
Philip Reames	67b71ad04a	[DAG] Fold insert_subvector undef, (extract_subvector X, 0), 0 with non-matching types We have an existing DAG combine for when an insert/extract subvector pair is entirely a nop, but we hadn't handled the case where the net result was either an insert or an extract (but not both). The transform is restricted to index = 0 to avoid having to adjust indices after the transform. Differential Revision: https://reviews.llvm.org/D158201	2023-08-18 12:28:27 -07:00
Craig Topper	bbbb93eb48	Revert "[DAG] Fold insert_subvector undef, (extract_subvector X, 0), 0 with non-matching types" This reverts commit 770be43f6782dab84d215d01b37396d63a9c2b6e. Forgot to remove from my tree while experimenting.	2023-08-18 12:00:07 -07:00
Craig Topper	770be43f67	[DAG] Fold insert_subvector undef, (extract_subvector X, 0), 0 with non-matching types We have an existing DAG combine for when an insert/extract subvector pair is entirely a nop, but we hadn't handled the case where the net result was either an insert or an extract (but not both). The transform is restricted to index = 0 to avoid having to adjust indices after the transform. Reviews, a couple comments on the test changes: * Mostly RISCV, mostly schedule reordering. * One real regression in splats-with-mixed-vl.ll due to a different overly aggressive combine, fix in a follow up patch. * The test/CodeGen/X86/vector-replicaton-i1-mask.ll diff looked concerning at first, but not the mask size at most 4 i1s. I think the type changes on the mask loads are correct, but would welcome a second opinion with someone more familiar with AVX512 codegen. Differential Revision: https://reviews.llvm.org/D158201	2023-08-18 11:59:18 -07:00
Craig Topper	846fbb06b8	[DAGCombiner][RISCV] Return SDValue(N, 0) instead of SDValue() after 2 calls to CombineTo in visitSTORE. RISC-V found a case where the CombineTo caused N to be CSEd with an existing node and then deleted. The top level DAGCombiner loop was surprised to find a node was deleted, but SDValue() was returned from the visit function. We need to return SDValue(N, 0) to tell the top level loop that a change was made, but the worklist updates were already handled. Fixes #64772. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D158208	2023-08-17 15:13:36 -07:00
Simon Pilgrim	b0a77af4f1	[DAG] SimplifyDemandedBits - add sra(shl(x,c1),c1) -> sign_extend_inreg(x) demanded elts fold Move the sra(shl(x,c1),c1) -> sign_extend_inreg(x) fold inside SimplifyDemandedBits so we can recognize hidden splats with DemandedElts masks. Because the c1 shift amount has multiple uses, hidden splats won't get simplified to a splat constant buildvector - meaning the existing fold in DAGCombiner::visitSRA can't fire as it won't see a uniform shift amount. I also needed to add TLI preferSextInRegOfTruncate hook to help keep truncate(sign_extend_inreg(x)) vector patterns on X86 so we can use PACKSS more efficiently. Differential Revision: https://reviews.llvm.org/D157972	2023-08-15 16:32:03 +01:00
Craig Topper	6299650f97	[DAGCombiner] Fold trunc(undef) -> undef. We already do this in getNode, but the undef might appear during another DAGCombine. While here remove code for handling noop truncates. getNode checks the types and won't a noop truncate. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D157910	2023-08-14 13:02:24 -07:00
Philip Reames	b1ada7a1d3	[DAG] Support store merging of vector constant stores (try 2) Original commit didn't handle the case where one of the stores was a truncating store of the build_vector. The existing codepath produced wrong code (which thankfully also failed asserts) instead of guarding against unexpected types. Original commit message follows.. Ran across this when making a change to RISCV memset lowering. Seems very odd that manually merging a store into a vector prevents it from being further merged. Differential Revision: https://reviews.llvm.org/D156349	2023-08-10 08:54:05 -07:00
Philip Reames	0696a531c2	Revert "[DAG] Support store merging of vector constant stores" This reverts commit 660b740e4b3c4b23dfba36940ae0fe2ad41bfedf. Crash reported in the review thread post commit. Reverting while investigating.	2023-08-10 07:58:00 -07:00
Konstantina Mitropoulou	2c5d1b5ab7	[DAGCombiner] Reassociate the operands from (OR (OR(CMP1, CMP2)), CMP3) to (OR (OR(CMP1, CMP3)), CMP2) This happens when CMP1 and CMP3 have the same predicate (or CMP2 and CMP3 have the same predicate). This helps optimizations such as the fololowing one: CMP(A,C)\|\|CMP(B,C) => CMP(MIN/MAX(A,B), C) CMP(A,C)&&CMP(B,C) => CMP(MIN/MAX(A,B), C) Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D156215	2023-08-08 20:08:01 -07:00
pvanhout	98ccc70b93	[DAG] Fix crash in replaceStoreOfInsertLoad Idx's type can be different from Ptr's, causing a "Binary operator types must match" assertion failure when emitting the MUL. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D156972	2023-08-08 15:15:34 +02:00
Philip Reames	660b740e4b	[DAG] Support store merging of vector constant stores Ran across this when making a change to RISCV memset lowering. Seems very odd that manually merging a store into a vector prevents it from being further merged. Differential Revision: https://reviews.llvm.org/D156349	2023-08-02 14:41:46 -07:00
David Blaikie	4e429fd2a7	Few linter fixes size() > 0 -> !empty indentation mismatched names on parameters in decls/defs const on value return types	2023-07-31 18:52:57 +00:00
Evgenii Kudriashov	c13e310fa7	[DAGCombine] Support truncated constants for fptosi.sat combining Closes https://github.com/llvm/llvm-project/issues/56779 Reviewed By: RKSimon, dmgreen Differential Revision: https://reviews.llvm.org/D152926	2023-07-28 18:54:39 +03:00
Pranav Kant	6f305e0658	[DAGCombiner] Limit graph traversal to cap compile times hasPredecessorHelper method, that is used by DAGCombiner to combine to pre-indexed and post-indexed load/stores, is a major source of slowdown while compiling a large function with MSan enabled on Arm. This patch caps the DFS-graph traversal for this method to 8192 which cuts compile time by 50% (4m -> 2m compile time) at the cost of less overall nodes combined. Here's the summary of pre-index DAG nodes created and time it took to compile the pathological case with different MaxDepth limit: 1. With MaxDepth = 0 (unlimited): 1800, took 4m 2. With MaxDepth = 32k, 560, took 2m31s 3. With MaxDepth = 8k, 139, took 2m. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D154885	2023-07-26 17:29:38 +00:00
Jay Foad	6fcad9cf93	[DAGCombiner] Simplify foldAndOrOfSETCC. NFC. Pull out repeated hasOneUse checks. Simplify some conditions. Reduce indentation. Differential Revision: https://reviews.llvm.org/D156220	2023-07-26 10:22:55 +01:00
Craig Topper	1f5a1b8952	[DAGCombiner] Minor improvements to foldAndOrOfSETCC. NFC Reduce the scope of some variables. Replace an if with an assertion. Reviewed By: kmitropoulou Differential Revision: https://reviews.llvm.org/D156140	2023-07-25 00:20:06 -07:00
WANG Rui	595d5f36f4	[DAGCombine] Canonicalize operands for visitANDLike During the construction of SelectionDAG, there are no explicit canonicalization rules to adjust the order of operands for AND nodes. This may prevent the optimization in DAGCombiner::visitANDLike from being triggered. This patch canonicalizes the operands before matches, which can be observed to improve optimization on the RISC-V target architecture. Canonicalize: ``` and(x, add) -> and(add, x) ``` Signed-off-by: WANG Rui <wangrui@loongson.cn> Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D154760	2023-07-24 16:52:04 +08:00
Amaury Séchet	88452508f3	[DAG] Improve carry reconstruction in combineCarryDiamond. The gain is usually suffiscient to go the extra mile and reconstruct a carry in some cases. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D154533	2023-07-22 22:49:48 +00:00
Simon Pilgrim	697f60598e	[DAG] hoistLogicOpWithSameOpcodeHands - ensure SIGN_EXTEND_INREG nodes have the same extension value type Fix bug in the check for matching SIGN_EXTEND_INREG types	2023-07-20 10:44:46 +01:00
Simon Pilgrim	98b0f1360d	[DAG] hoistLogicOpWithSameOpcodeHands - add support for SIGN_EXTEND_INREG nodes. This can reuse the existing *_EXTEND node handling (with special handling for the valuetype arg)	2023-07-19 11:56:32 +01:00
Simon Pilgrim	2167ae93c9	[DAG] hoistLogicOpWithSameOpcodeHands - add support for _EXTEND_VECTOR_INREG nodes. This can reuse the existing _EXTEND node handling.	2023-07-19 10:50:23 +01:00
Simon Pilgrim	3ad4f92f83	[DAG] More aggressively (extract_vector_elt (build_vector x, y), c) iff element is zero constant We currently don't extract vector elements from multi-use build vectors unless TLI.aggressivelyPreferBuildVectorSources accepts them, which seems a little extreme for constant build vectors (especially as under some cases ComputeKnownBits will indirectly extract the data for us). This is causing a few regressions in some upcoming SimplifyDemandedBits work I'm looking at, all of which just need to know that the element is zero, so I've tweaked the fold to accept zero elements as well, which will typically fold very easily. Differential Revision: https://reviews.llvm.org/D155582	2023-07-18 17:31:34 +01:00
Konstantina Mitropoulou	4c42ab1199	[DAGCombiner] Change foldAndOrOfSETCC() to optimize and/or patterns CMP(A,C)\|\|CMP(B,C) => CMP(MIN/MAX(A,B), C) CMP(A,C)&&CMP(B,C) => CMP(MIN/MAX(A,B), C) This first patch handles integer types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D153502	2023-07-17 17:13:47 -07:00
Matt Arsenault	296e24cd2e	DAG: Constant fold frexp nodes Special casing the nonfinite exponent value everywhere is kind of annoying.	2023-07-17 17:34:29 -04:00
Simon Pilgrim	e9caa37e9c	[DAG] Move lshr narrowing from visitANDLike to SimplifyDemandedBits Inspired by some of the cases from D145468 Let SimplifyDemandedBits handle the narrowing of lshr to half-width if we don't require the upper bits, the narrowed shift is profitable and the zext/trunc are free. A future patch will propose the equivalent shl narrowing combine. Differential Revision: https://reviews.llvm.org/D146121	2023-07-17 15:50:09 +01:00
Noah Goldstein	74f0ec5e24	[DAGCombiner] Make it so that `udiv` can be folded with `(select c, NonZero, 1)` This is done by allowing speculation of `udiv` if we can prove the denominator is non-zero. https://alive2.llvm.org/ce/z/VNCt_q Differential Revision: https://reviews.llvm.org/D149198	2023-07-12 17:17:53 -05:00
Ivan Kosarev	15e7749e19	[Codegen] Generate fast fp64-to-fp16 conversions in unsafe mode. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D154528	2023-07-12 11:55:19 +01:00
Amaury Séchet	ee2d10cd16	[NFC] Reorder functions in DAGCombiner so all UADDO_CARRY related functions are next to each others.	2023-07-04 14:55:11 +00:00
Simon Pilgrim	4742715eb7	[DAG] Fold (ext (_extend_vector_inreg x)) -> (*_extend_vector_inreg x)	2023-06-30 14:42:49 +01:00
David Green	14f54a594e	[DAG][AArch64] Fold shuffle_vector<4,5,6,7> to extract_subvector During legalization, we can end up with shuffles that are identity masks, so act like extract_subvector, but do not simplify to extract_subvector. This adjusts the profitability heuristic in foldExtractSubvectorFromShuffleVector to allow identity vectors that do not start at element 0. Undef masks elements are excluded as it can be more useful to keep the undef elements. Differential Revision: https://reviews.llvm.org/D153504	2023-06-30 11:13:39 +01:00
Luke Lau	742fb8b5c7	[DAGCombine] Fold (store (insert_elt (load p)) x p) -> (store x) If we have a store of a load with no other uses in between it, it's considered dead and is removed. So sometimes when legalizing a fixed length vector store of an insert, we end up producing better code through scalarization than without. An example is the follow below: %a = load <4 x i64>, ptr %x %b = insertelement <4 x i64> %a, i64 %y, i32 2 store <4 x i64> %b, ptr %x If this is scalarized, then DAGCombine successfully removes 3 of the 4 stores which are considered dead, and on RISC-V we get: sd a1, 16(a0) However if we make the vector type legal (-mattr=+v), then we lose the optimisation because we don't scalarize it. This patch attempts to recover the optimisation for vectors by identifying patterns where we store a load with a single insert inbetween, replacing it with a scalar store of the inserted element. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D152276	2023-06-28 22:45:04 +01:00
FLZ101	32e4013dd4	[AArch64][SelectionDAG] fix infinite loop caused by legalizing & combining CONCAT_VECTORS Legalizing in `AArch64TargetLowering::LowerCONCAT_VECTORS()` and combining in `DAGCombiner::visitCONCAT_VECTORS()` could cause an infinite loop. This commit fixes that issue by conditionally skipping the combining. Fix https://github.com/llvm/llvm-project/issues/63322 Reviewed By: RKSimon, MaskRay Differential Revision: https://reviews.llvm.org/D153316	2023-06-27 13:57:41 -07:00
Simon Pilgrim	1f006f5fb6	[DAG] mergeTruncStores - early out if we collect more than the maximum number of stores If we have an excessive number of stores in a single chain then the candidate WideVT may exceed the maximum width of an EVT integer type (and will assert) - but since mergeTruncStores doesn't support anything wider than a i64 store we should just early-out if we've collected more than stores than that. Fixes #63306	2023-06-23 16:22:11 +01:00
David Green	589c940eb3	[DAG] Fix and expand fmin/fmax reassociation fold. This call to reassociateReduction is used by both fminnum/fmaxnum and fminimum/fmaximum. In adding support for fminimum/fmaximum we appear to be fixing the use of an incorrect reduction type, which should have only applied to minnum/maxnum. I also believe that it doesn't need nsz and reassoc to perform the reassociation. For float min/max it should always be valid. Differential Revision: https://reviews.llvm.org/D153247	2023-06-23 14:45:14 +01:00
Amaury Séchet	34d8c5b9ce	[DAG] Peek through trunc when combining select into shifts. This fixes a regression in D127115 Depends on D127115 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D151916	2023-06-23 00:35:39 +00:00
Simon Pilgrim	43ad2e9c8b	[DAG] Add getExtOrTrunc helper. NFC. Wrap the getSExtOrTrunc/getZExtOrTrunc calls behind an IsSigned argument.	2023-06-20 16:03:18 +01:00
Simon Pilgrim	ff23856c1c	[DAG] Fold (abds x, y) -> (abdu x, y) iff both args are known positive This is a generic DAG combine version of D151055 which recognizes when a signed ABDS can be safely replaced with a unsigned ABDU instruction if it is legal. Alive2: https://alive2.llvm.org/ce/z/pb5BjG Differential Revision: https://reviews.llvm.org/D153328	2023-06-20 15:31:22 +01:00
Jeffrey Byrnes	7972a6e126	[DAGCombiner][NFC] Factor out ByteProvider Differential Revision: https://reviews.llvm.org/D143018 Change-Id: I3dc03787a3382c0c3fe6b869f869c2946f450874	2023-06-19 08:54:34 -07:00
Craig Topper	7163539466	[DAGCombiner] When combining (sext_inreg (zext X), VT) -> (sext X) don't pass along the sext_inreg VT. ISD::SIGN_EXTEND is only supposed to have one operand, but we were creating it with 2 operands. Since we basically never check for extra operands this went unnoticed.	2023-06-15 11:47:42 -07:00

1 2 3 4 5 ...

3640 Commits