llvm-project

Author	SHA1	Message	Date
Pranav Kant	6f305e0658	[DAGCombiner] Limit graph traversal to cap compile times hasPredecessorHelper method, that is used by DAGCombiner to combine to pre-indexed and post-indexed load/stores, is a major source of slowdown while compiling a large function with MSan enabled on Arm. This patch caps the DFS-graph traversal for this method to 8192 which cuts compile time by 50% (4m -> 2m compile time) at the cost of less overall nodes combined. Here's the summary of pre-index DAG nodes created and time it took to compile the pathological case with different MaxDepth limit: 1. With MaxDepth = 0 (unlimited): 1800, took 4m 2. With MaxDepth = 32k, 560, took 2m31s 3. With MaxDepth = 8k, 139, took 2m. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D154885	2023-07-26 17:29:38 +00:00
Jay Foad	6fcad9cf93	[DAGCombiner] Simplify foldAndOrOfSETCC. NFC. Pull out repeated hasOneUse checks. Simplify some conditions. Reduce indentation. Differential Revision: https://reviews.llvm.org/D156220	2023-07-26 10:22:55 +01:00
Craig Topper	1f5a1b8952	[DAGCombiner] Minor improvements to foldAndOrOfSETCC. NFC Reduce the scope of some variables. Replace an if with an assertion. Reviewed By: kmitropoulou Differential Revision: https://reviews.llvm.org/D156140	2023-07-25 00:20:06 -07:00
WANG Rui	595d5f36f4	[DAGCombine] Canonicalize operands for visitANDLike During the construction of SelectionDAG, there are no explicit canonicalization rules to adjust the order of operands for AND nodes. This may prevent the optimization in DAGCombiner::visitANDLike from being triggered. This patch canonicalizes the operands before matches, which can be observed to improve optimization on the RISC-V target architecture. Canonicalize: ``` and(x, add) -> and(add, x) ``` Signed-off-by: WANG Rui <wangrui@loongson.cn> Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D154760	2023-07-24 16:52:04 +08:00
Amaury Séchet	88452508f3	[DAG] Improve carry reconstruction in combineCarryDiamond. The gain is usually suffiscient to go the extra mile and reconstruct a carry in some cases. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D154533	2023-07-22 22:49:48 +00:00
Simon Pilgrim	697f60598e	[DAG] hoistLogicOpWithSameOpcodeHands - ensure SIGN_EXTEND_INREG nodes have the same extension value type Fix bug in the check for matching SIGN_EXTEND_INREG types	2023-07-20 10:44:46 +01:00
Simon Pilgrim	98b0f1360d	[DAG] hoistLogicOpWithSameOpcodeHands - add support for SIGN_EXTEND_INREG nodes. This can reuse the existing *_EXTEND node handling (with special handling for the valuetype arg)	2023-07-19 11:56:32 +01:00
Simon Pilgrim	2167ae93c9	[DAG] hoistLogicOpWithSameOpcodeHands - add support for _EXTEND_VECTOR_INREG nodes. This can reuse the existing _EXTEND node handling.	2023-07-19 10:50:23 +01:00
Simon Pilgrim	3ad4f92f83	[DAG] More aggressively (extract_vector_elt (build_vector x, y), c) iff element is zero constant We currently don't extract vector elements from multi-use build vectors unless TLI.aggressivelyPreferBuildVectorSources accepts them, which seems a little extreme for constant build vectors (especially as under some cases ComputeKnownBits will indirectly extract the data for us). This is causing a few regressions in some upcoming SimplifyDemandedBits work I'm looking at, all of which just need to know that the element is zero, so I've tweaked the fold to accept zero elements as well, which will typically fold very easily. Differential Revision: https://reviews.llvm.org/D155582	2023-07-18 17:31:34 +01:00
Konstantina Mitropoulou	4c42ab1199	[DAGCombiner] Change foldAndOrOfSETCC() to optimize and/or patterns CMP(A,C)\|\|CMP(B,C) => CMP(MIN/MAX(A,B), C) CMP(A,C)&&CMP(B,C) => CMP(MIN/MAX(A,B), C) This first patch handles integer types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D153502	2023-07-17 17:13:47 -07:00
Matt Arsenault	296e24cd2e	DAG: Constant fold frexp nodes Special casing the nonfinite exponent value everywhere is kind of annoying.	2023-07-17 17:34:29 -04:00
Simon Pilgrim	e9caa37e9c	[DAG] Move lshr narrowing from visitANDLike to SimplifyDemandedBits Inspired by some of the cases from D145468 Let SimplifyDemandedBits handle the narrowing of lshr to half-width if we don't require the upper bits, the narrowed shift is profitable and the zext/trunc are free. A future patch will propose the equivalent shl narrowing combine. Differential Revision: https://reviews.llvm.org/D146121	2023-07-17 15:50:09 +01:00
Noah Goldstein	74f0ec5e24	[DAGCombiner] Make it so that `udiv` can be folded with `(select c, NonZero, 1)` This is done by allowing speculation of `udiv` if we can prove the denominator is non-zero. https://alive2.llvm.org/ce/z/VNCt_q Differential Revision: https://reviews.llvm.org/D149198	2023-07-12 17:17:53 -05:00
Ivan Kosarev	15e7749e19	[Codegen] Generate fast fp64-to-fp16 conversions in unsafe mode. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D154528	2023-07-12 11:55:19 +01:00
Amaury Séchet	ee2d10cd16	[NFC] Reorder functions in DAGCombiner so all UADDO_CARRY related functions are next to each others.	2023-07-04 14:55:11 +00:00
Simon Pilgrim	4742715eb7	[DAG] Fold (ext (_extend_vector_inreg x)) -> (*_extend_vector_inreg x)	2023-06-30 14:42:49 +01:00
David Green	14f54a594e	[DAG][AArch64] Fold shuffle_vector<4,5,6,7> to extract_subvector During legalization, we can end up with shuffles that are identity masks, so act like extract_subvector, but do not simplify to extract_subvector. This adjusts the profitability heuristic in foldExtractSubvectorFromShuffleVector to allow identity vectors that do not start at element 0. Undef masks elements are excluded as it can be more useful to keep the undef elements. Differential Revision: https://reviews.llvm.org/D153504	2023-06-30 11:13:39 +01:00
Luke Lau	742fb8b5c7	[DAGCombine] Fold (store (insert_elt (load p)) x p) -> (store x) If we have a store of a load with no other uses in between it, it's considered dead and is removed. So sometimes when legalizing a fixed length vector store of an insert, we end up producing better code through scalarization than without. An example is the follow below: %a = load <4 x i64>, ptr %x %b = insertelement <4 x i64> %a, i64 %y, i32 2 store <4 x i64> %b, ptr %x If this is scalarized, then DAGCombine successfully removes 3 of the 4 stores which are considered dead, and on RISC-V we get: sd a1, 16(a0) However if we make the vector type legal (-mattr=+v), then we lose the optimisation because we don't scalarize it. This patch attempts to recover the optimisation for vectors by identifying patterns where we store a load with a single insert inbetween, replacing it with a scalar store of the inserted element. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D152276	2023-06-28 22:45:04 +01:00
FLZ101	32e4013dd4	[AArch64][SelectionDAG] fix infinite loop caused by legalizing & combining CONCAT_VECTORS Legalizing in `AArch64TargetLowering::LowerCONCAT_VECTORS()` and combining in `DAGCombiner::visitCONCAT_VECTORS()` could cause an infinite loop. This commit fixes that issue by conditionally skipping the combining. Fix https://github.com/llvm/llvm-project/issues/63322 Reviewed By: RKSimon, MaskRay Differential Revision: https://reviews.llvm.org/D153316	2023-06-27 13:57:41 -07:00
Simon Pilgrim	1f006f5fb6	[DAG] mergeTruncStores - early out if we collect more than the maximum number of stores If we have an excessive number of stores in a single chain then the candidate WideVT may exceed the maximum width of an EVT integer type (and will assert) - but since mergeTruncStores doesn't support anything wider than a i64 store we should just early-out if we've collected more than stores than that. Fixes #63306	2023-06-23 16:22:11 +01:00
David Green	589c940eb3	[DAG] Fix and expand fmin/fmax reassociation fold. This call to reassociateReduction is used by both fminnum/fmaxnum and fminimum/fmaximum. In adding support for fminimum/fmaximum we appear to be fixing the use of an incorrect reduction type, which should have only applied to minnum/maxnum. I also believe that it doesn't need nsz and reassoc to perform the reassociation. For float min/max it should always be valid. Differential Revision: https://reviews.llvm.org/D153247	2023-06-23 14:45:14 +01:00
Amaury Séchet	34d8c5b9ce	[DAG] Peek through trunc when combining select into shifts. This fixes a regression in D127115 Depends on D127115 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D151916	2023-06-23 00:35:39 +00:00
Simon Pilgrim	43ad2e9c8b	[DAG] Add getExtOrTrunc helper. NFC. Wrap the getSExtOrTrunc/getZExtOrTrunc calls behind an IsSigned argument.	2023-06-20 16:03:18 +01:00
Simon Pilgrim	ff23856c1c	[DAG] Fold (abds x, y) -> (abdu x, y) iff both args are known positive This is a generic DAG combine version of D151055 which recognizes when a signed ABDS can be safely replaced with a unsigned ABDU instruction if it is legal. Alive2: https://alive2.llvm.org/ce/z/pb5BjG Differential Revision: https://reviews.llvm.org/D153328	2023-06-20 15:31:22 +01:00
Jeffrey Byrnes	7972a6e126	[DAGCombiner][NFC] Factor out ByteProvider Differential Revision: https://reviews.llvm.org/D143018 Change-Id: I3dc03787a3382c0c3fe6b869f869c2946f450874	2023-06-19 08:54:34 -07:00
Craig Topper	7163539466	[DAGCombiner] When combining (sext_inreg (zext X), VT) -> (sext X) don't pass along the sext_inreg VT. ISD::SIGN_EXTEND is only supposed to have one operand, but we were creating it with 2 operands. Since we basically never check for extra operands this went unnoticed.	2023-06-15 11:47:42 -07:00
Amara Emerson	f79b0333fc	[DAGCombiner] Fix crash when trying to replace an indexed store with a narrow store. rdar://108818859 Differential Revision: https://reviews.llvm.org/D152978	2023-06-15 01:54:38 -07:00
Anna Thomas	26bfbec5d2	[Intrinsic] Introduce reduction intrinsics for minimum/maximum This patch introduces the reduction intrinsic for floating point minimum and maximum which has the same semantics (for NaN and signed zero) as llvm.minimum and llvm.maximum. Reviewed-By: nikic Differential Revision: https://reviews.llvm.org/D152370	2023-06-13 12:29:58 -04:00
David Green	14914fb157	[DAG][NFC] Update comment on min/max reduction fold. As pointed out in D141870, this one was incorrectly referencing and.	2023-06-13 17:09:22 +01:00
Amaury Séchet	a70d5e25f3	[DAGCombine] Make sure combined nodes are added back to the worklist in topological order. Currently, a node and its users are added back to the worklist in reverse topological order after it is combined. This diff changes that order to be topological. This is part of a larger migration to get the DAGCombiner to process nodes in topological order. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D127115	2023-06-13 09:14:37 +00:00
Nikita Popov	5c6ff3a602	[DAGCombine] Move setcc of freeze fold to brcond This fold goes against the usual approach of pushing freeze into operands. The idea behind the fold is that if the setcc feeds into a brcond, the freeze can be dropped entirely. Move the fold to brcond, where we can remove the freeze directly. This ensures that there can be no infinite combine loops due to conflicting transforms. Differential Revision: https://reviews.llvm.org/D152544	2023-06-12 12:01:29 +02:00
Yeting Kuo	2fe2a6d4b8	[DAGCombiner] Use generalized pattern matcher in visitFMA to support vp.fma. Note: Some patterns in visitFMA are needed refined to support splat of constant. Reviewed By: luke Differential Revision: https://reviews.llvm.org/D152260	2023-06-08 09:40:21 +08:00
Serge Pavlov	10e7899818	[FPEnv] Get rid of extra moves in fpenv calls If intrinsic `get_fpenv` or `set_fpenv` is lowered to the form where FP environment is represented as a region in memory, extra moves can appear. For example the code: define void @func_01(ptr %ptr) { %env = call i256 @llvm.get.fpenv.i256() store i256 %env, ptr %ptr ret void } produces DAG: ch = get_fpenv_mem ch, memory_region val: i256, ch = load ch, memory_region ch = store ch, ptr, val In this case the extra moves can be avoided if `get_fpenv_mem` got pointer to the memory where the FP environment should be finally placed. This change implement such optimization for this use case. Differential Revision: https://reviews.llvm.org/D150437	2023-06-06 14:54:52 +07:00
Matt Arsenault	a1422bf906	DAG: Reorder conditions	2023-06-05 18:44:17 -04:00
Amaury Séchet	7988725f65	[NFC][DAG] Move isTruncateOf so that it can be used in foldBinOpIntoSelect.	2023-06-05 15:33:59 +00:00
JP Lehr	c9998ec145	Revert "[DAGCombine] Make sure combined nodes are added back to the worklist in topological order." This reverts commit e69fa03ddd85812be3143d79a0359c3e8d43bd45. This patch lead to build time outs on the AMDGPU OpenMP runtime buildbot.	2023-06-05 10:55:58 -04:00
Amaury Séchet	e69fa03ddd	[DAGCombine] Make sure combined nodes are added back to the worklist in topological order. Currently, a node and its users are added back to the worklist in reverse topological order after it is combined. This diff changes that order to be topological. This is part of a larger migration to get the DAGCombiner to process nodes in topological order. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D127115	2023-06-05 11:09:18 +00:00
Jay Foad	b7052fa329	[DAGCombiner] Do not fold fadd (fmul x, y), (fmul x, y) -> fma x, y, (fmul x, y) Differential Revision: https://reviews.llvm.org/D151890	2023-06-01 16:32:24 +01:00
David Green	7740216f2e	[DAG] Combine insert(shuffle(load), load, 0) into a single load Given an insert of a scalar load into a vector shuffle with mask u,0,1,2,3,4,5,6 or 1,2,3,4,5,6,7,u (depending on the insert index), it can be more profitable to convert to a single load and avoid the shuffles. This adds a DAG combine for it, providing the new load is still fast. Differential Revision: https://reviews.llvm.org/D151029	2023-05-31 19:48:57 +01:00
Dhruv Chawla	51572c2cd7	[NFC][DAGCombiner]: Only consider nodes with no uses for pruning when forming initial worklist When the worklist is initially being formed, there is no need to consider all nodes for pruning. This is because the first time calling getNextWorklistEntry will only clear those nodes which have no uses, with their operands being added to the worklist. However, when the worklist is created for the first time all nodes are added anyways, so this operation actually ends up adding no nodes. This patch adds a parameter IsCandidateForPruning to AddToWorklist with a default value of true to avoid having to update every call site. Differential Revision: https://reviews.llvm.org/D151416	2023-05-25 19:48:30 +05:30
Amaury Séchet	87bf2bff05	[NFC][DAG] Simplify a giant expression in visitMul.	2023-05-18 18:58:07 +00:00
Philip Reames	0dc0c27989	[TLI] Add IsZero parameter to storeOfVectorConstantIsCheap [nfc] Make the decision to consider zero constant stores cheap target specific. Will be used in an upcoming change for RISCV.	2023-05-17 09:19:01 -07:00
Austin Chang	d069ac035a	[DAGCombiner] Add bswap(logic_op(bswap(x), y)) optimization This is the implementation of D149782 The patch implements a helper function that matches and fold the following cases in the DAGCombiner: 1. `bswap(logic_op(x, bswap(y))) -> logic_op(bswap(x), y)` 2. `bswap(logic_op(bswap(x), y)) -> logic_op(x, bswap(y))` 3. `bswap(logic_op(bswap(x), bswap(y))) -> logic_op(x, y)` in multiuse case, which still reduces the number of instructions. The helper function accepts SDValue with BSWAP and BITREVERSE opcode. This patch folds the BSWAP cases and remain the BITREVERSE optimization in the future Reviewed By: RKSimon, goldstein.w.n Differential Revision: https://reviews.llvm.org/D149783	2023-05-16 18:58:07 -05:00
Simon Pilgrim	8f82d8ee76	[DAG] visitSUBSAT - fold subsat(x,y) -> sub(x,y) if it never overflows	2023-05-06 15:55:04 +01:00
Simon Pilgrim	08c1150d4c	[DAG] Add computeOverflowForSignedSub/computeOverflowForUnsignedSub/computeOverflowForSub Match the addition variants (although computeOverflowForUnsignedSub is really just a placeholder), and use this in DAGCombiner::visitSUBO	2023-05-06 15:55:04 +01:00
Simon Pilgrim	3fb067f7ba	[DAG] visitADDSAT - fold saddsat(x,y) -> add(x,y) if it never overflows Extend existing uaddsat(x,y) fold	2023-05-06 14:18:23 +01:00
Simon Pilgrim	7395f6ae78	[DAG] Add computeOverflowForSignedAdd and computeOverflowForAdd wrapper Add basic computeOverflowForSignedAdd helper to recognise that sadd overflow can't occur if both operands have more that one sign bit. Add computeOverflowForAdd wrapper that calls computeOverflowForSignedAdd/computeOverflowForUnsignedAdd depending on the IsSigned argument, and use this in DAGCombiner::visitADDO	2023-05-06 13:33:14 +01:00
Simon Pilgrim	c7fce3f98b	[DAG] Rename computeOverflowKind -> computeOverflowForUnsignedAdd. NFC. Matches the naming convention for the equivalent ValueTracking helpers - further SelectionDAG computeOverflowFor*() helpers will be added soon.	2023-05-05 19:38:54 +01:00
Luo, Yuanke	ae1ca47bb4	[Coverity] Big parameter passed by value.	2023-05-05 09:50:38 +08:00
Craig Topper	fe9f557578	[DAGCombiner][RISCV] Enable reassociation for VP_FMA in visitFADDForFMACombine. Reviewed By: fakepaper56 Differential Revision: https://reviews.llvm.org/D149911	2023-05-04 17:20:58 -07:00

1 2 3 4 5 ...

3616 Commits