llvm-project

Author	SHA1	Message	Date
John Brawn	dd87127f4e	[DAGCombiner] Eliminate fp casts if we have the right fast math flags (#131345 ) When floating-point operations are legalized to operations of a higher precision (e.g. f16 fadd being legalized to f32 fadd) then we get narrowing then widening operations between each operation. With the appropriate fast math flags (nnan ninf contract) we can eliminate these casts.	2025-04-28 11:21:51 +01:00
Simon Pilgrim	10f6c3e270	[DAG] visitCONCAT_VECTORS - relax legality checks (#137210 ) We can fold combineConcatVectorOfConcatVectors/combineConcatVectorOfExtracts until after vector legalization	2025-04-24 19:08:06 +01:00
Simon Pilgrim	79151244d6	[DAG] narrowExtractedVectorLoad - reuse existing SDLoc. NFC (#136870 )	2025-04-23 16:50:06 +01:00
Nicholas Guy	a1f369e630	[AArch64][SVE] Add dot product lowering for PARTIAL_REDUCE_MLA node (#130933 ) Add lowering in tablegen for PARTIAL_REDUCE_U/SMLA ISD nodes. Only happens when the combine has been performed on the ISD node. Also adds in check to only do the DAG combine when the node can then eventually be lowered, so changes neon tests too. --------- Co-authored-by: James Chesterman <james.chesterman@arm.com>	2025-04-23 13:19:41 +01:00
Simon Pilgrim	a99e055030	[DAG] shouldReduceLoadWidth - add optional<unsigned> byte offset argument (#136723 ) Based off feedback for #129695 - we need to be able to determine the load offset of smaller loads when trying to determine whether a multiple use load should be split (in particular for AVX subvector extractions). This patch adds a std::optional<unsigned> ByteOffset argument to shouldReduceLoadWidth calls for where we know the constant offset to allow targets to make use of it in future patches.	2025-04-23 12:30:27 +01:00
Luke Lau	8204931038	[RISCV] Add disjoint or patterns for vwadd[u].v{v,x} (#136716 ) DAGCombiner::hoistLogicOpWithSameOpcodeHands will hoist (or disjoint (ext a), (ext b)) -> (ext (or disjoint a, b)) So this adds patterns to match vwadd[u].v{v,x} in this case. We have to teach the combine to preserve the disjoint flag.	2025-04-23 15:17:04 +08:00
Craig Topper	f6178cdad0	[SelectionDAG] Pass LoadExtType when ATOMIC_LOAD is created. (#136653 ) Rename one signature of getAtomic to getAtomicLoad and pass LoadExtType. Previously we had to set the extension type after the node was created, but we don't usually modify SDNodes once they are created. It's possible the node already existed and has been CSEd. If that happens, modifying the node may affect the other users. It's therefore safer to add the extension type at creation so that it is part of the CSE information. I don't know of any failures related to the current implementation. I only noticed that it doesn't match how we usually do things.	2025-04-22 09:11:46 -07:00
Iris	2b71269388	[SelectionDAG][X86] Fold `sub(x, mul(divrem(x,y)[0], y))` to `divrem(x, y)[1]` (#136565 ) Closes #51823.	2025-04-22 20:57:03 +08:00
Kazu Hirata	81b4fc2bed	[CodeGen] Construct SmallVector with ArrayRef (NFC) (#135930 ) Note that we can drop the call to reserve because the constructor that takes ArrayRef calls append, which in turn calls reserve.	2025-04-16 08:37:56 -07:00
Alex MacLean	1bfd444628	[DAGCombiner] Fold and/or of NaN SETCC (#135645 ) Fold an AND or OR of two NaN SETCC nodes into a single SETCC where possible. This optimization already exists in InstCombine but adding in here as well can allow for additional folding if more logical operations are exposed.	2025-04-16 06:48:42 -07:00
Craig Topper	8ed397d8e4	[DAGCombiner] Disable narrowExtractedVectorLoad for indexed loads. (#135847 ) The later code does not expect or preserve the index output. Fixes #135821	2025-04-15 14:58:15 -07:00
Piotr Fusik	e100d2bf9a	[DAGCombiner] Fold subtraction if above a constant threshold to `umin` (#135194 ) Like #134235, but with a constant. It's a pattern in Adler-32 checksum calculation in zlib. Example: unsigned adler32_mod(unsigned x) { return x >= 65521u ? x - 65521u : x; } Before, on RISC-V: lui a1, 16 lui a2, 1048560 addiw a1, a1, -16 sltu a1, a1, a0 negw a1, a1 addi a2, a2, 15 and a1, a1, a2 addw a0, a0, a1 Or, with Zicond: lui a1, 16 lui a2, 1048560 addiw a1, a1, -16 sltu a1, a1, a0 addi a2, a2, 15 czero.eqz a1, a2, a1 addw a0, a0, a1 After, with Zbb: lui a1, 1048560 addi a1, a1, 15 addw a1, a0, a1 minu a0, a1, a0	2025-04-11 15:00:40 +02:00
zhijian lin	378ac572ac	Reland "[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the poison value in the IR." (#135056 ) A new ISD::POISON SDNode is introduced to represent the poison value in the IR, replacing the previous use of ISD::UNDEF	2025-04-10 11:29:14 -04:00
Piotr Fusik	807cc3791f	[DAGCombiner] Fold subtraction if above threshold to `umin` (#134235 ) Folds patterns such as: unsigned foo(unsigned x, unsigned y) { return x >= y ? x - y : x; } Before, on RISC-V: sltu a2, a0, a1 addi a2, a2, -1 and a1, a1, a2 subw a0, a0, a1 Or, with Zicond: sltu a2, a0, a1 czero.nez a1, a1, a2 subw a0, a0, a1 After, with Zbb: subw a1, a0, a1 minu a0, a0, a1 Only applies to unsigned comparisons. If `x >= y` then `x - y` is less than or equal `x`. Otherwise, `x - y` wraps and is greater than `x`.	2025-04-10 09:08:08 +02:00
Jakub Kuderski	ef1088f703	Revert "[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the poison value in the IR." (#135060 ) Reverts llvm/llvm-project#125883 This PR causes crashes in RISC-V codegen around f16/f64 poison values: https://github.com/llvm/llvm-project/pull/125883#issuecomment-2787048206	2025-04-09 14:40:56 -04:00
zhijian lin	8fddef8483	[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the poison value in the IR. (#125883 ) A new ISD::POISON SDNode is introduced to represent the `poison value` in the IR, replacing the previous use of ISD::UNDEF.	2025-04-07 10:03:05 -04:00
Alex MacLean	ad39049ec4	[DAGCombiner] Attempt to fold 'add' nodes to funnel-shift or rotate (#125612 ) Almost all of the rotate idioms that are valid for an 'or' are also valid when the halves are combined with an 'add'. Further, many of these cases are not handled by common bits tracking meaning that the 'add' is not converted to a 'disjoint or'.	2025-04-04 15:39:24 -07:00
Simon Pilgrim	9b32f3d096	[DAG] visitEXTRACT_SUBVECTOR - don't return early on failure of EXTRACT_SUBVECTOR(INSERT_SUBVECTOR()) -> BITCAST fold (#133695 ) Always allow later folds to try to match as well.	2025-03-31 14:32:43 +01:00
Simon Pilgrim	666faa7fd9	[DAG] visitEXTRACT_SUBVECTOR - accumulate SimplifyDemandedVectorElts demanded elts across all EXTRACT_SUBVECTOR uses (REAPPLIED) (#133401 ) Similar to what is done for visitEXTRACT_VECTOR_ELT - if all uses of a vector are EXTRACT_SUBVECTOR, then determine the accumulated demanded elts across all users and call SimplifyDemandedVectorElts in "AssumeSingleUse" use. Second try after #133130 was reverted by #133331 due to it affecting reverted test files	2025-03-29 17:55:38 +00:00
Walter Lee	5b7fd708fe	Revert "[DAG] visitEXTRACT_SUBVECTOR - accumulate SimplifyDemandedVectorElts demanded elts across all EXTRACT_SUBVECTOR uses" (#133331 ) Reverts llvm/llvm-project#133130 This touches a common file as #133083, which is causing failures	2025-03-27 18:36:38 -04:00
Philip Reames	c90a536bcf	[CodeGen] Simplify code using TypeSize overloads of getMachineMemOperand [nfc] These were added in d584cea. This change runs through existing uses and simplifies where obvious.	2025-03-27 11:47:51 -07:00
Simon Pilgrim	a8575b3ea8	[DAG] visitEXTRACT_SUBVECTOR - accumulate SimplifyDemandedVectorElts demanded elts across all EXTRACT_SUBVECTOR uses (#133130 ) Similar to what is done for visitEXTRACT_VECTOR_ELT - if all uses of a vector are EXTRACT_SUBVECTOR, then determine the accumulated demanded elts across all users and call SimplifyDemandedVectorElts in "AssumeSingleUse" use.	2025-03-27 15:31:06 +00:00
LU-JOHN	2df25a4733	Invalidate range metadata when folding bitcast into load (#133095 )	2025-03-27 14:10:55 +07:00
Pierre van Houtryve	6e3c24fc0a	[DAG] Combine (sext (sext_in_reg x)) to (sext_in_reg (any_extend x)) (#132386 )	2025-03-24 09:31:02 +01:00
Mikhail R. Gadelha	f138e36d52	[SelectionDAG][RISCV] Avoid store merging across function calls (#130430 ) This patch improves DAGCombiner's handling of potential store merges by detecting function calls between loads and stores. When a function call exists in the chain between a load and its corresponding store, we avoid merging these stores if the spilling is unprofitable. We had to implement a hook on TLI, since TTI is unavailable in DAGCombine. Currently, it's only enabled for riscv. This is the DAG equivalent of PR #129258	2025-03-22 10:35:25 -03:00
Matthias Braun	e6382f2111	SelectionDAG: neg (and x, 1) --> SIGN_EXTEND_INREG x, i1 (#131239 ) The pattern ```LLVM %shl = shl i32 %x, 31 %ashr = ashr i32 %shl, 31 ``` would be combined to `SIGN_EXTEND_INREG %x, ValueType:ch:i1` by SelectionDAG. However InstCombine normalizes this pattern to: ```LLVM %and = and i32 %x, 1 %neg = sub i32 0, %and ``` This adds matching code to DAGCombiner to catch this variant as well.	2025-03-14 10:47:56 -07:00
LU-JOHN	95e186cadf	Reland "DAG: Preserve range metadata when load is narrowed" (#128144 ) (#130609 ) Changes: Add guard to ensure truncation is strictly smaller than original size. --------- Signed-off-by: John Lu <John.Lu@amd.com>	2025-03-13 12:47:03 +07:00
sommersun	848ba3854c	[DAG] fold AVGFLOORS to AVGFLOORU for non-negative operand (#84746 ) (#129678 ) Fold ISD::AVGFLOORS to ISD::AVGFLOORU for non-negative operand. Cover test is modified for uhadd with zero extension. Fixes #84746	2025-03-10 13:01:08 +00:00
Paul Walker	a537724069	[LLVM][DAGCombine] Remove combiner-vector-fcopysign-extend-round. (#129878 ) This option was added to improve test coverage for SVE lowering code that is impossible to reach otherwise. Given it is not possible to trigger a bug without it and the generated code is universally worse with it, I figure the option has no value and should be removed.	2025-03-05 15:31:34 +00:00
James Chesterman	e3c8e17b07	Reland "[DAGCombiner] Add generic DAG combine for ISD::PARTIAL_REDUCE_MLA (#127083 )" This relands commit 7a06681398a33d53ba6d661777be8b4c1d19acb7.	2025-03-04 11:09:33 +00:00
Kazu Hirata	7a06681398	Revert "[DAGCombiner] Add generic DAG combine for ISD::PARTIAL_REDUCE_MLA (#127083 )" This reverts commit 2bef21f24ba932a757a644470358c340f4bcd113. Multiple builtbot failures have been reported: https://github.com/llvm/llvm-project/pull/127083	2025-03-04 01:44:09 -08:00
James Chesterman	2bef21f24b	[DAGCombiner] Add generic DAG combine for ISD::PARTIAL_REDUCE_MLA (#127083 ) Add generic DAG combine for ISD::PARTIAL_REDUCE_U/SMLA nodes. Transforms the DAG from: PARTIAL_REDUCE_MLA(Acc, MUL(EXT(MulOpLHS), EXT(MulOpRHS)), Splat(1)) to PARTIAL_REDUCE_MLA(Acc, MulOpLHS, MulOpRHS).	2025-03-04 09:09:15 +00:00
Huibin Wang	59138a603f	[DAGCombiner] Cleanup MatchFunnelPosNeg by using SDPatternMatch matchers (#129482 ) Fixes issue: https://github.com/llvm/llvm-project/issues/129034	2025-03-03 14:35:38 +07:00
Yingwei Zheng	2709366f75	[DAGCombiner] Don't ignore N2's undef elements in `foldVSelectOfConstants` (#129272 ) Since N2 will be reused in the fold, we cannot skip N2's undef elements if the corresponding element in N1 is well-defined. For example: ``` t2: v4i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0> t24: v4i32 = BUILD_VECTOR undef:i32, undef:i32, Constant:i32<1>, undef:i32 t11: v4i32 = vselect t8, t2, t10 ``` Before this patch, we fold t11 into: ``` t26: v4i32 = sign_extend t8 t27: v4i32 = add t26, t24 ``` The last element of t27 is incorrect. Closes https://github.com/llvm/llvm-project/issues/129181.	2025-03-01 20:21:28 +08:00
Simon Pilgrim	bae41127e2	[DAG] replaceShuffleOfInsert - add support for shuffle_vector(scalar_to_vector(x),y) -> insert_vector_elt(y,x,c) (#127210 ) Begin extending replaceShuffleOfInsert to handle other forms of scalar insertion into a vector. I've limited this to targets that have Custom/Legal ISD::INSERT_VECTOR_ELT handling for now - although we can probably always fold this before LegalOperations.	2025-02-27 08:41:58 +00:00
Daniel Thornburgh	02128342d2	Revert "DAG: Preserve range metadata when load is narrowed" (#128948 ) Reverts llvm/llvm-project#128144 Breaks clang prod x64 build (seen in Fuchsia toolchain)	2025-02-26 14:14:55 -08:00
LU-JOHN	d8bcb53780	DAG: Preserve range metadata when load is narrowed (#128144 ) In DAGCombiner.cpp preserve range metadata when load is narrowed to load LSBs if original range metadata bounds can fit in the narrower type. Utilize preserved range metadata to reduce 64-bit shl to 32-bit shl. --------- Signed-off-by: John Lu <John.Lu@amd.com>	2025-02-26 15:40:49 +07:00
Vikash Gupta	352c48f278	[SelectionDAG] Utilizing target hook convertSelectOfConstantsToMath for SelectwithConstant (#127599 ) The Target hook convertSelectOfConstantsToMath() needs to be used within SimplifySelectCC helper combine function in SelectionDAG Isel, where generic select folding with constants is happening into simple maths op using the condition as it is. It necessarily fixes #121145.	2025-02-25 20:32:24 +05:30
Yingwei Zheng	44d1dbd24c	[X86][DAGCombiner] Skip x87 fp80 values in `combineFMulOrFDivWithIntPow2` (#128618 ) f80 is not a valid IEEE floating-point type. Closes https://github.com/llvm/llvm-project/issues/128528.	2025-02-25 22:03:17 +08:00
Yingwei Zheng	646e4f2eed	[DAGCombiner] visitFREEZE: Early exit when N is deleted (#128161 ) `N` may get merged with existing nodes inside the loop. Early exit when it is deleted to avoid the crash. Alternative solution: use `DAGNodeDeletedListener` to refresh the value of N. Closes https://github.com/llvm/llvm-project/issues/128143.	2025-02-22 12:06:34 +08:00
Piotr Fusik	8b58cb853a	[SelectionDAG][NFC] Refactor duplicate code into SDNode::bitcastToAPInt() (#127503 )	2025-02-20 13:23:00 +07:00
zhijian lin	1ac0db44fd	[NFC] using isUndef() instead of getOpcode() == ISD::UNDEF (#127713 ) [NFC] using isUndef() instead of getOpcode() == ISD::UNDEF	2025-02-19 08:42:38 -05:00
Craig Topper	ef9f0b3c41	[DAGCombiner] Don't peek through truncates of shift amounts in takeInexpensiveLog2. (#126957 ) Shift amounts in SelectionDAG don't have to match the result type of the shift. SelectionDAGBuilder will aggressively truncate shift amounts to the target's preferred type. This may result in a zero extend that existed in IR being removed. If we look through a truncate here, we can't guarantee the upper bits of the truncate input are zero. There may have been a zext that was removed. Unfortunately, this regresses tests where no truncate was involved. The only way I can think to fix this is to add an assertzext when SelectionDAGBuilder truncates a shift amount or remove the early truncation of shift amounts from SelectionDAGBuilder all together. Fixes #126889.	2025-02-17 20:26:05 -08:00
Jim Lin	31bfae35d2	[DAGCombiner] Add hasOneUse checks for folding (not (add X, -1)) to (neg X) (#126667 ) To get more better codegen for AArch with bic, x86 with andn and riscv with andn.	2025-02-12 12:24:29 +08:00
Matt Arsenault	c268a3f093	DAG: Fix extract of load combine with mismatched vector element type Fix the case where the vector element type of the loaded extractelement input does not match the result type of the extract. This fixes a regression reported after c55a7659b38946350315ac4a18d9805deb1f0a54	2025-02-06 22:56:56 +07:00
Matt Arsenault	c55a7659b3	DAG: Move scalarizeExtractedVectorLoad to TargetLowering (#122670 ) SimplifyDemandedVectorElts should be able to use this on loads	2025-02-04 17:37:12 +07:00
Craig Topper	788bbd2ef6	[DAGCombiner] Improve chain handling in fold (fshl ld1, ld0, c) -> (ld0[ofs]) combine. (#124871 ) Happened to notice some odd things related to chains in this code. The code calls hasOneUse on LoadSDNode* which will check users of the data and the chain. I think this was trying to check that the data had one use so one of the loads would definitely be removed by the transform. Load chains don't always have users so our testing may not have noticed that the chains being used would block the transform. The code makes all users of ld1's chain use the new load's chain, but we don't know that ld1 becomes dead. This can cause incorrect dependencies if ld1's chain is used and it isn't deleted. I think the better thing to do is use makeEquivalentMemoryOrdering to make all users of ld0 and ld1 depend on the new load and the original loads. If the olds loads become dead, their chain will be cleaned up later. I'm having trouble getting a test for any ordering issue with the current code. areNonVolatileConsecutiveLoads requires the two loads to have the same input chain. Given that, I don't know how to use one of the load chain results without also using the other. If they are both used we don't do the transform because SDNode::hasOneUse will return false for both.	2025-02-03 11:48:41 -08:00
Matt Arsenault	97a1f494a6	DAG: Avoid breaking legal vector_shuffle with multiple uses (#123712 ) Previously this combine would undo AMDGPU's new custom legalization of wide vector shuffles into 2 element pieces. The comment also states that this combine is only done before legalization, but the case with a build_vector source was unconditional. We probably don't want to do this if the multiple uses are full scalarization of the vector, but this seems to work well enough. Scalarizing extracts should have folded out pre-legalize.	2025-01-30 10:55:21 +07:00
Benjamin Maxwell	778138114e	[SDAG] Use BatchAAResults for querying alias analysis (AA) results (#123934 ) Once we get to SelectionDAG the IR should not be changing anymore, so we can use BatchAAResults rather than AAResults to cache AA queries. This should be a NFC change for targets that enable AA during codegen (such as AArch64), but also give a nice compile-time improvement in some cases. See: https://github.com/llvm/llvm-project/pull/123787#issuecomment-2606797041 Note: This follows Nikita's suggestion on #123787.	2025-01-23 09:16:09 +00:00
Matt Arsenault	5e79ae60a6	DAG: Fix vector_shuffle -> splat fold defining undef lanes (#123596 ) For shuffle vector splats with undef lanes in the mask, this was introducing real values. Filter out build_vector results based on the undef elements in the mask. This avoids AMDGPU test regressions in a future change. test/CodeGen/X86/urem-seteq-illegal-types.ll looks worse but I didn't investigate.	2025-01-21 23:55:50 +07:00

1 2 3 4 5 ...

4010 Commits