llvm-project

Author	SHA1	Message	Date
zhijian lin	afda4c295b	Reland [SelectionDAG] Folding ZERO-EXTEND/SIGN_EXTEND poison to Poison value in getNode (#136701 ) This patch addresses the signed/zero extension of poison by using a poison value of the extended type instead of a constant zero of the extended type.	2025-04-22 17:36:41 -04:00
Craig Topper	f6178cdad0	[SelectionDAG] Pass LoadExtType when ATOMIC_LOAD is created. (#136653 ) Rename one signature of getAtomic to getAtomicLoad and pass LoadExtType. Previously we had to set the extension type after the node was created, but we don't usually modify SDNodes once they are created. It's possible the node already existed and has been CSEd. If that happens, modifying the node may affect the other users. It's therefore safer to add the extension type at creation so that it is part of the CSE information. I don't know of any failures related to the current implementation. I only noticed that it doesn't match how we usually do things.	2025-04-22 09:11:46 -07:00
Craig Topper	497382ee07	[SelectionDAG] Make the FoldingSet profile in getAtomic match AddNodeIDCustom. (#136651 ) In theory, the mismatch would have made CSE of AtomicSDNodes not work, but I don't know how to test it.	2025-04-21 22:39:31 -07:00
Craig Topper	704fc6542c	[SelectionDAG] Prefer to use ATOMIC_LOAD extension type over getExtendForAtomicOps() in computeKnownBits/ComputeNumSignBits. (#136600 ) If an ATOMIC_LOAD has ZEXTLOAD/SEXTLOAD extension type we should trust that over getExtendForAtomicOps(). SystemZ is the only target that uses setAtomicLoadExtAction and they return ANY_EXTEND from getExtendForAtomicOps(). So I'm not sure there's a way to get a contradiction currently. Note, type legalization uses getExtendForAtomicOps() when promoting ATOMIC_LOAD so we may not need to check getExtendForAtomicOps() for ATOMIC_LOAD. I have not done much investigating of this.	2025-04-21 13:49:49 -07:00
Jim Lin	b95ec24ff0	[SDAG] Handle insert_subvector in isKnownNeverNaN (#131989 ) Propagate nnan across insert_subvector.	2025-04-22 01:19:56 +08:00
Nico Weber	e18a77cfbe	Revert "[SelectionDAG] Folding ZERO-EXTEND/SIGN_EXTEND poison to Poison value in getNode (#122741 )" This reverts commit f12078e72601e7c03e5d66afab034313caf8f791. Breaks `check-llvm`, see comments on https://github.com/llvm/llvm-project/pull/122741	2025-04-21 10:51:03 -04:00
zhijian lin	f12078e726	[SelectionDAG] Folding ZERO-EXTEND/SIGN_EXTEND poison to Poison value in getNode (#122741 ) The PR will fix the issue https://github.com/llvm/llvm-project/issues/122728 This patch addresses the signed/zero extension of poison by using a poison value of the extended type instead of a constant zero of the extended type.	2025-04-21 10:02:21 -04:00
Simon Pilgrim	64ffecfc43	[DAG] isKnownNeverNaN - add DemandedElts element mask to isKnownNeverNaN calls (#135952 ) Matches what we've done for computeKnownBits etc. to improve vector handling	2025-04-18 09:24:02 +01:00
zhijian lin	3dfdb4dad5	[SelectionDAG] Propagate poison in getNode with two operands if the input is poison. (#135387 ) Propagation to poison in function `SDValue SelectionDAG::getNode(unsigned Opcode, const SDLoc &DL, EVT VT,SDValue N1, SDValue N2, const SDNodeFlags Flags) ` if one of the input is poison. The patch also revert the test cases llvm/test/CodeGen/X86/pr119158.ll llvm/test/CodeGen/X86/half.ll which are mentioned in https://github.com/llvm/llvm-project/pull/125883#discussion_r2021390919 --------- Co-authored-by: Amy Kwan <amy.kwan1@ibm.com>	2025-04-17 09:23:14 -04:00
Simon Pilgrim	bb5f53c727	[DAG] isSplatValue - only treat binop splats with repeated undef elements as undef (#135945 ) #135597 didn't correctly fix the issue of binops with an undef element from only one operand - only reporting the common undef elements could incorrectly recognise splats where the (binop X, undef) fold might actually be different - we need to ensure both operands have the same demanded undefs for certainty. Fixes #135917	2025-04-16 12:34:11 +01:00
Simon Pilgrim	17b4cacbd4	[DAG] isSplatValue - only treat binop splats shared undef elements as undef (#135597 ) #134602 demonstrated an issue where an AND node always had at least one demanded UNDEF element in either operand, and incorrectly reported this an all-undef result - despite the other element being 0 (so would correctly fold to 0). This fix only assumes a binops splats element is undefined if both operands are undef. Fixes #134602	2025-04-15 08:33:42 +01:00
zhijian lin	378ac572ac	Reland "[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the poison value in the IR." (#135056 ) A new ISD::POISON SDNode is introduced to represent the poison value in the IR, replacing the previous use of ISD::UNDEF	2025-04-10 11:29:14 -04:00
Jakub Kuderski	ef1088f703	Revert "[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the poison value in the IR." (#135060 ) Reverts llvm/llvm-project#125883 This PR causes crashes in RISC-V codegen around f16/f64 poison values: https://github.com/llvm/llvm-project/pull/125883#issuecomment-2787048206	2025-04-09 14:40:56 -04:00
zhijian lin	8fddef8483	[SelectionDAG] Introducing a new ISD::POISON SDNode to represent the poison value in the IR. (#125883 ) A new ISD::POISON SDNode is introduced to represent the `poison value` in the IR, replacing the previous use of ISD::UNDEF.	2025-04-07 10:03:05 -04:00
Sergei Barannikov	0a1742708d	[SelectionDAG] Wire up -gen-sdnode-info TableGen backend (#125358 ) This patch introduces SelectionDAGGenTargetInfo and SDNodeInfo classes, which provide methods for accessing the generated SDNode descriptions. Pull Request: https://github.com/llvm/llvm-project/pull/125358 Draft PR: https://github.com/llvm/llvm-project/pull/119709 RFC: https://discourse.llvm.org/t/rfc-tablegen-erating-sdnode-descriptions	2025-04-06 13:14:37 +03:00
LU-JOHN	6a46c6c865	Ensure KnownBits passed when calculating from range md has right size (#132985 ) KnownBits passed to computeKnownBitsFromRangeMetadata must have the same bit width as the range metadata bit width. Otherwise the calculated results will be incorrect. --------- Signed-off-by: John Lu <John.Lu@amd.com>	2025-04-03 10:17:14 +07:00
Philip Reames	c90a536bcf	[CodeGen] Simplify code using TypeSize overloads of getMachineMemOperand [nfc] These were added in d584cea. This change runs through existing uses and simplifies where obvious.	2025-03-27 11:47:51 -07:00
LU-JOHN	70aeb89094	Calculate KnownBits from Metadata correctly for vector loads (#128908 ) Calculate KnownBits correctly from metadata for vector loads. --------- Signed-off-by: John Lu <John.Lu@amd.com>	2025-03-25 22:46:30 +07:00
Simon Pilgrim	0237216f16	[DAG] canCreateUndefOrPoison - add EXTRACT_SUBVECTOR handling (#132745 ) Similar to INSERT_SUBVECTOR - the index is constant and will be inbounds	2025-03-24 16:03:47 +00:00
David Green	bd1be8a242	[CodeGen][GlobalISel] Add a getVectorIdxWidth and getVectorIdxLLT. (#131526 ) From #106446, this adds a variant of getVectorIdxTy that returns an LLT. Many uses only look at the width, so a getVectorIdxWidth was added as the common base.	2025-03-18 08:31:11 +00:00
Jim Lin	00cad3ed22	[SDAG] Handle extract_subvector in isKnownNeverNaN (#131581 ) Propagate nnan across extract_subvector.	2025-03-18 09:37:16 +08:00
Benjamin Maxwell	55fdeccc45	[SDAG][X86] Remove hack needed to avoid missing x87 FPU stack pops (#128055 ) If a (two-result) node like `FMODF` or `FFREXP` is expanded to a library call, where said library has the function prototype like: `float(float, float*)` -- that is it returns a float from the call and via an output pointer. The first result of the node maps to the value returned by value and the second result maps to the value returned via the output pointer. If only the second result is used after the expansion, we hit an issue on x87 targets: ``` // Before expansion: t0, t1 = fmodf x return t1 // t0 is unused ``` Expanded result: ``` ptr = alloca ch0 = call modf ptr t0, ch1 = copy_from_reg, ch0 // t0 unused t1, ch2 = ldr ptr, ch1 return t1 ``` So far things are alright, but the DAGCombiner optimizes this to: ``` ptr = alloca ch0 = call modf ptr // copy_from_reg optimized out t1, ch1 = ldr ptr, ch0 return t1 ``` On most targets this is fine. The optimized out `copy_from_reg` is unused and is a NOP. However, x87 uses a floating-point stack, and if the `copy_from_reg` is optimized out it won't emit a pop needed to remove the unused result. The prior solution for this was to attach the chain from the `copy_from_reg` to the root, which did work, however, the root is not always available (it's set to null during legalize types). So the alternate solution in this patch is to replace the `copy_from_reg` with an `X86ISD::POP_FROM_X87_REG` within the X86 call lowering. This node is the same as `copy_from_reg` except this node makes it explicit that it may lower to an x87 FPU stack pop. Optimizations should be more cautious when handling this node than a normal CopyFromReg to avoid removing a required FPU stack pop. ``` ptr = alloca ch0 = call modf ptr t0, ch1 = pop_from_x87_reg, ch0 // t0 unused t1, ch2 = ldr ptr, ch1 return t1 ``` Using this node ensures a required x87 FPU pop is not removed due to the DAGCombiner. This is an alternate solution for #127976.	2025-03-03 12:23:28 +00:00
Craig Topper	7bd2be4266	[SelectionDAG] Use Register and MCRegister. NFC Add operators to Register to supporting adding an offset to get another Register.	2025-03-02 22:33:25 -08:00
Simon Pilgrim	7de64925da	[DAG] shouldReduceLoadWidth - hasOneUse should check just the loaded value - not the chain (#128167 ) The hasOneUse check was failing in any case where the load was part of a chain - we should only be checking if the loaded value has one use, and any updates to the chain should be handled by the fold calling shouldReduceLoadWidth. I've updated the x86 implementation to match, although it has no effect here yet (I'm still looking at how to improve the x86 implementation) as the inner for loop was discarding chain uses anyway. By using SDValue::hasOneUse instead this patch exposes a missing dependency on the LLVMSelectionDAG library in a lot of tools + unittests, which resulted in having to make SDNode::hasNUsesOfValue inline. Noticed while fighting the x86 regressions in #122671	2025-02-24 11:09:41 +00:00
Piotr Fusik	8b58cb853a	[SelectionDAG][NFC] Refactor duplicate code into SDNode::bitcastToAPInt() (#127503 )	2025-02-20 13:23:00 +07:00
zhijian lin	1ac0db44fd	[NFC] using isUndef() instead of getOpcode() == ISD::UNDEF (#127713 ) [NFC] using isUndef() instead of getOpcode() == ISD::UNDEF	2025-02-19 08:42:38 -05:00
James Chesterman	d4a0848dc6	[SelectionDAG] Add PARTIAL_REDUCE_U/SMLA ISD Nodes (#125207 ) Add signed and unsigned PARTIAL_REDUCE_MLA ISD nodes. Add command line argument (aarch64-enable-partial-reduce-nodes) that indicates whether the intrinsic experimental_vector_partial_ reduce_add will be transformed into the new ISD node. Lowering with the new ISD nodes will, for now, always be done as an expand.	2025-02-18 09:08:47 +00:00
Cullen Rhodes	9b2fc66830	[SDAG] Harden assumption in getMemsetStringVal (#126207 ) In 5235973ee03aca4148ecabe5eff64da2af1e034e, an ICE was fixed in getMemsetStringVal where f128 wasn't handled. It was noted at the time [1] that the code below this also looks suspect, since it assumes the element type of VT is either an f32 or f64. This part of getMemsetStringVal relates to memcpy operations where the source is a copy from a zero constant. The VT in question is determined by TargetLowering::findOptimalMemOpLowering, which in turn calls a further TLI hook getOptimalMemOpType. For AArch64, getOptimalMemOpType returns either a v16i8, f128, i64, i32 or Other. For Other, TargetLowering::findOptimalMemOpLowering will then pick an integer VT. So on AArch64 at least, I don't believe the suspect code can be reached. For other targets, ARM and x86 are the only ones that return a FP vector type from getOptimalMemOpType. For both targets, the only such type is v2f64, but given f64 is already handled it should also be fine. To defend this, I considered adding an assert as mentioned in [1], but given getConstantFP handles vector types, I figured using this to fully handle the FP types makes the code simpler and more robust. For test coverage I added unreachables to both of the branches handling FP types in this code, but found neither fired with check-llvm across all targets. Test coverage was added to llvm/test/CodeGen/AArch64/memcpy-f128.ll in 5235973ee03aca4148ecabe5eff64da2af1e034e to defend ICE on f128, but at some point it stopped hitting this code. AArch64TargetLowering::getOptimalMemOpType was updated in 29200611055f49a0d37243caa5f8bba1df9d57a6, so I suspect this is when it happened, although I haven't verified this. Although I did find by updating the test to disable NEON, getOptimalMemOpType returns an f128 and the branch is once again hit. For the final branch noted as suspect in [1], as far as I can tell this has never had any test coverage, so I've added a test to the ARM backend for this. Fixes: https://github.com/llvm/llvm-project/issues/20521 [1]	2025-02-13 08:48:06 +00:00
Philip Reames	e4016bf5c3	[DAG] Use ArrayRef to simplify ShuffleVectorSDNode::isSplatMask	2025-02-11 12:47:10 -08:00
Simon Pilgrim	b7c8271601	[DAG] getNode - convert scalar i1 arithmetic calls to bitwise instructions (#125486 ) We already do this for vector vXi1 types - this patch removes the vector constraint to handle it for all bool types.	2025-02-03 16:36:01 +00:00
David Green	cae0d67cba	[AArch64][SDAG] Detect non-zeroes in truncating buildvectors in fshl lowering (#123597 ) A BUILD_VECTOR can implicity shrink the bits of the operands if the operand types are not legal. For example a v8i16 constant BUILD_VECTOR might be represented as v8i16 BUILDVECTOR(i32 1, i32 2, ...). Unfortunately this means that the constants are not accepted by matchUnaryPredicateImpl, preventing in this case funnel shifts detecting that all the operands are non-zero. Add a flag to help it match.	2025-02-03 10:47:45 +00:00
Pierre van Houtryve	8ea018ce1d	[DAGISel] Fix MMRA Handling in copyExtraInfo (#124730 ) #78569 did not implement this correctly and an edge case breaks it by triggering `Assertion `!Leafs.empty()' failed.` Fixes SWDEV-507698	2025-01-28 13:27:26 +01:00
Benjamin Maxwell	778138114e	[SDAG] Use BatchAAResults for querying alias analysis (AA) results (#123934 ) Once we get to SelectionDAG the IR should not be changing anymore, so we can use BatchAAResults rather than AAResults to cache AA queries. This should be a NFC change for targets that enable AA during codegen (such as AArch64), but also give a nice compile-time improvement in some cases. See: https://github.com/llvm/llvm-project/pull/123787#issuecomment-2606797041 Note: This follows Nikita's suggestion on #123787.	2025-01-23 09:16:09 +00:00
Alex MacLean	3606876b67	[SDAG] Fix CSE for ADDRSPACECAST nodes (#122912 ) Correct CSE in SelectionDAG can make DAG combining more effective and reduces the size of the DAG and thus should improve compile time.	2025-01-20 09:09:22 -08:00
Min-Yih Hsu	2291d0aba9	[DAGCombiner] Turn `(neg (max x, (neg x)))` into `(min x, (neg x))` (#120666 ) This pattern was originally spotted in 429.mcf by @topperc. We already have a DAGCombiner pattern to turn `(neg (abs x))` into `(min x, (neg x))`. But in some cases `(neg (max x, (neg x)))` is formed by an expanded `abs` followed by a `neg` that is generated only after the `abs` expansion. This patch adds a separate pattern to match cases like this, as well as its inverse pattern: `(neg (min X, (neg X))) --> (max X, (neg X))`. This pattern is applicable to both signed and unsigned min/max.	2025-01-02 16:28:55 -08:00
Sergei Barannikov	9ae92d7056	[SelectionDAG] Virtualize isTargetStrictFPOpcode / isTargetMemoryOpcode (#119969 ) With this change, targets are no longer required to put memory / strict-fp opcodes after special `ISD::FIRST_TARGET_MEMORY_OPCODE`/`ISD::FIRST_TARGET_STRICTFP_OPCODE` markers. This will also allow autogenerating `isTargetMemoryOpcode`/`isTargetStrictFPOpcode (#119709). Pull Request: https://github.com/llvm/llvm-project/pull/119969	2024-12-21 05:29:51 +03:00
Craig Topper	ecd59f802f	[SelectionDAG] Use SmallVectorImpl& to avoid repeating SmallVector size. NFC	2024-12-19 22:03:42 -08:00
Craig Topper	e6b2495545	[SelectionDAG] Split SDNode::use_iterator into user_iterator and use_iterator. (#120531 ) SDNode::use_iterator now returns an SDUse& when dereferenced. SDNode::user_iterator returns SDNode*. SDNode::use_begin/use_end/uses work on use_iterator. SDNode::user_begin/user_end/users work on user_iterator. We can now write range based for loops using SDUse& and SDNode::uses(). I've converted many of these in this patch. I didn't update loops that have additional variables updated in their for statement. Some loops use SDNode::use_iterator::getOperandNo() which also prevents using range based for loops. I plan to move this into SDUse in a follow up patch.	2024-12-19 08:35:32 -08:00
Craig Topper	104ad9258a	[SelectionDAG] Rename SDNode::uses() to users(). (#120499 ) This function is most often used in range based loops or algorithms where the iterator is implicitly dereferenced. The dereference returns an SDNode * of the user rather than SDUse * so users() is a better name. I've long beeen annoyed that we can't write a range based loop over SDUse when we need getOperandNo. I plan to rename use_iterator to user_iterator and add a use_iterator that returns SDUse& on dereference. This will make it more like IR.	2024-12-18 20:09:33 -08:00
Benjamin Maxwell	a7dafea384	[SDAG] Allow folding stack slots into sincos/frexp in more cases (#118117 ) This adds a new helper `canFoldStoreIntoLibCallOutputPointers()` to check that it is safe to fold a store into a node that will expand to a library call that takes output pointers. This requires checking for two (independent) properties: 1. The store is not within a CALLSEQ_START..CALLSEQ_END pair * If it is, the expansion would lead to nested call sequences (which is invalid) 2. The node does not appear as a predecessor to the store * If it does, attempting to merge the store into the call would result in a cycle in the DAG These two properties are checked as part of the same traversal in `canFoldStoreIntoLibCallOutputPointers()`	2024-12-17 10:54:17 +00:00
abhishek-kaushik22	d20731ce6b	[CGData][GlobalIsel][Legalizer][DAG][MC][AsmParser][X86][AMX] Use `std::move` to avoid copy (#118068 )	2024-12-06 09:46:15 +08:00
Craig Topper	1d3f9f8862	[SelectionDAG] Stop storing EVTs in a function scoped static std::set. (#118715 ) EVTs potentially contain a Type * that points into memory owned by an LLVMContext. Storing them in a function scoped static means they may outlive the LLVMContext they point to. This std::set is used to unique single element VT lists containing a single extended EVT. Single element VT list with a simple EVT are uniqued by a separate cache indexed by the MVT::SimpleValueType enum. VT lists with more than one element are uniqued by a FoldingSet owned by the SelectionDAG object. This patch moves the single element cache into SelectionDAG so that it will be destroyed when SelectionDAG is destroyed. Fixes #88233	2024-12-05 12:56:36 -08:00
Nikita Popov	3e1b55cafc	[SDAG] Don't allow implicit trunc in getConstant() (#117558 ) Assert that the passed value is a valid unsigned integer value for the specified type. For signed values getSignedConstant() / getSignedTargetConstant() should be used instead.	2024-11-26 10:36:00 +01:00
Craig Topper	bc282605df	[SelectionDAG] Require last operand of (STRICT_)FP_ROUND to be a TargetConstant. (#117639 ) Fix all the places I could find that did't do this. We were already mostly correct for FP_ROUND after 9a976f36615dbe15e76c12b22f711b2e597a8e51, but not STRICT_FP_ROUND.	2024-11-25 21:36:33 -08:00
Benjamin Maxwell	014455a587	[SDAG] Limit sincos/frexp stack slot folding to stores chained to entry (#115906 ) When the chain is not the entry node there is a risk the stores are within a (CALLSEQ_START, CALLSEQ_END), which when the node is expanded will lead to nested call sequences. It should be possible to check for this and allow more cases, but for now, let's limit this to cases where it's definitely safe. Fixes #115323	2024-11-12 20:48:41 +00:00
David Sherwood	69b39e7cc7	[SelectionDAG] Add support for extending masked loads in computeKnownBits (#115450 ) We already support computing known bits for extending loads, but not for masked loads. For now I've only added support for zero-extends because that's the only thing currently tested. Even when the passthru value is poison we still know the top X bits are zero.	2024-11-11 09:17:49 +00:00
Benjamin Maxwell	ea6b8fa4b9	[SDAG] Merge multiple-result libcall expansion into DAG.expandMultipleResultFPLibCall() (#114792 ) This merges the logic for expanding both FFREXP and FSINCOS into one method `DAG.expandMultipleResultFPLibCall()`. This reduces duplication and also allows FFREXP to benefit from the stack slot elimination implemented for FSINCOS. This method will also be used in future to implement more multiple-result intrinsics (such as modf and sincospi).	2024-11-06 11:06:06 +00:00
Craig Topper	97b7474970	[SelectionDAG] Remove unneeded assert from SelectionDAG::getSignedConstant. NFC (#114336 ) This assert is also present inside the APInt constructor after #114539.	2024-11-04 14:43:08 -08:00
Yingwei Zheng	917b3d13b5	[SDAG] Intersect poison-generating flags after CSE (#114650 ) This patch intersects poison-generating flags after CSE to fix assertion failure reported in https://github.com/llvm/llvm-project/pull/112354#issuecomment-2452369552. Co-authored-by: Antonio Frighetto <me@antoniofrighetto.com>	2024-11-02 19:06:27 +08:00
Benjamin Maxwell	89a8c71db6	[SDAG] Support expanding `FSINCOS` to vector library calls (#114039 ) This shares most of its code with the scalar sincos expansion. It allows expanding vector FSINCOS nodes to a library call from the specified `-vector-library`. The upside of this is it will mean the vectorizer only needs to handle the sincos intrinsic, which has no memory effects, and this can handle lowering the intrinsic to a call that takes output pointers.	2024-10-31 12:41:43 +00:00

1 2 3 4 5 ...

2674 Commits