llvm-project

Author	SHA1	Message	Date
antangelo	dd4844722d	[SelectionDAG] Add generic implementation for @llvm.expect.with.probability when optimizations are disabled (#117459 ) Handle \@llvm.expect.with.probability in SelectionDAGBuilder, FastISel, and IntrinsicLowering in the same way \@llvm.expect is handled, where the value is passed through as-is. This can be reached if the intrinsic is used without optimizations, where it would otherwise be properly transformed out. Fixes #115411 for SelectionDAG. A similar patch is likely needed for GlobalISel.	2024-11-26 20:22:25 -05:00
Nikita Popov	3e1b55cafc	[SDAG] Don't allow implicit trunc in getConstant() (#117558 ) Assert that the passed value is a valid unsigned integer value for the specified type. For signed values getSignedConstant() / getSignedTargetConstant() should be used instead.	2024-11-26 10:36:00 +01:00
Craig Topper	bc282605df	[SelectionDAG] Require last operand of (STRICT_)FP_ROUND to be a TargetConstant. (#117639 ) Fix all the places I could find that did't do this. We were already mostly correct for FP_ROUND after 9a976f36615dbe15e76c12b22f711b2e597a8e51, but not STRICT_FP_ROUND.	2024-11-25 21:36:33 -08:00
Craig Topper	c2bb056482	[SelectionDAG][RISCV][AArch64] Allow f16 STRICT_FLDEXP to be promoted. Fix integer promotion of STRICT_FLDEXP in type legalizer. (#117633 ) A special case in type legalization wasn't accounting for different operand numbering between FLDEXP and STRICT_FLDEXP. AArch64 already asked STRICT_FLDEXP to be promoted, but had no test for it.	2024-11-25 16:12:45 -08:00
David Sherwood	9b76e7fc60	Revert "[DAGCombiner] Add support for scalarising extracts of a vector setcc (#116031 )" (#117556 ) This reverts commit 22ec44f509ff266b581dbb490d7b040473b7c31a.	2024-11-25 13:49:21 +00:00
David Sherwood	22ec44f509	[DAGCombiner] Add support for scalarising extracts of a vector setcc (#116031 ) For IR like this: %icmp = icmp ult <4 x i32> %a, splat (i32 5) %res = extractelement <4 x i1> %icmp, i32 1 where there is only one use of %icmp we can take a similar approach to what we already do for binary ops such add, sub, etc. and convert this into %ext = extractelement <4 x i32> %a, i32 1 %res = icmp ult i32 %ext, 5 For AArch64 targets at least the scalar boolean result will almost certainly need to be in a GPR anyway, since it will probably be used by branches for control flow. I've tried to reuse existing code in scalarizeExtractedBinop to also work for setcc. NOTE: The optimisations don't apply for tests such as extract_icmp_v4i32_splat_rhs in the file CodeGen/AArch64/extract-vector-cmp.ll because scalarizeExtractedBinOp only works if one of the input operands is a constant.	2024-11-25 09:25:01 +00:00
Nikita Popov	3317c9ceac	[AMDGPU] Use getSignedConstant() where necessary (#117328 ) Create signed constant using getSignedConstant(), to avoid future assertion failures when we disable implicit truncation in getConstant(). This also touches some generic legalization code, which apparently only AMDGPU tests.	2024-11-25 09:49:34 +01:00
Félix-Antoine Constantin	7a56dc7245	[Clang] Attribute NoFPClass should not prevent tail call optimization. (#116741 ) Fixes #111950	2024-11-22 17:28:45 -08:00
Jonathan Cohen	00d383ee9d	[DAGCombiner] Limit steps in shouldCombineToPostInc (#116030 ) Currently the function will walk the entire DAG to find other candidates to perform a post-inc store. This leads to very long compilation times on large functions. Added a MaxSteps limit to avoid this, which is also aligned to how hasPredecessorHelper is used elsewhere in the code.	2024-11-21 11:58:37 +02:00
abhishek-kaushik22	a23260087d	[SDAG] [X86] Extend SplitVecOp_VSETCC for STRICT_FSETCCS (#116768 ) Closes #116767	2024-11-21 17:43:01 +08:00
Benjamin Maxwell	0a1795f781	[SDAG] Generalize FSINCOS type legalization (NFC) (#116848 ) There's nothing that specific to FSINCOS about these; they could be used for similar nodes in the future.	2024-11-20 10:56:39 +00:00
Lei Huang	ed8ebad6eb	[SelectionDAG] Support integer promotion for VP_LOAD and VP_STORE (#81299 ) Add integer promotion support for for VP_LOAD and VP_STORE via legalization of extend and truncate of each form. Patch commandeered from: https://reviews.llvm.org/D109377	2024-11-18 13:32:58 -05:00
Simon Pilgrim	51809e4a26	[DAG] SimplifyDemandedVectorElts - add SimplifyMultipleUse handling to SEXT/ZEXT/TRUNC nodes (#116227 ) Allows us to bypass multiple uses of a SEXT/ZEXT/TRUNC node operand	2024-11-16 12:40:42 +00:00
Graham Hunter	ed5aaddd7b	[IR] Vector extract last active element intrinsic (#113587 ) As discussed in #112738, it may be better to have an intrinsic to represent vector element extracts based on mask bits. This intrinsic is for the case of extracting the last active element, if any, or a default value if the mask is all-false. The target-agnostic SelectionDAG lowering is similar to the IR in #106560.	2024-11-14 17:48:43 +00:00
Sam Elliott	862f42eedf	[TargetLowering] Use Correct VT for Multi-out Asm (#116024 ) This was overlooked in 7d940432c46be83b8fcb5dbefee439585fa820cd - when inline assembly has multiple outputs, they are returned as members of a struct, and the `getAsmOperandType` needs to be called for each member of struct. The difference between this and the single-output case is that in the latter, there isn't a struct wrapping the outputs. I noticed this when trying to use the same mechanism in the RISC-V backend. Committing two tests: - One that shows a crash before this change, which is fixed by this change. - One (commented out) that shows a different crash with tied inputs/outputs. This is commented as it is not fixed by this change and needs more work in target-independent inline asm handling code.	2024-11-14 12:31:31 +00:00
Ricardo Jesus	e52238b59f	[AArch64] Add @llvm.experimental.vector.match (#101974 ) This patch introduces an experimental intrinsic for matching the elements of one vector against the elements of another. For AArch64 targets that support SVE2, the intrinsic lowers to a MATCH instruction for supported fixed and scalar vector types.	2024-11-14 09:00:19 +00:00
Alex MacLean	7a8fe0f83c	[SelectionDAG] Fixup type usage of CondCodeAction table (#116082 ) Ensure that all uses of CondCodeAction table are checking the compared types, not the produced type. This is a prerequisite to landing #115035	2024-11-13 13:20:16 -08:00
Benjamin Maxwell	014455a587	[SDAG] Limit sincos/frexp stack slot folding to stores chained to entry (#115906 ) When the chain is not the entry node there is a risk the stores are within a (CALLSEQ_START, CALLSEQ_END), which when the node is expanded will lead to nested call sequences. It should be possible to check for this and allow more cases, but for now, let's limit this to cases where it's definitely safe. Fixes #115323	2024-11-12 20:48:41 +00:00
Kazu Hirata	4048c64306	[llvm] Remove redundant control flow statements (NFC) (#115831 ) Identified with readability-redundant-control-flow.	2024-11-12 10:09:42 -08:00
David Sherwood	69b39e7cc7	[SelectionDAG] Add support for extending masked loads in computeKnownBits (#115450 ) We already support computing known bits for extending loads, but not for masked loads. For now I've only added support for zero-extends because that's the only thing currently tested. Even when the passthru value is poison we still know the top X bits are zero.	2024-11-11 09:17:49 +00:00
Alex Bradbury	5a08acc1e7	[LegalizeTypes] Support softening FMINIMUM/FMAXIMUM (#115463 ) Without this, you get an error "Do not know how to soften the result of this operator!" when compiling for a soft float target. The libcall names match those defined in glibc <https://www.gnu.org/software/libc/manual/html_node/Misc-FP-Arithmetic.html> and more recently added to LLVM's libc <https://github.com/llvm/llvm-project/pull/86016>.	2024-11-09 10:02:06 +00:00
David Sherwood	b9dd60228c	[DAGCombiner] Remove a hasOneUse check in visitAND (#115142 ) For some reason there was a hasOneUse check on the splat for the second operand and it's not obvious to me why. The check blocks optimisations for lowering of nodes like AVGFLOORU and AVGCEILU. In a follow-on patch I also plan to improve the generated code for AVGCEILU further by teaching computeKnownBits about zero-extending masked loads.	2024-11-08 08:20:31 +00:00
Yingwei Zheng	f74aed7938	[DAGCombiner] Add basic support for `trunc nsw/nuw` (#113808 ) This patch adds basic support for `trunc nsw/nuw` in SDAG. It will allow DAGCombiner to further eliminate in-reg `zext/sext` instructions.	2024-11-07 00:23:53 +08:00
Benjamin Maxwell	ea6b8fa4b9	[SDAG] Merge multiple-result libcall expansion into DAG.expandMultipleResultFPLibCall() (#114792 ) This merges the logic for expanding both FFREXP and FSINCOS into one method `DAG.expandMultipleResultFPLibCall()`. This reduces duplication and also allows FFREXP to benefit from the stack slot elimination implemented for FSINCOS. This method will also be used in future to implement more multiple-result intrinsics (such as modf and sincospi).	2024-11-06 11:06:06 +00:00
Heejin Ahn	492812f613	[WebAssembly] Fix rethrow's index calculation (#114693 ) So far we have assumed that we only rethrow the exception caught in the innermost EH pad. This is true in code we directly generate, but after inlining this may not be the case. For example, consider this code: ```ll ehcleanup: %0 = cleanuppad ... call @destructor cleanupret from %0 unwind label %catch.dispatch ``` If `destructor` gets inlined into this function, the code can be like ```ll ehcleanup: %0 = cleanuppad ... invoke @throwing_func to label %unreachale unwind label %catch.dispatch.i catch.dispatch.i: catchswitch ... [ label %catch.start.i ] catch.start.i: %1 = catchpad ... invoke @some_function to label %invoke.cont.i unwind label %terminate.i invoke.cont.i: catchret from %1 to label %destructor.exit destructor.exit: cleanupret from %0 unwind label %catch.dispatch ``` We lower a `cleanupret` into `rethrow`, which assumes it rethrows the exception caught by the nearest dominating EH pad. But after the inlining, the nearest dominating EH pad is not `ehcleanup` but `catch.start.i`. The problem exists in the same manner in the new (exnref) EH, because it assumes the exception comes from the nearest EH pad and saves an exnref from that EH pad and rethrows it (using `throw_ref`). This problem can be fixed easily if `cleanupret` has the basic block where its matching `cleanuppad` is. The bitcode instruction `cleanupret` kind of has that info (it has a token from the `cleanuppad`), but that info is lost when when we enter ISel, because `TargetSelectionDAG.td`'s `cleanupret` node does not have any arguments: `5091a359d9/llvm/include/llvm/Target/TargetSelectionDAG.td (L700)` Note that `catchret` already has two basic block arguments, even though neither of them means `catchpad`'s BB. This PR adds the `cleanuppad`'s BB as an argument to `cleanupret` node in ISel and uses it in the Wasm backend. Because this node is also used in X86 backend we need to note its argument there too but nothing more needs to change there as long as X86 doesn't need it. --- - Details about changes in the Wasm backend: After this PR, our pseudo `RETHROW` instruction takes a BB, which means the EH pad whose exception it needs to rethrow. There are currently two ways to generate a `RETHROW`: one is from `llvm.wasm.rethrow` intrinsic and the other is from `CLEANUPRET` we discussed above. In case of `llvm.wasm.rethrow`, we add a '0' as a placeholder argument when it is lowered to a `RETHROW`, and change it to a BB in LateEHPrepare. As written in the comments, this PR doesn't change how this BB is computed. The BB argument will be converted to an immediate argument as with other control flow instructions in CFGStackify. In case of `CLEANUPRET`, it already has a BB argument pointing to an EH pad, so it is just converted to a `RETHROW` with the same BB argument in LateEHPrepare. This will also be lowered to an immediate in CFGStackify with other control flow instructions. --- Fixes #114600.	2024-11-05 21:45:13 -08:00
Craig Topper	13b5899c29	[SelectionDAGBuilder][X86] Don't form FMAXNUM for f16 vectors if FMAXNUM needs to be promoted. (#114943 ) In #70357, I changed a isLegalOrCustom to isLegalOrCustomOrPromote in visitSelect to enable integer min/max to be formed when the operation was promoted. Unfortunately, this also affected floating point. For floating point, fmaxnum may require a libcall so we also need to check if the operation on the promoted type is legal or custom. Other changes to RISC-V have seen made the original change untested so this patch restores the original isLegalOrCustom. Fixes #114520.	2024-11-05 15:06:37 -08:00
Simon Pilgrim	9540a7ae82	[DAG] SimplifyMultipleUseDemandedBits - bypass ADD nodes if either operand is zero (#112588 ) The dpbusd_const.ll test change is due to us losing the expanded add reduction pattern as one of the elements is known to be zero (removing one of the adds from the reduction pyramid). I don't think its of concern. Noticed while working on #107423	2024-11-05 17:20:41 +00:00
Simon Pilgrim	aef0e77c76	[DAG] visitAND - Fold (and (srl X, C), 1) -> (srl X, BW-1) for signbit extraction (#114992 ) If we're masking the LSB of a SRL node result and that is shifting down an extended sign bit, see if we can change the SRL to shift down the MSB directly. These patterns can occur during legalisation when we've sign extended to a wider type but the SRL is still shifting from the subreg. Alternative to #114967 Fixes the remaining regression in #112588	2024-11-05 14:42:15 +00:00
Simon Pilgrim	5df387227e	[DAG] visitAND - refactor "and (sub 0, ext(bool X)), 1 --> zext(bool X)" to use SDPatternMatch.	2024-11-05 14:30:21 +00:00
Craig Topper	97b7474970	[SelectionDAG] Remove unneeded assert from SelectionDAG::getSignedConstant. NFC (#114336 ) This assert is also present inside the APInt constructor after #114539.	2024-11-04 14:43:08 -08:00
Simon Pilgrim	0ac2e42227	[DAG] SimplifyDemandedBits - ignore SRL node if we're just demanding known sign bits (#114805 ) Check to see if we are only demanding (shifted) signbits from a SRL node that are also signbits in the source node. We can't demand any upper zero bits that the SRL will shift in (up to max shift amount), and the lower demanded bits bound must already be all signbits. Same fold as #114389 which added this for SimplifyMultipleUseDemandedBits	2024-11-04 17:27:45 +00:00
Sander de Smalen	3098200fcc	[ISel] Propagate disjoint flag in ShrinkDemandedOp (#114560 ) When trying to evaluate an expression in a narrower type, the DAGCombine should propagate the disjoint flag, as it's equally valid on the narrower expression. This helps improve better use of addressing modes for some Arm SME instructions, for example.	2024-11-03 19:42:04 +00:00
Kazu Hirata	6927a434ba	[SelectionDAG] Remove unused includes (NFC) (#114697 ) Identified with misc-include-cleaner.	2024-11-03 07:03:33 -08:00
Antonio Frighetto	8a2113c5c5	Reapply "[SelectionDAG] Add preliminary plumbing for `samesign` flag" Original commit: 19c8475871faee5ebb06281034872a85a2552675 Multiple 2-stage sanitizer buildbots were reporting failures, the issue has been addressed separately in 29246a92aee87e86cbc2bb64ee520d7385644f34.	2024-11-02 14:19:58 +01:00
Yingwei Zheng	917b3d13b5	[SDAG] Intersect poison-generating flags after CSE (#114650 ) This patch intersects poison-generating flags after CSE to fix assertion failure reported in https://github.com/llvm/llvm-project/pull/112354#issuecomment-2452369552. Co-authored-by: Antonio Frighetto <me@antoniofrighetto.com>	2024-11-02 19:06:27 +08:00
Vitaly Buka	c5eb591257	Revert "[SelectionDAG] Add preliminary plumbing for `samesign` flag" (#114647 ) Crashes on ARM builds https://lab.llvm.org/buildbot/#/builders/85/builds/2548 ``` DAGCombiner.cpp:16157: SDValue (anonymous namespace)::DAGCombiner::visitFREEZE(SDNode *): Assertion `DAG.isGuaranteedNotToBeUndefOrPoison(R, false) && "Can't create node that may be undef/poison!"' failed. ``` Issue #114648 This reverts commit 19c8475871faee5ebb06281034872a85a2552675.	2024-11-02 01:35:13 -07:00
Philip Reames	69edef1ab9	[DAG] Simplify control flow in SelectionDAGBuilder::visitShuffleVector [NFC] If we've handled ==, and < above, the only case left can be >. We don't need to branch on this, and can instead assert and reduce indentation, and simplify reasoning about the fallthrough path.	2024-11-01 08:59:15 -07:00
Antonio Frighetto	19c8475871	[SelectionDAG] Add preliminary plumbing for `samesign` flag Extend recently-added poison-generating IR flag to codegen as well.	2024-10-31 19:47:50 +01:00
Simon Pilgrim	9fb4bc5bf4	[DAG] SimplifyMultipleUseDemandedBits - ignore SRL node if we're just demanding known sign bits (#114389 ) Check to see if we are only demanding (shifted) signbits from a SRL node that are also signbits in the source node. We can't demand any upper zero bits that the SRL will shift in (up to max shift amount), and the lower demanded bits bound must already be all signbits.	2024-10-31 16:40:29 +00:00
Benjamin Maxwell	89a8c71db6	[SDAG] Support expanding `FSINCOS` to vector library calls (#114039 ) This shares most of its code with the scalar sincos expansion. It allows expanding vector FSINCOS nodes to a library call from the specified `-vector-library`. The upside of this is it will mean the vectorizer only needs to handle the sincos intrinsic, which has no memory effects, and this can handle lowering the intrinsic to a call that takes output pointers.	2024-10-31 12:41:43 +00:00
dnsampaio	28d0718033	[DAGCombiner] Add combine avg from shifts (#113909 ) This teaches dagcombiner to fold: `(asr (add nsw x, y), 1) -> (avgfloors x, y)` `(lsr (add nuw x, y), 1) -> (avgflooru x, y)` as well the combine them to a ceil variant: `(avgfloors (add nsw x, y), 1) -> (avgceils x, y)` `(avgflooru (add nuw x, y), 1) -> (avgceilu x, y)` iff valid for the target. Removes some of the ARM MVE patterns that are now dead code. It adds the avg opcodes to `IsQRMVEInstruction` as to preserve the immediate splatting as before.	2024-10-31 10:57:27 +01:00
Craig Topper	00cbb68fb7	[LegalizeDAG] Use getSignedConstant. NFC	2024-10-30 21:43:16 -07:00
Kazu Hirata	f582cd3dc7	[SelectionDAG] Fix a warning This patch fixes: llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp:1489:17: error: unused variable 'Flags' [-Werror,-Wunused-variable]	2024-10-30 17:49:51 -07:00
Yingwei Zheng	cf9d1c1486	[SDAG] Simplify `SDNodeFlags` with bitwise logic (#114061 ) This patch allows using enumeration values directly and simplifies the implementation with bitwise logic. It addresses the comment in https://github.com/llvm/llvm-project/pull/113808#discussion_r1819923625.	2024-10-31 08:10:07 +08:00
Simon Pilgrim	f7b5f0c805	[DAG] Fold (and X, (rot (not Y), Z)) -> (and X, (not (rot Y, Z))) On ANDNOT capable targets we can always do this profitably, without ANDNOT we only attempt this if we don't introduce an additional NOT Followup to #112547	2024-10-30 10:46:12 +00:00
Matt Arsenault	88e23eb2cf	DAG: Fix legalization of vector addrspacecasts (#113964 )	2024-10-29 08:08:50 -05:00
Benjamin Maxwell	c3260c65e8	[IR] Add `llvm.sincos` intrinsic (#109825 ) This adds the `llvm.sincos` intrinsic, legalization, and lowering. The `llvm.sincos` intrinsic takes a floating-point value and returns both the sine and cosine (as a struct). ``` declare { float, float } @llvm.sincos.f32(float %Val) declare { double, double } @llvm.sincos.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.sincos.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.sincos.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.sincos.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float> %Val) ``` The lowering is built on top of the existing FSINCOS ISD node, with additional type legalization to allow for f16, f128, and vector values.	2024-10-29 10:52:20 +00:00
Ellis Hoag	6ab26eab4f	Check hasOptSize() in shouldOptimizeForSize() (#112626 )	2024-10-28 09:45:03 -07:00
Simon Pilgrim	056cf936a7	[DAG] Fold (and X, (bswap/bitreverse (not Y))) -> (and X, (not (bswap/bitreverse Y))) (#112547 ) On ANDNOT capable targets we can always do this profitably, without ANDNOT we only attempt this if we don't introduce an additional NOT Fixes #112425	2024-10-28 11:52:44 +00:00
Dimitry Andric	4bce21480f	Ensure !NDEBUG with LLVM_ENABLE_ABI_BREAKING_CHECKS does not segfault (#113588 ) In SelectionDAG, `TargetTransformInfo::hasBranchDivergence()` can be called when both `NDEBUG` and `LLVM_ENABLE_ABI_BREAKING_CHECKS` are enabled. In that case, the class member `TTI` is still initialized to `nullptr`, causing a segfault. Fix this by ensuring that all the calls to `hasBranchDivergence` and `VerifyDAGDivergence` only occur when `NDEBUG` is disabled, and `LLVM_ENABLE_ABI_BREAKING_CHECKS` is enabled.	2024-10-24 19:30:38 +02:00

1 2 3 4 5 ...

13909 Commits