llvm-project

Author	SHA1	Message	Date
Noah Goldstein	01d8e1ca01	[ValueTracking] Handle non-canonical operand order in `isImpliedCondICmps` We don't always have canonical order here, so do it manually. Closes #85575	2024-03-17 17:46:06 -05:00
Paul Walker	fd07b8f809	[LLVM][tests/Transforms/InstCombine] Convert instances of ConstantExpr based splats to use splat(). This is mostly NFC but some output does change due to consistently inserting into poison rather than undef and using i64 as the index type for inserts.	2024-02-27 13:37:23 +00:00
Nikita Popov	90ba33099c	[InstCombine] Canonicalize constant GEPs to i8 source element type (#68882 ) This patch canonicalizes getelementptr instructions with constant indices to use the `i8` source element type. This makes it easier for optimizations to recognize that two GEPs are identical, because they don't need to see past many different ways to express the same offset. This is a first step towards https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699. This is limited to constant GEPs only for now, as they have a clear canonical form, while we're not yet sure how exactly to deal with variable indices. The test llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll gives two representative examples of the kind of optimization improvement we expect from this change. In the first test SimplifyCFG can now realize that all switch branches are actually the same. In the second test it can convert it into simple arithmetic. These are representative of common optimization failures we see in Rust. Fixes https://github.com/llvm/llvm-project/issues/69841.	2024-01-24 15:25:29 +01:00
Nikita Popov	d77067d08a	[ValueTracking] Add dominating condition support in computeKnownBits() (#73662 ) This adds support for using dominating conditions in computeKnownBits() when called from InstCombine. The implementation uses a DomConditionCache, which stores which branches may provide information that is relevant for a given value. DomConditionCache is similar to AssumptionCache, but does not try to do any kind of automatic tracking. Relevant branches have to be explicitly registered and invalidated values explicitly removed. The necessary tracking is done inside InstCombine. The reason why this doesn't just do exactly the same thing as AssumptionCache is that a lot more transforms touch branches and branch conditions than assumptions. AssumptionCache is an immutable analysis and mostly gets away with this because only a handful of places have to register additional assumptions (mostly as a result of cloning). This is very much not the case for branches. This change regresses compile-time by about ~0.2%. It also improves stage2-O0-g builds by about ~0.2%, which indicates that this change results in additional optimizations inside clang itself. Fixes https://github.com/llvm/llvm-project/issues/74242.	2023-12-06 14:17:18 +01:00
Craig Topper	7ec4f6094e	[InstCombine] Infer disjoint flag on Or instructions. (#72912 ) The disjoint flag was recently added to IR in #72583 We already set it when we turn an add into an or. This patch sets it on Ors that weren't converted from an Add.	2023-12-02 14:11:12 -08:00
Nikita Popov	ac75171d41	[InstCombine] Fix incorrect nneg inference on shift amount Whether this is valid depends on the bit widths of the involved integers. Fixes https://github.com/llvm/llvm-project/issues/72927.	2023-11-21 15:47:55 +01:00
Nikita Popov	a1652fdb5e	[InstCombine] Add tests for incorrect shift nneg inference (NFC) The second test is a miscompile.	2023-11-21 15:47:55 +01:00
Yingwei Zheng	6da4ecdf92	[InstCombine] Infer shift flags with unknown shamt (#72535 ) Alive2: https://alive2.llvm.org/ce/z/82Wr3q Related patch: `2dd52b4527`	2023-11-18 15:15:14 +08:00
Yingwei Zheng	4fdc289d4a	[InstCombine] Infer nsw flag for `(X <<nuw C1) >>u C --> X << (C1 - C)` (#72407 ) Alive2: https://alive2.llvm.org/ce/z/nnHAPy This missed optimization is discovered with the help of https://github.com/AliveToolkit/alive2/pull/962.	2023-11-16 02:34:59 +08:00
Nikita Popov	5918f62301	[InstCombine] Infer zext nneg flag (#71534 ) Use KnownBits to infer the nneg flag on zext instructions. Currently we only set nneg when converting sext -> zext, but don't set it when we have a zext in the first place. If we want to use it in optimizations, we should make sure the flag inference is consistent.	2023-11-08 09:34:40 +01:00
Nikita Popov	b47ff36134	[InstCombine] Drop exact flag instead of increasing demanded bits (#70311 ) Demanded bit simplification for lshr/ashr will currently demand the low bits if the exact flag is set. This is because these bits must be zero to satisfy the flag. However, this means that our demanded bits simplification is worse for lshr/ashr exact than it is for plain lshr/ashr, which is generally not desirable. Instead, drop the exact flag if a demanded bits simplification of the operand succeeds, which may no longer satisfy the exact flag. This matches what we do for the exact flag on udiv, as well as the nuw/nsw flags on add/sub/mul.	2023-10-26 13:12:30 +02:00
Nikita Popov	cf3ac964dc	[InstCombine] Add additional demanded bits tests for shifts (NFC)	2023-10-26 11:03:58 +02:00
Dmitriy Smirnov	e13bed4c5f	[PATCH] [llvm] [InstCombine] Canonicalise ADD+GEP This patch tries to canonicalise add + gep to gep + gep. Co-authored-by: Paul Walker <paul.walker@arm.com> Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D155688	2023-10-06 12:29:06 +01:00
Nikita Popov	41895843b5	[InstCombine] Only perform one iteration InstCombine is a worklist-driven algorithm, which works roughly as follows: * All instructions are initially pushed to the worklist. The initial order is in RPO program order. * All newly inserted instructions get added to the worklist. * When an instruction is folded, its users get added back to the worklist. * When the use-count of an instruction decreases, it gets added back to the worklist. * And a few of other heuristics on when we should revisit instructions. On top of the worklist algorithm, InstCombine layers an additional fix-point iteration: If any fold was performed in the previous iteration, then InstCombine will re-populate the worklist from scratch and fold the entire function again. This continues until a fix-point is reached. In the vast majority of cases, InstCombine will reach a fix-point within a single iteration: However, a second iteration is performed to verify that this is indeed the fixpoint. We can see this in the statistics for llvm-test-suite: "instcombine.NumOneIteration": 411380, "instcombine.NumTwoIterations": 117921, "instcombine.NumThreeIterations": 236, "instcombine.NumFourOrMoreIterations": 2, The way to read these numbers is that in 411380 cases, InstCombine performs no folds. In 117921 cases it performs a fold and reaches the fix-point within one iteration (the second iteration verifies the fixpoint). In the remaining 238 cases, more than one iteration is needed to reach the fixpoint. In other words, only in 0.04% of cases are additional iterations needed to reach a fixpoint. Conversely, in 22.3% of cases InstCombine performs a completely useless extra iteration to verify the fix point. This patch removes the fixpoint iteration from InstCombine, and always only perform a single iteration. This results in a major compile-time improvement of around 4% at negligible codegen impact. This explicitly does accept that we will not reach a fixpoint in all cases. However, this is mitigated by two factors: First, the data suggests that this happens very rarely in practice. Second, InstCombine runs many times during the optimization pipeline (8 times even without LTO), so there are many chances to recover such cases. In order to prevent accidental optimization regressions in the future, this implements a verify-fixpoint option, which is enabled by default when instcombine is specified in -passes and disabled when InstCombinePass() is constructed from C++. This means that test cases need to explicitly use the no-verify-fixpoint option if they fail to reach a fixed point (for a well understand reason we cannot / do not want to avoid). Differential Revision: https://reviews.llvm.org/D154579	2023-07-31 10:56:49 +02:00
Nikita Popov	0db5d8e123	Reapply [InstSimplify] Make simplifyWithOpReplaced() recursive (PR63104) A similar assumption as for the x^x case also existed for the absorber case, which lead to a stage2 miscompile. That assumption is not fixed. ----- Support replacement of operands not only in the immediate instruction, but also instructions it uses. To the most part, this extension is straightforward, but there are two bits worth highlighting: First, we can now no longer assume that if the Op is a vector, the instruction also returns a vector. If Op is a vector and the instruction returns a scalar, we should consider it as a cross-lane operation. Second, for the x ^ x special case and the absorber special case, we can no longer assume that one of the operands is RepOp, as we might have a replacement higher up the instruction chain. There is one optimization regression, but it is in a fuzzer-generated test case. Fixes https://github.com/llvm/llvm-project/issues/63104.	2023-07-18 10:36:39 +02:00
Nikita Popov	2bc7d02312	Revert "[InstSimplify] Make simplifyWithOpReplaced() recursive (PR63104)" This is very likely the cause of a stage 2 failure in Transforms/LoopVectorize/check-prof-info.ll. Revert until I can investigate this. This reverts commit 3d199d086e076f0b9b90d4c59f2226a417a639b5.	2023-07-14 18:33:39 +02:00
Nikita Popov	3d199d086e	[InstSimplify] Make simplifyWithOpReplaced() recursive (PR63104) Support replacement of operands not only in the immediate instruction, but also instructions it uses. To the most part, this extension is straightforward, but there are two bits worth highlighting: First, we can now no longer assume that if the Op is a vector, the instruction also returns a vector. If Op is a vector and the instruction returns a scalar, we should consider it as a cross-lane operation. Second, for the x ^ x special case, we can no longer assume that the operand is RepOp, as we might have a replacement higher up the instruction chain. There is one optimization regression, but it is in a fuzzer-generated test case. Fixes https://github.com/llvm/llvm-project/issues/63104.	2023-07-14 16:33:40 +02:00
Nikita Popov	21827268ad	[InstCombine] Fold add of zext and sext of i1 (zext a) + (sext a) is 0 if a is a bool. The regression is in a fuzzer-generated test. Proof: https://alive2.llvm.org/ce/z/KotnN6	2023-07-14 14:52:13 +02:00
Nikita Popov	fa45fb7f0c	[InstCombine] Handle assumes in multi-use demanded bits simplification This fixes the largest remaining discrepancy between results of computeKnownBits() and SimplifyDemandedBits(). We only care about the multi-use case here, because the assume necessarily introduces an extra use.	2023-06-02 14:24:24 +02:00
Nikita Popov	f7d1baa414	[KnownBits] Return zero instead of unknown for always poison shifts For always poison shifts, any KnownBits return value is valid. Currently we return unknown, but returning zero is generally more profitable. We had some code in ValueTracking that tried to do this, but was actually dead code. Differential Revision: https://reviews.llvm.org/D150648	2023-05-23 14:41:22 +02:00
Zhongyunde	90d30fde12	[InstCombine] Add frozen for the condition value of SelectInst If the condition value of SelectInst may be a poison or undef value, infer constant range at SelectInst use is incorrect, similar to D143883. Fixes https://github.com/llvm/llvm-project/issues/62401 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D149339	2023-04-27 21:35:54 +08:00
Noah Goldstein	5a3d9e0617	[InstCombine] Transform `(shift X,Or(Y,BitWidth-1))` -> `(shift X,BitWidth-1)` shl : https://alive2.llvm.org/ce/z/_B7Qca lshr: https://alive2.llvm.org/ce/z/6eXz_W ashr: https://alive2.llvm.org/ce/z/oGEx-q Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D145326	2023-03-06 20:30:06 -06:00
Noah Goldstein	9bb409ff1d	[InstCombine] Add tests for transform `(shift X,(Or Y, BitWidth-1))`; NFC Differential Revision: https://reviews.llvm.org/D145334	2023-03-06 20:29:57 -06:00
Sanjay Patel	8e8467d9d8	[InstCombine] canonicalize "extract lowest set bit" away from cttz intrinsic 1 << (cttz X) --> -X & X https://alive2.llvm.org/ce/z/qv3E9e This creates an extra use of the input value, so that's generally not preferred, but there are advantages to this direction: 1. 'negate' and 'and' allow for better analysis than 'cttz'. 2. This is more likely to induce follow-on transforms (in the example from issue #60801, we'll get the decrement pattern). 3. The more basic ALU ops are more likely to result in better codegen across a variety of targets. This won't solve the motivating bugs (see issue #60799) because we do not recognize the redundant icmp+sel, and the x86 backend may not have the pattern-matching to produce the optimal BMI instructions. Differential Revision: https://reviews.llvm.org/D144329	2023-02-19 17:29:40 -05:00
Sanjay Patel	a8831631c7	[InstCombine] add tests for 1<<cttz(x); NFC issue #60799 issue #60801	2023-02-18 08:34:55 -05:00
Sanjay Patel	d4493dd1ed	[InstCombine] add nuw to any (1<<x) https://alive2.llvm.org/ce/z/9EjDKE This was mentioned as a missing fold in D139598. It can unlock follow-on folds in some cases. This verifies one of the changed tests: https://alive2.llvm.org/ce/z/B_btDM	2022-12-15 12:03:47 -05:00
William Huang	be4b1dd35b	[InstCombine] Revert D125845 Reverting D125845 `[InstCombine] Canonicalize GEP of GEP by swapping constant-indexed GEP to the back` because multiple users reported performance regression Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D138950	2022-11-29 22:02:40 +00:00
Philip Reames	656e53e544	[instcombine] Add basic test coverage for demanded bits of scalable vectors	2022-10-21 07:59:04 -07:00
William Huang	6c767cef5a	[InstCombine] Canonicalize GEP of GEP by swapping constant-indexed GEP to the back Canonicalize GEP of GEP by swapping GEP with some suffix constant indices to the back (and GEP with all constant indices to the back of that), this allows more constant index GEP merging to happen. Exceptions are: If swapping violates use-def relations, or anti-optimizes LICM For constant indexed GEP of GEP, if they cannot be merged directly, they will be casted to i8* and merged. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D125845	2022-10-20 17:41:26 +00:00
Bjorn Pettersson	4ab40eca08	[test][InstCombine] Update some test cases to use opaque pointers These tests cases were converted using the script at https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34 Differential Revision: https://reviews.llvm.org/D135094	2022-10-03 22:17:59 +02:00
Sanjay Patel	1d1d1e6f22	[InstCombine] fold full-shift of sdiv to icmp+extend This is a disguised sign-bit test with offset: (X / +DivC) >> (Width - 1) --> ext (X <= -DivC) (X / -DivC) >> (Width - 1) --> ext (X >= +DivC) https://alive2.llvm.org/ce/z/cO8JO4 We don't match/test poison in the sdiv constant because that would be immediate undefined behavior.	2022-09-18 13:13:14 -04:00
Sanjay Patel	0ae6bc0771	[InstCombine] add tests for full-right-shift of sdiv; NFC	2022-09-18 13:13:14 -04:00
Sanjay Patel	73919a87e9	[InstCombine] try multi-use demanded bits folds for 'add' This patch enables a multi-use demanded bits fold (motivated by issue #57576): https://alive2.llvm.org/ce/z/DsZakh This mimics transforms that we already do on the single-use path. Originally, this patch did not include the last part to form a constant, but that can be removed independently to reduce risk. It's not clear what the effect of either change will be when viewed end-to-end. This is expected to be neutral or a slight win for compile-time. See the "add-demand2" series for experimental timing results: https://llvm-compile-time-tracker.com/?config=NewPM-O3&stat=instructions&remote=rotateright Differential Revision: https://reviews.llvm.org/D133788	2022-09-14 09:30:59 -04:00
Craig Topper	d76c8f5127	[InstCombine] Add mul with negated power of 2 constant to canEvaluateShifted. If we are right shifting a multiply by a negated power of 2 where the power of 2 is the same as the shift amount, we can replace with a negate followed by an And. New tests have not been committed yet but the patch shows the diffs. Let me know if you want any changes or additional tests. Differential Revision: https://reviews.llvm.org/D130103	2022-07-20 11:00:22 -07:00
Craig Topper	3aff7870a7	[InstCombine] Pre-commit test for D130103.	2022-07-20 11:00:21 -07:00
Nuno Lopes	952e069393	[NFC] remove 'br undef' from InstCombine test cases This is UB and allows the compiler to give any result, so these tests weren't meaningful InstCombine tests are now clean of 'br undef'	2022-06-10 15:28:57 +01:00
Sanjay Patel	a004438959	[InstCombine] add/move tests for shift-of-constant-by-same-shift-by-constant; NFC	2022-05-30 15:17:54 -04:00
Nikita Popov	0863abe3ac	[InstCombine] Fold icmp of select with non-constant operand Try to push an icmp into a select even if the icmp operand isn't constant - perform a generic SimplifyICmpInst instead. This doesn't appear to impact compile-time much, and forming logical and/or is generally profitable, as we have very good support for them.	2022-05-06 16:04:39 +02:00
Bjorn Pettersson	acdc419c89	[test] Use -passes=instcombine instead of -instcombine in lots of tests. NFC Another step moving away from the deprecated syntax of specifying pass pipeline in opt. Differential Revision: https://reviews.llvm.org/D119081	2022-02-07 14:26:59 +01:00
Sanjay Patel	2d031ec5e5	[InstCombine] add one-use check to opposite shift folds Test comments say this might be intentional, but I don't see any hard evidence to support it. The extra instruction shows up as a potential regression in D117680. One test does show a missed fold that might be recovered with better demanded bits analysis.	2022-01-20 13:49:23 -05:00
Sanjay Patel	09c575e728	[InstCombine] add/move tests for shl with binop; NFC	2021-09-28 14:46:27 -04:00
Sanjay Patel	21429cf43a	[InstCombine] generalize fold for (trunc (X u>> C1)) u>> C This is another step towards trying to re-apply D110170 by eliminating conflicting transforms that cause infinite loops. a47c8e40c734 was a previous patch in this direction. The diffs here are mostly cosmetic, but intentional: 1. The existing code that would handle this pattern in FoldShiftByConstant() is limited to 'shl' only now. The formatting change to IsLeftShift shows that we could move several transforms into visitShl() directly for efficiency because they are not common shift transforms. 2. The tests are regenerated to show new instruction names to prove that we are getting (almost) identical logic results. 3. The one case where we differ ("trunc_sandwich_small_shift1") shows that we now use a narrow 'and' instruction. Previously, we relied on another transform to do that, but it is limited to legal types. That seems to be a legacy constraint from when IR analysis and codegen were less robust. https://alive2.llvm.org/ce/z/JxyGA4 declare void @llvm.assume(i1) define i8 @src(i32 %x, i32 %c0, i8 %c1) { ; The sum of the shifts must not overflow the source width. %z1 = zext i8 %c1 to i32 %sum = add i32 %c0, %z1 %ov = icmp ult i32 %sum, 32 call void @llvm.assume(i1 %ov) %sh1 = lshr i32 %x, %c0 %tr = trunc i32 %sh1 to i8 %sh2 = lshr i8 %tr, %c1 ret i8 %sh2 } define i8 @tgt(i32 %x, i32 %c0, i8 %c1) { %z1 = zext i8 %c1 to i32 %sum = add i32 %c0, %z1 %maskc = lshr i8 -1, %c1 %s = lshr i32 %x, %sum %t = trunc i32 %s to i8 %a = and i8 %t, %maskc ret i8 %a }	2021-09-27 10:57:31 -04:00
Simon Pilgrim	5a14edd8ed	[InstCombine] Ensure shifts are in range for (X << C1) / C2 -> X fold. We can get here before out of range shift amounts have been handled - limit to BW-2 for sdiv and BW-1 for udiv Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=38078	2021-09-25 12:57:43 +01:00
Simon Pilgrim	10c982e0b3	Revert rG1c9bec727ab5c53fa060560dc8d346a911142170 : [InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069) Reverted (manually due to merge conflicts) while regressions reported on PR51540 are investigated As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead. I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst. https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!) Differential Revision: https://reviews.llvm.org/D106450	2021-08-23 21:09:26 +01:00
Simon Pilgrim	1c9bec727a	[InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069) As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead. I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst. https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!) Differential Revision: https://reviews.llvm.org/D106450	2021-07-22 10:58:51 +01:00
Sanjay Patel	0be0a1237c	[ValueTracking] improve analysis for "C << X" and "C >> X" This is based on the example/comments in: https://llvm.org/PR48984 I tried just lifting the restriction in computeKnownBitsFromShiftOperator() as suggested in the bug report, but that doesn't catch all of the cases shown here. I didn't step through to see exactly why that happened. But it seems like a reasonable compromise to cheaply check the special-case of shifting a constant. There's a slight regression on a cmp transform as noted, but this is likely the more important/common pattern, so we can fix that icmp pattern later if needed. Differential Revision: https://reviews.llvm.org/D95959	2021-02-09 12:38:06 -05:00
Sanjay Patel	9d230295d9	[InstCombine] add tests for demanded/known bits of shifted constant; NFC These are variations of a missed analysis noted in: https://llvm.org/PR48984	2021-02-04 10:31:22 -05:00
Nikita Popov	a6df39236f	[InstSimplify] Fold out-of-bounds shift to poison Make InstSimplify return poison rather than undef for out-of-bounds shifts, as specified by LandRef: > If op2 is (statically or dynamically) equal to or larger than the > number of bits in op1, this instruction returns a poison value. Differential Revision: https://reviews.llvm.org/D93998	2021-01-06 20:41:37 +01:00
Nikita Popov	766cf7f32e	[InstSimplify] Fold division by zero to poison Div/rem by zero is immediate undefined behavior and anything goes. Currently we fold it to undef, this patch changes it to fold to poison instead, which is slightly stronger. Differential Revision: https://reviews.llvm.org/D93995	2021-01-03 20:52:45 +01:00
Roman Lebedev	0e76a9bc58	[NFC][InstCombine] Update few comment updates i missed in 0ac56e8eaaeb As pointed out in post-commit review in that commit	2020-11-06 17:38:00 +03:00

1 2 3 4

165 Commits