llvm-project

Author	SHA1	Message	Date
Yingwei Zheng	c59ea32f82	[InstCombine] Canonicalize `icmp pred (X +/- C1), C2` into `icmp pred X, C2 -/+ C1` with nowrap flag implied by with.overflow intrinsic (#75511 ) This patch tries to canonicalize the pattern `Overflow \| icmp pred Res, C2` into `Overflow \| icmp pred X, C2 +/- C1`, where `Overflow` and `Res` are return values of `xxx.with.overflow X, C1`. Alive2: https://alive2.llvm.org/ce/z/PhR_3S Fixes #75360.	2023-12-16 17:58:57 +08:00
Yingwei Zheng	9cf3e31172	[InstCombine] Explicitly fold `~(~X >>u Y)` into `X >>s Y` (#75473 ) Fixes #75369. This patch explicitly folds `~(~X >>u Y)` into `X >>s Y` to fix assertion failure in #75369.	2023-12-14 23:06:38 +08:00
Nikita Popov	6e8b17d821	[InstCombine] Support or disjoint in displaced shift fold When I originally added this fold, it did not actually fix my motivation case, where the add was represented as an or. Now that we have the disjoint flag this can finally be cleanly supported.	2023-12-07 15:00:40 +01:00
Craig Topper	56248caa3b	[InstCombine] Explicitly set disjoint flag when converting xor to or. (#74229 )	2023-12-06 09:41:59 -08:00
Nikita Popov	a1b9736e9b	[PatternMatch] Add m_c_DisjointOr (NFC) Add commutative variant of m_DisjointOr.	2023-12-06 14:05:02 +01:00
Craig Topper	3e7ca05e93	[InstCombine] Use disjoint flag instead of calling haveNoCommonBitsSet. (#74222 )	2023-12-03 12:34:49 -08:00
Craig Topper	5db1c6ed48	Revert "[InstCombine] Fix missed opportunity to fold 'or' into 'mul' operand. (#74225 )" This reverts commit e3b3c91dd0bbc8bd6f1ee562641daf1e554eb1b6. This is causing an infinite loop on stage 2 builds.	2023-12-03 01:50:44 -08:00
Craig Topper	e3b3c91dd0	[InstCombine] Fix missed opportunity to fold 'or' into 'mul' operand. (#74225 ) We were able to fold or (mul X, Y), X --> mul X, (add Y, 1) (when the multiply has no common bits with X) This patch makes the transform work if the mul operands are commuted.	2023-12-03 00:51:22 -08:00
Nikita Popov	93636581d3	[InstCombiner] Make isFreeToInvert() and friends instance functions (NFC) In order to use SQ inside of these. There doesn't seem to be any strong need for these to be static.	2023-12-01 15:40:12 +01:00
Jeremy Morse	2425e2940e	[DebugInfo][RemoveDIs] Have getInsertionPtAfterDef return an iterator (#73149 ) Part of the "RemoveDIs" project to remove debug intrinsics requires passing block-positions around in iterators rather than as instruction pointers, allowing some debug-info to reside in BasicBlock::iterator. This means getInsertionPointAfterDef has to return an iterator, and as it can return no-instruction that means returning an optional iterator. This patch changes the signature for getInsertionPtAfterDef and then patches up the various places that use it to handle the different type. This would overall be an NFC patch, however in InstCombinerImpl::freezeOtherUses I've started skipping any debug intrinsics at the returned insert-position. This should not have any _meaningful_ effect on the compiler output: at worst it means variable assignments that are skipped will now cover the freeze instruction and anything inserted before it, which should be inconsequential. Sadly: this makes the function signature ugly. This is probably the ugliest piece of fallout for the "RemoveDIs" work, but it serves the overall purpose of improving compile times and not allowing `-g` to affect compiler output, so should be worthwhile in the end.	2023-11-30 12:19:57 +00:00
Noah Goldstein	b7c0f79926	[InstCombine] Replace `isFreeToInvert` + `CreateNot` with `getFreelyInverted` This is nearly an NFC, the only change is potentially to order that values are created/names. Otherwise it is a slight speed boost/simplification to avoid having to go through the `getFreelyInverted` recursive logic twice to simplify the extra `not` op.	2023-11-20 17:59:27 -06:00
Noah Goldstein	3039691f53	[InstCombine] add `getFreeInverted` to perform folds for free inversion of op With the current logic of `if(isFreeToInvert(Op)) return Not(Op)` its fairly easy to either 1) cause regressions or 2) infinite loops if the folds we have for `Not(Op)` ever de-sync with the cases we know are freely invertible. This patch adds `getFreeInverted` which is able to build the free inverted op along with check for free inversion to alleviate this problem.	2023-11-20 17:59:27 -06:00
HaohaiWen	95d584c6ac	[InstCombine] Convert or concat to fshl if opposite or concat exists (#68502 ) If there are two 'or' instructions concat variables in opposite order and the first 'or' dominates the second one, the second 'or' can be optimized to fshl to rotate shift first 'or'. This can eliminate an shl and expose more optimization opportunity for bswap/bitreverse.	2023-11-20 13:12:55 +08:00
Noah Goldstein	ad9147399f	[InstCombine] Improve eq/ne by parts to handle `ult/ugt` equality pattern. (icmp eq/ne (lshr x, C), (lshr y, C) gets optimized to `(icmp ult/uge (xor x, y), (1 << C)`. This can cause the current equal by parts detection to miss the high-bits as it may get optimized to the new pattern. This commit adds support for detecting / combining the ult/ugt pattern. Closes #69884	2023-11-04 19:00:28 -05:00
Nikita Popov	95e4ad3f0f	[InstCombine] Remove redundant add+and fold (NFCI) This is handling a special case of demanded bits simplification (which has multi-use support for adds, so it's not applicable in that case either).	2023-10-24 17:10:27 +02:00
Nikita Popov	b5c44564e5	[InstCombine] Remove redundant folds in foldCastedBitwiseLogic() (NFCI) The vector sext limitation the comment talks about has been removed a long time ago, in https://reviews.llvm.org/D36213.	2023-10-24 17:06:39 +02:00
Nikita Popov	14b0ae439f	[InstCombine] Remove redundant fold in foldUnsignedUnderflowCheck() (NFCI) Base - Offset == 0 will get canonicalized to Base == Offset even in multi-use contexts, at which point all of these patterns already get handled by generic code.	2023-10-24 16:57:18 +02:00
HaohaiWen	8ff3e4f39b	[InstCombine] Refactor matchFunnelShift to allow more pattern (NFC) (#68474 ) Current implementation of matchFunnelShift only allows opposite shift pattern. Refactor it to allow more pattern.	2023-10-19 09:06:30 +08:00
Yingwei Zheng	8a7e547798	[InstCombine] Canonicalize `(X +/- Y) & Y` into `~X & Y` when Y is a power of 2 (#67915 ) This patch canonicalizes the pattern `(X +/- Y) & Y` into `~X & Y` when `Y` is a power of 2 or zero. It will reduce the patterns to match in #67836 and exploit more optimization opportunities. Alive2: https://alive2.llvm.org/ce/z/LBpvRF	2023-10-12 17:18:12 +08:00
Yingwei Zheng	a7f962c007	[InstCombine] Canonicalize `and(zext(A), B)` into `select A, B & 1, 0` (#66740 ) This patch canonicalizes the pattern `and(zext(A), B)` into `select A, B & 1, 0`. Thus, we can reuse transforms `select B == even, B & 1, 0 -> 0` and `select B == odd, B & 1, 0 -> zext(B == odd)` in `InstCombine`. It is an alternative to #66676. Alive2: https://alive2.llvm.org/ce/z/598phE Fixes #66733. Fixes #66606. Fixes #28612.	2023-09-29 02:51:58 +08:00
Nikita Popov	6cd5eb1f54	[InstCombine] Avoid some uses of ConstantExpr::getZExt() (NFC) Add helpers getLosslessUnsignedTrunc/getLosslessSignedTrunc for this common pattern.	2023-09-28 17:02:33 +02:00
Yingwei Zheng	4c241a9335	[InstCombine] Fold `(-1 + A) & B` into `A ? 0 : B` where A is effectively a bool Solves issue https://github.com/llvm/llvm-project/issues/63321. This patch explicitly folds `(-1 + A) & B` into `A ? 0 : B`. Additional trunc will be created when `A` is neither i1 nor <N x i1>. https://alive2.llvm.org/ce/z/pWv9jJ Reviewed By: goldstein.w.n Differential Revision: https://reviews.llvm.org/D153148	2023-09-24 19:10:47 +08:00
Marc Auberer	1f313034cb	[InstCombine] Remove unnecessary one-use-check (#66419 ) This removes a oneUse check, that is actually unnecessary. Alive2: https://alive2.llvm.org/ce/z/qEkUEf Original patch: https://reviews.llvm.org/D159380	2023-09-15 06:46:30 +02:00
Noah Goldstein	2a904f456a	[InstCombine] Rename some shadow variables; NFC Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D159325	2023-09-13 15:50:18 -05:00
Yingwei Zheng	780b046bd0	[InstCombine] Use m_c_And/m_c_Or instead of duplicate logic. NFC. See also https://reviews.llvm.org/D153148#inline-1535588	2023-09-10 23:34:23 +08:00
Marc Auberer	904ac6fe6b	[InstCombine] Fold ((A&B)^A)\|((A&B)^B) to A^B Depends on D159379 ((A & B) ^ A) \| ((A & B) ^ B) -> A ^ B (A ^ (A & B)) \| (B ^ (A & B)) -> A ^ B ((A & B) ^ B) \| ((A & B) ^ A) -> A ^ B (B ^ (A & B)) \| (A ^ (A & B)) -> A ^ B Alive2: https://alive2.llvm.org/ce/z/i44xmq Baseline tests: https://reviews.llvm.org/D159379 Reviewed By: huihuiz Differential Revision: https://reviews.llvm.org/D159380	2023-09-07 17:33:06 -07:00
Matt Arsenault	70aede228a	InstCombine: Recognize fneg(fabs) as bitcasted integer Technically increases the number of instructions if the result isn't cast back to float. Even in this case it's still probably a better canonical form since it enables FP value tracking. https://reviews.llvm.org/D151939	2023-08-31 19:07:36 -04:00
Matt Arsenault	5c0da5839d	InstCombine: Recognize fabs as bitcasted integer In the past we sort of pretended float might be implementable as a non-IEEE type but that never realistically would work. Exotic FP types would need to be added to the IR. Turning these into FP operations enables FP tracking optimizations. https://reviews.llvm.org/D151937	2023-08-31 19:03:48 -04:00
Matt Arsenault	50a9b3d8a5	InstCombine: Recognize fneg when performed as bitcasted integer This is a resurrection of D18874. This was previously wrong with fneg conflated with fsub, but we now have a proper fneg instruction. Additionally, I think it is now clearer that IR float=IEEE float, and a different bit layout would require adding a different IR type. https://reviews.llvm.org/D151934	2023-08-31 18:59:34 -04:00
XChy	8a0b2ca821	[InstCombine] Transform bitwise (A >> C - 1, zext(icmp)) -> zext (bitwise(A < 0, icmp)) This extends foldCastedBitwiseLogic to handle the similar cases. I have recently submitted a patch to implement a single fold like: (A > 0) \| (A < 0) -> zext (A != 0) But it is not general enough, and some problems like a < b & a >= b - 1 happen again. So I generalize this fold by matching the pattern bitwise(A >> C - 1, zext(icmp)), and replace A >> C - 1 with zext(A < 0) here. (C is the scalar size bits of the type of A.) Then we get bitwise(zext(A < 0), zext(icmp)), this will be folded by original code in foldCastedBitwiseLogic, into zext(bitwise(A < 0, icmp)). And finally, any related icmp fold will be automatically implemented because bitwise(icmp,icmp) had been implemented. The proof of the correctness is obvious, because the folds below were previously proved and implemented. A >> C - 1 -> zext(A < 0) bitwise(zext(A), zext(B)) -> zext(bitwise(A, B)) And the fold of this patch is the combination of folds above. Fixes https://github.com/llvm/llvm-project/issues/63751. Differential Revision: https://reviews.llvm.org/D154791	2023-07-24 13:04:32 +02:00
Nikita Popov	218f97578b	[IR] Accept non-Instruction in BinaryOperator::CreateWithCopiedFlags() (NFC) The underlying copyIRFlags() API accepts arbitrary values and can work with flags on operators (i.e. instructions or constant expressions). Remove the arbitrary limitation that the CreateWithCopiedFlags() API imposes, so we can directly pass through values matched by PatternMatch, which can be constant expressions. The attached test case works fine now, but would crash with an upcoming change to not produce and constant expressions.	2023-07-21 10:05:52 +02:00
Dhruv Chawla	20ae2d200d	[InstCombine] Generalize foldAndOrOfICmpEqZeroAndICmp This patch generalizes the fold implemented by foldAndOrOfICmpEqZeroAndICmp, which are: (icmp eq X, 0) \| (icmp ult Other, X) -> (icmp ule Other, X-1) (icmp ne X, 0) & (icmp uge Other, X) -> (icmp ugt Other, X-1) to the following: (icmp eq X, C) \| (icmp ult Other, (X - C)) -> (icmp ule Other, (X - (C + 1))) (icmp ne X, C) & (icmp uge Other, (X - C)) -> (icmp ugt Other, (X - (C + 1))) The function foldAndOrOfICmpEqZeroAndICmp is also renamed to foldAndOrOfICmpEqConstantAndICmp to reflect the changes. Proofs: https://alive2.llvm.org/ce/z/yXGv6q Fixes #63749. Differential Revision: https://reviews.llvm.org/D154937	2023-07-12 11:13:37 +05:30
Nikita Popov	bc49103015	[InstCombine] Don't handle constants in de morgan folds (PR63791) If the and/or operand is an immediate constant, it will get folded away anyway. Don't try to freely invert those operands. A particularly degenerate case of this arises when both operands are constant and the result is a constant, in which case we try to invert users of a constant, resulting in an assertion failure. Fixes https://github.com/llvm/llvm-project/issues/63791.	2023-07-11 15:18:53 +02:00
Nikita Popov	b53e16ca0c	[InstCombine] Extract "freely invert" related helpers (NFC)	2023-07-11 15:00:23 +02:00
XChy	bfb5d2e6f8	[InstCombine] Transform (A > 0) \| (A < 0) -> zext (A != 0) fold [InstCombine] Transform (A > 0) \| (A < 0) -> zext (A != 0) fold This extends foldCastedBitwiseLogic to handle the similar cases. Actually, for `(A > B) \| (A < B)`, when B != 0, it can be optimized to `zext( A != B )` by foldAndOrOfICmpsUsingRanges. However, when B = 0, transformZExtICmp will transform `zext(A < 0) to i32` into `A << 31`, which cannot be optimized by foldAndOrOfICmpsUsingRanges. Because I'm new to LLVM and has no concise knowledge about how LLVM decides the order of optimization, I choose to extend foldCastedBitwiseLogic to fold `( A << (X - 1) ) \| ((A > 0) zext to iX) -> (A != 0) zext to iX`. And the equivalent fold follows: ``` A << (X - 1) ) \| ((A > 0) zext to iX -> A < 0 \| A > 0 -> (A != 0) zext to iX ``` It's proved by [[https://alive2.llvm.org/ce/z/33HzjE\|alive-tv]] Related issue: [[https://github.com/llvm/llvm-project/issues/62586 \| (a > b) \| (a < b) is not simplified only for the case b=0 ]] Reviewed By: goldstein.w.n Differential Revision: https://reviews.llvm.org/D154126	2023-07-06 02:02:43 -05:00
Nikita Popov	b45a73f4d0	[InstCombine] Fold binop of shifts with related amounts Fold binop(shift(ShiftedC1, ShAmt), shift(ShiftedC2, add(ShAmt, AddC))) -> shift(binop(ShiftedC1, shift(ShiftedC2, AddC)), ShAmt) where both shifts are the same and AddC is a valid shift amount. Proofs: https://alive2.llvm.org/ce/z/PhVVeg Differential Revision: https://reviews.llvm.org/D152927	2023-06-28 09:54:36 +02:00
Noah Goldstein	91cdffcb2f	[InstCombine] Transform `(binop1 (binop2 (lshift X,Amt),Mask),(lshift Y,Amt))` If `Mask` and `Amt` are not constants and `binop1` and `binop2` are the same we can transform to: `(binop (lshift (binop X, Y), Amt), Mask)` If `binop` is `add`, `lshift` must be `shl`. If `Mask` and `Amt` are constants `C` and `C1` respectively. We can transform to: `(lshift1 (binop1 (binop2 X, (inv_lshift1 C, C1), Y)), C1)` Saving an instruction IFF: `lshift1` is same opcode as `lshift2` Either `bitwise1` and/or `bitwise2` is `and`. Proofs(1/2): https://alive2.llvm.org/ce/z/BjN-m_ Proofs(2/2): https://alive2.llvm.org/ce/z/bZn5QB This is to help fix the regression caused in D151807 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D152568	2023-06-13 20:08:35 -05:00
John McIver	1001f9031f	[InstCombine] Optimize and of icmps with power-of-2 and contiguous masks Add an instance combine optimization for expressions of the form: (%arg u< C1) & ((%arg & C2) != C2) -> %arg u< C2 Where C1 is a power-of-2 and C2 is a contiguous mask starting 1 bit below C1. This commit resolves GitHub missed-optimization issue #54856. Validation of scalar tests: - https://alive2.llvm.org/ce/z/JfKjiU - https://alive2.llvm.org/ce/z/AruHY_ - https://alive2.llvm.org/ce/z/JAiR6t - https://alive2.llvm.org/ce/z/S2X2e5 - https://alive2.llvm.org/ce/z/4cycdE - https://alive2.llvm.org/ce/z/NcDiLP Validation of vector tests: - https://alive2.llvm.org/ce/z/ABY6tE - https://alive2.llvm.org/ce/z/BTJi3s - https://alive2.llvm.org/ce/z/3BKWpu - https://alive2.llvm.org/ce/z/RrAbkj - https://alive2.llvm.org/ce/z/nM6fsN Reviewed By: goldstein.w.n Differential Revision: https://reviews.llvm.org/D125717	2023-06-09 16:07:01 -06:00
Shivam Gupta	46aba711ab	[InstCombine] (icmp eq A, -1) & (icmp eq B, -1) --> (icmp eq (A&B), -1) This patch add another icmp fold for -1 case. This fixes https://github.com/llvm/llvm-project/issues/62311, where we want instcombine to merge all compare intructions together so later passes like simplifycfg and slpvectorize can better optimize this chained comparison. Reviewed By: goldstein.w.n Differential Revision: https://reviews.llvm.org/D151660	2023-06-08 09:00:05 +05:30
Eric Gullufsen	b9bbe2f603	[InstCombine] Preserve nsw/nuw flags in canonicalization canonicalizeLogicFirst reorders logic op / math op for suitable constants, and this commit makes this function pass through nsw/nuw flags on the Add. Differential Revision: https://reviews.llvm.org/D147568	2023-04-05 10:12:54 -04:00
Matt Arsenault	3b44109b71	InstCombine: Introduce new is.fpclass from logic of fcmp Fixes regressions from patch to turn more classes into fcmp.	2023-03-26 09:34:29 -04:00
Matt Arsenault	8a37512924	ValueTracking: Extract fcmpToClassTest out of InstCombine Also update unsigned to FPClassTest	2023-03-16 23:14:40 -04:00
Matt Arsenault	0d18f315d8	InstCombine: Handle folding fcmp of 0 into llvm.is.fpclass This needs to consider the denormal mode.	2023-03-15 07:07:55 -04:00
Matt Arsenault	08f0388711	InstCombine: Fold and/or of fcmp into class This is motivated by patterns like !isfinite \|\| zero. The AMDGPU math libraries have a lot of patterns like this, and I'm trying to fix the code to be more portable and less dependent on directly calling class intrinsics. I believe this is the first place where new is.fpclass calls are introduced. There are more class-like compares that could be recognized; this is a set I currently care about plus a few extras. Keep the == 0 cases disabled for now. It depends on the denormal mode. If we just check IEEE mode now, it will break my use case without another patch I'm working on.	2023-02-23 06:19:40 -04:00
Kazu Hirata	f8f3db2756	Use APInt::count{l,r}_{zero,one} (NFC)	2023-02-19 22:04:47 -08:00
Yingchi Long	f9e2fb9d8e	[InstCombine] combine intersection for inequality icmps ``` define i1 @src(i32 %A) { %mask1 = and i32 %A, 15 ; 0x0f %tst1 = icmp eq i32 %mask1, 3 ; 0x03 %mask2 = and i32 %A, 255 ; 0xff %tst2 = icmp eq i32 %mask2, 243; 0xf3 %res = or i1 %tst1, %tst2 ret i1 %res } ``` -> ``` define i1 @tgt(i32 %A) { %1 = and i32 %A, 15 %res = icmp eq i32 %1, 3 ret i1 %res } ``` Proof: https://alive2.llvm.org/ce/z/4AyvcE Assume that `(B & D) & (C ^ E) == 0`, and `(B & D) == D \|\| (B & D) == B`, transforms: ``` (icmp ne (A & B), C) & (icmp ne (A & D), E) -> (icmp ne (A & (B&D)), (C&E)) ``` Fixes: https://github.com/llvm/llvm-project/issues/59680 Reviewed By: spatel, bcl5980 Differential Revision: https://reviews.llvm.org/D140666	2023-02-10 12:50:39 +08:00
Matt Arsenault	9ad6bdd747	InstCombine: Fold and (fcmp), (is.fpclass) into is.fpclass Fold class test performed by an fcmp into another class. For now this avoids introducing new class calls then there isn't one that already exists.	2023-02-08 21:40:20 -04:00
chenglin.bi	4a66b3b20e	[InstCombine] Fold pattern xor(and, or) to select (A & B) ^ (A \| C) --> A ? ~B : C https://alive2.llvm.org/ce/z/KCBfXr https://alive2.llvm.org/ce/z/Pm-zJN https://alive2.llvm.org/ce/z/VT8uC2 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D141660	2023-02-03 17:12:16 +08:00
Sanjay Patel	2d7bb60667	[InstCombine] improve description of not+shift transform; NFC This was added recently with: e44a305690add9f75	2023-01-26 08:58:45 -05:00
Sanjay Patel	e44a305690	[InstCombine] invert canonicalization of sext (x > -1) --> not (ashr x) https://alive2.llvm.org/ce/z/2iC4oB This is similar to changes made for zext + lshr: 21d3871b7c90 6c39a3aae1dc The existing fold did not account for extra uses, so we see some instruction count reductions in the test diffs. This is intended to improve analysis (icmp likely has more transforms than any other opcode), make other transforms more symmetric with zext/lshr, and it can be inverted in codegen if profitable. As with the earlier changes, there is potential to uncover infinite combine loops, but I have not found any yet.	2023-01-24 16:44:15 -05:00

1 2 3 4 5 ...

706 Commits