llvm-project

Author	SHA1	Message	Date
Yingwei Zheng	caa2258250	[LLVM] Remove nuw neg (#86295 ) This patch removes APIs that creating NUW neg. It is a trivial case because `sub nuw 0, X` always gets simplified into zero. I believe there is no optimization opportunities in the real-world applications that we can take advantage of the nuw flag. Motivated by https://github.com/llvm/llvm-project/pull/84792#discussion_r1524891134. Compile-time improvement: https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=da7b7478b7cbb32c09d760f6b8d0e67901e0d533&stat=instructions:u	2024-03-26 20:56:16 +08:00
Yingwei Zheng	9eb399b854	[InstCombine] Support zext nneg in `foldLogicCastConstant` (#82355 ) This patch extends [D36234](https://reviews.llvm.org/D36234) to handle `zext nneg` instructions. I found this while adding support for cast instructions in `getFreelyInvertedImpl`.	2024-02-20 23:09:00 +08:00
SahilPatidar	4b483ecd55	[InstCombine] Fix failure to fold (and %x, (sext i1 %m)) -> (select %m, %x, 0) with multiple uses of %m (#81409 ) Resolves #81288.	2024-02-19 11:07:16 +01:00
Eikansh Gupta	db870cfc9e	[InstCombine] Extract helper from matchFunnelShift (NFC) The matchFunnelShift function was doing pattern matching and creating the fshl/fshr instruction if needed. Moved the pattern matching code to function convertOrOfShiftsToFunnelShift. It can be reused for other optimizations.	2024-02-16 16:46:41 +01:00
Yingwei Zheng	470c5b8011	[InstSimplify][InstCombine] Remove unnecessary `m_c_` matchers. (#81712 ) This patch removes unnecessary `m_c_` matchers since we always canonicalize `commutive_op Cst, X` into `commutive_op X, Cst`. Compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=bfc0b7c6891896ee8e9818f22800472510093864&to=d27b058bb9acaa43d3cadbf3cd889e8f79e5c634&stat=instructions:u	2024-02-14 16:40:36 +08:00
Nikita Popov	074f7c2235	[InstCombine] Remove redundant fold (NFCI) This has been subsumed by simplifyAndOrWithOpReplaced().	2024-02-12 09:54:32 +01:00
Nikita Popov	35d6ae8110	[InstCombine] Handle multi-use in simplifyAndOrWithOpReplaced() (#81006 ) Slightly generalize simplifyAndOrWithOpReplaced() by allowing it to perform simplifications (without creating new instructions) in multi-use cases. This way we can remove existing patterns without worrying about multi-use edge cases. I've opted to change the general way the implementation works to be more similar to the standard simplifyWithOpReplaced(). We perform the operand replacement generically, and then try to simplify the result or create a new instruction if we're allowed to do so.	2024-02-08 09:44:51 +01:00
Yingwei Zheng	65bf93dd7b	[InstCombine] Clean up bitwise folds without one-use check (#80587 ) This patch removes some bitwise folds that fail to check the one-use constraint on the operands. See also the comments https://github.com/llvm/llvm-project/pull/77231#issuecomment-1904090035.	2024-02-08 01:15:05 +08:00
Yingwei Zheng	f37d81f8a3	[PatternMatch] Add a matching helper `m_ElementWiseBitCast`. NFC. (#80764 ) This patch introduces a matching helper `m_ElementWiseBitCast`, which is used for matching element-wise int <-> fp casts. The motivation of this patch is to avoid duplicating checks in https://github.com/llvm/llvm-project/pull/80740 and https://github.com/llvm/llvm-project/pull/80414.	2024-02-07 21:02:13 +08:00
Yingwei Zheng	4858e9c9fe	[InstCombine] Canonicalize the fcmp range check idiom into `fabs + fcmp` (#76367 ) This patch canonicalizes the fcmp range check idiom into `fabs + fcmp` since the canonicalized form is better than the original form for the backends. Godbolt: https://godbolt.org/z/x3eqPb1fz ``` and (fcmp olt/ole/ult/ule x, C), (fcmp ogt/oge/ugt/uge x, -C) --> fabs(x) olt/ole/ult/ule C or (fcmp ogt/oge/ugt/uge x, C), (fcmp olt/ole/ult/ule x, -C) --> fabs(x) ogt/oge/ugt/uge C ``` Alive2: https://alive2.llvm.org/ce/z/MRtoYq	2024-02-07 04:33:26 +08:00
elhewaty	2614672cc1	[InstCombine] Fold ((cst << x) & 1) --> x == 0 when cst is odd (#79772 ) Fold ((cst << x) & 1) to zext(x == 0) when cst is odd. Fixes: https://github.com/llvm/llvm-project/issues/73384 Alive2: https://alive2.llvm.org/ce/z/5RbaK6	2024-02-05 16:27:53 +01:00
Yingwei Zheng	f2816ff60c	[InstCombine] Simplify and/or by replacing operands with constants (#77231 ) This patch tries to simplify `X \| Y` by replacing occurrences of `Y` in `X` with 0. Similarly, it tries to simplify `X & Y` by replacing occurrences of `Y` in `X` with -1. Alive2: https://alive2.llvm.org/ce/z/cNjDTR Note: As the current implementation is too conservative in the one-use checks, I cannot remove other existing hard-coded simplifications if they involves more than two instructions (e.g, `A & ~(A ^ B) --> A & B`). Compile-time impact: http://llvm-compile-time-tracker.com/compare.php?from=a085402ef54379758e6c996dbaedfcb92ad222b5&to=9d655c6685865ffce0ad336fed81228f3071bd03&stat=instructions%3Au \|stage1-O3\|stage1-ReleaseThinLTO\|stage1-ReleaseLTO-g\|stage1-O0-g\|stage2-O3\|stage2-O0-g\|stage2-clang\| \|--\|--\|--\|--\|--\|--\|--\| \|+0.01%\|-0.00%\|+0.00%\|-0.02%\|+0.01%\|+0.02%\|-0.01%\| Fixes #76554.	2024-01-31 14:30:55 +08:00
Yingwei Zheng	9acc404230	[InstCombine] Recognize more rotation patterns (#78107 ) InstCombine already handles the pattern `(shl ShVal, (X & (Width - 1))) \| (lshr ShVal, ((-X) & (Width - 1)))`. Under certain circumstances, `X & (Width - 1)` will be simplified to `X`. Therefore, this patch adds support for the pattern `(shl ShVal, X) \| (lshr ShVal, ((-X) & (Width - 1)))`. Alive2: https://alive2.llvm.org/ce/z/P7JQ2V	2024-01-18 20:29:53 +08:00
Noah Goldstein	60e8915d22	[InstCombine] Add folds for `(add/sub/disjoint_or/icmp C, (ctpop (not x)))` `(ctpop (not x))` <-> `(sub nuw nsw BitWidth(x), (ctpop x))`. The `sub` expression can sometimes be constant folded depending on the use case of `(ctpop (not x))`. This patch adds fold for the following cases: `(add/sub/disjoint_or C, (ctpop (not x))` -> `(add/sub/disjoint_or C', (ctpop x))` `(cmp pred C, (ctpop (not x))` -> `(cmp swapped_pred C', (ctpop x))` Where `C'` depends on how we constant fold `C` with `BitWidth(x)` for the given opcode. Proofs: https://alive2.llvm.org/ce/z/qUgfF3 Closes #77859	2024-01-15 12:05:38 -08:00
Yingwei Zheng	29f98d6c25	[InstCombine] Fold bitwise logic with intrinsics (#77460 ) This patch does the following folds: ``` bitwise(fshl (A, B, ShAmt), fshl(C, D, ShAmt)) -> fshl(bitwise(A, C), bitwise(B, D), ShAmt) bitwise(fshr (A, B, ShAmt), fshr(C, D, ShAmt)) -> fshr(bitwise(A, C), bitwise(B, D), ShAmt) bitwise(bswap(A), bswap(B)) -> bswap(bitwise(A, B)) bitwise(bswap(A), C) -> bswap(bitwise(A, bswap(C))) bitwise(bitreverse(A), bitreverse(B)) -> bitreverse(bitwise(A, B)) bitwise(bitreverse(A), C) -> bitreverse(bitwise(A, bitreverse(C))) ``` Alive2: https://alive2.llvm.org/ce/z/iZN_TL	2024-01-10 19:33:18 +08:00
Yingwei Zheng	90802e652d	[InstCombine] Handle commuted cases of the fold `((B\|C)&A)\|B -> B\|(A&C)` (#76565 ) Alive2: https://alive2.llvm.org/ce/z/Qdsqk6 The commit `f1eda23514` didn't handle other cases that commute operands.	2023-12-29 23:58:58 +08:00
Yingwei Zheng	7a1a476116	[InstCombine] Fold `(X & C1) \| C2` into `X & (C1 \| C2)` iff `(X & C2) == C2` (#76470 ) Alive2: https://alive2.llvm.org/ce/z/VKJYaS	2023-12-28 20:47:40 +08:00
Yingwei Zheng	0d454d6e59	[InstCombine] Fold xor of icmps using range information (#76334 ) This patch folds xor of icmps into a single comparison using range-based reasoning as `foldAndOrOfICmpsUsingRanges` does. Fixes #70928.	2023-12-25 07:14:31 +08:00
Yingwei Zheng	c59ea32f82	[InstCombine] Canonicalize `icmp pred (X +/- C1), C2` into `icmp pred X, C2 -/+ C1` with nowrap flag implied by with.overflow intrinsic (#75511 ) This patch tries to canonicalize the pattern `Overflow \| icmp pred Res, C2` into `Overflow \| icmp pred X, C2 +/- C1`, where `Overflow` and `Res` are return values of `xxx.with.overflow X, C1`. Alive2: https://alive2.llvm.org/ce/z/PhR_3S Fixes #75360.	2023-12-16 17:58:57 +08:00
Yingwei Zheng	9cf3e31172	[InstCombine] Explicitly fold `~(~X >>u Y)` into `X >>s Y` (#75473 ) Fixes #75369. This patch explicitly folds `~(~X >>u Y)` into `X >>s Y` to fix assertion failure in #75369.	2023-12-14 23:06:38 +08:00
Nikita Popov	6e8b17d821	[InstCombine] Support or disjoint in displaced shift fold When I originally added this fold, it did not actually fix my motivation case, where the add was represented as an or. Now that we have the disjoint flag this can finally be cleanly supported.	2023-12-07 15:00:40 +01:00
Craig Topper	56248caa3b	[InstCombine] Explicitly set disjoint flag when converting xor to or. (#74229 )	2023-12-06 09:41:59 -08:00
Nikita Popov	a1b9736e9b	[PatternMatch] Add m_c_DisjointOr (NFC) Add commutative variant of m_DisjointOr.	2023-12-06 14:05:02 +01:00
Craig Topper	3e7ca05e93	[InstCombine] Use disjoint flag instead of calling haveNoCommonBitsSet. (#74222 )	2023-12-03 12:34:49 -08:00
Craig Topper	5db1c6ed48	Revert "[InstCombine] Fix missed opportunity to fold 'or' into 'mul' operand. (#74225 )" This reverts commit e3b3c91dd0bbc8bd6f1ee562641daf1e554eb1b6. This is causing an infinite loop on stage 2 builds.	2023-12-03 01:50:44 -08:00
Craig Topper	e3b3c91dd0	[InstCombine] Fix missed opportunity to fold 'or' into 'mul' operand. (#74225 ) We were able to fold or (mul X, Y), X --> mul X, (add Y, 1) (when the multiply has no common bits with X) This patch makes the transform work if the mul operands are commuted.	2023-12-03 00:51:22 -08:00
Nikita Popov	93636581d3	[InstCombiner] Make isFreeToInvert() and friends instance functions (NFC) In order to use SQ inside of these. There doesn't seem to be any strong need for these to be static.	2023-12-01 15:40:12 +01:00
Jeremy Morse	2425e2940e	[DebugInfo][RemoveDIs] Have getInsertionPtAfterDef return an iterator (#73149 ) Part of the "RemoveDIs" project to remove debug intrinsics requires passing block-positions around in iterators rather than as instruction pointers, allowing some debug-info to reside in BasicBlock::iterator. This means getInsertionPointAfterDef has to return an iterator, and as it can return no-instruction that means returning an optional iterator. This patch changes the signature for getInsertionPtAfterDef and then patches up the various places that use it to handle the different type. This would overall be an NFC patch, however in InstCombinerImpl::freezeOtherUses I've started skipping any debug intrinsics at the returned insert-position. This should not have any _meaningful_ effect on the compiler output: at worst it means variable assignments that are skipped will now cover the freeze instruction and anything inserted before it, which should be inconsequential. Sadly: this makes the function signature ugly. This is probably the ugliest piece of fallout for the "RemoveDIs" work, but it serves the overall purpose of improving compile times and not allowing `-g` to affect compiler output, so should be worthwhile in the end.	2023-11-30 12:19:57 +00:00
Noah Goldstein	b7c0f79926	[InstCombine] Replace `isFreeToInvert` + `CreateNot` with `getFreelyInverted` This is nearly an NFC, the only change is potentially to order that values are created/names. Otherwise it is a slight speed boost/simplification to avoid having to go through the `getFreelyInverted` recursive logic twice to simplify the extra `not` op.	2023-11-20 17:59:27 -06:00
Noah Goldstein	3039691f53	[InstCombine] add `getFreeInverted` to perform folds for free inversion of op With the current logic of `if(isFreeToInvert(Op)) return Not(Op)` its fairly easy to either 1) cause regressions or 2) infinite loops if the folds we have for `Not(Op)` ever de-sync with the cases we know are freely invertible. This patch adds `getFreeInverted` which is able to build the free inverted op along with check for free inversion to alleviate this problem.	2023-11-20 17:59:27 -06:00
HaohaiWen	95d584c6ac	[InstCombine] Convert or concat to fshl if opposite or concat exists (#68502 ) If there are two 'or' instructions concat variables in opposite order and the first 'or' dominates the second one, the second 'or' can be optimized to fshl to rotate shift first 'or'. This can eliminate an shl and expose more optimization opportunity for bswap/bitreverse.	2023-11-20 13:12:55 +08:00
Noah Goldstein	ad9147399f	[InstCombine] Improve eq/ne by parts to handle `ult/ugt` equality pattern. (icmp eq/ne (lshr x, C), (lshr y, C) gets optimized to `(icmp ult/uge (xor x, y), (1 << C)`. This can cause the current equal by parts detection to miss the high-bits as it may get optimized to the new pattern. This commit adds support for detecting / combining the ult/ugt pattern. Closes #69884	2023-11-04 19:00:28 -05:00
Nikita Popov	95e4ad3f0f	[InstCombine] Remove redundant add+and fold (NFCI) This is handling a special case of demanded bits simplification (which has multi-use support for adds, so it's not applicable in that case either).	2023-10-24 17:10:27 +02:00
Nikita Popov	b5c44564e5	[InstCombine] Remove redundant folds in foldCastedBitwiseLogic() (NFCI) The vector sext limitation the comment talks about has been removed a long time ago, in https://reviews.llvm.org/D36213.	2023-10-24 17:06:39 +02:00
Nikita Popov	14b0ae439f	[InstCombine] Remove redundant fold in foldUnsignedUnderflowCheck() (NFCI) Base - Offset == 0 will get canonicalized to Base == Offset even in multi-use contexts, at which point all of these patterns already get handled by generic code.	2023-10-24 16:57:18 +02:00
HaohaiWen	8ff3e4f39b	[InstCombine] Refactor matchFunnelShift to allow more pattern (NFC) (#68474 ) Current implementation of matchFunnelShift only allows opposite shift pattern. Refactor it to allow more pattern.	2023-10-19 09:06:30 +08:00
Yingwei Zheng	8a7e547798	[InstCombine] Canonicalize `(X +/- Y) & Y` into `~X & Y` when Y is a power of 2 (#67915 ) This patch canonicalizes the pattern `(X +/- Y) & Y` into `~X & Y` when `Y` is a power of 2 or zero. It will reduce the patterns to match in #67836 and exploit more optimization opportunities. Alive2: https://alive2.llvm.org/ce/z/LBpvRF	2023-10-12 17:18:12 +08:00
Yingwei Zheng	a7f962c007	[InstCombine] Canonicalize `and(zext(A), B)` into `select A, B & 1, 0` (#66740 ) This patch canonicalizes the pattern `and(zext(A), B)` into `select A, B & 1, 0`. Thus, we can reuse transforms `select B == even, B & 1, 0 -> 0` and `select B == odd, B & 1, 0 -> zext(B == odd)` in `InstCombine`. It is an alternative to #66676. Alive2: https://alive2.llvm.org/ce/z/598phE Fixes #66733. Fixes #66606. Fixes #28612.	2023-09-29 02:51:58 +08:00
Nikita Popov	6cd5eb1f54	[InstCombine] Avoid some uses of ConstantExpr::getZExt() (NFC) Add helpers getLosslessUnsignedTrunc/getLosslessSignedTrunc for this common pattern.	2023-09-28 17:02:33 +02:00
Yingwei Zheng	4c241a9335	[InstCombine] Fold `(-1 + A) & B` into `A ? 0 : B` where A is effectively a bool Solves issue https://github.com/llvm/llvm-project/issues/63321. This patch explicitly folds `(-1 + A) & B` into `A ? 0 : B`. Additional trunc will be created when `A` is neither i1 nor <N x i1>. https://alive2.llvm.org/ce/z/pWv9jJ Reviewed By: goldstein.w.n Differential Revision: https://reviews.llvm.org/D153148	2023-09-24 19:10:47 +08:00
Marc Auberer	1f313034cb	[InstCombine] Remove unnecessary one-use-check (#66419 ) This removes a oneUse check, that is actually unnecessary. Alive2: https://alive2.llvm.org/ce/z/qEkUEf Original patch: https://reviews.llvm.org/D159380	2023-09-15 06:46:30 +02:00
Noah Goldstein	2a904f456a	[InstCombine] Rename some shadow variables; NFC Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D159325	2023-09-13 15:50:18 -05:00
Yingwei Zheng	780b046bd0	[InstCombine] Use m_c_And/m_c_Or instead of duplicate logic. NFC. See also https://reviews.llvm.org/D153148#inline-1535588	2023-09-10 23:34:23 +08:00
Marc Auberer	904ac6fe6b	[InstCombine] Fold ((A&B)^A)\|((A&B)^B) to A^B Depends on D159379 ((A & B) ^ A) \| ((A & B) ^ B) -> A ^ B (A ^ (A & B)) \| (B ^ (A & B)) -> A ^ B ((A & B) ^ B) \| ((A & B) ^ A) -> A ^ B (B ^ (A & B)) \| (A ^ (A & B)) -> A ^ B Alive2: https://alive2.llvm.org/ce/z/i44xmq Baseline tests: https://reviews.llvm.org/D159379 Reviewed By: huihuiz Differential Revision: https://reviews.llvm.org/D159380	2023-09-07 17:33:06 -07:00
Matt Arsenault	70aede228a	InstCombine: Recognize fneg(fabs) as bitcasted integer Technically increases the number of instructions if the result isn't cast back to float. Even in this case it's still probably a better canonical form since it enables FP value tracking. https://reviews.llvm.org/D151939	2023-08-31 19:07:36 -04:00
Matt Arsenault	5c0da5839d	InstCombine: Recognize fabs as bitcasted integer In the past we sort of pretended float might be implementable as a non-IEEE type but that never realistically would work. Exotic FP types would need to be added to the IR. Turning these into FP operations enables FP tracking optimizations. https://reviews.llvm.org/D151937	2023-08-31 19:03:48 -04:00
Matt Arsenault	50a9b3d8a5	InstCombine: Recognize fneg when performed as bitcasted integer This is a resurrection of D18874. This was previously wrong with fneg conflated with fsub, but we now have a proper fneg instruction. Additionally, I think it is now clearer that IR float=IEEE float, and a different bit layout would require adding a different IR type. https://reviews.llvm.org/D151934	2023-08-31 18:59:34 -04:00
XChy	8a0b2ca821	[InstCombine] Transform bitwise (A >> C - 1, zext(icmp)) -> zext (bitwise(A < 0, icmp)) This extends foldCastedBitwiseLogic to handle the similar cases. I have recently submitted a patch to implement a single fold like: (A > 0) \| (A < 0) -> zext (A != 0) But it is not general enough, and some problems like a < b & a >= b - 1 happen again. So I generalize this fold by matching the pattern bitwise(A >> C - 1, zext(icmp)), and replace A >> C - 1 with zext(A < 0) here. (C is the scalar size bits of the type of A.) Then we get bitwise(zext(A < 0), zext(icmp)), this will be folded by original code in foldCastedBitwiseLogic, into zext(bitwise(A < 0, icmp)). And finally, any related icmp fold will be automatically implemented because bitwise(icmp,icmp) had been implemented. The proof of the correctness is obvious, because the folds below were previously proved and implemented. A >> C - 1 -> zext(A < 0) bitwise(zext(A), zext(B)) -> zext(bitwise(A, B)) And the fold of this patch is the combination of folds above. Fixes https://github.com/llvm/llvm-project/issues/63751. Differential Revision: https://reviews.llvm.org/D154791	2023-07-24 13:04:32 +02:00
Nikita Popov	218f97578b	[IR] Accept non-Instruction in BinaryOperator::CreateWithCopiedFlags() (NFC) The underlying copyIRFlags() API accepts arbitrary values and can work with flags on operators (i.e. instructions or constant expressions). Remove the arbitrary limitation that the CreateWithCopiedFlags() API imposes, so we can directly pass through values matched by PatternMatch, which can be constant expressions. The attached test case works fine now, but would crash with an upcoming change to not produce and constant expressions.	2023-07-21 10:05:52 +02:00
Dhruv Chawla	20ae2d200d	[InstCombine] Generalize foldAndOrOfICmpEqZeroAndICmp This patch generalizes the fold implemented by foldAndOrOfICmpEqZeroAndICmp, which are: (icmp eq X, 0) \| (icmp ult Other, X) -> (icmp ule Other, X-1) (icmp ne X, 0) & (icmp uge Other, X) -> (icmp ugt Other, X-1) to the following: (icmp eq X, C) \| (icmp ult Other, (X - C)) -> (icmp ule Other, (X - (C + 1))) (icmp ne X, C) & (icmp uge Other, (X - C)) -> (icmp ugt Other, (X - (C + 1))) The function foldAndOrOfICmpEqZeroAndICmp is also renamed to foldAndOrOfICmpEqConstantAndICmp to reflect the changes. Proofs: https://alive2.llvm.org/ce/z/yXGv6q Fixes #63749. Differential Revision: https://reviews.llvm.org/D154937	2023-07-12 11:13:37 +05:30

1 2 3 4 5 ...

724 Commits