llvm-project

Author	SHA1	Message	Date
Noah Goldstein	81cdd35c0c	[ValueTracking] Add support for `xor`/`disjoint or` in `isKnownNonZero` Handles cases like `X ^ Y == X` / `X disjoint\| Y == X`. Both of these cases have identical logic to the existing `add` case, so just converting the `add` code to a more general helper. Proofs: https://alive2.llvm.org/ce/z/Htm7pe Closes #87706	2024-04-10 13:13:43 -05:00
Noah Goldstein	2646790155	[ValueTracking] Add tests for `xor`/`disjoint or` in `isKnownNonZero`; NFC	2024-04-10 13:13:43 -05:00
Noah Goldstein	0c57a2e4b4	[ValueTracking] Add support for `xor`/`disjoint or` in `getInvertibleOperands` This strengthens our `isKnownNonEqual` logic with some fairly trivial cases. Proofs: https://alive2.llvm.org/ce/z/4pxRTj Closes #87705	2024-04-10 13:13:43 -05:00
Noah Goldstein	195d278d50	[ValueTracking] Add tests for `xor`/`disjoint or` in `getInvertibleOperands`; NFC	2024-04-10 13:13:43 -05:00
Noah Goldstein	9c545a14c0	[ValueTracking] Add support for `insertelement` in `isKnownNonZero` Inserts don't modify the data, so if all elements that end up in the destination are non-zero the result is non-zero. Closes #87703	2024-04-10 13:13:43 -05:00
Noah Goldstein	8a28b9b8ec	[ValueTracking] Add tests for `insertelement` in `isKnownNonZero`; NFC	2024-04-10 13:13:43 -05:00
Noah Goldstein	87528bfefb	[ValueTracking] Add support for `shufflevector` in `isKnownNonZero` Shuffles don't modify the data, so if all elements that end up in the destination are non-zero the result is non-zero. Closes #87702	2024-04-10 13:13:42 -05:00
Noah Goldstein	c1d3f39ae9	[ValueTracking] Add tests for `shufflevector` in `isKnownNonZero`	2024-04-10 13:13:42 -05:00
Noah Goldstein	f1ee458ddb	[ValueTracking] improve `isKnownNonZero` precision for `smax` Instead of relying on known-bits for strictly positive, use the `isKnownPositive` API. This will use `isKnownNonZero` which is more accurate. Closes #88170	2024-04-10 10:40:49 -05:00
Noah Goldstein	2ff82c2c64	[ValueTracking] Add tests for improving `isKnownNonZero` of `smax`; NFC	2024-04-10 10:40:49 -05:00
Noah Goldstein	678f32ab66	[ValueTracking] Add more conditions in to `isTruePredicate` There is one notable "regression". This patch replaces the bespoke `or disjoint` logic we a direct match. This means we fail some simplification during `instsimplify`. All the cases we fail in `instsimplify` we do handle in `instcombine` as we add `disjoint` flags. Other than that, just some basic cases. See proofs: https://alive2.llvm.org/ce/z/_-g7C8 Closes #86083	2024-04-04 12:42:58 -05:00
Noah Goldstein	74447cf46f	[ValueTracking] Add tests for deducing more conditions in `isTruePredicate`; NFC	2024-04-04 12:42:58 -05:00
Andreas Jonson	e66cfebb04	[ValueTracking] Handle range attributes (#85143 ) Handle the range attribute in ValueTracking.	2024-03-20 12:43:00 +01:00
Nikita Popov	0f46e31cfb	[IR] Change representation of getelementptr inrange (#84341 ) As part of the migration to ptradd (https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699), we need to change the representation of the `inrange` attribute, which is used for vtable splitting. Currently, inrange is specified as follows: ``` getelementptr inbounds ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, inrange i32 1, i64 2) ``` The `inrange` is placed on a GEP index, and all accesses must be "in range" of that index. The new representation is as follows: ``` getelementptr inbounds inrange(-16, 16) ({ [4 x ptr], [4 x ptr] }, ptr @vt, i64 0, i32 1, i64 2) ``` This specifies which offsets are "in range" of the GEP result. The new representation will continue working when canonicalizing to ptradd representation: ``` getelementptr inbounds inrange(-16, 16) (i8, ptr @vt, i64 48) ``` The inrange offsets are relative to the return value of the GEP. An alternative design could make them relative to the source pointer instead. The result-relative format was chosen on the off-chance that we want to extend support to non-constant GEPs in the future, in which case this variant is more expressive. This implementation "upgrades" the old inrange representation in bitcode by simply dropping it. This is a very niche feature, and I don't think trying to upgrade it is worthwhile. Let me know if you disagree.	2024-03-20 10:59:45 +01:00
Nikita Popov	00ca80938b	[ConstantFold] Fix comparison between special pointer constants This code was assuming that the LHS would always be one of GlobalVariable, BlockAddress or ConstantExpr. However, it can also be a special constant like dso_local_equivalent or no_cfi. Make sure this is handled gracefully.	2024-03-19 12:24:11 +01:00
Noah Goldstein	5265be11b1	[InstSimply] Simplify `(fmul -x, +/-0)` -> `-/+0` We already handle the `+x` case, and noticed it was missing in the bug affecting #82555 Proofs: https://alive2.llvm.org/ce/z/WUSvmV Closes #85345	2024-03-18 15:11:55 -05:00
Noah Goldstein	6984ba7b94	[InstSimply] Add tests for simplify `(fmul -x, +/-0)`; NFC	2024-03-18 15:11:55 -05:00
Matt Arsenault	6cfd3439d4	APFloat: Fix signed zero handling in minnum/maxnum (#83376 ) Follow the 2019 rules and order -0 as less than +0 and +0 as greater than -0. As currently defined this isn't required for the intrinsics, but is a better QoI. This will avoid the workaround in libc added by #83158	2024-02-29 16:51:33 +05:30
Paul Walker	6a17929e9f	[LLVM][tests/Transforms/InstSimplify] Convert instances of ConstantExpr based splats to use splat(). This is mostly NFC but some output does change due to consistently inserting into poison rather than undef and using i64 as the index type for inserts.	2024-02-27 13:37:23 +00:00
Björn Pettersson	7677453886	[ConstantFolding] Do not consider padded-in-memory types as uniform (#81854 ) Teaching ConstantFoldLoadFromUniformValue that types that are padded in memory can't be considered as uniform. Using the big hammer to prevent optimizations when loading from a constant for which DataLayout::typeSizeEqualsStoreSize would return false. Main problem solved would be something like this: store i17 -1, ptr %p, align 4 %v = load i8, ptr %p, align 1 If for example the i17 occupies 32 bits in memory, then LLVM IR doesn't really tell where the padding goes. And even if we assume that the 15 most significant bits are padding, then they should be considered as undefined (even if LLVM backend typically would pad with zeroes). Anyway, for a big-endian target the load would read those most significant bits, which aren't guaranteed to be one's. So it would be wrong to constant fold the load as returning -1. If LLVM IR had been more explicit about the placement of padding, then we could allow the constant fold of the load in the example, but only for little-endian. Fixes: https://github.com/llvm/llvm-project/issues/81793	2024-02-15 15:40:21 +01:00
Yingwei Zheng	470c5b8011	[InstSimplify][InstCombine] Remove unnecessary `m_c_` matchers. (#81712 ) This patch removes unnecessary `m_c_` matchers since we always canonicalize `commutive_op Cst, X` into `commutive_op X, Cst`. Compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=bfc0b7c6891896ee8e9818f22800472510093864&to=d27b058bb9acaa43d3cadbf3cd889e8f79e5c634&stat=instructions:u	2024-02-14 16:40:36 +08:00
Danila Malyutin	cb1a9f70ec	[InstSimplify] Add trivial simplifications for gc.relocate intrinsic (#81639 ) Fold gc.relocate of undef and null to undef and null respectively. Similar transform is currently done by instcombine, but there is no reason to not include it here as well.	2024-02-14 02:16:32 +03:00
Yingwei Zheng	e17dded8d7	[InstSimplify] Generalize `simplifyAndOrOfFCmps` (#81027 ) This patch generalizes `simplifyAndOrOfFCmps` to simplify patterns like: ``` define i1 @src(float %x, float %y) { %or.cond.i = fcmp ord float %x, 0.000000e+00 %cmp.i.i34 = fcmp olt float %x, %y %cmp.i2.sink.i = and i1 %or.cond.i, %cmp.i.i34 ret i1 %cmp.i2.sink.i } define i1 @tgt(float %x, float %y) { %cmp.i.i34 = fcmp olt float %x, %y ret i1 %cmp.i.i34 } ``` Alive2: https://alive2.llvm.org/ce/z/9rydcx This patch and #80986 will fix the regression introduced by #80941. See also the IR diff https://github.com/dtcxzyw/llvm-opt-benchmark/pull/199#discussion_r1480974120.	2024-02-08 15:07:35 +08:00
Yingwei Zheng	f37d81f8a3	[PatternMatch] Add a matching helper `m_ElementWiseBitCast`. NFC. (#80764 ) This patch introduces a matching helper `m_ElementWiseBitCast`, which is used for matching element-wise int <-> fp casts. The motivation of this patch is to avoid duplicating checks in https://github.com/llvm/llvm-project/pull/80740 and https://github.com/llvm/llvm-project/pull/80414.	2024-02-07 21:02:13 +08:00
Yingwei Zheng	50e80e06d1	[ValueTracking] Merge `cannotBeOrderedLessThanZeroImpl` into `computeKnownFPClass` (#76360 ) This patch merges the logic of `cannotBeOrderedLessThanZeroImpl` into `computeKnownFPClass` to improve the signbit inference. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2024-01-31 18:26:50 +08:00
Matt Arsenault	a46422a776	Reapply "ValueTracking: Identify implied fp classes by general fcmp (#66505 )" This reverts commit 0d0c2298552222b049fa3b8db5efef4b161e51e9. Includes a bug fix for fcmp one handling, as well as for positive constants.	2024-01-25 13:38:23 +05:30
Nikita Popov	49e3e75143	[ConstantFold] Clean up binop identity folding Resolve the two FIXMEs: Perform the binop identitiy fold with AllowRHSConstant, and remove redundant folds later in the code.	2024-01-18 10:37:48 +01:00
Nikita Popov	97e3220d63	[InstSimplify] Consider bitcast as potential cross-lane operation The bitcast might change the number of vector lanes, in which case it will be a cross-lane operation. Fixes https://github.com/llvm/llvm-project/issues/77320.	2024-01-08 15:52:58 +01:00
Nikita Popov	ade7ae4760	[InstSimplify] Add test for #77320 (NFC)	2024-01-08 15:52:58 +01:00
Yingwei Zheng	554feb0058	[InstSimplify] Simplify `select cond, undef, val` to `val` if `val = poison` implies `cond = poison` (#76465 ) This patch folds: ``` select cond, undef, val -> val select cond, val, undef -> val ``` iff `impliesPoison(val, cond)` returns true. Example: ``` define i32 @src1(i32 %retval.0.i.i) { %cmp.i = icmp sgt i32 %retval.0.i.i, -1 %spec.select.i = select i1 %cmp.i, i32 %retval.0.i.i, i32 undef ret i32 %spec.select.i } define i32 @tgt1(i32 %retval.0.i.i) { ret i32 %retval.0.i.i } ``` Alive2: https://alive2.llvm.org/ce/z/okJW3G Compile-time impact: http://llvm-compile-time-tracker.com/compare.php?from=38c9390b59c4d2b9181614d6a909887497d3692f&to=e146f51ba278aa3bb6879a9ec651831ac8938e91&stat=instructions%3Au	2023-12-28 23:37:19 +08:00
Yingwei Zheng	8a4266a626	[InstSimplify] Fold `u/sdiv exact (mul nsw/nuw X, C), C --> X` when C is not a power of 2 (#76445 ) Alive2: https://alive2.llvm.org/ce/z/3D9R7d	2023-12-28 17:36:25 +08:00
Yingwei Zheng	1fea712cd1	[ValueTracking] Infer `X u<= X +nuw Y` for any Y (#75524 ) Alive2: https://alive2.llvm.org/ce/z/kiGxCf Fixes #70374.	2023-12-15 16:33:39 +08:00
Nikita Popov	7686d49517	[ValueTracking] Handle returned attribute with mismatched type The returned attribute can be used when it is possible to "losslessly bitcast" between the argument and return type, including between two vector types. computeKnownBits() would crash in this case, isKnownNonZero() would potentially produce a miscompile. Fixes https://github.com/llvm/llvm-project/issues/74722.	2023-12-08 17:05:13 +01:00
Mikhail Goncharov	0d0c229855	Revert "Reapply "ValueTracking: Identify implied fp classes by general fcmp (#66505 )"" This reverts commit d55692d60d218f402ce107520daabed15f2d9ef6. See discussion in #66505: assertion fires in OSS build of TensorFlow.	2023-12-05 11:10:24 +01:00
Nikita Popov	460faa0c87	[InstSimplify] Check common operand with constant earlier If both icmps have the same operands and the RHS is constant, we would currently go into the isImpliedCondMatchingOperands() code path, instead of the isImpliedCondCommonOperandWithConstants() path. Both are correct, but the latter can produce more accurate results if the implication is dependent on the sign.	2023-12-01 12:18:59 +01:00
Nikita Popov	89b0044ca9	[InstSimplify] Add test for implied cond with equal ops and constant (NFC)	2023-12-01 12:18:27 +01:00
Nikita Popov	cd31cf5989	[InstSimplify] Fix or disjoint miscompile with op replacement Make sure %x does not get folded to "or disjoint %x, %x" without dropping the flag, as this would be a derefinement.	2023-12-01 11:45:09 +01:00
Nikita Popov	5a1020bb00	[InstSimplify] Add test for disjoint or miscompile (NFC) The absorption case is already handled correctly, but the idempentence case is not.	2023-12-01 11:45:09 +01:00
Matt Arsenault	d55692d60d	Reapply "ValueTracking: Identify implied fp classes by general fcmp (#66505 )" This reverts commit 96a0d714d58e48c363ee6abbbcdfd7a6ce646ac1. Avoid assert with dynamic denormal-fp-math We don't recognize compares with 0 as an exact class test if we don't know the denormal mode. We could try to do better here, but it's probably not worth it. Fixes asserts reported after 1adce7d8e47e2438f99f91607760b825e5e3cc37	2023-12-01 17:51:46 +09:00
Nikita Popov	ea602cb806	[IR] Support or disjoint in hasPoisonGeneratingFlags() This fixed incorrect removal of freeze instructions.	2023-11-30 17:26:23 +01:00
Nikita Popov	ca5a01d8e4	[InstSimplify] Add test for incorrect freeze of or disjoint (NFC)	2023-11-30 17:26:23 +01:00
Nikita Popov	07c18a05e2	[InstSimplify] Fix select bit test miscompile with disjoint The select condition ensures the disjointness here. The transform is not valid without dropping the flag, which InstSimplify can't do.	2023-11-30 16:55:32 +01:00
Nikita Popov	c89553ae82	[InstSimplify] Add test for or disjoint miscompile (NFC)	2023-11-30 16:55:32 +01:00
Graham Hunter	4028dd2e93	[InstSimplify] Fold converted urem to 0 if there's no overlapping bits (#71528 ) When folding urem instructions we can end up not recognizing that the output will always be 0 due to Value*s being different, despite generating the same data (in this case, 2 different calls to vscale). This patch recognizes the (x << N) & (add (x << M), -1) pattern that instcombine replaces urem with after the two vscale calls have been reduced to one via CSE, then replaces with 0 when x is a power of 2 and N >= M.	2023-11-20 10:27:16 +00:00
Nikita Popov	56c1d30183	[IR] Remove support for lshr/ashr constant expressions (#71955 ) Remove support for the lshr and ashr constant expressions. All places creating them have been removed beforehand, so this just removes the APIs and uses of these constant expressions in tests. This is part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.	2023-11-14 09:25:14 +01:00
Matt Arsenault	0e1a52f556	ValueTracking: Handle compare gt to -inf in class identification (#72086 ) This apparently shows up somewhere in chromium. We also are missing a canonicalization to an equality compare with inf.	2023-11-14 10:05:38 +09:00
Matt Arsenault	d4912e8050	ValueTracking: Add some tests to cover asserts in fcmpImpliesClass Catch asserts hit after 1adce7d8e47e2438f99f91607760b825e5e3cc37	2023-11-13 15:05:52 +09:00
Hans Wennborg	96a0d714d5	Revert "ValueTracking: Identify implied fp classes by general fcmp (#66505 )" This causes asserts to fire: llvm/lib/Analysis/ValueTracking.cpp:4262: std::tuple<Value , FPClassTest, FPClassTest> llvm::fcmpImpliesClass(CmpInst::Predicate, const Function &, Value , const APFloat *, bool): Assertion `(RHSClass == fcPosNormal \|\| RHSClass == fcNegNormal \|\| RHSClass == fcPosSubnormal \|\| RHSClass == fcNegSubnormal) && "should have been recognized as an exact class test"' failed. See comments on the PR. > Previously we could recognize exact class tests performed by > an fcmp with special values (0s, infs and smallest normal). > Expand this to recognize the implied classes by a compare with a general > constant. e.g. fcmp ogt x, 1 implies positive and non-0. > > The API should be better merged with fcmpToClassTest but that > made the diff way bigger, will try to do that in a future > patch. This reverts commit dc3faf0ed0e3f1ea9e435a006167d9649f865da1.	2023-11-10 14:45:52 +01:00
Matt Arsenault	dc3faf0ed0	ValueTracking: Identify implied fp classes by general fcmp (#66505 ) Previously we could recognize exact class tests performed by an fcmp with special values (0s, infs and smallest normal). Expand this to recognize the implied classes by a compare with a general constant. e.g. fcmp ogt x, 1 implies positive and non-0. The API should be better merged with fcmpToClassTest but that made the diff way bigger, will try to do that in a future patch.	2023-11-10 11:39:19 +09:00
Graham Hunter	34f83e86b4	[InstSimplify] Precommit extra tests for PR71528	2023-11-08 17:02:10 +00:00

1 2 3 4 5 ...

1421 Commits