llvm-project

Author	SHA1	Message	Date
Yingwei Zheng	2f1f6b704d	[LLVM] Use `std::move` for APInt. NFC. (#86257 ) This patch adjusts argument passing for `APInt` to improve the compile-time. Compile-time improvement: https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=ba3e326def3a6e5cd6d72ff5a49c74fba18de1df&stat=instructions:u	2024-03-23 14:58:25 +08:00
Andreas Jonson	e66cfebb04	[ValueTracking] Handle range attributes (#85143 ) Handle the range attribute in ValueTracking.	2024-03-20 12:43:00 +01:00
Noah Goldstein	5265be11b1	[InstSimply] Simplify `(fmul -x, +/-0)` -> `-/+0` We already handle the `+x` case, and noticed it was missing in the bug affecting #82555 Proofs: https://alive2.llvm.org/ce/z/WUSvmV Closes #85345	2024-03-18 15:11:55 -05:00
Artem Tyurin	141145232f	[IRBuilder] Fold binary intrinsics (#80743 ) Fixes https://github.com/llvm/llvm-project/issues/61240.	2024-03-15 09:58:25 +01:00
Andreas Jonson	a3b52509d5	[InstSimpliy] Use range attribute to simplify comparisons (#84627 ) Use the new range attribute from https://github.com/llvm/llvm-project/pull/84617 to simplify comparisons where both sides have range information.	2024-03-12 10:39:37 +01:00
Andreas Jonson	54bb4be018	[InstSimplify] Handle vec values when simplifying comparisons using range metadata (#84673 ) Found that this failed with an assertion when vec was used in this optimization while working on https://github.com/llvm/llvm-project/pull/84627.	2024-03-10 12:54:37 +01:00
Björn Pettersson	7677453886	[ConstantFolding] Do not consider padded-in-memory types as uniform (#81854 ) Teaching ConstantFoldLoadFromUniformValue that types that are padded in memory can't be considered as uniform. Using the big hammer to prevent optimizations when loading from a constant for which DataLayout::typeSizeEqualsStoreSize would return false. Main problem solved would be something like this: store i17 -1, ptr %p, align 4 %v = load i8, ptr %p, align 1 If for example the i17 occupies 32 bits in memory, then LLVM IR doesn't really tell where the padding goes. And even if we assume that the 15 most significant bits are padding, then they should be considered as undefined (even if LLVM backend typically would pad with zeroes). Anyway, for a big-endian target the load would read those most significant bits, which aren't guaranteed to be one's. So it would be wrong to constant fold the load as returning -1. If LLVM IR had been more explicit about the placement of padding, then we could allow the constant fold of the load in the example, but only for little-endian. Fixes: https://github.com/llvm/llvm-project/issues/81793	2024-02-15 15:40:21 +01:00
Yingwei Zheng	470c5b8011	[InstSimplify][InstCombine] Remove unnecessary `m_c_` matchers. (#81712 ) This patch removes unnecessary `m_c_` matchers since we always canonicalize `commutive_op Cst, X` into `commutive_op X, Cst`. Compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=bfc0b7c6891896ee8e9818f22800472510093864&to=d27b058bb9acaa43d3cadbf3cd889e8f79e5c634&stat=instructions:u	2024-02-14 16:40:36 +08:00
Yingwei Zheng	dc866ae49e	[ValueTracking] Move the `isSignBitCheck` helper into ValueTracking. NFC. (#81704 ) This patch moves the `isSignBitCheck` helper into ValueTracking to reuse the logic in ValueTracking/InstSimplify. Addresses the comment https://github.com/llvm/llvm-project/pull/80740#discussion_r1488440050.	2024-02-14 15:33:08 +08:00
Danila Malyutin	cb1a9f70ec	[InstSimplify] Add trivial simplifications for gc.relocate intrinsic (#81639 ) Fold gc.relocate of undef and null to undef and null respectively. Similar transform is currently done by instcombine, but there is no reason to not include it here as well.	2024-02-14 02:16:32 +03:00
Yingwei Zheng	e17dded8d7	[InstSimplify] Generalize `simplifyAndOrOfFCmps` (#81027 ) This patch generalizes `simplifyAndOrOfFCmps` to simplify patterns like: ``` define i1 @src(float %x, float %y) { %or.cond.i = fcmp ord float %x, 0.000000e+00 %cmp.i.i34 = fcmp olt float %x, %y %cmp.i2.sink.i = and i1 %or.cond.i, %cmp.i.i34 ret i1 %cmp.i2.sink.i } define i1 @tgt(float %x, float %y) { %cmp.i.i34 = fcmp olt float %x, %y ret i1 %cmp.i.i34 } ``` Alive2: https://alive2.llvm.org/ce/z/9rydcx This patch and #80986 will fix the regression introduced by #80941. See also the IR diff https://github.com/dtcxzyw/llvm-opt-benchmark/pull/199#discussion_r1480974120.	2024-02-08 15:07:35 +08:00
Yingwei Zheng	f37d81f8a3	[PatternMatch] Add a matching helper `m_ElementWiseBitCast`. NFC. (#80764 ) This patch introduces a matching helper `m_ElementWiseBitCast`, which is used for matching element-wise int <-> fp casts. The motivation of this patch is to avoid duplicating checks in https://github.com/llvm/llvm-project/pull/80740 and https://github.com/llvm/llvm-project/pull/80414.	2024-02-07 21:02:13 +08:00
Yingwei Zheng	930996e9e4	[ValueTracking][NFC] Pass `SimplifyQuery` to `computeKnownFPClass` family (#80657 ) This patch refactors the interface of the `computeKnownFPClass` family to pass `SimplifyQuery` directly. The motivation of this patch is to compute known fpclass with `DomConditionCache`, which was introduced by https://github.com/llvm/llvm-project/pull/73662. With `DomConditionCache`, we can do more optimization with context-sensitive information. Example (extracted from [fmt/format.h](`e17bc67547/include/fmt/format.h (L3555-L3566)`)): ``` define float @test(float %x, i1 %cond) { %i32 = bitcast float %x to i32 %cmp = icmp slt i32 %i32, 0 br i1 %cmp, label %if.then1, label %if.else if.then1: %fneg = fneg float %x br label %if.end if.else: br i1 %cond, label %if.then2, label %if.end if.then2: br label %if.end if.end: %value = phi float [ %fneg, %if.then1 ], [ %x, %if.then2 ], [ %x, %if.else ] %ret = call float @llvm.fabs.f32(float %value) ret float %ret } ``` We can prove the signbit of `%value` is always zero. Then the fabs can be eliminated.	2024-02-06 02:30:12 +08:00
Yingwei Zheng	50e80e06d1	[ValueTracking] Merge `cannotBeOrderedLessThanZeroImpl` into `computeKnownFPClass` (#76360 ) This patch merges the logic of `cannotBeOrderedLessThanZeroImpl` into `computeKnownFPClass` to improve the signbit inference. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2024-01-31 18:26:50 +08:00
Nikita Popov	97e3220d63	[InstSimplify] Consider bitcast as potential cross-lane operation The bitcast might change the number of vector lanes, in which case it will be a cross-lane operation. Fixes https://github.com/llvm/llvm-project/issues/77320.	2024-01-08 15:52:58 +01:00
ChipsSpectre	4444a7e89a	[InstSimplify] Simplify the expression `(a^c)&(a^~c)` to zero and (a^c) \| (a^~c) to minus one (#76637 ) Changes the InstSimplify pass of the LLVM optimizer, such that the aforementioned expression is reduced to zero if c2==~c1. Alive2: https://alive2.llvm.org/ce/z/xkQiid Fixes https://github.com/llvm/llvm-project/issues/75692.	2024-01-03 12:01:02 +01:00
Yingwei Zheng	554feb0058	[InstSimplify] Simplify `select cond, undef, val` to `val` if `val = poison` implies `cond = poison` (#76465 ) This patch folds: ``` select cond, undef, val -> val select cond, val, undef -> val ``` iff `impliesPoison(val, cond)` returns true. Example: ``` define i32 @src1(i32 %retval.0.i.i) { %cmp.i = icmp sgt i32 %retval.0.i.i, -1 %spec.select.i = select i1 %cmp.i, i32 %retval.0.i.i, i32 undef ret i32 %spec.select.i } define i32 @tgt1(i32 %retval.0.i.i) { ret i32 %retval.0.i.i } ``` Alive2: https://alive2.llvm.org/ce/z/okJW3G Compile-time impact: http://llvm-compile-time-tracker.com/compare.php?from=38c9390b59c4d2b9181614d6a909887497d3692f&to=e146f51ba278aa3bb6879a9ec651831ac8938e91&stat=instructions%3Au	2023-12-28 23:37:19 +08:00
Yingwei Zheng	8a4266a626	[InstSimplify] Fold `u/sdiv exact (mul nsw/nuw X, C), C --> X` when C is not a power of 2 (#76445 ) Alive2: https://alive2.llvm.org/ce/z/3D9R7d	2023-12-28 17:36:25 +08:00
Paul Walker	dea16ebd26	[LLVM][IR] Replace ConstantInt's specialisation of getType() with getIntegerType(). (#75217 ) The specialisation will not be valid when ConstantInt gains native support for vector types. This is largely a mechanical change but with extra attention paid to constant folding, InstCombineVectorOps.cpp, LoopFlatten.cpp and Verifier.cpp to remove the need to call `getIntegerType()`. Co-authored-by: Nikita Popov <github@npopov.com>	2023-12-18 11:58:42 +00:00
Yingwei Zheng	741975df92	[InstCombine][InstSimplify] Pass `SimplifyQuery` to `computeKnownBits` directly. NFC. (#74246 ) This patch passes `SimplifyQuery` to `computeKnownBits` directly in `InstSimplify` and `InstCombine`. As the `DomConditionCache` in #73662 is only used in `InstCombine`, it is inconvenient to introduce a new argument `DC` to `computeKnownBits`.	2023-12-04 02:26:39 +08:00
Nikita Popov	cd31cf5989	[InstSimplify] Fix or disjoint miscompile with op replacement Make sure %x does not get folded to "or disjoint %x, %x" without dropping the flag, as this would be a derefinement.	2023-12-01 11:45:09 +01:00
Nikita Popov	07c18a05e2	[InstSimplify] Fix select bit test miscompile with disjoint The select condition ensures the disjointness here. The transform is not valid without dropping the flag, which InstSimplify can't do.	2023-11-30 16:55:32 +01:00
Nikita Popov	d9e8ae7d2f	[ValueTracking] Convert MaskedValueIsZero() to use SimplifyQuery (NFC)	2023-11-29 11:18:42 +01:00
Nikita Popov	9ca9c2cf7e	[InstSimplify] Remove redundant gep zero fold (NFC) We already higher the all zero indices case above, no need to also handle the special case of a single zero index.	2023-11-20 16:25:48 +01:00
Graham Hunter	4028dd2e93	[InstSimplify] Fold converted urem to 0 if there's no overlapping bits (#71528 ) When folding urem instructions we can end up not recognizing that the output will always be 0 due to Value*s being different, despite generating the same data (in this case, 2 different calls to vscale). This patch recognizes the (x << N) & (add (x << M), -1) pattern that instcombine replaces urem with after the two vscale calls have been reduced to one via CSE, then replaces with 0 when x is a power of 2 and N >= M.	2023-11-20 10:27:16 +00:00
Nikita Popov	2310066faa	[InstSimplify] Simplify calculation of GEP result pointer type (NFC) The result type is the same as the input pointer type, except for splat geps.	2023-11-17 17:14:07 +01:00
Nikita Popov	ebb8ffde94	[InstSimplify] Extract commutative and folds into helper (NFCI) There are a number of and folds that are repeated for both operand orders. Move these into a helper that is invoked with both orders. This is conceptually NFC, but may not be entirely so, as the order of folds may change.	2023-11-15 16:31:55 +01:00
annamthomas	98d8b688bd	[InstSimplify] Check call for FMF instead of CtxI (#71585 ) This code was incorrectly checking that the CtxI has required FMF, but the context instruction need not always be the instrinsic call. Check that the intrinsic call has the required FMF. Fixes PR71548.	2023-11-08 10:25:11 -05:00
Anna Thomas	f0cdf4b468	[InstCombine] Check FPMathOperator for Ctx before FMF check We need to check FPMathOperator for Ctx instruction before checking fast math flag on this Ctx. Ctx is not always an FPMathOperator, so explicitly check for it. Fixes #71548.	2023-11-07 10:50:19 -05:00
Nikita Popov	0c6a77baa6	[InstSimplify] Remove redundant simplifyAndOrOfICmpsWithZero() fold (NFCI) This has been subsumed by simplifyAndOrWithICmpEq().	2023-11-07 14:53:32 +01:00
Nikita Popov	fb01f683af	[InstSimplify] Remove redundant simplifyAndOrOfICmpsWithLimitConst() fold (NFCI) This fold has been subsumed by simplifyAndOrWithICmpEq().	2023-11-07 14:35:03 +01:00
Nikita Popov	060de415af	Reapply [InstCombine] Simplify and/or of icmp eq with op replacement (#70335 ) Relative to the first attempt, this contains two changes: First, we only handle the case where one side simplifies to true or false, instead of calling simplification recursively. The previous approach would return poison if one operand simplified to poison (under the equality assumption), which is incorrect. Second, we do not fold llvm.is.constant in simplifyWithOpReplaced(). We may be assuming that a value is constant, if the equality holds, but it may not actually be constant. This is nominally just a QoI issue, but the std::list implementation in libstdc++ relies on the precise behavior in a way that causes miscompiles. ----- and/or in logical (select) form benefit from generic simplifications via simplifyWithOpReplaced(). However, the corresponding fold for plain and/or currently does not exist. Similar to selects, there are two general cases for this fold (illustrated with `and`, but there are `or` conjugates). The basic case is something like `(a == b) & c`, where the replacement of a with b or b with a inside c allows it to fold to true or false. Then the whole operation will fold to either false or `a == b`. The second case is something like `(a != b) & c`, where the replacement inside c allows it to fold to false. In that case, the operand can be replaced with c, because in the case where a == b (and thus the icmp is false), c itself will already be false. As the test diffs show, this catches quite a lot of patterns in existing test coverage. This also obsoletes quite a few existing special-case and/or of icmp folds we have (e.g. simplifyAndOrOfICmpsWithLimitConst), but I haven't removed anything as part of this patch in the interest of risk mitigation. Fixes #69050. Fixes #69091.	2023-11-03 10:16:15 +01:00
Noah Goldstein	8c2fcf5b77	[InstSimplify] Add some basic simplifications for `llvm.ptrmask` Mostly the same as `and`. We also have a check for a useless `llvm.ptrmask` if the ptr is already known aligned. Differential Revision: https://reviews.llvm.org/D156633	2023-11-01 23:50:35 -05:00
Nikita Popov	e91812792a	[InstSimplify] Avoid ConstantExpr::getIntegerCast() (NFCI) This always works on a constant integer or integer splat, so the constant fold here should always succeed.	2023-11-01 11:15:18 +01:00
Nikita Popov	e46dd6fbc0	Revert "[InstCombine] Simplify and/or of icmp eq with op replacement (#70335 )" This reverts commit 1770a2e325192f1665018e21200596da1904a330. Stage 2 llvm-tblgen crashes when generating X86GenAsmWriter.inc and other files.	2023-10-30 18:33:03 +01:00
Nikita Popov	1770a2e325	[InstCombine] Simplify and/or of icmp eq with op replacement (#70335 ) and/or in logical (select) form benefit from generic simplifications via simplifyWithOpReplaced(). However, the corresponding fold for plain and/or currently does not exist. Similar to selects, there are two general cases for this fold (illustrated with `and`, but there are `or` conjugates). The basic case is something like `(a == b) & c`, where the replacement of a with b or b with a inside c allows it to fold to true or false. Then the whole operation will fold to either false or `a == b`. The second case is something like `(a != b) & c`, where the replacement inside c allows it to fold to false. In that case, the operand can be replaced with c, because in the case where a == b (and thus the icmp is false), c itself will already be false. As the test diffs show, this catches quite a lot of patterns in existing test coverage. This also obsoletes quite a few existing special-case and/or of icmp folds we have (e.g. simplifyAndOrOfICmpsWithLimitConst), but I haven't removed anything as part of this patch in the interest of risk mitigation. Fixes #69050. Fixes #69091.	2023-10-30 10:05:39 +01:00
Pierre van Houtryve	4fc1e7db27	[InstSimplify] Fold (a != 0) ? abs(a) : 0 (#70305 ) Solves #70204	2023-10-27 14:52:09 +02:00
Nikita Popov	4638c29c3d	[InstSimplify] Remove redundant pointer icmp fold (NFCI) This fold is already performed as part of simplifyICmpWithZero().	2023-10-26 14:30:33 +02:00
Craig Topper	a5686c2b55	[InstSimplify] Avoid use of ConstantExpr::getICmp. NFC (#67873 )	2023-09-30 13:01:29 -07:00
Craig Topper	abcaebfe3a	[InstSimplify] Use cast instead of dyn_cast+assert. NFC	2023-09-29 22:08:53 -07:00
Nikita Popov	b35f2940e9	[InstSimplify] Avoid use of ConstantExpr::getCast() Use the constant folding API instead. One of these uses actually improves results, because the bitcast expression gets folded away.	2023-09-29 10:23:40 +02:00
Nikita Popov	a09e32e5fe	[InstSimplify] Respect UseInstrInfo in more folds Some folds using m_NUW, m_NSW style matchers were missed, make sure they respect UseInstrInfo. This is part of #53218, but not a complete fix for the issue.	2023-09-26 13:54:03 +02:00
Yingwei Zheng	0d821b22e0	[InstSimplify] Generalize fold for icmp ugt/ule (pow2 << X), signmask Alive2: https://alive2.llvm.org/ce/z/wZ41t7	2023-09-25 00:07:20 +08:00
Nikita Popov	c41b4b6397	[InstCombine] Make flag drop during select equiv fold more generic Instead of unsetting flags on the instruction, attempting the fold, and the resetting the flags if it failed, add support to simplifyWithOpReplaced() to ignore poison-generating flags/metadata and collect all instructions where they may need to be dropped. This allows us to perform the fold a) with poison-generating metadata, which was previously not handled and b) poison-generating flags/metadata that are not on the root instruction. Proof for the ctpop case: https://alive2.llvm.org/ce/z/3H3HFs Fixes https://github.com/llvm/llvm-project/issues/62450.	2023-09-19 14:54:25 +02:00
Yingwei Zheng	be2723da5c	[InstSimplify] Fold icmp of `X and/or C1` and `X and/or C2` into constant (#65905 ) This patch simplifies the pattern `icmp X and/or C1, X and/or C2` when one constant mask is the subset of the other. If `C1 & C2 == C1`, `A = X and/or C1`, `B = X and/or C2`, we can do the following folds: `icmp ule A, B -> true` `icmp ugt A, B -> false` We can apply similar folds for signed predicates when `C1` and `C2` are the same sign: `icmp sle A, B -> true` `icmp sgt A, B -> false` Alive2: https://alive2.llvm.org/ce/z/Q4ekP5 Fixes #65833.	2023-09-18 21:32:48 +08:00
Paul Walker	c7d65e4466	[IR] Enable load/store/alloca for arrays of scalable vectors. Differential Revision: https://reviews.llvm.org/D158517	2023-09-14 13:49:01 +00:00
Matt Arsenault	00061843bd	InstSimplify: Simplifications for ldexp Ported from old amdgcn intrinsic which will soon be deleted. https://reviews.llvm.org/D149587	2023-09-13 08:38:48 +03:00
Matt Arsenault	6f2e943de6	InstSimplify: Handle folding fcmp with literal nans without a context instruction Fixes reported assert after ddb3f12c428bc4bd5a98913d74dfd7f2402bdfd8	2023-09-02 10:22:09 -04:00
Matt Arsenault	5dcd6669ff	InstSimplify: Handle exp10(log10(x)) -> x Copy from exp/exp2 case. https://reviews.llvm.org/D157894	2023-09-02 09:21:47 -04:00
Matt Arsenault	da077a52c4	InstSimplify: Handle log10(exp10(x)) Copied from the exp/exp2 cases https://reviews.llvm.org/D157894	2023-09-02 08:57:54 -04:00

1 2 3 4 5 ...

1069 Commits