llvm-project

Author	SHA1	Message	Date
Yingwei Zheng	a77dedcacb	[InstSimplify][InstCombine][ConstantFold] Move vector div/rem by zero fold to InstCombine (#114280 ) Previously we fold `div/rem X, C` into `poison` if any element of the constant divisor `C` is zero or undef. However, it is incorrect when threading udiv over an vector select: https://alive2.llvm.org/ce/z/3Ninx5 ``` define <2 x i32> @vec_select_udiv_poison(<2 x i1> %x) { %sel = select <2 x i1> %x, <2 x i32> <i32 -1, i32 -1>, <2 x i32> <i32 0, i32 1> %div = udiv <2 x i32> <i32 42, i32 -7>, %sel ret <2 x i32> %div } ``` In this case, `threadBinOpOverSelect` folds `udiv <i32 42, i32 -7>, <i32 -1, i32 -1>` and `udiv <i32 42, i32 -7>, <i32 0, i32 1>` into `zeroinitializer` and `poison`, respectively. One solution is to introduce a new flag indicating that we are threading over a vector select. But it requires to modify both `InstSimplify` and `ConstantFold`. However, this optimization doesn't provide benefits to real-world programs: https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/data/zyw/opt-ci/actions-runner/_work/llvm-opt-benchmark/llvm-opt-benchmark/llvm/llvm-project/llvm/lib/IR/ConstantFold.cpp.html#L908 https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/data/zyw/opt-ci/actions-runner/_work/llvm-opt-benchmark/llvm-opt-benchmark/llvm/llvm-project/llvm/lib/Analysis/InstructionSimplify.cpp.html#L1107 This patch moves the fold into InstCombine to avoid breaking numerous existing tests. Fixes #114191 and #113866 (only poison-safety issue).	2024-11-01 22:56:22 +08:00
Yingwei Zheng	e577f14b67	[InstCombine] Use `m_NotForbidPoison` when folding `(X u< Y) ? -1 : (~X + Y) --> uadd.sat(~X, Y)` (#114345 ) Alive2: https://alive2.llvm.org/ce/z/mTGCo- We cannot reuse `~X` if `m_AllOnes` matches a vector constant with some poison elts. An alternative solution is to create a new not instead of reusing `~X`. But it doesn't worth the effort because we need to add a one-use check. Fixes https://github.com/llvm/llvm-project/issues/113869.	2024-11-01 22:18:44 +08:00
Yingwei Zheng	96b14f2ccb	[Reland][InstCombine] Fix FMF propagation in `foldSelectIntoOp` (#114499 ) Relands #114356. Compared to the last version, this patch only merges poison-generating/nsz flags from the select to fix LV regression in `llvm/test/Transforms/PhaseOrdering/AArch64/predicated-reduction.ll`.	2024-11-01 12:22:57 +08:00
gulfemsavrun	d183dc7c24	Revert "[InstCombine] Fix FMF propagation in `foldSelectIntoOp`" (#114458 ) Reverts llvm/llvm-project#114356 because it caused test failures. https://lab.llvm.org/buildbot/#/builders/190/builds/8601 https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-base-linux-x64/b8732549597609293617/overview	2024-10-31 13:21:52 -07:00
Yingwei Zheng	cf1963afad	[InstCombine] Fix FMF propagation in `foldSelectIntoOp` (#114356 ) Closes https://github.com/llvm/llvm-project/issues/113423.	2024-10-31 23:26:45 +08:00
Yingwei Zheng	18311093ab	[InstCombine] Do not fold `shufflevector(select)` if the select condition is a vector (#113993 ) Since `shufflevector` is not element-wise, we cannot do fold it into select when the select condition is a vector. For shufflevector that doesn't change the length, it doesn't crash, but it is still a miscompilation: https://alive2.llvm.org/ce/z/s8saCx Fixes https://github.com/llvm/llvm-project/issues/113986.	2024-10-29 10:39:07 +08:00
David Majnemer	902acde341	[InstCombine] Optimize away certain additions using modular arithmetic We can turn: ``` %add = add i8 %arg, C1 %and = and i8 %add, C2 %cmp = icmp eq i1 %and, C3 ``` into: ``` %and = and i8 %arg, C2 %cmp = icmp eq i1 %and, (C3 - C1) & C2 ``` This is only worth doing if the sequence is the sole user of the addition operation.	2024-10-28 22:51:35 +00:00
Matthias Braun	5903c6af44	InstCombine: Fold shufflevector(select) and shufflevector(phi) (#113746 ) - Transform `shufflevector(select(c, x, y), C)` to `select(c, shufflevector(x, C), shufflevector(y, C))` by re-using the `FoldOpIntoSelect` helper. - Transform `shufflevector(phi(x, y), C)` to `phi(shufflevector(x, C), shufflevector(y, C))` by re-using the `foldOpInotPhi` helper.	2024-10-28 15:35:17 -07:00
Yingwei Zheng	f78610af3f	[InstCombine] Add function attribute `instcombine-no-verify-fixpoint` (#113822 ) This patch introduces a function attribute `instcombine-no-verify-fixpoint` to avoids disabling fix-point verification for unrelated tests in the same file. Address comment https://github.com/llvm/llvm-project/pull/112642#discussion_r1804714387.	2024-10-28 17:45:08 +08:00
Yingwei Zheng	5155c38cee	[InstCombine] Don't check uses of constant exprs (#113684 ) This patch skips constant expressions to avoid iterating over uses on other functions. Fix crash reported in https://github.com/llvm/llvm-project/pull/105510#issuecomment-2437521147.	2024-10-28 15:09:20 +08:00
David Majnemer	5d4a0d54b5	[InstCombine] Teach takeLog2 about right shifts, truncation and bitwise-and We left some easy opportunities for further simplifications. log2(trunc(x)) is simply trunc(log2(x)). This is safe if we know that trunc is NUW because it means that the truncation didn't drop any bits. It is also safe if the caller is OK with zero as a possible answer. log2(x >>u y) is simply `log2(x) - y`. log2(x & y) is a funny one. It comes up when doing something like: ``` unsigned int f(unsigned int x, unsigned int y) { unsigned char a = 1u << x; return y / a; } ``` LLVM would canonicalize this to: ``` %shl = shl nuw i32 1, %x %conv1 = and i32 %shl, 255 %div = udiv i32 %y, %conv1 ``` In cases like these, we can ignore the mask entirely. This is equivalent to `y >> x`.	2024-10-28 05:13:04 +00:00
Jay Foad	90cdc03e7f	[IR] Fix undiagnosed cases of structs containing scalable vectors (#113455 ) Type::isScalableTy and StructType::containsScalableVectorType failed to detect some cases of structs containing scalable vectors because containsScalableVectorType did not call back into isScalableTy to check the element types. Fix this, which requires sharing the same Visited set in both functions. Also change the external API so that callers are never required to pass in a Visited set, and normalize the naming to isScalableTy.	2024-10-25 12:56:10 +01:00
Noah Goldstein	294726d738	Reapply "[InstCombine] Folding `(icmp eq/ne (and X, -P2), INT_MIN)`" (#111236 ) The underlying issue with msan was fixed by #113200	2024-10-23 09:12:08 -05:00
Andreas Jonson	00b47b98d4	[NFC] Fix missplaced comment	2024-10-22 20:51:46 +02:00
XChy	a2ba438f3e	[InstCombine] Preserve the flag from RHS only if the `and` is bitwise (#113164 ) Fixes #113123 Alive proof: https://alive2.llvm.org/ce/z/hnqeLC	2024-10-21 22:30:31 +08:00
Kazu Hirata	8819267747	[InstCombine] Simplify code with SmallMapVector::operator[] (NFC) (#113022 )	2024-10-19 14:38:40 -07:00
Ramkumar Ramachandra	7b65971e1f	InstCombine: sink loads with invariant.load metadata (#112692 )	2024-10-18 10:35:56 +01:00
goldsteinn	c85611e858	[SimplifyLibCall][Attribute] Fix bug where we may keep `range` attr with incompatible type (#112649 ) In a variety of places we change the bitwidth of a parameter but don't update the attributes. The issue in this case is from the `range` attribute when inlining `__memset_chk`. `optimizeMemSetChk` will replace an `i32` with an `i8`, and if the `i32` had a `range` attr assosiated it will cause an error. Fixes #112633	2024-10-17 10:32:55 -05:00
Jay Foad	85c17e4092	[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706 ) Convert many instances of: Fn = Intrinsic::getOrInsertDeclaration(...); CreateCall(Fn, ...) to the equivalent CreateIntrinsic call.	2024-10-17 16:20:43 +01:00
Yingwei Zheng	095d49da76	[InstCombine] Set `samesign` when converting signed predicates into unsigned (#112642 ) Alive2: https://alive2.llvm.org/ce/z/6cqdt-	2024-10-17 20:43:48 +08:00
Nikita Popov	0f7d148db4	[InstCombine] Add shared helper for logical and bitwise and/or (NFC) Add a helper for shared folds between logical and bitwise and/or and move the and/or of icmp and fcmp folds in there. This makes it easier to extend to more folds. A possible extension would be to base the current and/or of icmp reassociation logic on this helper, so that it for example also applies to fcmp.	2024-10-17 14:25:44 +02:00
Ramkumar Ramachandra	682fa797b7	InstCombine/Select: remove redundant code (NFC) (#112388 ) InstCombinerImpl::foldSelectInstWithICmp has some inlined code for select-icmp-xor simplification, but this simplification is already done by other code, via another path: (X & Y) == 0 ? X : X ^ Y -> ((X & Y) == 0 ? 0 : Y) ^ X -> (X & Y) ^ X -> X & ~Y Cover the cases that it claims to simplify, and demonstrate that stripping it doesn't cause test changes.	2024-10-16 12:44:09 +01:00
Yingwei Zheng	0936195311	[InstCombine] Drop `samesign` in InstCombine (#112480 ) Closes https://github.com/llvm/llvm-project/issues/112476.	2024-10-16 19:13:52 +08:00
Yingwei Zheng	3bf2295ee0	[InstCombine] Drop `samesign` flag in `foldAndOrOfICmpsWithConstEq` (#112489 ) In `5dbfca30c1` we assume that RHS is poison implies LHS is also poison. It doesn't hold after introducing samesign flag. This patch drops the `samesign` flag on RHS if the original expression is a logical and/or. Closes #112467.	2024-10-16 16:24:44 +08:00
Alexey Bader	583fa4f5b7	[InstCombine] Extend fcmp+select folding to minnum/maxnum intrinsics (#112088 ) Today, InstCombine can fold fcmp+select patterns to minnum/maxnum intrinsics when the nnan and nsz flags are set. The ordering of the operands in both the fcmp and select instructions is important for the folding to occur. maxnum patterns: 1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ogt, oge} 2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ule, ult} The second pattern is supposed to make the order of the operands in the select instruction irrelevant. However, the pattern matching code uses the CmpInst::getInversePredicate method to invert the comparison predicate. This method doesn't take into account the fast-math flags, which can lead missing the folding opportunity. The patch extends the pattern matching code to handle unordered fcmp instructions. This allows the folding to occur even when the select instruction has the operands in the inverse order. New maxnum patterns: 1. (a op b) ? a : b -> maxnum(a, b), where op is one of {ugt, uge} 2. (a op b) ? b : a -> maxnum(a, b), where op is one of {ole, olt} The same changes are applied to the minnum intrinsic.	2024-10-15 22:05:16 +04:00
Ramkumar Ramachandra	1c6c850937	InstCombine: extend select-equiv to support vectors (#111966 ) foldSelectEquivalence currently doesn't support GVN-like replacements on vector types. Put in the checks for potentially lane-crossing operations, and lift the limitation.	2024-10-15 11:10:45 +01:00
Yingwei Zheng	9edc454ee6	[InstCombine] Drop range attributes in `foldIsPowerOf2OrZero` (#112178 ) Closes https://github.com/llvm/llvm-project/issues/112078.	2024-10-14 20:52:55 +08:00
Ramkumar Ramachandra	c5f82f7893	ValueTracking: introduce llvm::isNotCrossLaneOperation (#112011 ) Factor out and unify common code from InstSimplify and InstCombine that partially guard against cross-lane vector operations into llvm::isNotCrossLaneOperation in ValueTracking. Alive2 proofs for changed tests: https://alive2.llvm.org/ce/z/68H4ka	2024-10-14 11:37:30 +01:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Yingwei Zheng	6a65e98fa7	[InstCombine] Drop range attributes in `foldIsPowerOf2` (#111946 ) Fixes https://github.com/llvm/llvm-project/issues/111934.	2024-10-11 18:19:21 +08:00
Arthur Eubanks	e34d614e7d	[Passes] Remove -enable-infer-alignment-pass flag (#111873 ) This flag has been on for a while without any complaints.	2024-10-10 12:28:46 -07:00
Kazu Hirata	2d8cd32ae5	[InstCombine] Avoid repeated hash lookups (NFC) (#111618 )	2024-10-08 20:37:33 -07:00
David Green	d2408c417c	[InstCombine] Canonicalize more geps with constant gep bases and constant offsets. (#110033 ) This is another small but hopefully not performance negative step to canonicalizing towards i8 geps. We looks for geps with a constant offset base pointer of the form `gep (gep @glob, C1), x, C2` and expand the gep instruction, so that the constant can hopefully be combined together (or the x offset can be computed in common).	2024-10-06 10:44:21 +01:00
Vitaly Buka	574266ce33	Revert "[InstCombine] Folding `(icmp eq/ne (and X, -P2), INT_MIN)`" (#111236 ) Reverts #110880 because of exposed issue is Msan instrumentation #111212. This reverts commit a64643688526114b50c25b3eda8a57855bd2be87.	2024-10-04 23:20:40 -07:00
Benjamin Maxwell	6b3220afa6	[InstCombine] Avoid crash on aggregate types in SimplifyDemandedUseFPClass (#111128 ) The disables folding for FP aggregates that are not poison/posZero types, which is currently not supported. Note: To fully handle this aggregates would also likely require teaching `computeKnownFPClass()` to handle array and struct constants (which does not seem implemented outside of zero init).	2024-10-04 15:00:24 +01:00
Nikita Popov	67d247a441	[InstCombine] Decompose more icmps into masks (#110836 ) Extend decomposeBitTestICmp() to handle cases where the resulting comparison is of the form `icmp (X & Mask) pred C` with non-zero `C`. Add a flag to allow code to opt-in to this behavior and use it in the "log op of icmp" fold infrastructure. This addresses regressions from #97289. Proofs: https://alive2.llvm.org/ce/z/hUhdbU	2024-10-04 10:17:23 +02:00
Noah Goldstein	a646436885	[InstCombine] Folding `(icmp eq/ne (and X, -P2), INT_MIN)` Folds to `(icmp slt/sge X, (INT_MIN + P2))` Proofs: https://alive2.llvm.org/ce/z/vpNFY5 Closes #110880	2024-10-03 13:05:08 -05:00
Stephen Tozer	caa265e01c	[DebugInfo][InstCombine] Do not overwrite prior DILocation for new Insts (#108565 ) When InstCombine replaces an old instruction with a new instruction, it copies !dbg and !annotation metadata from old to new. For some InstCombine patterns we set a specific DILocation on the new instruction prior to insertion, however, which more accurately reflects the new instruction. This more specific DILocation may be overwritten on insertion by a less appropriate one, resulting in a less correct line mapping. This patch changes this behaviour to only copy the DILocation from old to new if the new instruction has no existing DILocation (which will always be the case for a new instruction unless InstCombine has specifically set one).	2024-10-03 17:08:45 +01:00
Marina Taylor	d0d12fc78a	[InstCombine] Fold (X==Z) ? (Y==Z) : (!(Y==Z) && X==Y) --> X==Y (#108619 ) This corresponds to the canonicalized form of some logic that was seen in Swift-generated code for comparing optional pointers: `(X==Z \|\| Y==Z) ? (X==Z && Y==Z) : X==Y --> X==Y` where `Z` was the constant `0`. https://alive2.llvm.org/ce/z/J_3aa9	2024-10-03 15:33:30 +01:00
Nikita Popov	7de492f90d	[InstCombine] Preserve nuw flag in indexed compare fold If all the involved GEPs have the nuw flag, also preserve it on the resulting adds and GEPs.	2024-10-02 16:03:47 +02:00
Yingwei Zheng	62cd07fb67	[InstCombine] Canonicalize `sub mask, X -> ~X` when high bits are ignored (#110635 ) Alive2: https://alive2.llvm.org/ce/z/NJgBPL The motivating case of this patch is to emit `andn` on RISC-V with zbb for expressions like `(sub 63, X) & 63`.	2024-10-02 12:48:06 +08:00
Nikita Popov	e565a4fa0b	[IR] Extract helper for GEPNoWrapFlags intersection (NFC) When combining two geps into one by adding the offsets, we have to take some care when intersecting the flags, because nusw flags cannot be straightforwardly preserved. Add a helper for this on GEPNoWrapFlags so we won't have to repeat this logic in various places.	2024-10-01 16:58:23 +02:00
Yingwei Zheng	2a2c35a9a6	[InstCombine] Fold `icmp spred (mul nsw X, Z), (mul nsw Y, Z)` into `icmp spred X, Y` (#110630 ) ``` icmp spred (mul nsw X, Z), (mul nsw Y, Z) -> icmp spred X, Y iff Z > 0 icmp spred (mul nsw X, Z), (mul nsw Y, Z) -> icmp spred Y, X iff Z < 0 ``` Alive2: https://alive2.llvm.org/ce/z/9fXFfn	2024-10-01 22:16:05 +08:00
Nikita Popov	e2a855def5	[InstCombine] Fix SimplifyDemandedBits recursion cutoff for Arguments There was a discrepancy between how SimplifyDemandedBits and computeKnownBits handled the Argument case. computeKnownBits() would use information from range attributes even once the recursion limit has been reached. Fixes https://github.com/llvm/llvm-project/issues/110631.	2024-10-01 11:44:13 +02:00
Yingwei Zheng	1efd1227b2	[InstCombine] Fold `icmp eq/ne (X nw Z), (Y nw Z) -> icmp eq/ne Z, 0` when `X != Y` (#110413 ) Alive2: https://alive2.llvm.org/ce/z/9oDP6K I found this pattern in `04e75858d7/casadi/core/repmat.cpp (L70-L78)`.	2024-09-30 10:21:20 +08:00
Simon Pilgrim	795c24c6fb	[InstCombine] foldVecExtTruncToExtElt - extend to handle trunc(lshr(extractelement(x,c1),c2)) -> extractelement(bitcast(x),c3) patterns. (#109689 ) This patch moves the existing trunc+extractlement -> extractelement+bitcast fold into a foldVecExtTruncToExtElt helper and extends the helper to handle trunc+lshr+extractelement cases as well. Fixes #107404	2024-09-28 17:52:10 +01:00
Ramkumar Ramachandra	1832d609f7	InstCombine/Demanded: simplify srem case (NFC) (#110260 ) The srem case of SimplifyDemandedUseBits partially duplicates KnownBits::srem. It is guarded by a statement that takes the absolute value of the RHS and checks whether it is a power of 2, but the abs() call here useless, since an srem with a negative RHS is flipped into one with a positive RHS, adjusting LHS appropriately. Stripping the abs call allows us to call KnownBits::srem instead of partially duplicating it.	2024-09-27 19:12:35 +01:00
Nikita Popov	5ef02a3fd4	[InstCombine] Fall through to computeKnownBits() for sdiv by -1 When dividing by -1 we were breaking out of the code entirely, while we should fall through to computeKnownBits(). This fixes an instcombine-verify-known-bits discrepancy. Fixes https://github.com/llvm/llvm-project/issues/109957.	2024-09-25 14:23:06 +02:00
Nikita Popov	b8d1bae648	[CmpInstAnalysis] Return decomposed bit test as struct (NFC) (#109819 ) decomposeBitTestICmp() currently returns the result via two out parameters plus an in-place modification of Pred. This changes it to return an optional struct instead. The motivation here is twofold. First, I'd like to extend this code to handle cases where the comparison is against a value other than zero, which would mean yet another out parameter. Second, while doing that I was badly bitten by the in-place modification, so I'd like to get rid of it.	2024-09-25 10:14:15 +02:00
Marina Taylor	5cd0900ef6	[InstCombine] Compare `icmp inttoptr, inttoptr` values directly (#107012 ) InstCombine already has some rules for `icmp ptrtoint, ptrtoint` to drop the casts and compare the source values. This change adds the same for the reverse case with `inttoptr`.	2024-09-24 09:39:07 +02:00

1 2 3 4 5 ...

6354 Commits