llvm-project

Author	SHA1	Message	Date
Vladimir Radosavljevic	57d1fbf62c	[InstCombine] Limit (icmp eq/ne (and (add A, Addend), Msk), C) fold to one use of and (#172858 ) If the and has multiple uses, the fold can increase the instruction count.	2026-02-07 03:09:27 +08:00
Andreas Jonson	faa4b97b10	[InstCombine] fold icmp ne (and X, 1), 0 --> trunc X to i1 (#178977 ) Remove vector check so this fold always is done. proof: https://alive2.llvm.org/ce/z/oabD6J closes #172888	2026-02-03 19:14:27 +01:00
Nathan Corbyn	efe75626cd	[InstCombine] Add combines for unsigned comparison of absolute value to constant (#176148 ) This patch implements the following two peephole optimisations: 1. ``` abs(X) u> K --> K >= 0 ? `X + K u> 2 * K` : `false` ```; 2. If `abs(INT_MIN)` is `poison`, ```abs(X) u< K --> K >= 1 ? `X + (K - 1) u<= 2 * (K - 1)` : K != 0```. See the following Alive2 proofs: [1](https://alive2.llvm.org/ce/z/J2SRSv) and [2](https://alive2.llvm.org/ce/z/tfxTrU).	2026-01-29 02:49:55 +08:00
Alan Zhao	f2921e536b	[InstCombine][profcheck] More fixes for missing branch data in InstCombineCompares.cpp (#178084 ) Again, these fixes are trivial as we're creating new select instructions with predicates from existing select instructions. In this case, we create one select instruction from two existing select instructions, but since both existing select instructions have the same predicate, their profile data should be the same, so we can reuse the profile data from either instruction. Therefore, we arbitrarily reuse the profile data from the first select instruction. Tracking issue: #147390	2026-01-27 11:20:15 -08:00
Alan Zhao	88257505b0	[InstCombine][profcheck] Fix missing branch data in InstCombineCompares.cpp (#178070 ) These are trivial fixes where we create a new select instruction with the same conditional as an existing select. Tracking issue: #147390	2026-01-26 23:25:21 +00:00
Manasij Mukherjee	4bc2e4b4c1	[InstCombine] Add new pattern to foldICmpAddConstant (#175876 ) icmp ult (add nuw X, (lshr A, ShAmtC)), C --> icmp ult A, C when C <= (1 << ShAmtC) Pattern found in ffmpeg according to the report https://alive2.llvm.org/ce/z/rpY8LY Fixes https://github.com/llvm/llvm-project/issues/167178	2026-01-17 15:45:13 +00:00
Justin Lebar	bbcab0bf57	[InstCombine] Fix i1 ssub.sat compare folding (#173742 ) For every type other than i1, ssub.sat x, y = 0 implies x == y. But ssub.sat.i1 0, -1 = 0 (because the result of 1 saturates to 0). The changes to instcombine are not strictly necessary. Instcombine canonicalizes the ssub.sat.i1 before we arrive at these pattern-matches. The real fix is in ValueTracking. Nonetheless we agreed in review it makes sense to add these checks to instcombine, even though they're currently unreachable: https://github.com/llvm/llvm-project/pull/173742#issuecomment-3696631396 This was found by a fuzzer I'm working on!	2026-01-12 11:03:00 -08:00
Justin Lebar	5243501cca	[InstCombine] Guard foldICmpSRemConstant against zero divisors (#173702 ) instcombine can create srem X, 0 or icmp ult X, 0 mid-pass when operands fold to zero, which trips assertions in foldICmpSRemConstant. Bail out on zero divisors / zero ULT constants instead of asserting, and add a regression test from the minimized reproducer. This was found by a fuzzer I'm working on. The high-level design is to randomly generate LLVM IR, run a pass on it, and then run the original and new IR through the interpreter. They should produce the same results. Right now I'm only fuzzing instcombine.	2026-01-09 10:32:22 +01:00
Wenju He	993054d96f	[InstCombine] Fold redundant FP clamp selects; relax min-max-pattern bailout in visitFCmp (#173452 ) visitFCmp() previously bailed out when a following select matched a clamp pattern. This blocks simplifications when the clamp is provably redundant. This PR allows simplification for clamp selects of flavor SPF_FMAXNUM/ SPF_FMINNUM when one arm is a constant and the other is a sitofp/uitofp of an integer value, and the constant equals the exact min/max of that integer domain: * SPF_FMAXNUM (pattern max(X,C)): redundant if C is the minimum integer mapped exactly to FP (e.g. X = sitofp i8, C = -128.0f). * SPF_FMINNUM (pattern min(X,C)): redundant if C is the maximum integer mapped exactly to FP (e.g. X = uitofp i8, C = 255.0f). This fixes a regression in #173454 --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>	2026-01-05 11:34:53 +08:00
Hongyu Chen	0952ccc712	[InstCombine] Bail out on type mismatch in foldICmpBinOpWithConstantViaTruthTable (#173179 ) Fixes https://github.com/llvm/llvm-project/issues/173177 The previous implementation doesn't consider cases like `<2 x i1> icmp(binop(sel <2 x i1>, sel i1))`.	2025-12-21 16:40:38 +08:00
Nikita Popov	bc19a0ad49	[InstCombine] Use getSigned() for negative numbers	2025-12-11 17:30:37 +01:00
Tirthankar Mazumder	2ce17ba347	[InstCombine][CmpInstAnalysis] Use consistent spelling and function names. NFC. (#171645 ) Both `decomposeBitTestICmp` and `decomposeBitTest` have a parameter called `lookThroughTrunc`. This was spelled in full (i.e. `lookThroughTrunc`) in the header. However, in the implementation, it's written as `lookThruTrunc`. I opted to convert all instances of `lookThruTrunc` into `lookThroughTrunc` to reduce surprise while reading the code and for conformity. --- The other change in this PR is the renaming of the wrapper around `decomposeBitTest()`. Even though it was a wrapper around `CmpInstAnalysis.h`'s `decomposeBitTest`, the function was called `decomposeBitTestICmp`. This is quite confusing because such a function _also_ exists in `CmpInstAnalysis.h`, but it is _not_ the one actually being used in `InstCombineAndOrXor.cpp`.	2025-12-11 07:40:04 +00:00
Tirthankar Mazumder	d94958b2f2	[InstCombine] Fold `icmp samesign u{gt/lt} (X +nsw C2), C` -> `icmp s{gt/lt} X, (C - C2)` (#169960 ) Fixes #166973 Partially addresses #134028 Alive2 proof: https://alive2.llvm.org/ce/z/BqHQNN	2025-12-08 13:05:37 +01:00
actink	583fba3524	[InstCombine] fold icmp of select with invertible shl (#147182 ) Proof: https://alive2.llvm.org/ce/z/a5fzlJ Closes https://github.com/llvm/llvm-project/issues/146642 --------- Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>	2025-11-28 08:54:47 +08:00
Pedro Lobo	e8af134bb7	[InstCombine] Generalize trunc-shift-icmp fold from (1 << Y) to (Pow2 << Y) (#169163 ) Extends the `icmp(trunc(shl))` fold to handle any power of 2 constant as the shift base, not just 1. This generalizes the following patterns by adjusting the comparison offsets by `log2(Pow2)`. ```llvm (trunc (1 << Y) to iN) == 0 --> Y u>= N (trunc (1 << Y) to iN) != 0 --> Y u< N (trunc (1 << Y) to iN) == 2C --> Y == C (trunc (1 << Y) to iN) != 2C --> Y != C ; to (trunc (Pow2 << Y) to iN) == 0 --> Y u>= N - log2(Pow2) (trunc (Pow2 << Y) to iN) != 0 --> Y u< N - log2(Pow2) (trunc (Pow2 << Y) to iN) == 2C --> Y == C - log2(Pow2) (trunc (Pow2 << Y) to iN) != 2C --> Y != C - log2(Pow2) ``` Proof: https://alive2.llvm.org/ce/z/2zwTkp	2025-11-22 15:44:06 +00:00
Peter Collingbourne	b3c54914ef	InstCombine: Stop transforming EQ/NE of SHR to 0 to ULT/UGT if >1 use This is a small code size optimization that lets us avoid both shifting and comparing to a constant if we need the shifted value anyway. On most architectures the zero comparison is cheaper than a constant comparison (or free if the shift sets flags). Although this change appears to remove the optimization entirely, we continue to do this transform if there is one use because of the code below the removed code that transforms the shift into an and, followed by the PR10267 case in InstCombinerImpl::foldICmpAndConstConst that transforms the and into a ult/ugt. Added a test case to verify this explicitly. Per [1] reduces clang .text size by 0.09% and dynamic instruction count by 0.01%. [1] https://llvm-compile-time-tracker.com/compare.php?from=1f38d49ebe96417e368a567efa4d650b8a9ac30f&to=0873787a12b8f2eab019d8211ace4bccc1807343&stat=size-text Reviewers: nikic, dtcxzyw Reviewed By: dtcxzyw Pull Request: https://github.com/llvm/llvm-project/pull/168007	2025-11-17 19:39:20 -08:00
kper	fcb5293ad0	[InstCombine]: Canonicalize to a mask when trunc nuw (#163628 ) The canonicalize is also triggered when the `trunc` is `nuw`. Proof: https://alive2.llvm.org/ce/z/eWvWe3 Fixes: https://github.com/llvm/llvm-project/issues/162451	2025-10-18 00:13:29 +08:00
Brandon	11faf88d8f	[InstCombine] Fold icmp with clamp into unsigned bound check (#161303 ) Fix #157315 alive2: https://alive2.llvm.org/ce/z/TEnuFV The equality comparison of `min(max(X, Lo), Hi)` and `X` is actually a range check on `X`. This PR folds this into an unsigned bound check `(X - Lo) u< (Hi - Lo + 1)`. --------- Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>	2025-10-02 21:51:39 +02:00
Yingwei Zheng	0b7129afcc	[InstCombine] Fix FMF propagation in `foldFCmpFSubIntoFCmp` (#161539 ) Proof: https://alive2.llvm.org/ce/z/orSP-S Closes https://github.com/llvm/llvm-project/issues/161525.	2025-10-03 01:44:03 +08:00
Ramkumar Ramachandra	7fb3a91418	[PatternMatch] Introduce match functor (NFC) (#159386 ) A common idiom is the usage of the PatternMatch match function within a functional algorithm like all_of. Introduce a match functor to shorten this idiom. Co-authored-by: Luke Lau <luke@igalia.com>	2025-09-17 21:04:33 +01:00
Nikita Popov	1cbb35e044	[InstCombine] Support GEP chains in foldCmpLoadFromIndexedGlobal() (#157447 ) Currently this fold only supports a single GEP. However, in ptradd representation, it may be split across multiple GEPs. In particular, PR #151333 will split off constant offset GEPs. To support this, add a new helper decomposeLinearExpression(), which decomposes a pointer into a linear expression of the form BasePtr + Index * Scale + Offset. I plan to also extend this helper to look through mul/shl on the index and use it in more places that currently use collectOffset() to extract a single index * scale. This will make sure such optimizations are not affected by the ptradd migration.	2025-09-09 16:50:45 +02:00
Hongyu Chen	75b0c89e62	[InstCombine][VectorCombine][NFC] Unify uses of lossless inverse cast (#156597 ) This patch addresses https://github.com/llvm/llvm-project/pull/155216#discussion_r2297724663. This patch adds a helper function to put the inverse cast on constants, with cast flags preserved(optional). Follow-up patches will add trunc/ext handling on VectorCombine and flags preservation on InstCombine.	2025-09-08 13:30:06 +00:00
Nikita Popov	305cf0e912	[InstCombine] Make foldCmpLoadFromIndexedGlobal() GEP-type independent (#157089 ) foldCmpLoadFromIndexedGlobal() currently checks that the global type, the GEP type and the load type match in certain ways. Replace this with generic logic based on offsets. This is a reboot of https://github.com/llvm/llvm-project/pull/67093. This PR is less ambitious by requiring that the constant offset is smaller than the stride, which avoids the additional complexity of that PR.	2025-09-08 12:54:24 +02:00
Seraphimt	9f620b8f62	[InstCombine] Slightly optimize visitFcmp (NFC) (#156097 ) Studying the code related to float found a slightly optimal sequence of actions.	2025-08-31 17:48:56 +02:00
Yingwei Zheng	49144f7e49	[InstCombine] Improve range computation in `foldICmpAddConstant` (#155096 ) Address comment https://github.com/llvm/llvm-project/pull/110511#discussion_r1788946221.	2025-08-24 14:32:21 +08:00
zGoldthorpe	a8d25683ee	[PatternMatch] Allow `m_ConstantInt` to match integer splats (#153692 ) When matching integers, `m_ConstantInt` is a convenient alternative to `m_APInt` for matching unsigned 64-bit integers, allowing one to simplify ```cpp const APInt *IntC; if (match(V, m_APInt(IntC))) { if (IntC->ule(UINT64_MAX)) { uint64_t Int = IntC->getZExtValue(); // ... } } ``` to ```cpp uint64_t Int; if (match(V, m_ConstantInt(Int))) { // ... } ``` However, this simplification is only true if `V` is a scalar type. Specifically, `m_APInt` also matches integer splats, but `m_ConstantInt` does not. This patch ensures that the matching behaviour of `m_ConstantInt` parallels that of `m_APInt`, and also incorporates it in some obvious places.	2025-08-15 10:43:54 -06:00
Pavel Skripkin	30144226a4	[llvm] [InstCombine] fold "icmp eq (X + (V - 1)) & -V, X" to "icmp eq (and X, V - 1), 0" (#152851 ) This fold optimizes ```llvm define i1 @src(i32 %num, i32 %val) { %mask = add i32 %val, -1 %neg = sub nsw i32 0, %val %num.biased = add i32 %num, %mask %_2.sroa.0.0 = and i32 %num.biased, %neg %_0 = icmp eq i32 %_2.sroa.0.0, %num ret i1 %_0 } ``` to ```llvm define i1 @tgt(i32 %num, i32 %val) { %mask = add i32 %val, -1 %tmp = and i32 %num, %mask %ret = icmp eq i32 %tmp, 0 ret i1 %ret } ``` For power-of-two `val`. Observed in real life for following code ```rust pub fn is_aligned(num: usize) -> bool { num.next_multiple_of(1 << 12) == num } ``` which verifies that num is aligned to 4096. Alive2 proof https://alive2.llvm.org/ce/z/QisECm	2025-08-14 10:23:03 +03:00
Paul Walker	fb4a8f67b9	[LLVM][InstCombine] foldICmpEquality: Compare APInt values rather than addresses. (#151726 )	2025-08-04 13:54:44 +01:00
David Green	d9971be83e	[InstCombine] Make foldCmpLoadFromIndexedGlobal more resilient to non-array geps. (#150639 ) My understanding is that gep [n x i8] and gep i8 can be treated equivalently - the array type conveys no extra information and could be removed. This goes through foldCmpLoadFromIndexedGlobal and tries to make it work for non-array gep types, so long as the index type still matches the array being loaded.	2025-08-03 10:19:42 +01:00
Nikita Popov	2672719a09	[InstCombine] Don't handle non-canonical index type in icmp of load fold (#151346 ) We should just bail out and wait for it to be canonicalized. The current implementation could emit a trunc without actually performing the transform.	2025-07-30 17:52:08 +02:00
Nikita Popov	f0f3194e19	[InstCombine] Fold icmp of gep chains (#146714 ) This extends https://github.com/llvm/llvm-project/pull/144065 to the general case of an icmp between two GEP chains that have a common base.	2025-07-23 17:08:34 +02:00
Nikita Popov	1e24b53534	[InstCombine] Add limit for expansion of gep chains (#147065 ) When converting gep subtraction / comparison to offset subtraction / comparison, avoid expanding very long multi-use gep chains.	2025-07-23 09:47:53 +02:00
kissholic	baf2953097	Optimize fptrunc(x)>=C1 --> x>=C2 (#99475 ) Fix https://github.com/llvm/llvm-project/issues/85265#issue-2186848949	2025-07-19 17:52:06 +09:00
Ross Kirsling	b1a93cfc32	[InstCombine] foldOpIntoPhi should apply to icmp with non-constant operand (#147676 ) Alive2: https://alive2.llvm.org/ce/z/4MeCzA Fixes #146263.	2025-07-16 10:03:25 +09:00
Yingwei Zheng	c9d9c3e349	[InstCombine] Fold `icmp pred X + K, Y -> icmp pred2 X, Y` if both X and Y is divisible by K (#147130 ) This patch generalizes `icmp ule X +nuw 1, Y -> icmp ult X, Y`-like optimizations to handle the case that the added RHS constant is a common power-of-2 divisor of both X and Y. We can further generalize this pattern to handle non-power-of-2 divisors as well. Alive2: https://alive2.llvm.org/ce/z/QgpeM_ Compile-time improvement (Stage2-O3 -0.09%): https://llvm-compile-time-tracker.com/compare.php?from=0ba59587fa98849ed5107fee4134e810e84b69a3&to=f80e5fe0bb2e63c05401bde7cd42899ea270909b&stat=instructions:u The original case is from the comparison of expanded GEP offsets: https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2530/files#r2183005292	2025-07-05 23:42:53 +08:00
Nikita Popov	83272a4849	[InstCombine] Fold icmp of gep chain with base (#144065 ) Fold icmp between a chain of geps and its base pointer. Previously only a single gep was supported. This will be extended to handle the case of two gep chains with a common base in a followup. This helps to avoid regressions after #137297.	2025-07-02 09:23:36 +02:00
Nikita Popov	bedd7ddb7f	[InstCombine] Fix use after free Load the nowrap flags before calling EmitGEPOffset(), as this may free the instruction.	2025-07-01 15:18:49 +02:00
Nikita Popov	b8b7494551	[InstCombine] Rewrite multi-use GEPs when simplifying comparison (#146100 ) We already do this when both sides are a GEP, but not if only one is. This ensures that the offset arithmetic is not duplicated.	2025-07-01 14:26:47 +02:00
Iris Shi	32f911f3e8	[InstCombine] Fold `ceil(X / (2 ^ C)) == 0` -> `X == 0` (#143683 ) Co-authored-by: Yingwei Zheng <dtcxzyw2333@gmail.com>	2025-06-23 10:51:17 +08:00
Acthinks Yang	f2734aa25e	[InstCombine] fold icmp with add/sub instructions having the same operands (#143241 ) Closes #143211.	2025-06-16 17:05:30 +02:00
Ramkumar Ramachandra	b40e4ceaa6	[ValueTracking] Make Depth last default arg (NFC) (#142384 ) Having a finite Depth (or recursion limit) for computeKnownBits is very limiting, but is currently a load-bearing necessity, as all KnownBits are recomputed on each call and there is no caching. As a prerequisite for an effort to remove the recursion limit altogether, either using a clever caching technique, or writing a easily-invalidable KnownBits analysis, make the Depth argument in APIs in ValueTracking uniformly the last argument with a default value. This would aid in removing the argument when the time comes, as many callers that currently pass 0 explicitly are now updated to omit the argument altogether.	2025-06-03 17:12:24 +01:00
AZero13	eaf911bb98	[InstCombine] Fix comment typo that incorrectly described fold (NFC) (#141105 ) icmp ne X, (sext (icmp ne X, 0)) --> X != 0 && X != -1, not X != 0 && X == -1, which would go to X == -1 anyway.	2025-05-22 20:28:45 +02:00
Antonio Frighetto	adfd59fdb8	[InstCombine] Introduce `foldICmpBinOpWithConstantViaTruthTable` folding Match icmps of binops where both operands are select with constant arms, i.e., `icmp pred (select A ? C1 : C2) binop (select B ? C3 : C4), C5`. Fold such patterns by creating a truth table of the possible four constant variants, and materialize back the optimal logic from it via `createLogicFromTable` helper. This also generalizes an existing fold, which has therefore been dropped. Proofs: https://alive2.llvm.org/ce/z/NS7Vzu. Fixes: https://github.com/llvm/llvm-project/issues/138212.	2025-05-13 09:04:25 +02:00
Simon Pilgrim	26da8870ed	Fix MSVC "not all control paths return a value" warning. NFC.	2025-04-30 12:32:22 +01:00
Yingwei Zheng	d20796dab7	[InstCombine] Offset both sides of an equality icmp (#134086 ) Proof: https://alive2.llvm.org/ce/z/zQ2UW4 Closes https://github.com/llvm/llvm-project/issues/134024	2025-04-30 00:19:23 +08:00
Matt Arsenault	48585caf72	InstCombine: Avoid counting uses of constants (#136566 ) Logically it does not matter; getFreelyInvertedImpl doesn't depend on the value for the m_ImmConstant case. This use count logic should probably sink into getFreelyInvertedImpl, every use of this appears to just be a hasOneUse or hasNUse count, so this could change to just be a use count threshold.	2025-04-23 10:51:55 +02:00
Yingwei Zheng	65ed35393c	[IR] Add helper `CmpPredicate::dropSameSign` (#134071 ) Address review comment https://github.com/llvm/llvm-project/pull/133711#discussion_r2024519641	2025-04-02 22:25:01 +08:00
Veera	4cdcf3b193	[InstCombine] Fold `(trunc nuw A to i1) == (trunc nuw B to i1)` to `A == B` (#133368 ) Fixes #133344 Proof: https://alive2.llvm.org/ce/z/X3Uh23 InstCombine couldn't optimize `i1` because `canonicalizeICmpBool()` was transforming the comparison into bitwise operations before `foldICmpTruncWithTruncOrExt()` was called. This PR solves the ordering issue by placing `foldICmpTruncWithTruncOrExt()` before `canonicalizeICmpBool()`. I believe this will not cause any regressions since all tests are passing.	2025-03-28 08:32:45 -04:00
Nikita Popov	e56a6a2683	Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880 ) (#128020 ) Relative to the previous attempt this includes two fixes: * Adjust callCapturesBefore() to not skip captures(ret: address, provenance) arguments, as these will not count as a capture at the call-site. * When visiting uses during stack slot optimization, don't skip the ModRef check for passthru captures. Calls can both modref and be passthru for captures. ------ This extends CaptureTracking to support inferring non-trivial CaptureInfos. The focus of this patch is to only support FunctionAttrs, other users of CaptureTracking will be updated in followups. The key API changes here are: * DetermineUseCaptureKind() now returns a UseCaptureInfo where the UseCC component specifies what is captured at that Use and the ResultCC component specifies what may be captured via the return value of the User. Usually only one or the other will be used (corresponding to previous MAY_CAPTURE or PASSTHROUGH results), but both may be set for call captures. * The CaptureTracking::captures() extension point is passed this UseCaptureInfo as well and then can decide what to do with it by returning an Action, which is one of: Stop: stop traversal. ContinueIgnoringReturn: continue traversal but don't follow the instruction return value. Continue: continue traversal and follow the instruction return value if it has additional CaptureComponents. For now, this patch retains the (unsound) special logic for comparison of null with a dereferenceable pointer. I'd like to switch key code to take advantage of address/address_is_null before dropping it. This PR mainly intends to introduce necessary API changes and basic inference support, there are various possible improvements marked with TODOs.	2025-02-27 09:38:29 +01:00
Nico Weber	e2ba1b6ffd	Revert "Reapply [CaptureTracking][FunctionAttrs] Add support for CaptureInfo (#125880 )" This reverts commit 0fab404ee874bc5b0c442d1841c7d2005c3f8729. Seems to break LTO builds of clang on Windows, see comments on https://github.com/llvm/llvm-project/pull/125880	2025-02-19 11:32:57 -05:00

1 2 3 4 5 ...

1154 Commits