llvm-project

Author	SHA1	Message	Date
Nikita Popov	8a09adc22a	[InstCombine] Split GEPs with multiple variable indices (#137297 ) Split GEPs that have more than one variable index into two. This is in preparation for the ptradd migration, which will not support multi-index GEPs. This also enables the split off part to be CSEd and LICMed.	2025-07-30 12:54:06 +02:00
Luke Lau	9a5b0f302b	Reapply "[InstCombine] Match scalable splats in m_ImmConstant (#132522 )" (#134262 ) This reapplies #132522. Previously casts of scalable m_ImmConstant splats weren't being folded by ConstantFoldCastOperand, triggering the "Constant-fold of ImmConstant should not fail" assertion. There are no changes to the code in this PR, instead we just needed #133207 to land first. A test has been added for the assertion in llvm/test/Transforms/InstSimplify/vec-icmp-of-cast.ll @icmp_ult_sext_scalable_splat_is_true. <hr/> #118806 fixed an infinite loop in FoldShiftByConstant that could occur when the shift amount was a ConstantExpr. However this meant that FoldShiftByConstant no longer kicked in for scalable vectors because scalable splats are represented by ConstantExprs. This fixes it by allowing scalable splats of non-ConstantExprs in m_ImmConstant, which also fixes a few other test cases where scalable splats were being missed. But I'm also hoping that UseConstantIntForScalableSplat will eventually remove the need for this. I noticed this when trying to reverse a combine on RISC-V in #132245, and saw that the resulting vector and scalar forms were different.	2025-04-03 18:03:16 +01:00
Luke Lau	b61e3874fa	Revert "[InstCombine] Match scalable splats in m_ImmConstant (#132522 )" This reverts commit df9e5ae5b40c4d245d904a2565e46f5b7ab9c7c8. This is triggering an assertion failure on llvm-test-suite with -enable-vplan-native-path: https://lab.llvm.org/buildbot/#/builders/198/builds/3365	2025-04-03 15:16:56 +01:00
Luke Lau	df9e5ae5b4	[InstCombine] Match scalable splats in m_ImmConstant (#132522 ) #118806 fixed an infinite loop in FoldShiftByConstant that could occur when the shift amount was a ConstantExpr. However this meant that FoldShiftByConstant no longer kicked in for scalable vectors because scalable splats are represented by ConstantExprs. This fixes it by allowing scalable splats of non-ConstantExprs in m_ImmConstant, which also fixes a few other test cases where scalable splats were being missed. But I'm also hoping that UseConstantIntForScalableSplat will eventually remove the need for this. I noticed this when trying to reverse a combine on RISC-V in #132245, and saw that the resulting vector and scalar forms were different. --------- Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>	2025-04-02 21:21:52 +01:00
Nikita Popov	462cb3cd6c	[InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144 ) If the gep is nusw (usually via inbounds) and the offset is non-negative, we can infer nuw. Proof: https://alive2.llvm.org/ce/z/ihztLy	2024-12-05 14:36:40 +01:00
Paul Walker	56c091ea71	[LLVM][IR] Use splat syntax when printing ConstantExpr based splats. (#116856 ) This brings the printing of scalable vector constant splats inline with their fixed length counterparts.	2024-11-21 11:21:12 +00:00
Paul Walker	38fffa630e	[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548 )	2024-11-06 11:53:33 +00:00
Yingwei Zheng	095d49da76	[InstCombine] Set `samesign` when converting signed predicates into unsigned (#112642 ) Alive2: https://alive2.llvm.org/ce/z/6cqdt-	2024-10-17 20:43:48 +08:00
Yingwei Zheng	62cd07fb67	[InstCombine] Canonicalize `sub mask, X -> ~X` when high bits are ignored (#110635 ) Alive2: https://alive2.llvm.org/ce/z/NJgBPL The motivating case of this patch is to emit `andn` on RISC-V with zbb for expressions like `(sub 63, X) & 63`.	2024-10-02 12:48:06 +08:00
Nikita Popov	a105877646	[InstCombine] Remove some of the complexity-based canonicalization (#91185 ) The idea behind this canonicalization is that it allows us to handle less patterns, because we know that some will be canonicalized away. This is indeed very useful to e.g. know that constants are always on the right. However, this is only useful if the canonicalization is actually reliable. This is the case for constants, but not for arguments: Moving these to the right makes it look like the "more complex" expression is guaranteed to be on the left, but this is not actually the case in practice. It fails as soon as you replace the argument with another instruction. The end result is that it looks like things correctly work in tests, while they actually don't. We use the "thwart complexity-based canonicalization" trick to handle this in tests, but it's often a challenge for new contributors to get this right, and based on the regressions this PR originally exposed, we clearly don't get this right in many cases. For this reason, I think that it's better to remove this complexity canonicalization. It will make it much easier to write tests for commuted cases and make sure that they are handled.	2024-08-21 12:02:54 +02:00
Yingwei Zheng	0c03b4ce10	[InstCombine] Infer sub nuw from dominating conditions (#100164 ) Alive2: https://alive2.llvm.org/ce/z/g3xxnM	2024-07-24 23:39:30 +08:00
Nikita Popov	2a1169088c	[InstCombine] Preserve all gep flags when emitting offset	2024-06-19 12:36:36 +02:00
Nikita Popov	cbe1760f02	[InstCombine] Allow multi-use OptimizePointerDifference() with two GEPs (#90017 ) Currently, the OptimizePointerDifference fold does not trigger when working on the sub of two geps where one of the geps has multiple uses, to avoid duplicating the offset arithmetic too much. However, there are cases where performing it would still be clearly profitable, e.g. test_sub_ptradd_multiuse. This patch drops the one-use restriction using the same strategy we use in GEP comparison folds: If there are multiple uses, we rewrite the GEP to use the expanded offset arithmetic instead (effectively canonicalizing it into ptradd representation). Fixes https://github.com/llvm/llvm-project/issues/88231.	2024-04-26 10:53:03 +09:00
Nikita Popov	7a77b763af	[InstCombine] Add additional multi-use test for sub of gep (NFC) A case where still performing the fold is clearly profitable.	2024-04-25 14:24:55 +09:00
Nikita Popov	d9a5aa8e2d	[PatternMatch] Do not accept undef elements in m_AllOnes() and friends (#88217 ) Change all the cstval_pred_ty based PatternMatch helpers (things like m_AllOnes and m_Zero) to only allow poison elements inside vector splats, not undef elements. Historically, we used to represent non-demanded elements in vectors using undef. Nowadays, we use poison instead. As such, I believe that support for undef in vector splats is no longer useful. At the same time, while poison splat elements are pretty much always safe to ignore, this is not generally the case for undef elements. We have existing miscompiles in our tests due to this (see the masked-merge-*.ll tests changed here) and it's easy to miss such cases in the future, now that we write tests using poison instead of undef elements. I think overall, keeping support for undef elements no longer makes sense, and we should drop it. Once this is done consistently, I think we may also consider allowing poison in m_APInt by default, as doing that change is much less risky than doing the same with undef. This change involves a substantial amount of test changes. For most tests, I've just replaced undef with poison, as I don't think there is value in retaining both. For some tests (where the distinction between undef and poison is important), I've duplicated tests.	2024-04-17 18:22:05 +09:00
Noah Goldstein	17162b61c2	[KnownBits] Make `nuw` and `nsw` support in `computeForAddSub` optimal Just some improvements that should hopefully strengthen analysis. Closes #83580	2024-03-05 12:59:58 -06:00
Paul Walker	fd07b8f809	[LLVM][tests/Transforms/InstCombine] Convert instances of ConstantExpr based splats to use splat(). This is mostly NFC but some output does change due to consistently inserting into poison rather than undef and using i64 as the index type for inserts.	2024-02-27 13:37:23 +00:00
Nikita Popov	90ba33099c	[InstCombine] Canonicalize constant GEPs to i8 source element type (#68882 ) This patch canonicalizes getelementptr instructions with constant indices to use the `i8` source element type. This makes it easier for optimizations to recognize that two GEPs are identical, because they don't need to see past many different ways to express the same offset. This is a first step towards https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699. This is limited to constant GEPs only for now, as they have a clear canonical form, while we're not yet sure how exactly to deal with variable indices. The test llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll gives two representative examples of the kind of optimization improvement we expect from this change. In the first test SimplifyCFG can now realize that all switch branches are actually the same. In the second test it can convert it into simple arithmetic. These are representative of common optimization failures we see in Rust. Fixes https://github.com/llvm/llvm-project/issues/69841.	2024-01-24 15:25:29 +01:00
Noah Goldstein	dbf6f30926	[InstCombine] Add folds for `(X + Y) - (W + Z)` If `Y` and `Z` are constant then we can simplify to `(X - W) + (Y - Z)`. If `Y == Z` we can fold to `X - W`. Note these transform exist outside of InstCombine. The purpose of this commit is primarily to make it so that folds can generate these simplifiable patterns without having to worry about creating an inf loop.	2023-11-20 17:59:26 -06:00
Noah Goldstein	c4d8d91679	[InstCombine] Add tests for folding `(X + Y) - (W + Z)`; NFC	2023-11-20 17:59:26 -06:00
Yingwei Zheng	6da4ecdf92	[InstCombine] Infer shift flags with unknown shamt (#72535 ) Alive2: https://alive2.llvm.org/ce/z/82Wr3q Related patch: `2dd52b4527`	2023-11-18 15:15:14 +08:00
Sanjay Patel	1ce06176ec	[InstSimplify] reduce "mul nsw i1" to "false" https://alive2.llvm.org/ce/z/XYGvYg Follow-up for: 68c197f07eeae71	2023-01-18 13:51:40 -05:00
Sanjay Patel	1720ec6da0	[InstCombine] restrict no-wrap propagation for i1/i2 to avoid miscompiles This transform was added with 68c197f07eeae71b9b7, and the post-commit review noted the potential for miscompiles at narrow bitwidths. I'm not sure how to expose the i1 nuw bug because we already simplify that, but other cases show that there are missing transforms to add in follow-up patches.	2023-01-18 10:32:12 -05:00
Sanjay Patel	eb42b67e8c	[InstCombine][InstSimplify] add tests for i1/i2 mul with no-wrap; NFC A bug was introduced with 68c197f07eeae71 as noted in the post-commit review comments, and there are potentially missed smaller transforms/simplifications because no-wrap multiply with only 1 or 2 bits eliminates some potential results.	2023-01-18 10:17:06 -05:00
Sanjay Patel	f81151edbf	[InstCombine] add tests for diff-of-squares; NFC Ideally, these negative tests would have been added before 68c197f07eea to make sure the specific operand matching is working as expected, but I did not remember to include them.	2023-01-18 08:13:37 -05:00
Sanjay Patel	68c197f07e	[InstCombine] factor difference-of-squares to reduce multiplication (X * X) - (Y * Y) --> (X + Y) * (X - Y) https://alive2.llvm.org/ce/z/BAuRCf The no-wrap propagation could be relaxed in some cases, but there does not seem to be an obvious rule for that.	2023-01-17 14:58:40 -05:00
Sanjay Patel	bf44905183	[InstCombine] add tests for difference-of-squares; NFC	2023-01-17 14:58:40 -05:00
Paul Walker	eae26b6640	[IRBuilder] Use canonical i64 type for insertelement index used by vector splats. Instcombine prefers this canonical form (see getPreferredVectorIndex), as does IRBuilder when passing the index as an integer so we may as well use the prefered form from creation. NOTE: All test changes are mechanical with nothing else expected beyond a change of index type from i32 to i64. Differential Revision: https://reviews.llvm.org/D140983	2023-01-11 14:08:06 +00:00
chenglin.bi	6703290361	[InstCombine] fold `sub + and` pattern with specific const value `C1 - ((C3 - X) & C2) --> (X & C2) + (C1 - (C2 & C3))` when: (C3 - ((C2 & C3) - 1)) is pow2 && ((C2 + C3) & ((C2 & C3) - 1)) == ((C2 & C3) - 1) && C2 is negative pow2 \|\| (C3 - X) is nuw https://alive2.llvm.org/ce/z/HXQJV- Fix: #58523 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D136582	2022-11-05 12:58:45 +08:00
chenglin.bi	a3a9fffea1	[InstCombine] Precommit test for D136582; NFC	2022-11-01 04:46:19 +08:00
Sanjay Patel	d2d23795ca	[InstCombine] improve demanded bits for Sub operand 0 This is copying the code that was added for 'add' with D130075. (That patch removed a fallthrough in the cases, but we can probably still share at least some code again as a follow-up cleanup, but I didn't want to risk it here.) The reasoning is similar to the carry propagation for 'add': if we don't demand low bits of the subtraction and the subtrahend (aka RHS or operand 1) is known zero in those low bits, then there can't be any borrowing required from the higher bits of operand 0, so the low bits don't matter. Also, the no-wrap flags can be propagated (and I think that should be true for add too). Here's an attempt to prove that in Alive2: https://alive2.llvm.org/ce/z/xqh7Pa (can add nsw or nuw to src and tgt, and it should still pass) Differential Revision: https://reviews.llvm.org/D136788	2022-10-27 09:41:57 -04:00
Sanjay Patel	7d3a37a4b4	[InstCombine] add tests for demanded bits of sub; NFC	2022-10-26 17:23:33 -04:00
Sanjay Patel	1bd856fbe5	[InstCombine] add tests for demanded bits of sub; NFC	2022-10-26 14:04:46 -04:00
Bjorn Pettersson	4ab40eca08	[test][InstCombine] Update some test cases to use opaque pointers These tests cases were converted using the script at https://gist.github.com/nikic/98357b71fd67756b0f064c9517b62a34 Differential Revision: https://reviews.llvm.org/D135094	2022-10-03 22:17:59 +02:00
Sanjay Patel	64d309131a	[InstCombine] try multi-use demanded bits fold for 'sub' This is similar to D133788 / 73919a87e9a6, but for sub the transform is valid only for low zeros in operand 1. https://alive2.llvm.org/ce/z/EmRsXC	2022-09-21 14:13:05 -04:00
Sanjay Patel	eecaa86314	[InstCombine] add tests for multi-use sub demanded bits: NFC	2022-09-21 14:13:05 -04:00
Sanjay Patel	0f32a5dea0	[InstCombine] don't canonicalize shl+sub to mul+add This stops Negator from transforming: `C1 - shl X, C2 --> mul X, (1<<C2) + C1` ...in the general case. There does not seem to be any analysis benefit to using mul in IR, and there's definitely downside in codegen (particularly when the multiply has to be expanded). If `C1` is 0, then there's a stronger argument that the single mul is a better canonicalization than negate-of-shl, but we may want to remove that too. This was noted as a potential conflict for D133667. Differential Revision: https://reviews.llvm.org/D134310	2022-09-21 08:39:07 -04:00
Eric Gullufsen	eb1e2b3997	[InstCombine] Canonicalize "and, add", "or, add", "xor, add" Canonicalize ``` ((x + C1) & C2) --> ((x & C2) + C1) ((x + C1) ^ C2) --> ((x ^ C2) + C1) ((x + C1) \| C2) --> ((x \| C2) + C1) ``` for suitable constants `C1` and `C2`. Alive2 proofs: [[ https://alive2.llvm.org/ce/z/BqMDVZ \| add, or --> or, add ]] [[ https://alive2.llvm.org/ce/z/BhAeCl \| add, xor --> xor, add ]] [[ https://alive2.llvm.org/ce/z/jYRHEt \| add, and --> and, add ]] Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D131142	2022-08-26 17:23:29 -04:00
Philip Reames	c58791c286	Revert "[InstCombine] Canonicalize "and, add", "or, add", "xor, add"" This reverts commit d2f110c693c88d1bb7caee4f72ebb14766f85239. test/Transforms/InstCombine/freeze.ll fails on ninja check-llvm on x86_64.	2022-08-26 11:18:31 -07:00
Eric Gullufsen	d2f110c693	[InstCombine] Canonicalize "and, add", "or, add", "xor, add" Canonicalize ``` ((x + C1) & C2) --> ((x & C2) + C1) ((x + C1) ^ C2) --> ((x ^ C2) + C1) ((x + C1) \| C2) --> ((x \| C2) + C1) ``` for suitable constants `C1` and `C2`. Alive2 proofs: [[ https://alive2.llvm.org/ce/z/BqMDVZ \| add, or --> or, add ]] [[ https://alive2.llvm.org/ce/z/BhAeCl \| add, xor --> xor, add ]] [[ https://alive2.llvm.org/ce/z/jYRHEt \| add, and --> and, add ]] Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D131142	2022-08-26 14:07:43 -04:00
Sanjay Patel	0cfc651032	[InstCombine] ease use constraint in tryFactorization() The stronger one-use checks prevented transforms like this: (x * y) + x --> x * (y + 1) (x * y) - x --> x * (y - 1) https://alive2.llvm.org/ce/z/eMhvQa This is one of the IR transforms suggested in issue #57255. This should be better in IR because it removes a use of a variable operand (we already fold the case with a constant multiply operand). The backend should be able to re-distribute the multiply if that's better for the target. Differential Revision: https://reviews.llvm.org/D132412	2022-08-24 12:10:54 -04:00
Sanjay Patel	4376afe727	[InstCombine] add tests for mul+sub common factor; NFC	2022-08-24 11:31:18 -04:00
Jay Foad	f82c55fa08	[InstCombine] Change order of canonicalization of ADD and AND Canonicalize ((x + C1) & C2) --> ((x & C2) + C1) for suitable constants C1 and C2, instead of the other way round. This should allow more constant ADDs to be matched as part of addressing modes for loads and stores. Differential Revision: https://reviews.llvm.org/D130080	2022-08-22 20:03:53 +01:00
Sanjay Patel	4022551a15	[ValueTracking] recognize sub X, (X -nuw Y) as not overflowing This extends a similar pattern from D125500 and D127754. If we know that operand 1 (RHS) of a subtract is itself a non-overflowing subtract from operand 0 (LHS), then the final/outer subtract is also non-overflowing: https://alive2.llvm.org/ce/z/Bqan8v InstCombine uses this analysis to trigger a narrowing optimization, so that is what the first changed test shows. The last test models a motivating case from issue #48013. In that example, we determine 'nuw' on the first sub from the urem, then we determine that the 2nd sub can be narrowed, and that leads to eliminating both subtracts. here are still several missing subtract narrowing optimizations demonstrated in the tests above the diffs shown here - those should be handled in InstCombine with another set of patches.	2022-06-19 15:12:19 -04:00
Sanjay Patel	bfb915ec8b	[InstCombine] add tests for 'sub nuw' with zext; NFC	2022-06-19 15:12:19 -04:00
Sanjay Patel	8605b4d8c5	[ValueTracking] recognize sub X, (X -nsw Y) as not overflowing This extends a similar pattern from D125500. If we know that operand 1 (RHS) of a subtract is itself a non-overflowing subtract from operand 0 (LHS), then the final/outer subtract is also non-overflowing: https://alive2.llvm.org/ce/z/Bqan8v InstCombine uses this analysis to trigger a narrowing optimization, so that is what the first changed test shows. The last test models the motivating case from issue #48013. In that example, we determine 'nsw' on the first sub from the srem, then we determine that the 2nd sub can be narrowed, and that leads to eliminating both subtracts. This works for unsigned sub too, but I left that out to keep the patch minimal. If this looks ok, I will follow up with that change. There are also several missing subtract narrowing optimizations demonstrated in the tests above the diffs shown here - those should be handled in InstCombine with another set of patches. Differential Revision: https://reviews.llvm.org/D127754	2022-06-14 14:51:49 -04:00
Sanjay Patel	304cda0b16	[InstCombine] add tests for sub with extended operands; NFC	2022-06-14 11:24:06 -04:00
Sanjay Patel	ee6754c277	[ValueTracking] recognize sub X, (X % Y) as not overflowing I fixed some poison-safety violations on related patterns in InstCombine and noticed that we missed adding nsw/nuw on them, so this adds clauses to the underlying analysis for that. We need the undef input restriction to make this safe according to Alive2: https://alive2.llvm.org/ce/z/48g9K8 Differential Revision: https://reviews.llvm.org/D125500	2022-05-13 09:59:41 -04:00
Sanjay Patel	0fefb56da7	[InstCombine] add tests for sub with rem operand; NFC	2022-05-13 09:59:40 -04:00
Nikita Popov	a266af7211	[InstCombine] Canonicalize SPF to min/max intrinsics Now that integer min/max intrinsics have good support in both InstCombine and other passes, start canonicalizing SPF min/max to intrinsic min/max. Once this sticks, we can stop matching SPF min/max in various places, and can remove hacks we have for preventing infinite loops and breaking of SPF canonicalization. Differential Revision: https://reviews.llvm.org/D98152	2022-02-24 09:01:20 +01:00

1 2 3 4

166 Commits