llvm-project

Author	SHA1	Message	Date
Kiva	f4b523d669	[PatternMatch][NFC] Add `m_IToFP` and `m_FPToI` (#188040 ) Added two patterns for IR pattern matching, `m_IToFP` and `m_FPToI` which are basically shortcuts of `m_CombinedOr(..., ...)` > if there isn't already one, PatternMatch should have an m_ItoFP which covers both _Originally posted by @arsenm in https://github.com/llvm/llvm-project/pull/185826#discussion_r2967473936_ /cc @arsenm	2026-03-23 14:10:18 +00:00
Lewis Crawford	a5c88632e7	[InstSimplify] Optimize fcmp(min/maxnimumnum) (#185868 ) Extend the fcmp(min/maxnum(X, C2), C1) optimization to apply to min/maximumnum too.	2026-03-13 16:35:52 +00:00
Andreas Jonson	0918e0f1ed	[InstSimplify] Simplify and/or of trunc nuw to i1 with op replacement (#185517 ) Regression noticed in https://github.com/llvm/llvm-project/pull/184182 proof: https://alive2.llvm.org/ce/z/CMjuSC	2026-03-10 19:06:21 +01:00
Andreas Jonson	6e8e8eaa34	Revert "[InstSimplify] Simplify and/or of trunc nuw to i1 with op replacement" This reverts commit dacb62989db8084cc2865d6b9ef85bbdf34e112d.	2026-03-09 21:51:46 +01:00
Andreas Jonson	dacb62989d	[InstSimplify] Simplify and/or of trunc nuw to i1 with op replacement	2026-03-09 21:46:24 +01:00
Lewis Crawford	fa6eef8378	Revert "Avoid maxnum(sNaN, x) optimizations / folds (#170181 )" (#184125 ) This reverts commit ea3fdc5972db7f2d459e543307af05c357f2be26. Re-enable const-folding for maxnum/minnum in the middle-end, GlobalISel, and SelectionDAG. Re-enable optimizations that depend on maxnum/minnum sNaN semantics in InstCombine and DAGCombiner. Now that maxnum(x, sNaN) is specified to non-deterministically produce either NaN or x, these constant-foldings and optimizations are now valid again according to the newly clarified semantics in #172012 .	2026-03-03 12:45:26 +00:00
Nathan Gauër	717a9ab442	[InstSimplify] Add support for llvm.structured.gep (#182874 ) Similar to GEP, the SGEP instruction with no indices can be simplified by directly using the base pointer.	2026-02-25 14:07:51 +01:00
Kunqiu Chen	85e07bad93	[InstructionSimplify] Extend simplifyICmpWithZero to handle equivalent zero RHS (#179055 ) Add a new helper function `matchEquivZeroRHS()` that recognizes comparisons with constants that are equivalent to comparisons with zero, and transforms the predicate accordingly. This handles the following transformations: - icmp sgt X, -1 --> icmp sge X, 0 - icmp sle X, -1 --> icmp slt X, 0 - icmp [us]ge X, 1 --> icmp [us]gt X, 0 - icmp [us]lt X, 1 --> icmp [us]le X, 0 This enables more optimization opportunities in `simplifyICmpWithZero`, such as folding icmp sgt X, -1 when X is known to be non-negative. --- - IR Impact: https://github.com/dtcxzyw/llvm-opt-benchmark/pull/3414	2026-02-13 00:06:32 +08:00
Luke Lau	cee36b23cc	[IR] Allow non-constant offsets in @llvm.vector.splice.{left,right} (#174693 ) Following on from #170796, this PR implements the second part of https://discourse.llvm.org/t/rfc-allow-non-constant-offsets-in-llvm-vector-splice/88974 by allowing non-constant offsets in the vector splice intrinsics. Previously @llvm.vector.splice had a restriction enforced by the verifier that the offset had to be known to be within the range of the vector at compile time. Because we can't enforce this with non-constant offsets, it's been relaxed so that offsets that would slide the vector out of bounds return a poison value, similar to insertelement/extractelement. @llvm.vector.splice.left also previously only allowed offsets within the range 0 <= Offset < N, but this has been relaxed to 0 <= Offset <= N so that it's consistent with @llvm.vector.splice.right. In lieu of the verifier checks that were removed, InstSimplify has been taught to fold splices to poison when the offset is out of bounds. The cost model isn't implemented in this PR, and just returns invalid for any non-constant offsets for now. I think the correct way to cost these non-constant offets isn't through getShuffleCost because they can't handle variable masks, but instead just through getIntrinsicInstCost.	2026-01-21 10:58:40 +00:00
Matt Arsenault	3d009a7b0d	InstSimplify: Handle nsz in fabs of known positive fold (#176923 ) The current tests for the fold use fma with a squared input. This isn't entirely correct because fma can return -0 in this case. Extend the fold to perform it with nsz. Also extend the tests to test with an unknown value for the addend. The known normal constant is almost special case that disproves a -0 result. Split out from https://github.com/llvm/llvm-project/pull/175614	2026-01-20 16:26:40 +01:00
Karol Zwolak	295256f59a	[InstSimplify] Fall back to the rest of the logic if folding of the consts isn't successfull when simplifying fcmp (#176159 ) Fixes #175949.	2026-01-16 11:59:12 +00:00
Nikita Popov	fb2c62cf68	[InstSimplify] Avoid poison value for ctz/abs in simplifyWithOpsReplaced() (#176168 ) If we drop flags, we'll set the zero_is_poison/int_min_is_poison flag to false as part of the transform. However, the constant folding was still performed with the value true, which made constant folding return poison. This resulted in the pattern failing to match, as the poison value is not equal to the other select arm. To avoid this, add some special handling to set the argument to false during constant folding as well. Fixes https://github.com/llvm/llvm-project/issues/175282.	2026-01-16 09:34:08 +01:00
Ramkumar Ramachandra	d69335bac9	[LLVM] Clean up code using [not_]equal_to (NFC) (#175824 ) Use llvm::[not_]equal_to landed in d2a521750 ([ADT] Introduce bind_{front,back}, [not_]equal_to, #175056) across LLVM for cleaner code.	2026-01-13 21:19:39 +00:00
Justin Lebar	a4ee3d9d7c	[Instcombine] Fix crash in foldMinimumMaximumSharedOp (#173705 ) We were missing a check that the inner intrinsic is in fact a min/max op. We'd crash if it was any other intrinsic! This was found by a fuzzer I'm working on. The high-level design is to randomly generate LLVM IR, run a pass on it, and then run the original and new IR through the interpreter. They should produce the same results. Right now I'm only fuzzing instcombine.	2026-01-09 10:31:47 +01:00
Kevin Per	d953c3604e	[Instcombine] Fold select of ucmp/scmp (#168505 ) Folds `select(icmp(eq, X, Y), 0, llvm.cmp(X, Y))` -> `llvm.cmp(X, Y)` for `llvm.ucmp` and `llvm.scmp`. Alive Proof: https://alive2.llvm.org/ce/z/sPJhgr Closes https://github.com/llvm/llvm-project/issues/166579	2026-01-03 16:01:57 +01:00
Nikita Popov	80b900e91c	[InstSimplify] Support ptrtoaddr in simplifyICmpInst() (#171985 ) This is basically the same change as #162653, but for InstSimplify instead of ConstantFolding. It folds `icmp (ptrtoaddr x, ptrtoaddr y)` to `icmp (x, y)` and `icmp (ptrtoaddr x, C)` to `icmp (x, inttoptr C)`. The fold is restricted to the case where the result type is the address type, as icmp only compares the icmp bits. As in the other PR, I think in practice all the folds are also going to work if the ptrtoint result type is larger than the address size, but it's unclear how to justify this in general.	2025-12-15 09:06:28 +00:00
Nikita Popov	d0d8359c01	[InstSimplify] Remove redundant icmp+ptrtoint fold (#171807 ) There is a generic fold for recursively calling simplifyICmpInst with the ptrtoint cast stripped: `9b6b52b534/llvm/lib/Analysis/InstructionSimplify.cpp (L3850-L3867)` As such, we shouldn't have to explicitly do this for the computePointerICmp() fold. This is not strictly NFC because the recursion limit applies to the generic fold, though I wouldn't expect this to matter in practice.	2025-12-12 08:55:25 +01:00
Craig Topper	8b87edfa68	[InstSimplify] Ignore mask when combinining vp.reverse(vp.reverse). (#171542 ) The mask doesn't really affect the reverse. It only poisons the masked off elements in the results. It should be ok to ignore the mask if we can eliminate the pair. I don't have a specific use case for this, but it matches what I had implemented in our downstream before the current upstream implementation. Submitting upstream so I can remove the delta in my downstream.	2025-12-09 19:44:00 -08:00
Lewis Crawford	ea3fdc5972	Avoid maxnum(sNaN, x) optimizations / folds (#170181 ) The behaviour of constant-folding `maxnum(sNaN, x)` and `minnum(sNaN, x)` has become controversial, and there are ongoing discussions about which behaviour we want to specify in the LLVM IR LangRef. See: - https://github.com/llvm/llvm-project/issues/170082 - https://github.com/llvm/llvm-project/pull/168838 - https://github.com/llvm/llvm-project/pull/138451 - https://github.com/llvm/llvm-project/pull/170067 - https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006 This patch removes optimizations and constant-folding support for `maxnum(sNaN, x)` but keeps it folded/optimized for `qNaN`. This should allow for some more flexibility so the implementation can conform to either the old or new version of the semantics specified without any changes. As far as I am aware, optimizations involving constant `sNaN` should generally be edge-cases that rarely occur, so here should hopefully be very little real-world performance impact from disabling these optimizations.	2025-12-02 12:43:03 +00:00
Pedro Lobo	76d614b7c1	[InstSimplify] Extend icmp-of-add simplification to sle/sgt/sge (#168900 ) When comparing additions with the same base where one has `nsw`, the following simplification can be performed: ```llvm icmp slt/sgt/sle/sge (x + C1), (x +nsw C2) => icmp slt/sgt/sle/sge C1, C2 ``` Previously this was only done for `slt`. This patch extends it to the `sgt`, `sle`, and `sge` predicates when either of the conditions hold: - `C1 <= C2 && C1 >= 0`, or - `C2 <= C1 && C1 <= 0` This patch also handles the `C1 == C2` case, which was previously excluded. Proof: https://alive2.llvm.org/ce/z/LtmY4f	2025-11-20 21:35:14 +00:00
Paul Walker	f2b5d04f29	[LLVM][InstSimplify] Add folds for SVE integer reduction intrinsics. (#167519 ) [andv, eorv, orv, s/uaddv, s/umaxv, s/uminv] sve_reduce_##(none, ?) -> op's neutral value sve_reduce_##(any, neutral) -> op's neutral value [andv, orv, s/umaxv, s/uminv] sve_reduce_##(all, splat(X)) -> X [eorv] sve_reduce_##(all, splat(X)) -> 0	2025-11-18 14:33:43 +00:00
Igor Gorban	dd7a000a31	[InstSimplify] Fix crash when optimizing minmax with bitcast constant vectors (#168055 ) When simplifying min/max intrinsics with fixed-size vector constants, InstructionSimplify attempts to optimize element-wise. However, getAggregateElement() can return null for certain constant expressions like bitcasts, leading to a null pointer dereference. This patch adds a check to bail out of the optimization when getAggregateElement() returns null, preventing the crash while maintaining correct behavior for normal constant vectors. Fixes crash with patterns like: call <2 x half> @llvm.minnum.v2f16(<2 x half> %x, <2 x half> bitcast (<1 x i32> <i32 N> to <2 x half>))	2025-11-15 03:05:30 +08:00
Congzhe	e246fffb25	Reland "[InstructionSimplify] Enhance simplifySelectInst() (#163453 )" (#164694 ) This reverts commit f1c1063. PR #163453 was merged and reverted since it exposed a crash. After investigation the crash was unrelated and is then fixed in #164628. This is an attempt to reland #163453.	2025-10-26 00:39:49 -04:00
Arthur Eubanks	f1c1063acb	Revert "[InstCombinePHI] Enhance PHI CSE to remove redundant phis" (#164520 ) Reverts llvm/llvm-project#163453 Causes crashes, see https://github.com/llvm/llvm-project/pull/163453#issuecomment-3429922732	2025-10-22 08:51:10 +00:00
Congzhe	9a9fbbba5c	[InstructionSimplify] Enhance simplifySelectInst() (#163453 ) Fold select instructions with true and false values that act as the same phi, which cleans up the IR and open up opportunities for other passes such as loop vectorization.	2025-10-21 16:12:26 -04:00
Nikita Popov	ec26f219ac	[InstSimplify] Support ptrtoaddr in simplifyGEPInst() (#164262 ) This adds support for ptrtoaddr in the `ptradd p, ptrtoaddr(p2) - ptrtoaddr(p) -> p2` fold. This fold requires that p and p2 have the same underlying object (otherwise the provenance may not be the same). The argument I would like to make here is that because the underlying objects are the same (and the pointers in the same address space), the non-address bits of the pointer must be the same. Looking at some specific cases of underlying object relationship: * phi/select: Trivially true. * getelementptr: Only modifies address bits, non-address bits must remain the same. * addrspacecast round-trip cast: Must preserve all bits because we optimize such round-trip casts away. * non-interposable global alias: I'm a bit unsure about this one, but I guess the alias and the aliasee must have the same non-address bits? * various intrinsics like launder.invariant.group, ptrmask. I think these all either preserve all pointer bits (like the invariant.group ones) or at least the non-address bits (like ptrmask). There are some interesting cases like amdgcn.make.buffer.rsrc, but those are cross address-space. ----- There is a second `gep (gep p, C), (sub 0, ptrtoint(p)) -> C` transform in this function, which I am not extending to handle ptrtoaddr, adding negative tests instead. This transform is overall dubious for provenance reasons, but especially dubious with ptrtoaddr, as then we don't have the guarantee that provenance of `p` has been exposed.	2025-10-21 09:27:07 +02:00
Nikita Popov	ee50839700	[InstSimplify] Support ptrtoaddr in simplifyCastInst() Handle ptrtoaddr the same way as ptrtoint. The fold already only operates on the index/address bits.	2025-10-20 14:18:34 +02:00
Nikita Popov	573ca36753	[IR] Replace alignment argument with attribute on masked intrinsics (#163802 ) The `masked.load`, `masked.store`, `masked.gather` and `masked.scatter` intrinsics currently accept a separate alignment immarg. Replace this with an `align` attribute on the pointer / vector of pointers argument. This is the standard representation for alignment information on intrinsics, and is already used by all other memory intrinsics. This means the signatures now match llvm.expandload, llvm.vp.load, etc. (Things like llvm.memcpy used to have a separate alignment argument as well, but were already migrated a long time ago.) It's worth noting that the masked.gather and masked.scatter intrinsics previously accepted a zero alignment to indicate the ABI type alignment of the element type. This special case is gone now: If the align attribute is omitted, the implied alignment is 1, as usual. If ABI alignment is desired, it needs to be explicitly emitted (which the IRBuilder API already requires anyway).	2025-10-20 08:50:09 +00:00
Nikita Popov	cf3765752b	[InstSimplify] Support ptrtoaddr in ptrmask fold Treat it the same way as ptrtoint. ptrmask only operates on the address bits of the pointer.	2025-10-14 13:55:04 +02:00
Nikita Popov	261580cacd	[InstSimplify] Support non-inbounds GEP in ptrdiff fold (#162676 ) We can fold ptrdiff(ptradd(p, x), p) to x regardless of whether the ptradd is inbounds. Proof: https://alive2.llvm.org/ce/z/Xuvc7N	2025-10-10 08:03:25 +00:00
Nikita Popov	187a8e3e08	[InstSimplify] Support ptrtoaddr in pointer subtraction fold (#162672 ) Add a new m_PtrToIntOrAddr() matcher which matches both ptrtoint and ptrtoaddr. Pointer arithmetic only works on the address bits, so supporting ptrtoaddr is always fine here.	2025-10-10 09:30:30 +02:00
Nikita Popov	7e5bb1e58a	[IR] Require DataLayout for pointer cast elimination (#162279 ) isEliminableCastPair() currently tries to support elimination of ptrtoint/inttoptr cast pairs by assuming that the maximum possible pointer size is 64 bits. Of course, this is no longer the case nowadays. This PR changes isEliminableCastPair() to accept an optional DataLayout argument, which is required to eliminate pointer casts. This means that we no longer eliminate these cast pairs during ConstExpr construction, and instead only do it during DL-aware constant folding. This had a lot of annoying fallout on tests, most of which I've addressed in advance of this change.	2025-10-07 17:19:48 +02:00
Lewis Crawford	17efa572c3	[InstSimplify] Optimize maximumnum and minimumnum (#139581 ) Add support for the new maximumnum and minimumnum intrinsics in various optimizations in InstSimplify. Also, change the behavior of optimizing maxnum(sNaN, x) to simplify to qNaN instead of x to better match the LLVM IR spec, and add more tests for sNaN behavior for all 3 max/min intrinsic types.	2025-10-07 14:23:32 +01:00
Yingwei Zheng	ca5ece8939	[InstSimplify] Simplify fcmp implied by dominating fcmp (#161090 ) This patch simplifies an fcmp into true/false if it is implied by a dominating fcmp. As an initial support, it only handles two cases: + `fcmp pred1, X, Y -> fcmp pred2, X, Y`: use set operations. + `fcmp pred1, X, C1 -> fcmp pred2, X, C2`: use `ConstantFPRange` and set operations. Note: It doesn't fix https://github.com/llvm/llvm-project/issues/70985, as the second fcmp in the motivating case is not dominated by the edge. We may need to adjust JumpThreading to handle this case. Comptime impact (~+0.1%): https://llvm-compile-time-tracker.com/compare.php?from=a728f213c863e4dd19f8969a417148d2951323c0&to=8ca70404fb0d66a824f39d83050ac38e2f1b25b9&stat=instructions:u IR diff: https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2848	2025-10-05 16:15:51 +08:00
Matthew Devereau	819e6b2043	[InstSimplify] Consider vscale_range for get active lane mask (#160073 ) Scalable get_active_lane_mask intrinsic calls can be simplified to i1 splat (ptrue) when its constant range is larger than or equal to the maximum possible number of elements, which can be inferred from vscale_range(x, y)	2025-09-24 11:35:15 +01:00
Rajveer Singh Bharadwaj	5d39cae6ba	[InstCombine] Generalise optimisation of redundant floating point comparisons with `ConstantFPRange` (#159315 ) Follow up of #158097 Similar to `simplifyAndOrOfICmpsWithConstants`, we can do so for floating point comparisons.	2025-09-20 12:42:17 +00:00
Ramkumar Ramachandra	7fb3a91418	[PatternMatch] Introduce match functor (NFC) (#159386 ) A common idiom is the usage of the PatternMatch match function within a functional algorithm like all_of. Introduce a match functor to shorten this idiom. Co-authored-by: Luke Lau <luke@igalia.com>	2025-09-17 21:04:33 +01:00
Rajveer Singh Bharadwaj	08a58b2cea	[InstCombine] Optimize redundant floating point comparisons in `or`/`and` inst's (#158097 ) Resolves #157371 We can eliminate one of the `fcmp` when we have two same `olt` or `ogt` instructions matched in `or`/`and` simplification.	2025-09-16 20:52:11 +05:30
David Sherwood	1f49c9494e	[InstSimplify] Simplify get.active.lane.mask when 2nd arg is zero (#158018 ) When the second argument passed to the get.active.lane.mask intrinsic is zero we can simplify the instruction to return an all-false mask regardless of the first operand.	2025-09-12 10:39:29 +01:00
Florian Hahn	b50ad945dd	[InstSimplify] Simplify extractvalue (umul_with_overflow(x, 1)). (#157307 ) Look through extractvalue to simplify umul_with_overflow where one of the operands is 1. This removes some redundant instructions when expanding SCEVs, which in turn makes the runtime check cost estimate more accurate, reducing the minimum iterations for which vectorization is profitable. PR: https://github.com/llvm/llvm-project/pull/157307	2025-09-07 18:32:40 +01:00
Kazu Hirata	c7cd1d0ae3	[Analysis] Remove an unnecessary cast (NFC) (#150838 ) getOpcode() already returns Instruction::CastOps.	2025-07-27 10:43:30 -07:00
jjasmine	68309adef3	[NFC] Clean up poison folding in simplifyBinaryIntrinsic (#147259 ) Fixes #147116.	2025-07-07 23:21:09 +08:00
jjasmine	07286b1fcd	[InstCombine] Propagate poison pow[i], [us]add, [us]sub and [us]mul (#146750 ) Fixes #146560 as well as propagate poison for [us]add, [us]sub and [us]mul	2025-07-04 22:55:07 +01:00
Iris Shi	f51d8730b3	[InstSimplify] Simplify 'x u>= 1' to true when x is known non-zero (#145204 )	2025-06-22 13:32:19 +08:00
Philip Reames	6f9cd79fa2	[InstSimplify] Add basic simplifications for vp.reverse (#144112 ) Directly modeled after what we do for vector.reverse, but with restrictions on EVL and mask added.	2025-06-16 10:07:56 -07:00
Ramkumar Ramachandra	b40e4ceaa6	[ValueTracking] Make Depth last default arg (NFC) (#142384 ) Having a finite Depth (or recursion limit) for computeKnownBits is very limiting, but is currently a load-bearing necessity, as all KnownBits are recomputed on each call and there is no caching. As a prerequisite for an effort to remove the recursion limit altogether, either using a clever caching technique, or writing a easily-invalidable KnownBits analysis, make the Depth argument in APIs in ValueTracking uniformly the last argument with a default value. This would aid in removing the argument when the time comes, as many callers that currently pass 0 explicitly are now updated to omit the argument altogether.	2025-06-03 17:12:24 +01:00
Nadharm	f71e4e9bc2	[InstSimplify] Handle nsz when simplifying X * 0.0 (#142181 ) If ValueTracking can guarantee non-NaN and non-INF and the `nsz` fast-math flag is set, we can simplify X * 0.0 ==> 0.0. https://alive2.llvm.org/ce/z/XacRQZ	2025-05-31 13:50:22 +08:00
Tim Gymnich	571a24c314	Reland [llvm] add GenericFloatingPointPredicateUtils #140254 (#141065 ) #140254 was previously missing 2 files in the bazel build config.	2025-05-22 17:17:02 +02:00
Kewen12	c47a5fbb22	Revert "[llvm] add GenericFloatingPointPredicateUtils (#140254 )" (#140968 ) This reverts commit d00d74bb2564103ae3cb5ac6b6ffecf7e1cc2238. The PR breaks our buildbots and blocks downstream merge.	2025-05-21 19:31:14 -04:00
Tim Gymnich	d00d74bb25	[llvm] add GenericFloatingPointPredicateUtils (#140254 ) add `GenericFloatingPointPredicateUtils` in order to generalize effects of floating point comparisons on `KnownFPClass` for both IR and MIR. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-05-21 23:45:31 +02:00

1 2 3 4 5 ...

1182 Commits