llvm-project

Author	SHA1	Message	Date
Peter Collingbourne	75bb30ddbf	Move {load,store}(llvm.protected.field.ptr) lowering to InstCombine. The previous position of llvm.protected.field.ptr lowering for loads and stores was problematic as it not only inhibited optimizations such as DSE (as stores to a llvm.protected.field.ptr were not considered to must-alias stores to the non-protected.field pointer) but also required changes to other optimization passes to avoid transformations that would reduce PFP coverage. Address this by moving the load/store part of the lowering to InstCombine, where it will run earlier than the PFP-breaking and AA-relying transformations. The deactivation symbol, null comparison and EmuPAC parts of the lowering remain in PreISelLowering. Now that the transformation inhibitions are no longer needed, remove them (i.e. partially revert #151649, and revert #182976). This change resulted in a 2.4% reduction in Fleetbench .text size and the following improvements to PFP performance overhead for BM_PROTO_Arena on various microarchitectures: before after Apple M2 Ultra 3.5% 3.3% Google Axion C4A 3.3% 2.9% Google Axion N4A 2.7% 2.2% Reviewers: fmayer, nikic, vitalybuka Reviewed By: fmayer Pull Request: https://github.com/llvm/llvm-project/pull/186548	2026-04-06 17:47:24 -07:00
Yunbo Ni	017b9f9c7a	[InstCombine] Fix crash in `foldReversedIntrinsicOperands` for struct-return intrinsics (#186339 ) Fixes #186334 Similar to #176556 , add the missing result type check in `foldReversedIntrinsicOperands()`. This prevents `CreateVectorReverse()` from being applied to struct-returning intrinsics.	2026-03-13 10:28:21 +00:00
tudinhh	36de257076	[InstCombine] Handle fixed-width results in get_active_lane_mask fold (#185317 ) The optimization introduced in #183329 incorrectly assumed that any extraction from a scalable active lane mask used a scalable index. When the result of a `llvm.vector.extract` is a fixed-width vector, the index should not be multiplied by vscale. This PR adds a check to ensure the index is only scaled by VScaleMin when the return type of the extraction is a scalable vector, not fixed-width. Fixes #185271	2026-03-09 12:51:47 +01:00
Nikolas Klauser	ce79fb3712	[InstCombine] Always fold nonnull assumptions into operand bundles (#169923 ) Fixes #168688	2026-03-02 13:58:08 +01:00
Matt Arsenault	82a1905c4b	InstCombine: Pass SimplifyQuery through SimplifyDemandedFPClass (#184096 )	2026-03-02 12:00:25 +00:00
Nikita Popov	3ad43f2d1c	[LangRef] Clarify nsz semantics (#180906 ) The current LangRef wording says that the sign of a zero argument or result is "insignificant", which is not really clear on what this means. Alive2 models this as non-deterministically flipping the zero sign bits for both inputs and outputs. This PR proposes to specify this flag as non-deterministically flipping inputs only. A consequence of this is that fabs is guaranteed to have an unset sign bit even if the input is zero (that is, https://alive2.llvm.org/ce/z/irCftQ is no longer a valid transform), and that the copysign result sign only depends on the second operand (that is, https://alive2.llvm.org/ce/z/VnHdfh is no longer a valid transform). These rules are still more liberal than we'd really like them to be, but at least avoid some of the issues with nsz. This is based on the discussion in: https://discourse.llvm.org/t/rfc-clarify-the-behavior-of-fp-operations-on-bit-strings-with-nsz-flag/85981	2026-03-02 10:53:07 +01:00
Kerry McLaughlin	10b48e41e7	[InstCombine] Combine extract from get_active_lane_mask where all lanes inactive (#183329 ) When extracting a subvector from the result of a get_active_lane_mask, return a constant zero vector if it can be proven that all lanes will be inactive. For example, the result of the extract below will be a subvector where every lane is inactive if X & Y are const, and `Y * VScale >= X`: vector.extract(get.active.lane.mask(Start, X), Y)	2026-02-27 09:46:49 +00:00
Nathiyaa Sengodan	4fec4df12a	[InstCombine] Fold min/max of two subtracts with common RHS (#183240 ) Fold: minmax(sub X, Z , sub Y, Z) -> sub minmax(X, Y), Z When both sub instructions have no-wrap flags and share the same RHS operand, we can fold: smin (sub nsw X, Z), (sub nsw Y, Z) -> sub nsw (smin X, Y), Z smax (sub nsw X, Z), (sub nsw Y, Z) -> sub nsw (smax X, Y), Z umin (sub nuw X, Z), (sub nuw Y, Z) -> sub nuw (umin X, Y), Z umax (sub nuw X, Z), (sub nuw Y, Z) -> sub nuw (umax X, Y), Z This is valid because subtraction by a common value preserves relative ordering when no signed/unsigned overflow occurs. Proof: https://alive2.llvm.org/ce/z/n9gwj2 Closes https://github.com/llvm/llvm-project/issues/167059	2026-02-26 01:21:28 +08:00
Hongxu Xu	8d3e6e709c	[InstCombine] Transform splat before n x i1 for vec.reduce.add (#182213 ) ```llvm define i1 @src(i1 %0) { %2 = insertelement <8 x i1> poison, i1 %0, i32 0 %3 = shufflevector <8 x i1> %2, <8 x i1> poison, <8 x i32> zeroinitializer %4 = tail call i1 @llvm.vector.reduce.add.v8i1(<8 x i1> %3) ret i1 %4 } define i1 @tgt(i1 %0) { ret i1 0 } ``` alive2: https://alive2.llvm.org/ce/z/vejxot `vector_reduce_add(<n x i1>)` to `Trunc(ctpop(bitcast <n x i1> to in))` interferes with the `vector_reduce_add(<splat>)` to `mul`, so I exchanged their order. Relevant PR: #161020	2026-02-21 15:03:25 +00:00
Nikita Popov	0a740668a4	[InstCombine] Support minimumnum/maximumnum (#180529 ) Support minimumnum/maximumnum intrinsics in various existing minnum/maxnum/minimum/maximum folds. The test coverage has been copied from minnum/maxnum. Proofs: https://alive2.llvm.org/ce/z/YMlLwO Proofs that time out: https://alive2.llvm.org/ce/z/dJN8wj	2026-02-09 15:55:59 +00:00
Nikolas Klauser	6dbdfd824a	[InstCombine] Drop nonnull assumes if the pointer is already known to be nonnull (#180434 )	2026-02-09 13:13:32 +01:00
Nikita Popov	a654a27fcd	[InstCombine] Fold min/max(fpext x, C) to fpext(min/max(x, fptrunc C)) (#179968 ) Fold `min/max(fpext x, C)` to `fpext(min/max(x, fptrunc C))` in cases where the truncation of the constant is lossless. This helps eliminate fpext/fptrunc pairs around min/max and addresses the regression from https://github.com/llvm/llvm-project/pull/177988. Proof: https://alive2.llvm.org/ce/z/y_Bcdd	2026-02-09 09:13:26 +00:00
Nikita Popov	531430b614	[InstCombine] Relax one-use check for min/max(fpext x, fpext y) to fpext(min/max(x, y)) fold (#180164 ) If only of the operands is one-use, the total number of fpexts stays the same, but the min/max is performed on a narrowed type. Additionally, the fpext may fold with a following fptrunc.	2026-02-09 09:34:17 +01:00
Snehasish Kumar	7449d32d7e	[InstCombine][profcheck] Fix profile metadata propagation for umax in InstCombine (#179332 ) Select instructions created from the expansion of an umax intrinsic do not have profile data even though the function may have profile data. This is because PGO instrumentation does not support intrinsics. Assisted-by: gemini	2026-02-05 21:09:06 -08:00
Nikolas Klauser	3064291c9f	Reapply "[InstCombine] Always fold alignment assumptions into operand bundles (#177597 )" (#179497 ) Truncating at 32 bits is now avoided by removing a cast to `unsigned`. This would also break at 64 bits (with the pointer size > 64 bit), but I don't think LLVM supports such a thing. This reverts commit bc7315749d6d16d0f162f816b3ec0ef7169615f2.	2026-02-03 19:30:48 +01:00
Nico Weber	bc7315749d	Revert "[InstCombine] Always fold alignment assumptions into operand bundles (#177597 )" This reverts commit b74e1bca6d77b3de5c05822d1631006ce2a30cc6. Makes clang assert: https://github.com/llvm/llvm-project/pull/177597#issuecomment-3824553291	2026-01-30 11:32:46 -05:00
Matt Arsenault	909041e480	InstCombine: Check one use before trying to simplify copysign sign (#178251 ) Fixes #178245	2026-01-27 17:21:44 +00:00
Matt Arsenault	10b539f13e	InstCombine: Try SimplifyDemandedBits on copysign signs (#177942 )	2026-01-26 18:43:13 +01:00
Nikolas Klauser	b74e1bca6d	[InstCombine] Always fold alignment assumptions into operand bundles (#177597 )	2026-01-23 16:54:17 +01:00
Dan Blackwell	c63a744f3f	[CodeGen][InstCombine][Sanitizers] Emit lifetimes when compiling with memtag-stack (#177130 ) Currently we do not emit lifetimes by default when compiling with memtag-stack - which means we don't catch use-after-scope (when compiling without optimization). This patch fixes that by mirroring ASan, HWASan and MSan, and always emitting lifetime markers. The patch is based on the changes made in aeca569. rdar://163713381	2026-01-22 14:22:44 +00:00
Yingwei Zheng	9696c8bd62	[InstCombine] Bail out on intrinsics with struct return types (#176556 ) After https://github.com/llvm/llvm-project/pull/174835, overflow intrinsics can be vectorized. But `foldShuffledIntrinsicOperands` doesn't support shuffling vectors inside the struct return value. Closes https://github.com/llvm/llvm-project/issues/176548.	2026-01-17 12:04:37 +00:00
Gábor Spaits	3424447645	[InstCombine] Remove unnecessary type equality check when creating zext or trunc (NFC) (#175947 ) This came up during discussions under PR #161101.	2026-01-14 15:50:04 +01:00
Henry	c6f6efba3b	[NFC] Implicit container copy cleanup (#174702 ) A set of cleanup for redundant implicit container copies. Fixed with const reference or move semantics. e8996cb24 [AMDGPU] replace copy with const reference (NFC) 25ceecee8 [-Wunsafe-buffer-usage] Replace vector copy with reference (NFC) e1f5254e0 [AMDGPU] Replace copy with move semantics (NFC) 8261250d7 [InstCombine] Replace vector copy with move semantic (NFC) 749bb21de [CommandLine] Avoid vector copy for const argument (NFC) b89526f90 [LoongArch] Remove unnecessary vector copy (NFC) 6b22bcf56 [TextAPI] Replace map copy with const reference (NFC) a121519d8 [BlockExtract] Avoid copy semantic for ctor (NFC) 3034d3063 [LifetimeSafety] Avoid map copy for dump methods (NFC) --------- Co-authored-by: sfu <afwbu8tp6@mozmail.com>	2026-01-14 11:16:32 +01:00
Aryan Kadole	362b653c69	[InstCombine] Fold Minimum over trailing or leading zeros (#173768 ) Add support for `umin(clz(x), clz(y)) => clz(x \| y)` `umin(ctz(x), ctz(y)) => ctz(x \| y)` [C++ source](https://godbolt.org/z/E8abbjT7G) [alive proof](https://alive2.llvm.org/ce/z/mh94_n) Fixes #173691	2026-01-11 20:56:52 +08:00
Kshitij Paranjape	2daf321660	[InstCombine] Add support for Instruction combining of hyperbolic functions (#173730 ) Fixes llvm/llvm-project#173706	2026-01-09 00:06:58 -05:00
Valeriy Savchenko	55eaa6c27b	[InstCombine][AArch64] Lower NEON shift intrinsics when possible (#172465 )	2026-01-07 07:46:22 +00:00
Benjamin Maxwell	49e601a3a2	[InstCombine] Don't fold struct-ret intrinsics into vector selects (#173062 ) Folding struct-ret intrinsics like `@llvm.sincos.v4f32` into selects with vector conditions is invalid (the result must be a vector).	2025-12-20 09:51:35 +00:00
Matt Arsenault	6e47d4ef45	Reapply "InstCombine: Fold ldexp with constant exponent to fmul" (#171895 ) (#171977 )	2025-12-12 12:55:55 +01:00
Matt Arsenault	757c5b3bc7	Revert "InstCombine: Fold ldexp with constant exponent to fmul" (#171895 ) Reverts llvm/llvm-project#171731 Fails on a libc test	2025-12-11 21:12:59 +00:00
Matt Arsenault	5eb2ec2179	InstCombine: Fold ldexp with constant exponent to fmul (#171731 ) If we can represent this with an fmul, prefer it as a canonical form. More optimizations will understand fmul, and allows contract to fma.	2025-12-11 19:20:45 +01:00
valadaptive	7f2bbba60d	[AArch64][ARM] Optimize more `tbl`/`tbx` calls into `shufflevector` (#169748 ) Resolves #169701. This PR extends the existing InstCombine operation which folds `tbl1` intrinsics to `shufflevector` if the mask operand is constant. Before this change, it only handled 64-bit `tbl1` intrinsics with no out-of-bounds indices. I've extended it to support both 64-bit and 128-bit vectors, and it now handles the full range of `tbl1`-`tbl4` and `tbx1`-`tbx4`, as long as at most two of the input operands are actually indexed into. For the purposes of `tbl`, we need a dummy vector of zeroes if there are any out-of-bounds indices, and for the purposes of `tbx`, we use the "fallback" operand. Both of those take up an operand for the purposes of `shufflevector`. This works a lot like https://github.com/llvm/llvm-project/pull/169110, with some added complexity because we need to handle multiple operands. I raised a couple questions in that PR that still need to be answered: - Is it correct to check `IsA<UndefValue>` for each mask index, and set the output mask index to -1 if so? This is later folded to a poison value, and I'm not sure about the subtle differences between poison and undef and when you can substitute one for the other. As I mentioned in #169110, the existing x86 pass (`simplifyX86vpermilvar`) already behaves this way when it comes to undef. - How can I write an Alive2 proof for this? It's very hard to find good documentation or tutorials about Alive2. As with #169110, most of the regression test cases were generated using Claude. Everything else was written by me.	2025-12-09 16:11:26 +00:00
Nikita Popov	7b652195d7	[IR] Add ImplicitTrunc argument to ConstantInt::get() (#170865 ) Add an ImplicitTrunc argument to ConstantInt::get(), which allows controlling whether implicit truncation of the value is permitted. This argument currently defaults to true, but will be switched to false in the future to guard against signed/unsigned confusion, similar to what has already happened for APInt. The argument gives an opt-out for cases where the truncation is intended. The patch contains one illustrative example where this happens.	2025-12-08 08:42:59 +01:00
David Green	f741851731	Revert "[AArch64][ARM] Move ARM-specific InstCombine transforms into `Transforms/Utils` (#169589 )" This reverts commit 1c32b6f51ccaaf9c65be11d7dca9e5a476cddb5a due to failures on BUILD_SHARED_LIBS builds.	2025-12-02 11:46:50 +00:00
valadaptive	1c32b6f51c	[AArch64][ARM] Move ARM-specific InstCombine transforms into `Transforms/Utils` (#169589 ) Back when `TargetTransformInfo::instCombineIntrinsic` was added in https://reviews.llvm.org/D81728, several transforms common to both ARM and AArch64 were kept in the non-target-specific `InstCombineCalls.cpp` so they could be shared between the two targets. I want to extend the transform of the `tbl` intrinsics into static `shufflevector`s in a similar manner to https://github.com/llvm/llvm-project/pull/169110 (right now it only works with a 64-bit `tbl1`, but `shufflevector` should allow it to work with up to 2 operands, and it can definitely work with 128-bit vectors). I think separating out the transform into a TTI hook is a prerequisite. ~~I'm not happy about creating an entirely new module for this and having to wire it up through CMake and everything, but I'm not sure about the alternatives. If any maintainers can think of a cleaner way of doing this, I'm very open to it.~~ I've moved the transforms into `Transforms/Utils/ARMCommonInstCombineIntrinsic.cpp`, which is a lot simpler.	2025-12-02 11:17:12 +00:00
Luke Lau	bb9449d5bb	[InstCombine] Fold @llvm.experimental.get.vector.length when cnt <= max_lanes (#169293 ) On RISC-V, some loops that the loop vectorizer vectorizes pre-LTO may turn out to have the exact trip count exposed after LTO, see #164762. If the trip count is small enough we can fold away the @llvm.experimental.get.vector.length intrinsic based on this corollary from the LangRef: > If %cnt is less than or equal to %max_lanes, the return value is equal to %cnt. This on its own doesn't remove the @llvm.experimental.get.vector.length in #164762 since we also need to teach computeKnownBits about @llvm.experimental.get.vector.length and the sub recurrence, but this PR is a starting point. I've added this in InstCombine rather than InstSimplify since we may need to insert a truncation (@llvm.experimental.get.vector.length can take an i64 %cnt argument, the result is always i32). Note that there was something similar done in VPlan in #167647 for when the loop vectorizer knows the trip count.	2025-11-27 07:16:03 +00:00
Peter Collingbourne	d2379effe9	Add deactivation symbol operand to ConstantPtrAuth. Deactivation symbol operands are supported in the code generator by building on the previously added support for IRELATIVE relocations. Reviewers: ojhunt, fmayer, ahmedbougacha, nikic, efriedma-quic Reviewed By: fmayer Pull Request: https://github.com/llvm/llvm-project/pull/133537	2025-11-26 12:39:40 -08:00
Peter Collingbourne	6227eb90da	Add IR and codegen support for deactivation symbols. Deactivation symbols are a mechanism for allowing object files to disable specific instructions in other object files at link time. The initial use case is for pointer field protection. For more information, see the RFC: https://discourse.llvm.org/t/rfc-deactivation-symbols/85556 Reviewers: ojhunt, nikic, fmayer, arsenm, ahmedbougacha Reviewed By: fmayer Pull Request: https://github.com/llvm/llvm-project/pull/133536	2025-11-26 12:37:09 -08:00
Daniel Thornburgh	c9ff2df8c3	[IR] "modular-format" attribute for functions using format strings (#147429 ) A new InstCombine transform uses this attribute to rewrite calls to a modular version of the implementation along with llvm.reloc.none relocations against aspects of the implementation needed by the call. This change only adds support for the 'float' aspect, but it also builds the structure needed for others. See issue #146159	2025-11-11 11:52:56 -08:00
Nikita Popov	7900e63fbb	[InstCombine] Support ptrtoaddr when converting to align assume bundle ptrtoaddr can be treated the same way as ptrtoint here.	2025-10-28 12:02:47 +01:00
Mihail Mihov	6034ab3d98	[InstCombine] Add CTLZ -> CTTZ simplification (#164733 ) This PR adds the simplification `ctlz(~x & (x - 1)) -> bitwidth - cttz(x, false)` ([Alive2](https://alive2.llvm.org/ce/z/vVDRCu)). Closes issue #164436	2025-10-25 00:40:11 +08:00
Benjamin Maxwell	c80495c1b0	[InstCombine] Allow folding cross-lane operations into PHIs/selects (#164388 ) Previously, cross-lane operations were disallowed here, but they are only problematic if the `select` condition is a vector, as the input of the operation is not simply one of the arms of the phi/select.	2025-10-23 09:27:57 +01:00
Nikita Popov	573ca36753	[IR] Replace alignment argument with attribute on masked intrinsics (#163802 ) The `masked.load`, `masked.store`, `masked.gather` and `masked.scatter` intrinsics currently accept a separate alignment immarg. Replace this with an `align` attribute on the pointer / vector of pointers argument. This is the standard representation for alignment information on intrinsics, and is already used by all other memory intrinsics. This means the signatures now match llvm.expandload, llvm.vp.load, etc. (Things like llvm.memcpy used to have a separate alignment argument as well, but were already migrated a long time ago.) It's worth noting that the masked.gather and masked.scatter intrinsics previously accepted a zero alignment to indicate the ABI type alignment of the element type. This special case is gone now: If the align attribute is omitted, the implied alignment is 1, as usual. If ABI alignment is desired, it needs to be explicitly emitted (which the IRBuilder API already requires anyway).	2025-10-20 08:50:09 +00:00
Ramkumar Ramachandra	36b543ab20	[InstComb] Handle undef in simplifyMasked(Store\|Scatter) (#161825 )	2025-10-03 16:52:48 +01:00
Gábor Spaits	d29798767c	[InstCombine] Transform `vector.reduce.add` and `splat` into multiplication (#161020 ) Fixes #160066 Whenever we have a vector with all the same elemnts, created with `insertelement` and `shufflevector` and we sum the vector, we have a multiplication.	2025-09-29 00:06:20 +02:00
Axel Sorenson	dee28f9555	[InstCombine] Rotate transformation port from SelectionDAG to InstCombine (#160628 ) The rotate transformation from `72c04bb882/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (L10312-L10337)` has no middle-end equivalent in InstCombine. The following is a port of that transformation to InstCombine. --------- Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>	2025-09-26 21:04:52 -07:00
Florian Hahn	d45a135918	[InstCombine] Remove redundant align 1 assumptions. (#160695 ) It seems like we have a bunch of align 1 assumptions in practice and unless I am missing something they should not add any value. See https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2861/files PR: https://github.com/llvm/llvm-project/pull/160695	2025-09-25 18:00:18 +01:00
Ramkumar Ramachandra	7fb3a91418	[PatternMatch] Introduce match functor (NFC) (#159386 ) A common idiom is the usage of the PatternMatch match function within a functional algorithm like all_of. Introduce a match functor to shorten this idiom. Co-authored-by: Luke Lau <luke@igalia.com>	2025-09-17 21:04:33 +01:00
Matthew Devereau	ead4f3e271	[InstCombine] Canonicalize active lane mask params (#158065 ) Rewrite active lane mask intrinsics to begin their range from 0 when both parameters are constant integers.	2025-09-12 16:35:58 +01:00
Florian Hahn	93a1470a97	[InstCombine] Remove redundant alignment assumptions. (#123348 ) Use known bits to remove redundant alignment assumptions. Libc++ now adds alignment assumptions for std::vector::begin() and std::vector::end(), so I expect we will see quite a bit more assumptions in C++ [1]. Try to clean up some redundant ones to start with. [1] https://github.com/llvm/llvm-project/pull/108961 PR: https://github.com/llvm/llvm-project/pull/123348	2025-09-12 13:45:36 +01:00
Hongyu Chen	75b0c89e62	[InstCombine][VectorCombine][NFC] Unify uses of lossless inverse cast (#156597 ) This patch addresses https://github.com/llvm/llvm-project/pull/155216#discussion_r2297724663. This patch adds a helper function to put the inverse cast on constants, with cast flags preserved(optional). Follow-up patches will add trunc/ext handling on VectorCombine and flags preservation on InstCombine.	2025-09-08 13:30:06 +00:00

1 2 3 4 5 ...

1165 Commits