llvm-project

Author	SHA1	Message	Date
Philip Reames	e6b4a21849	[IR] Add utilities for manipulating length of MemIntrinsic [nfc] (#153856 ) Goal is simply to reduce direct usage of getLength and setLength so that if we end up moving memset.pattern (whose length is in elements) there are fewer places to audit.	2025-08-20 13:50:11 -07:00
zGoldthorpe	82caa251d4	[InstCombine] Fold integer unpack/repack patterns through ZExt (#153583 ) This patch explicitly enables the InstCombiner to fold integer unpack/repack patterns such as ```llvm define i64 @src_combine(i32 %lower, i32 %upper) { %base = zext i32 %lower to i64 %u.0 = and i32 %upper, u0xff %z.0 = zext i32 %u.0 to i64 %s.0 = shl i64 %z.0, 32 %o.0 = or i64 %base, %s.0 %r.1 = lshr i32 %upper, 8 %u.1 = and i32 %r.1, u0xff %z.1 = zext i32 %u.1 to i64 %s.1 = shl i64 %z.1, 40 %o.1 = or i64 %o.0, %s.1 %r.2 = lshr i32 %upper, 16 %u.2 = and i32 %r.2, u0xff %z.2 = zext i32 %u.2 to i64 %s.2 = shl i64 %z.2, 48 %o.2 = or i64 %o.1, %s.2 %r.3 = lshr i32 %upper, 24 %u.3 = and i32 %r.3, u0xff %z.3 = zext i32 %u.3 to i64 %s.3 = shl i64 %z.3, 56 %o.3 = or i64 %o.2, %s.3 ret i64 %o.3 } ; => define i64 @tgt_combine(i32 %lower, i32 %upper) { %base = zext i32 %lower to i64 %upper.zext = zext i32 %upper to i64 %s.0 = shl nuw i64 %upper.zext, 32 %o.3 = or disjoint i64 %s.0, %base ret i64 %o.3 } ``` Alive2 proofs: [YAy7ny](https://alive2.llvm.org/ce/z/YAy7ny)	2025-08-15 12:48:32 -06:00
zGoldthorpe	a8d25683ee	[PatternMatch] Allow `m_ConstantInt` to match integer splats (#153692 ) When matching integers, `m_ConstantInt` is a convenient alternative to `m_APInt` for matching unsigned 64-bit integers, allowing one to simplify ```cpp const APInt *IntC; if (match(V, m_APInt(IntC))) { if (IntC->ule(UINT64_MAX)) { uint64_t Int = IntC->getZExtValue(); // ... } } ``` to ```cpp uint64_t Int; if (match(V, m_ConstantInt(Int))) { // ... } ``` However, this simplification is only true if `V` is a scalar type. Specifically, `m_APInt` also matches integer splats, but `m_ConstantInt` does not. This patch ensures that the matching behaviour of `m_ConstantInt` parallels that of `m_APInt`, and also incorporates it in some obvious places.	2025-08-15 10:43:54 -06:00
Vedant Paranjape	44df9826f3	[InstCombine] Propagate invariant.load metadata across unpacked loads (#152186 ) For loads that operate on aggregate type, instcombine unpacks the loads. It does not preserve the invariant.load metadata. This patch fixes that, it looks for the metadata in the parent load and attaches the metadata to the unpacked loads. ``` %struct.double2 = type { double, double } %struct.double1 = type { double } define %struct.double2 @func1(ptr %a) { %1 = load %struct.double2, ptr %a, align 16, !invariant.load !1 ret %struct.double2 %1 } !1 = !{} ``` Reproducer: https://godbolt.org/z/hcY8MMvYh	2025-08-14 10:08:26 -07:00
Pavel Skripkin	30144226a4	[llvm] [InstCombine] fold "icmp eq (X + (V - 1)) & -V, X" to "icmp eq (and X, V - 1), 0" (#152851 ) This fold optimizes ```llvm define i1 @src(i32 %num, i32 %val) { %mask = add i32 %val, -1 %neg = sub nsw i32 0, %val %num.biased = add i32 %num, %mask %_2.sroa.0.0 = and i32 %num.biased, %neg %_0 = icmp eq i32 %_2.sroa.0.0, %num ret i1 %_0 } ``` to ```llvm define i1 @tgt(i32 %num, i32 %val) { %mask = add i32 %val, -1 %tmp = and i32 %num, %mask %ret = icmp eq i32 %tmp, 0 ret i1 %ret } ``` For power-of-two `val`. Observed in real life for following code ```rust pub fn is_aligned(num: usize) -> bool { num.next_multiple_of(1 << 12) == num } ``` which verifies that num is aligned to 4096. Alive2 proof https://alive2.llvm.org/ce/z/QisECm	2025-08-14 10:23:03 +03:00
Orlando Cazalet-Hyams	d13341db26	[RemoveDIs][NFC] Remove getAssignmentMarkers (#153214 ) getAssignmentMarkers was for debug intrinsics. getDVRAssignmentMarkers is used for DbgRecords.	2025-08-13 10:56:19 +01:00
Yingwei Zheng	84b31581f8	Revert "[PatternMatch] Add `m_[Shift]OrSelf` matchers." (#152953 ) Reverts llvm/llvm-project#152924 According to `f67668b586`, it is not an NFC change.	2025-08-11 09:35:16 +02:00
Yingwei Zheng	1c499351d6	[PatternMatch] Add `m_[Shift]OrSelf` matchers. (#152924 ) Address the comment https://github.com/llvm/llvm-project/pull/147414/files#r2228612726. As they are usually used to match integer packing patterns, it is enough to handle constant shamts.	2025-08-11 09:58:16 +08:00
Szymon Piotr Milczek	fd41700962	[InstCombine] visitShuffleVectorInst assert with vector of pointers fix. (#152341 ) In visitShuffleVectorInst there's an if block that's meant to turn shufflevector followed by bitcast into extractelement where possible. It assumes that there will never be bitcasts performed on vectors of ptr as such operations are almost always illegal, and ptrtoint instructions should be used instead. There is however an edge case where a bitcast instruction can be performed on a vector of type `<1 x ptr>` to turn it into type `ptr` In this edge case, the code initializes the variable `VecBitWidth` to 0. Then, when iterating over users that are bitcasts, an attempt is made to create a vector of size 0, which triggers and assert. This commit changes initialization of `VecBitWidth` to use datalayout to find the the size of the vector instead of getPrimitiveSizeInBits method which results in 0 for ptr and vectors of ptr.	2025-08-08 15:23:02 +02:00
Antonio Frighetto	e977b28c37	[InstCombine] Match intrinsic recurrences when known to be hoisted For value-accumulating recurrences of kind: ``` %umax.acc = phi i8 [ %umax, %backedge ], [ %a, %entry ] %umax = call i8 @llvm.umax.i8(i8 %umax.acc, i8 %b) ``` The binary intrinsic may be simplified into an intrinsic with init value and the other operand, if the latter is loop-invariant: ``` %umax = call i8 @llvm.umax.i8(i8 %a, i8 %b) ``` Proofs: https://alive2.llvm.org/ce/z/ea2cVC. Fixes: https://github.com/llvm/llvm-project/issues/145875.	2025-08-08 09:31:50 +02:00
Paul Walker	1406058cba	[LLVM][InstCombine] Extend masked_gather's demanded elt analysis. (#151732 ) Add support for other Constant types for the mask operand.	2025-08-04 14:05:04 +01:00
Paul Walker	fb4a8f67b9	[LLVM][InstCombine] foldICmpEquality: Compare APInt values rather than addresses. (#151726 )	2025-08-04 13:54:44 +01:00
David Green	d9971be83e	[InstCombine] Make foldCmpLoadFromIndexedGlobal more resilient to non-array geps. (#150639 ) My understanding is that gep [n x i8] and gep i8 can be treated equivalently - the array type conveys no extra information and could be removed. This goes through foldCmpLoadFromIndexedGlobal and tries to make it work for non-array gep types, so long as the index type still matches the array being loaded.	2025-08-03 10:19:42 +01:00
Nikita Popov	09dc08b707	[InstCombine] Handle repeated users in foldOpIntoPhi() If the phi is used multiple times in the same user, it will appear multiple times in users(), in which case make_early_inc_range() is insufficient to prevent iterator invalidation. Fixes the issue reported at: https://github.com/llvm/llvm-project/pull/151115#issuecomment-3141542852	2025-08-01 11:07:06 +02:00
Kerry McLaughlin	e170676351	[Instcombine] Combine extractelement from a vector_extract at index 0 (#151491 ) Extracting any element from a subvector starting at index 0 is equivalent to extracting from the original vector, i.e. extract_elt(vector_extract(x, 0), y) -> extract_elt(x, y)	2025-08-01 09:54:43 +01:00
Nikita Popov	d4d2d7d785	[InstCombine] Preserve nuw in canonicalizeGEPOfConstGEPI8() (#151533 ) Proof: https://alive2.llvm.org/ce/z/4j8U3f	2025-08-01 09:40:03 +02:00
Nikita Popov	a71909156e	[InstCombine] Set flags when canonicalizing GEP indices (#151516 ) When truncating set nsw/nuw based on nusw/nuw. When extending, use zext nneg if nusw+nuw. Proof: https://alive2.llvm.org/ce/z/JA2Yzr	2025-07-31 15:58:04 +02:00
Nikita Popov	16d73839b1	[InstCombine] Support folding intrinsics into phis (#151115 ) Call foldOpIntoPhi() for speculatable intrinsics. We already do this for FoldOpIntoSelect(). Among other things, this partially subsumes https://github.com/llvm/llvm-project/pull/149858.	2025-07-31 12:32:37 +02:00
zGoldthorpe	71d6762309	[InstCombine] Added pattern for recognising the construction of packed integers. (#147414 ) This patch extends the instruction combiner to simplify the construction of a packed scalar integer from a vector type, such as: ```llvm target datalayout = "e" define i32 @src(<4 x i8> %v) { %v.0 = extractelement <4 x i8> %v, i32 0 %z.0 = zext i8 %v.0 to i32 %v.1 = extractelement <4 x i8> %v, i32 1 %z.1 = zext i8 %v.1 to i32 %s.1 = shl i32 %z.1, 8 %x.1 = or i32 %z.0, %s.1 %v.2 = extractelement <4 x i8> %v, i32 2 %z.2 = zext i8 %v.2 to i32 %s.2 = shl i32 %z.2, 16 %x.2 = or i32 %x.1, %s.2 %v.3 = extractelement <4 x i8> %v, i32 3 %z.3 = zext i8 %v.3 to i32 %s.3 = shl i32 %z.3, 24 %x.3 = or i32 %x.2, %s.3 ret i32 %x.3 } ; =============== define i32 @tgt(<4 x i8> %v) { %x.3 = bitcast <4 x i8> %v to i32 ret i32 %x.3 } ``` Alive2 proofs (little-endian): [YKdMeg](https://alive2.llvm.org/ce/z/YKdMeg) Alive2 proofs (big-endian): [vU6iKc](https://alive2.llvm.org/ce/z/vU6iKc)	2025-07-30 10:58:49 -06:00
Nikita Popov	385fe30ee0	[InstCombine] Strip trailing zero GEP indices (#151338 ) Zero indices at the end do not change the GEP offset and can be removed. (Doing the same at the start requires adjusting the source element type.)	2025-07-30 17:55:00 +02:00
Nikita Popov	2672719a09	[InstCombine] Don't handle non-canonical index type in icmp of load fold (#151346 ) We should just bail out and wait for it to be canonicalized. The current implementation could emit a trunc without actually performing the transform.	2025-07-30 17:52:08 +02:00
Nikita Popov	8a09adc22a	[InstCombine] Split GEPs with multiple variable indices (#137297 ) Split GEPs that have more than one variable index into two. This is in preparation for the ptradd migration, which will not support multi-index GEPs. This also enables the split off part to be CSEd and LICMed.	2025-07-30 12:54:06 +02:00
Adar Dagan	1afb42bc10	[InstCombine] Let shrinkSplatShuffle act on vectors of different lengths (#148593 ) shrinkSplatShuffle in InstCombine would only move truncs up through shuffles if those shuffles inputs had the exact same type as their output, this PR weakens this constraint to only requiring that the scalar type of the input and output match.	2025-07-28 13:00:37 +02:00
Pedro Lobo	d9952a7a5f	[InstCombine] Propagate neg `nsw` when folding `abs(-x)` to `abs(x)` (#150460 ) We can propagate the nsw in the neg to abs, as `-x` is only poison if x == INT_MIN.	2025-07-24 21:45:37 +01:00
Nikita Popov	be6bed4dc6	[InstCombine] Remove instructions before+after unreachable at same time There is no need to first remove the instructions before and then the ones after in two different worklist iterations. We don't need to worry about change reporting here, as the functions do that themselves. This avoids the issue in #150338, but not really in a principled way. It's possible that we will have to allow poison arguments to lifetime.start/lifetime.end again if this turns out to be a recurring problem.	2025-07-24 11:10:22 +02:00
Kazu Hirata	3e53d4d386	[llvm] Remove unused includes (NFC) (#150265 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-07-23 15:18:46 -07:00
Nikita Popov	f0f3194e19	[InstCombine] Fold icmp of gep chains (#146714 ) This extends https://github.com/llvm/llvm-project/pull/144065 to the general case of an icmp between two GEP chains that have a common base.	2025-07-23 17:08:34 +02:00
Nikita Popov	7e878aaf23	[PatternMatch] Add support for capture-and-match (NFC) (#149825 ) When using PatternMatch, there is a common problem where we want to both match something against a pattern, but also capture the value/instruction for various reasons (e.g. to access flags). Currently the two ways to do that is to either capture using m_Value/m_Instruction and do a separate match on the result, or to use the somewhat awkward `m_CombineAnd(m_XYZ, m_Value(V))` pattern. This PR introduces to add a variant of `m_Value`/`m_Instruction` which does both a capture and a match. `m_Value(V, m_XYZ)` is basically equivalent to `m_CombineAnd(m_XYZ, m_Value(V))`. I've ported two InstCombine files to this pattern as a sample.	2025-07-23 10:05:09 +02:00
Nikita Popov	1e24b53534	[InstCombine] Add limit for expansion of gep chains (#147065 ) When converting gep subtraction / comparison to offset subtraction / comparison, avoid expanding very long multi-use gep chains.	2025-07-23 09:47:53 +02:00
Jeremy Morse	c9ceb9b75f	[DebugInfo] Remove intrinsic-flavours of findDbgUsers (#149816 ) This is one of the final remaining debug-intrinsic specific codepaths out there, and pieces of cross-LLVM infrastructure to do with debug intrinsics.	2025-07-21 17:49:25 +01:00
Nikita Popov	a216702406	[InstCombine] Merge one-use GEP offsets during expansion (#147263 ) When expanding a GEP chain, if there is a chain of one-use GEPs followed by a multi-use GEP, rewrite the multi-use GEP to include the one-use GEPs offsets. This means the offsets from the one-use GEPs can be reused by the offset expansion without additional cost (from computing them again with a different reassociation).	2025-07-21 10:49:38 +02:00
kissholic	baf2953097	Optimize fptrunc(x)>=C1 --> x>=C2 (#99475 ) Fix https://github.com/llvm/llvm-project/issues/85265#issue-2186848949	2025-07-19 17:52:06 +09:00
Prabhu Rajasekaran	921c6dbeca	[llvm] Introduce callee_type metadata Introduce `callee_type` metadata which will be attached to the indirect call instructions. The `callee_type` metadata will be used to generate `.callgraph` section described in this RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html Reviewers: morehouse, petrhosek, nikic, ilovepi Reviewed By: nikic, ilovepi Pull Request: https://github.com/llvm/llvm-project/pull/87573	2025-07-18 14:40:54 -07:00
Jeremy Morse	c9d8b68676	[DebugInfo] Suppress lots of users of DbgValueInst (#149476 ) This is another prune of dead code -- we never generate debug intrinsics nowadays, therefore there's no need for these codepaths to run. --------- Co-authored-by: Nikita Popov <github@npopov.com>	2025-07-18 11:31:52 +01:00
Jeremy Morse	2a1869b981	[DebugInfo] Shave even more users of DbgVariableIntrinsic from LLVM (#149136 ) At this stage I'm just opportunistically deleting any code using debug-intrinsic types, largely adjacent to calls to findDbgUsers. I'll get to deleting that in probably one or more two commits.	2025-07-18 08:25:10 +01:00
Ryan Buchner	1f1fd07c32	[InstCombine] Optimize (select %x, op(%x), 0) to op(%x) for operations where op(0) == 0 (#147605 ) Currently this optimization only occurs for `mul`, but this generalizes that for any operation that has a fixed point of `0`. There is similar logic within `EarlyCSE` pass, but that is stricter in terms of `poison` propagation so will not optimize for many operations. Alive2 Proofs: `and`: https://alive2.llvm.org/ce/z/RraasX ; base-case https://alive2.llvm.org/ce/z/gzfFTX ; commuted-case https://alive2.llvm.org/ce/z/63XaoX ; compare against undef https://alive2.llvm.org/ce/z/MVRVNd ; select undef https://alive2.llvm.org/ce/z/2bsoYG ; vector https://alive2.llvm.org/ce/z/xByeX- ; vector compare against undef https://alive2.llvm.org/ce/z/zNdzmZ ; vector select undef `fshl`: https://alive2.llvm.org/ce/z/U3_PG3 ; base-case https://alive2.llvm.org/ce/z/BWCnxT ; compare against undef https://alive2.llvm.org/ce/z/8HGAE_ ; select undef ; vector times out `fshr`: https://alive2.llvm.org/ce/z/o6F47G ; base-case https://alive2.llvm.org/ce/z/fVnBXy ; compare against undef https://alive2.llvm.org/ce/z/suymYJ ; select undef ; vector times out `umin`: https://alive2.llvm.org/ce/z/GGMqf6 ; base-case https://alive2.llvm.org/ce/z/6cx5-k ; commuted-case https://alive2.llvm.org/ce/z/W5d9tz ; compare against undef https://alive2.llvm.org/ce/z/nKbaUn ; select undef https://alive2.llvm.org/ce/z/gxEGqc ; vector https://alive2.llvm.org/ce/z/_SDpi_ ; vector compare against undef `sdiv`: https://alive2.llvm.org/ce/z/5XGs3q `srem`: https://alive2.llvm.org/ce/z/vXAnQM `udiv`: https://alive2.llvm.org/ce/z/e6_8Ug `urem`: https://alive2.llvm.org/ce/z/VmM2SL `shl`: https://alive2.llvm.org/ce/z/aCZr3u ; Argument with range https://alive2.llvm.org/ce/z/YgDy8C ; Instruction with known bits https://alive2.llvm.org/ce/z/6pIxR6 ; Constant `lshr`: https://alive2.llvm.org/ce/z/WCCBej `ashr: https://alive2.llvm.org/ce/z/egV4TR --------- Co-authored-by: Ryan Buchner <rbuchner@ventanamicro.com> Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>	2025-07-16 19:42:41 -07:00
Jeremy Morse	5328c732a4	[DebugInfo] Strip more debug-intrinsic code from local utils (#149037 ) SROA and a few other facilities use generic-lambdas and some overloaded functions to deal with both intrinsics and debug-records at the same time. As part of stripping out intrinsic support, delete a swathe of this code from things in the Utils directory. This is a large diff, but is mostly about removing functions that were duplicated during the migration to debug records. I've taken a few opportunities to replace comments about "intrinsics" with "records", and replace generic lambdas with plain lambdas (I believe this makes it more readable). All of this is chipping away at intrinsic-specific code until we get to removing parts of findDbgUsers, which is the final boss -- we can't remove that until almost everything else is gone.	2025-07-16 14:13:53 +01:00
Cullen Rhodes	7392a546bb	[InstCombine] Treat identical operands as one in pushFreezeToPreventPoisonFromPropagating (#145348 ) To push a freeze through an instruction, only one operand may produce poison. However, this currently fails for identical operands which are treated as separate. This patch fixes this by treating them as a single operand.	2025-07-16 13:35:40 +01:00
Jeremy Morse	5b8c15c6e7	[DebugInfo] Remove getPrevNonDebugInstruction (#148859 ) With the advent of intrinsic-less debug-info, we no longer need to scatter calls to getPrevNonDebugInstruction around the codebase. Remove most of them -- there are one or two that have the "SkipPseudoOp" flag turned on, however they don't seem to be in positions where skipping anything would be reasonable.	2025-07-16 11:41:32 +01:00
Pierre van Houtryve	f223411e2e	[InstCombine]PtrReplacer: Correctly handle select with unavailable operands (#148829 ) The testcase I added previously failed because a SelectInst with invalid operands was created (one side `addrspace(4)`, the other `addrspace(5)`). PointerReplacer needs to dig deeper if the true and/or false instructions of the select are not available. Fixes SWDEV-542957	2025-07-16 09:32:05 +02:00
Ross Kirsling	b1a93cfc32	[InstCombine] foldOpIntoPhi should apply to icmp with non-constant operand (#147676 ) Alive2: https://alive2.llvm.org/ce/z/4MeCzA Fixes #146263.	2025-07-16 10:03:25 +09:00
Ahmed Bougacha	77bcab835a	[InstCombine] Combine ptrauth intrin. callee into same-key bundle. (#94707 ) Try to optimize a call to the result of a ptrauth intrinsic, potentially into the ptrauth call bundle: call(ptrauth.resign(p)), ["ptrauth"()] -> call p, ["ptrauth"()] call(ptrauth.sign(p)), ["ptrauth"()] -> call p as long as the key/discriminator are the same in sign and auth-bundle, and we don't change the key in the bundle (to a potentially-invalid key.) Generating a plain call to a raw unauthenticated pointer is generally undesirable, but if we ended up seeing a naked ptrauth.sign in the first place, we already have suspicious code. Unauthenticated calls are also easier to spot than naked signs, so let the indirect call shine. Note that there is an arguably unsafe extension to this, where we don't bother checking that the key in bundle and intrinsic are the same (and also allow folding away an auth into a bundle.) This can end up generating calls with a bundle that has an invalid key (which an informed frontend wouldn't have otherwise done), which can be problematic. The C that generates that is straightforward but arguably unreasonable. That wouldn't be an issue if we were to bite the bullet and make these fully AArch64-specific, allowing key knowledge to be embedded here.	2025-07-15 14:39:53 -07:00
Ahmed Bougacha	42d2ae1034	[InstCombine] Combine ptrauth constant callee into bundle. (#94706 ) Try to optimize a call to a ptrauth constant, into its ptrauth bundle: call(ptrauth(f)), ["ptrauth"()] -> call f as long as the key/discriminator are the same in constant and bundle.	2025-07-15 13:37:07 -07:00
Jeremy Morse	57a5f9c47e	[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383 ) There are no longer debug-info instructions, thus we don't need this skipping. Horray!	2025-07-15 15:34:10 +01:00
Florian Hahn	08a8e1c6b6	[InstCombine] Move extends across identity shuffles. (#146901 ) Add a new fold to instcombine to move SExt/ZExt across identity shuffles, applying the cast after the shuffle. This sinks extends and can enable more general additional folding of both shuffles (and related instructions) and extends. If backends prefer splitting up doing casts first, the extends can be hoisted again in VectorCombine for example. A larger example is included in the load_i32_zext_to_v4i32. The wider extend is easier to compute an accurate cost for and targets (like AArch64) can lower a single wider extend more efficiently than multiple separate extends. This is a generalization of a VectorCombine version (https://github.com/llvm/llvm-project/pull/141109) as suggested by @preames. PR: https://github.com/llvm/llvm-project/pull/146901	2025-07-14 21:01:03 +01:00
Marius Kamp	9544bb5c29	[InstCombine] Fold umul.overflow(x, c1) \| (x*c1 > c2) to x > c2/c1 (#147327 ) The motivation of this pattern is to check whether the product of a variable and a constant would be mathematically (i.e., as integer numbers instead of bit vectors) greater than a given constant bound. The pattern appears to occur when compiling several Rust projects (it seems to originate from the `smallvec` crate but I have not checked this further). Unless `c1` is `0`, we can transform this pattern into `x > c2/c1` with all operations working on unsigned integers. Due to undefined behavior when an element of a non-splat vector is `0`, the transform is only implemented for scalars and splat vectors. Alive proof: https://alive2.llvm.org/ce/z/LawTkm Closes #142674	2025-07-11 10:52:13 +02:00
Nikita Popov	889854bef1	[InstCombine] Avoid unprofitable add with remainder transform (#147319 ) If C1 is 1 and we're working with a power of two divisor, this will end up replacing the `and` for the remainder with a multiply and a longer dependency chain. Fixes https://github.com/llvm/llvm-project/issues/147176.	2025-07-08 14:29:31 +02:00
Jeffrey Byrnes	0da9aacf48	[InstCombine] Extend bitmask mul combine to handle independent operands (#142503 ) This extends https://github.com/llvm/llvm-project/pull/136013 to capture cases where the combineable bitmask muls are nested under multiple or-disjoints. This PR is meant for commits starting at 8c403c912046505ffc10378560c2fc48f214af6a op1 = or-disjoint mul(and (X, C1), D) , reg1 op2 = or-disjoint mul(and (X, C2), D) , reg2 out = or-disjoint op1, op2 -> temp1 = or-disjoint reg1, reg2 out = or-disjoint mul(and (X, (C1 + C2)), D), temp1 Case1: https://alive2.llvm.org/ce/z/dHApyV Case2: https://alive2.llvm.org/ce/z/Jz-Nag Case3: https://alive2.llvm.org/ce/z/3xBnEV	2025-07-07 13:50:42 -07:00
Yingwei Zheng	c9d9c3e349	[InstCombine] Fold `icmp pred X + K, Y -> icmp pred2 X, Y` if both X and Y is divisible by K (#147130 ) This patch generalizes `icmp ule X +nuw 1, Y -> icmp ult X, Y`-like optimizations to handle the case that the added RHS constant is a common power-of-2 divisor of both X and Y. We can further generalize this pattern to handle non-power-of-2 divisors as well. Alive2: https://alive2.llvm.org/ce/z/QgpeM_ Compile-time improvement (Stage2-O3 -0.09%): https://llvm-compile-time-tracker.com/compare.php?from=0ba59587fa98849ed5107fee4134e810e84b69a3&to=f80e5fe0bb2e63c05401bde7cd42899ea270909b&stat=instructions:u The original case is from the comparison of expanded GEP offsets: https://github.com/dtcxzyw/llvm-opt-benchmark/pull/2530/files#r2183005292	2025-07-05 23:42:53 +08:00
agorenstein-nvidia	b0473c599b	[InstCombine] Pull extract through broadcast (#143380 ) The change adds a new instcombine pattern, and associated test, for patterns like this: ``` %3 = shufflevector <2 x float> %1, <2 x float> poison, <4 x i32> zeroinitializer %4 = extractelement <4 x float> %3, i64 %idx ``` The shufflevector has a splat, or broadcast, mask, so the extractelement simply must be the first element of %1, so we transform this to ``` %2 = extractelement <2 x float> %1, i64 0 ```	2025-07-04 18:19:50 +02:00

1 2 3 4 5 ...

6667 Commits