llvm-project

Author	SHA1	Message	Date
Alexis Engelke	efcd3b6108	[IPO][InstCombine][Vectorize][NFCI] Drop uses of BranchInst (#186596 ) Refactor remaining parts of Transforms apart from Scalar and Utils.	2026-03-14 17:49:00 +00:00
azwolski	a209ff855b	[InstCombine] Limit canonicalization of extractelement(cast) to constant index or same basic block (#166227 ) The current canonicalization of extractelement(cast) requires that the CastInst has only one use. However, when that use occurs inside a loop, it still satisfies this condition, even though the cast is effectively used multiple times, once per iteration, rather than truly being used once. ```cpp } else if (auto CI = dyn_cast<CastInst>(I)) { // Canonicalize extractelement(cast) -> cast(extractelement). // Bitcasts can change the number of vector elements, and they cost // nothing. if (CI->hasOneUse() && (CI->getOpcode() != Instruction::BitCast)){ ``` Before ```llvm %34 = fptosi <4 x float> %33 to <4 x i32> ;/loop{ %40 = extractelement <4 x i32> %34, i32 %36 ``` After ```llvm ;/loop{ %37 = extractelement <4 x float> %30, i32 %32 %38 = fptosi float %37 to i32 ``` After canonicalization, for this particular example, it no longer uses a single instruction to cast the entire vector at once, but instead performs the cast for every element separately, which is less performant. Ideally, we would like to check if the cast instruction has one use and that this use is not called inside a loop. However, InstCombine/InstCombineVectorOps.cpp does not provide utilities like `LoopInfo` to check that. It might be possible to approximate this by analyzing basic block successors or by building a dominance tree, but that may be a costly solution. A solution to prevent this optimization could be to check if the index is an immediate value and if the use is inside the same basic block as the cast instruction: ```cpp if (CI->hasOneUse() && (CI->getOpcode() != Instruction::BitCast)) { Instruction U = cast<Instruction>(*CI->user_begin()); if (U->getParent() == CI->getParent() \|\| isa<ConstantInt>(Index)){ ``` Fixes: https://github.com/llvm/llvm-project/issues/165793	2026-01-08 22:43:13 +01:00
Paul Walker	3da3934336	[LLVM][LangRef] Redefine out-of-range stepvector values as being truncated. (#173494 ) The LangRef current defines out-of-range stepvector values as poison. This property is at odds with both the expansion used for fixed-length vectors and the equivalent ISD node, both of which implicitly truncate out-of-range values.	2025-12-25 10:44:04 +00:00
Meredith Julian	5d2fc9408e	[InstCombine] Fix phi scalarization with binop (#169120 ) InstCombine phi scalarization would always create a new binary op with the phi as the first operand, which is not correct for non-commutable binary ops such as sub. This fix preserves the original binary op ordering in the new binary op and adds a test for this behavior. Currently, this transformation can produce silently incorrect IR, and in the case of the added test, would optimize it out entirely.	2025-11-24 12:48:32 +01:00
Princeton Ferro	9e04291fd2	[InstCombine] linearize complexity of `findDemandedEltsByAllUsers()` (#161436 ) Each call to `findemandedEltsBySingleUser()` returns a new APInt that must be OR'd with the current APInt. For large vectors with many uses this can be slow, if the total number of operations is `{# uses} x {size of vector}`. Instead or OR'ing, use `setBit()` on the passed-in APInt.	2025-10-01 12:08:49 -07:00
Yingwei Zheng	6eff26094a	[InstCombine] Skip replaceExtractElements for ConstantData (#160575 ) Closes https://github.com/llvm/llvm-project/issues/160507. Note: Replacing other users except for `ExtElt` is a bit strange to me. I tried to only replace `ExtElt` with a new extractelement, but it caused regressions on `widen_extract2/3`.	2025-09-25 16:53:24 +00:00
Szymon Piotr Milczek	fd41700962	[InstCombine] visitShuffleVectorInst assert with vector of pointers fix. (#152341 ) In visitShuffleVectorInst there's an if block that's meant to turn shufflevector followed by bitcast into extractelement where possible. It assumes that there will never be bitcasts performed on vectors of ptr as such operations are almost always illegal, and ptrtoint instructions should be used instead. There is however an edge case where a bitcast instruction can be performed on a vector of type `<1 x ptr>` to turn it into type `ptr` In this edge case, the code initializes the variable `VecBitWidth` to 0. Then, when iterating over users that are bitcasts, an attempt is made to create a vector of size 0, which triggers and assert. This commit changes initialization of `VecBitWidth` to use datalayout to find the the size of the vector instead of getPrimitiveSizeInBits method which results in 0 for ptr and vectors of ptr.	2025-08-08 15:23:02 +02:00
Kerry McLaughlin	e170676351	[Instcombine] Combine extractelement from a vector_extract at index 0 (#151491 ) Extracting any element from a subvector starting at index 0 is equivalent to extracting from the original vector, i.e. extract_elt(vector_extract(x, 0), y) -> extract_elt(x, y)	2025-08-01 09:54:43 +01:00
Florian Hahn	08a8e1c6b6	[InstCombine] Move extends across identity shuffles. (#146901 ) Add a new fold to instcombine to move SExt/ZExt across identity shuffles, applying the cast after the shuffle. This sinks extends and can enable more general additional folding of both shuffles (and related instructions) and extends. If backends prefer splitting up doing casts first, the extends can be hoisted again in VectorCombine for example. A larger example is included in the load_i32_zext_to_v4i32. The wider extend is easier to compute an accurate cost for and targets (like AArch64) can lower a single wider extend more efficiently than multiple separate extends. This is a generalization of a VectorCombine version (https://github.com/llvm/llvm-project/pull/141109) as suggested by @preames. PR: https://github.com/llvm/llvm-project/pull/146901	2025-07-14 21:01:03 +01:00
agorenstein-nvidia	b0473c599b	[InstCombine] Pull extract through broadcast (#143380 ) The change adds a new instcombine pattern, and associated test, for patterns like this: ``` %3 = shufflevector <2 x float> %1, <2 x float> poison, <4 x i32> zeroinitializer %4 = extractelement <4 x float> %3, i64 %idx ``` The shufflevector has a splat, or broadcast, mask, so the extractelement simply must be the first element of %1, so we transform this to ``` %2 = extractelement <2 x float> %1, i64 0 ```	2025-07-04 18:19:50 +02:00
Luke Lau	d0c1ea928c	[InstCombine] Pull unary shuffles through fneg/fabs (#144933 ) This canonicalizes fneg/fabs (shuffle X, poison, mask) -> shuffle (fneg/fabs X), posion, mask This undoes part of b331a7ebc1e02f9939d1a4a1509e7eb6cdda3d38 and a8f13dbdeb31be37ee15b5febb7cc2137bbece67, but keeps the binary shuffle case i.e. shuffle fneg, fneg, mask. By pulling out the shuffle we bring it inline with the same canonicalisation we perform on binary ops and intrinsics, which the original commit acknowledges it goes in the opposite direction. However nowadays VectorCombine is more powerful and can do more optimisations when the shuffle is pulled out, so I think we should revisit this. In particular we get more shuffles folded and can perform scalarization.	2025-06-30 10:40:12 +01:00
Ramkumar Ramachandra	b40e4ceaa6	[ValueTracking] Make Depth last default arg (NFC) (#142384 ) Having a finite Depth (or recursion limit) for computeKnownBits is very limiting, but is currently a load-bearing necessity, as all KnownBits are recomputed on each call and there is no caching. As a prerequisite for an effort to remove the recursion limit altogether, either using a clever caching technique, or writing a easily-invalidable KnownBits analysis, make the Depth argument in APIs in ValueTracking uniformly the last argument with a default value. This would aid in removing the argument when the time comes, as many callers that currently pass 0 explicitly are now updated to omit the argument altogether.	2025-06-03 17:12:24 +01:00
Kazu Hirata	89308de4b0	[llvm] Value-initialize values with *Map::try_emplace (NFC) (#141522 ) try_emplace value-initializes values, so we do not need to pass nullptr to try_emplace when the value types are raw pointers or std::unique_ptr<T>.	2025-05-26 15:13:02 -07:00
Ramkumar Ramachandra	f398f2aadc	[InstCombine] Preserve GEP no-wrap flags (#141113 )	2025-05-22 22:48:33 +01:00
Ricardo Jesus	c91c3f930c	[InstCombine] Do not combine shuffle+bitcast if the bitcast is eliminable. (#135769 ) If we are attempting to combine shuffle+bitcast but the bitcast is pairable with a subsequent bitcast, we should not fold the shuffle as doing so can block further simplifications. The motivation for this is a long-standing regression affecting SIMDe on AArch64, introduced indirectly by the AlwaysInliner (1a2e77cf). Some reproducers: * https://godbolt.org/z/53qx18s6M * https://godbolt.org/z/o5e43h5M7	2025-04-30 08:22:38 +01:00
Florian Hahn	cfd53ffb44	[InstCombine] Use MapVector for SourceAggregates. (#132564 ) foldAggregateConstructionIntoAggregateReuse iterates over the entries of SourceAggregates and the order of inserted instructions depends on the order of the iterator. Using a regular DenseMap can lead to non-deterministic value naming/numbering. I don't think it can actually impact the generated binary, but it makes diffing IR more difficult. PR: https://github.com/llvm/llvm-project/pull/132564	2025-03-23 11:16:50 +00:00
Kazu Hirata	5d2393a222	[InstCombine] Avoid repeated hash lookups (NFC) (#124243 )	2025-01-24 08:09:20 -08:00
Jeremy Morse	8e70273509	[NFC][DebugInfo] Use iterator moveBefore at many call-sites (#123583 ) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to moveBefore use iterators. This patch adds a (guaranteed dereferenceable) iterator-taking moveBefore, and changes a bunch of call-sites where it's obviously safe to change to use it by just calling getIterator() on an instruction pointer. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer insertBefore, but not before adding concise documentation of what considerations are needed (very few).	2025-01-24 10:53:11 +00:00
Ramkumar Ramachandra	4a0d53a0b0	PatternMatch: migrate to CmpPredicate (#118534 ) With the introduction of CmpPredicate in 51a895a (IR: introduce struct with CmpInst::Predicate and samesign), PatternMatch is one of the first key pieces of infrastructure that must be updated to match a CmpInst respecting samesign information. Implement this change to Cmp-matchers. This is a preparatory step in migrating the codebase over to CmpPredicate. Since we no functional changes are desired at this stage, we have chosen not to migrate CmpPredicate::operator==(CmpPredicate) calls to use CmpPredicate::getMatching(), as that would have visible impact on tests that are not yet written: instead, we call CmpPredicate::operator==(Predicate), preserving the old behavior, while also inserting a few FIXME comments for follow-ups.	2024-12-13 14:18:33 +00:00
Matthias Braun	768754807f	[InstCombine] Optimistically allow multiple shufflevector uses in foldOpPhi (#114278 ) We would like to optimize situations of the form that happen after loop vectorization+SROA: ``` loop: %phi = phi zeroinitializer, %interleaved %deinterleave_a = shufflevector %phi, poison ; pick half of the lanes %deinterleave_b = shufflevector %phi, posion ; pick remaining lanes ... %a = ... %b = ... %interleaved = shufflevector %a, %b ; interleave lanes of a+b ``` where the interleave and de-interleave shuffle operations cancel each other out. This could be handled by `foldOpPhi` but does not currently work because it does not proceed when there are multiple uses of the `Phi` operation. This extends `foldOpPhi` to allow multiple `shufflevector` uses when they are shown to simplify for all `Phi` input values.	2024-12-12 17:20:48 -08:00
weiguozhi	9424f3dcc5	[InstCombine] Extend folding of aggregate construction to cases when source aggregates are partially available (#100828 ) Function foldAggregateConstructionIntoAggregateReuse can fold insertvalue(phi(extractvalue(src1), extractvalue(src2))) into phi(src1, src2) when we can find source aggregates in all predecessors. This patch extends it to handle following case insertvalue(phi(extractvalue(src1), elm2)) into phi(src1, insertvalue(elm2)) with the condition that the predecessor without source aggregate has only one successor.	2024-11-21 08:34:27 -08:00
peterbell10	a3f2e01c95	[InstCombine] Only fold extract element to trunc if vector `hasOneUse` (#115627 ) This fixes a missed optimization caused by the `foldBitcastExtElt` pattern interfering with other combine patterns. In the case I was hitting, we have IR that combines two vectors into a new larger vector by extracting elements and inserting them into the new vector. ```llvm define <4 x half> @bitcast_extract_insert_to_shuffle(i32 %a, i32 %b) { %avec = bitcast i32 %a to <2 x half> %a0 = extractelement <2 x half> %avec, i32 0 %a1 = extractelement <2 x half> %avec, i32 1 %bvec = bitcast i32 %b to <2 x half> %b0 = extractelement <2 x half> %bvec, i32 0 %b1 = extractelement <2 x half> %bvec, i32 1 %ins0 = insertelement <4 x half> undef, half %a0, i32 0 %ins1 = insertelement <4 x half> %ins0, half %a1, i32 1 %ins2 = insertelement <4 x half> %ins1, half %b0, i32 2 %ins3 = insertelement <4 x half> %ins2, half %b1, i32 3 ret <4 x half> %ins3 } ``` With the current behavior, `InstCombine` converts each vector extract sequence to ```llvm %tmp = trunc i32 %a to i16 %a0 = bitcast i16 %tmp to half %a1 = extractelement <2 x half> %avec, i32 1 ``` where the extraction of `%a0` is now done by truncating the original integer. While on it's own this is fairly reasonable, in this case it also blocks the pattern which converts `extractelement` - `insertelement` into shuffles which gives the overall simpler result: ```llvm define <4 x half> @bitcast_extract_insert_to_shuffle(i32 %a, i32 %b) { %avec = bitcast i32 %a to <2 x half> %bvec = bitcast i32 %b to <2 x half> %ins3 = shufflevector <2 x half> %avec, <2 x half> %bvec, <4 x i32> <i32 0, i32 1, i32 2, i32 3> ret <4 x half> %ins3 } ``` In this PR I fix the conflict by obeying the `hasOneUse` check even if there is no shift instruction required. In these cases we can't remove the vector completely, so the pattern has less benefit anyway. Also fwiw, I think dropping the `hasOneUse` check for the 0th element might have been a mistake in the first place. Looking at `535c5d56a7` the commit message only mentions loosening the `isDesirableIntType` requirement and doesn't mention changing the `hasOneUse` check at all.	2024-11-20 13:06:57 -08:00
Yingwei Zheng	27bf45aa36	[InstCombine] Fix poison safety of folding shufflevector into select (#115483 ) We are allowed to fold shufflevector into select iff the condition is guaranteed not to be poison or the RHS is a poison. Alive2: https://alive2.llvm.org/ce/z/28zEWR Closes https://github.com/llvm/llvm-project/issues/115465.	2024-11-10 17:07:25 +08:00
Yingwei Zheng	18311093ab	[InstCombine] Do not fold `shufflevector(select)` if the select condition is a vector (#113993 ) Since `shufflevector` is not element-wise, we cannot do fold it into select when the select condition is a vector. For shufflevector that doesn't change the length, it doesn't crash, but it is still a miscompilation: https://alive2.llvm.org/ce/z/s8saCx Fixes https://github.com/llvm/llvm-project/issues/113986.	2024-10-29 10:39:07 +08:00
Matthias Braun	5903c6af44	InstCombine: Fold shufflevector(select) and shufflevector(phi) (#113746 ) - Transform `shufflevector(select(c, x, y), C)` to `select(c, shufflevector(x, C), shufflevector(y, C))` by re-using the `FoldOpIntoSelect` helper. - Transform `shufflevector(phi(x, y), C)` to `phi(shufflevector(x, C), shufflevector(y, C))` by re-using the `foldOpInotPhi` helper.	2024-10-28 15:35:17 -07:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Nikita Popov	dc6876fc98	[ValueTracking] Use isSafeToSpeculativelyExecuteWithVariableReplaced() in more places (#109149 ) This replaces some uses of isSafeToSpeculativelyExecute() with isSafeToSpeculativelyExecuteWithVariableReplaced(), in cases where we are guarding against operand changes rather plain speculation. I believe that this is NFC with the current implementation of the function (as it only does something different from loads), but this makes us more defensive against future generalizations.	2024-09-19 09:38:20 +02:00
Maciej Gabka	95d2d1cba0	Move stepvector intrinsic out of experimental namespace (#98043 ) This patch is moving out stepvector intrinsic from the experimental namespace. This intrinsic exists in LLVM for several years now, and is widely used.	2024-08-28 12:48:20 +01:00
Nikita Popov	4d2ae88d16	[InstCombine] Fix invalid scalarization of div If the binop is not speculatable, and the extract index is out of range, then scalarizing will perform the operation on a poison operand, resulting in immediate UB, instead of the previous poison result. Fixes https://github.com/llvm/llvm-project/issues/97053.	2024-07-03 11:05:33 +02:00
Simon Pilgrim	5b4000dc58	[VectorUtils] Add llvm::scaleShuffleMaskElts wrapper for narrowShuffleMaskElts/widenShuffleMaskElts, NFC. (#96646 ) Using the target number of vector elements, scaleShuffleMaskElts will try to use narrowShuffleMaskElts/widenShuffleMaskElts to scale the shuffle mask accordingly. Working on #58895 I didn't want to create yet another case where we have to handle both re-scaling cases.	2024-06-26 10:43:58 +01:00
Stephen Tozer	d75f9dd1d2	Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497 )" Reverts the above commit, as it updates a common header function and did not update all callsites: https://lab.llvm.org/buildbot/#/builders/29/builds/382 This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.	2024-06-24 18:00:22 +01:00
Stephen Tozer	6481dc5761	[IR][NFC] Update IRBuilder to use InsertPosition (#96497 ) Uses the new InsertPosition class (added in #94226) to simplify some of the IRBuilder interface, and removes the need to pass a BasicBlock alongside a BasicBlock::iterator, using the fact that we can now get the parent basic block from the iterator even if it points to the sentinel. This patch removes the BasicBlock argument from each constructor or call to setInsertPoint. This has no functional effect, but later on as we look to remove the `Instruction *InsertBefore` argument from instruction-creation (discussed [here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)), this will simplify the process by allowing us to deprecate the InsertPosition constructor directly and catch all the cases where we use instructions rather than iterators.	2024-06-24 17:27:43 +01:00
Nikita Popov	6be6c3a37b	[InstCombine] Use disjoint flag for alternate binops Check the or disjoint flag instead of the weaker MaskedValueIsZero query.	2024-06-18 16:51:37 +02:00
Nikita Popov	8052e94946	[InstCombine] Avoid use of ConstantExpr::getShl() Use the constant folding API instead. Use ImmConstant to make sure it actually folds.	2024-06-18 16:43:19 +02:00
Nikita Popov	530d4c9bf3	[InstCombine] Use m_Poison() instead of m_Undef() (NFCI) In this case the shuffle mask checks should already guarantee a single-source shuffle, so this is just for clarity.	2024-05-21 16:15:30 +02:00
Nikita Popov	8d5b7d4d11	[InstCombine] Use m_Poison() instead of m_Undef() (NFCI) In this case, the isIdentityWithExtract() checks should already guarantee that these are single-source shuffles, so this is just for clarity.	2024-05-21 16:07:46 +02:00
Nikita Popov	d0e0205bfc	[InstCombine] Check for poison instead of undef in single shuffle fold Otherwise we'll convert undef to poison. Alive2 was already flagging the existing test8 test as a miscompile.	2024-05-21 16:03:20 +02:00
Nikita Popov	fbc798e442	[InstCombine] Use m_Poison instead of m_Undef (NFCI) In this case, isIdentityWithExtract() should already ensure that this is a single-source shuffle. This just makes things more explicit.	2024-05-21 15:48:35 +02:00
Nikita Popov	8f1c984325	[InstCombine] Check for poison instead of undef in shuffle of unop transform Otherwise this may not actually be a single-source shuffle.	2024-05-21 15:43:50 +02:00
Nikita Popov	2f1e2325cf	[InstCombine] Use m_Poison instead of m_Undef in some places (NFCI) I believe that in these cases other conditions already ensure that the second operand is not used, this is mostly for clarity.	2024-05-21 15:33:25 +02:00
Nikita Popov	ecd269e830	[InstCombine] Check for poison instead of undef in splat shuffle fold We can't canonicalize these to a splat shuffle, as doing so would convert undef -> poison.	2024-05-21 15:21:31 +02:00
Nikita Popov	263224e448	[InstCombine] Require poison operand in canEvaluateShuffled transform This transform works on single-source shuffles, which require that the second operand is poison, not undef. Otherwise we may convert undef to poison. Fixes https://github.com/llvm/llvm-project/issues/92887.	2024-05-21 15:00:01 +02:00
Marc Auberer	b3fe27f2be	[InstCombine] Copy flags of extractelement for extelt -> icmp combine (#86366 ) Fixes #86164	2024-03-24 16:14:56 +01:00
Michele Scandale	536cb1fad3	[InstCombine] Fix for folding select-like `shufflevector` into floating point binary operators. (#85452 ) Folding a select-like `shufflevector` into a floating point binary operators can only be done if the result is preserved for both case. In particular, if the common operand of the `shufflevector` and the floating point binary operator can be a NaN, then the transformation won't preserve the result value.	2024-03-21 12:40:18 -07:00
Jeremy Morse	2fe81edef6	[NFC][RemoveDIs] Insert instruction using iterators in Transforms/ As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carry a bit of debug-info. This commit implements some of that by updating the contents of llvm/lib/Transforms/Utils to always use iterator-versions of instruction constructors. There are two general flavours of update: * Almost all call-sites just call getIterator on an instruction * Several make use of an existing iterator (scenarios where the code is actually significant for debug-info) The underlying logic is that any call to getFirstInsertionPt or similar APIs that identify the start of a block need to have that iterator passed directly to the insertion function, without being converted to a bare Instruction pointer along the way. Noteworthy changes: * FindInsertedValue now takes an optional iterator rather than an instruction pointer, as we need to always insert with iterators, * I've added a few iterator-taking versions of some value-tracking and DomTree methods -- they just unwrap the iterator. These are purely convenience methods to avoid extra syntax in some passes. * A few calls to getNextNode become std::next instead (to keep in the theme of using iterators for positions), * SeparateConstOffsetFromGEP has it's insertion-position field changed. Noteworthy because it's not a purely localised spelling change. All this should be NFC.	2024-03-05 15:12:22 +00:00
Nikita Popov	92fc4b482f	[InstCombine] Preserve poison in bitcast of insertelement fold If the base was poison, retain the poison value.	2023-12-19 13:06:04 +01:00
Nikita Popov	67fd4e3408	[InstCombine] Check for poison instead of undef in shuffle transform This one doesn't seem to make a practical difference because we'd canonicalize undef -> poison in the relevant cases anywy.	2023-12-19 12:56:52 +01:00
Nikita Popov	9d25b28b9e	[InstCombine] Explicitly canonicalize splat shuffles to use poison RHS This is usually handled by demanded elements simplification. However, as that is not supported for scalable vectors, also handle it explicitly here.	2023-12-18 16:30:40 +01:00
Nikita Popov	e93d324adb	[InstCombine] Preserve poison in evaluateInDifferentElementOrder() Don't unnecessarily replace poison with undef.	2023-12-18 15:36:22 +01:00
Nikita Popov	6c9813aa02	[InstCombine] Check for poison instead of undef in shuffle combine Otherwise we may replace undef with poison. Note that a lot of tests regressing here already have variants that use poison instead of undef (often in a separate inseltpoison file), which is why I'm not adjusting them to the new pattern.	2023-12-18 15:19:16 +01:00

1 2 3 4 5 ...

368 Commits