llvm-project

Author	SHA1	Message	Date
zGoldthorpe	a8d25683ee	[PatternMatch] Allow `m_ConstantInt` to match integer splats (#153692 ) When matching integers, `m_ConstantInt` is a convenient alternative to `m_APInt` for matching unsigned 64-bit integers, allowing one to simplify ```cpp const APInt *IntC; if (match(V, m_APInt(IntC))) { if (IntC->ule(UINT64_MAX)) { uint64_t Int = IntC->getZExtValue(); // ... } } ``` to ```cpp uint64_t Int; if (match(V, m_ConstantInt(Int))) { // ... } ``` However, this simplification is only true if `V` is a scalar type. Specifically, `m_APInt` also matches integer splats, but `m_ConstantInt` does not. This patch ensures that the matching behaviour of `m_ConstantInt` parallels that of `m_APInt`, and also incorporates it in some obvious places.	2025-08-15 10:43:54 -06:00
Adar Dagan	1afb42bc10	[InstCombine] Let shrinkSplatShuffle act on vectors of different lengths (#148593 ) shrinkSplatShuffle in InstCombine would only move truncs up through shuffles if those shuffles inputs had the exact same type as their output, this PR weakens this constraint to only requiring that the scalar type of the input and output match.	2025-07-28 13:00:37 +02:00
mayanksolanki393	4e0dd007ac	[InstCombine] Combine trunc (lshr X, BW-1) to i1 --> icmp slt X, 0 (#142593 ) (#143846 ) Fixes #142593, the issue was fixed using the suggestion on the ticket itself. Godbolt: https://godbolt.org/z/oW5b74jc4 alive2 proof: https://alive2.llvm.org/ce/z/QHnD7e	2025-06-16 09:46:52 +02:00
Yingwei Zheng	cab09e76e0	[InstCombine] Propagate FMF from fptrunc when folding `fptrunc fabs(X) -> fabs(fptrunc X)` (#143352 ) Alive2: https://alive2.llvm.org/ce/z/DWV3G3 fptrunc yields infinity when the input cannot fit in the target type. So ninf should be propagated from fptrunc. For other intrinsics, the previous check ensures that the result is never an infinity: `5d3899d293/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp (L1910-L1917)` Closes https://github.com/llvm/llvm-project/issues/143122.	2025-06-16 12:07:47 +08:00
Paul Walker	f43aaf90df	[NFC][LLVM] Refactor IRBuilder::Create{VScale,ElementCount,TypeSize}. (#142803 ) CreateVScale took a scaling parameter that had a single use outside of IRBuilder with all other callers having to create a redundant ConstantInt. To work round this some code perferred to use CreateIntrinsic directly. This patch simplifies CreateVScale to return a call to the llvm.vscale() intrinsic and nothing more. As well as simplifying the existing call sites I've also migrated the uses of CreateIntrinsic. Whilst IRBuilder used CreateVScale's scaling parameter as part of the implementations of CreateElementCount and CreateTypeSize, I have follow-on work to switch them to the NUW varaiety and thus they would stop using CreateVScale's scaling as well. To prepare for this I have moved the multiplication and constant folding into the implementations of CreateElementCount and CreateTypeSize. As a final step I have replaced some callers of CreateVScale with CreateElementCount where it's clear from the code they wanted the latter.	2025-06-10 12:35:59 +01:00
Ramkumar Ramachandra	b40e4ceaa6	[ValueTracking] Make Depth last default arg (NFC) (#142384 ) Having a finite Depth (or recursion limit) for computeKnownBits is very limiting, but is currently a load-bearing necessity, as all KnownBits are recomputed on each call and there is no caching. As a prerequisite for an effort to remove the recursion limit altogether, either using a clever caching technique, or writing a easily-invalidable KnownBits analysis, make the Depth argument in APIs in ValueTracking uniformly the last argument with a default value. This would aid in removing the argument when the time comes, as many callers that currently pass 0 explicitly are now updated to omit the argument altogether.	2025-06-03 17:12:24 +01:00
Usman Nadeem	b93173185d	[InstCombine] Narrow trunc(lshr) in more cases (#139645 ) We can narrow `trunc(lshr(i32)) to i8` to `trunc(lshr(i16)) to i8` even when the bits that we are shifting in are not zero, in the cases where the MSBs of the shifted value don't actually matter and actually end up being truncated away. This kind of narrowing does not remove the trunc but can help the vectorizer generate better code in a smaller type. Motivation: libyuv, functions like ARGBToUV444Row_C(). Proof: https://alive2.llvm.org/ce/z/9Ao2aJ	2025-05-13 13:42:42 -07:00
Nikita Popov	50aacb9e1b	[InstCombine] Support ptrtoint of gep folds for chain of geps (#137323 ) Support the ptrtoint(gep null, x) -> x and ptrtoint(gep inttoptr(x), y) -> x+y folds for the case where there is a chain of geps that ends in null or inttoptr. This avoids some regressions from #137297. While here, also be a bit more careful about edge cases like pointer to vector splats and mismatched pointer and index size.	2025-04-28 09:02:29 +02:00
Luke Lau	27a437108b	[InstCombine] Handle scalable splats of constants in getMinimumFPType (#132960 ) We previously handled ConstantExpr scalable splats in 5d929794a87602cfd873381e11cc99149196bb49, but only fpexts. ConstantExpr fpexts have since been removed, and simultaneously we didn't handle splats of constants that weren't extended. This updates it to remove the fpext check and instead see if we can shrink the result of getSplatValue. Note that the test case doesn't get completely folded away due to #132922	2025-03-27 13:24:00 +00:00
Yingwei Zheng	a77346bad0	[IRBuilder] Refactor FMF interface (#121657 ) Up to now, the only way to set specified FMF flags in IRBuilder is to use `FastMathFlagGuard`. It makes the code ugly and hard to maintain. This patch introduces a helper class `FMFSource` to replace the original parameter `Instruction *FMFSource` in IRBuilder. To maximize the compatibility, it accepts an instruction or a specified FMF. This patch also removes the use of `FastMathFlagGuard` in some simple cases. Compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=f87a9db8322643ccbc324e317a75b55903129b55&to=9397e712f6010be15ccf62f12740e9b4a67de2f4&stat=instructions%3Au	2025-01-06 14:37:04 +08:00
Andreas Jonson	855bc46bc8	[InstCombine] Fold trunc nuw/nsw X to i1 -> true IFF X != 0 (#119131 ) proof https://alive2.llvm.org/ce/z/prpPex	2024-12-08 22:28:16 +01:00
John Brawn	99dc396759	[InstCombine] Make fptrunc combine use intersection of fast math flags (#118808 ) These combines involve swapping the fptrunc with its operand, and using the intersection of fast math flags is the safest option as e.g. if we have (fptrunc (fneg ninf x)) then (fneg ninf (fptrunc x)) will not be correct as if x is a not within the range of the destination type the result of (fptrunc x) will be inf.	2024-12-06 11:40:30 +00:00
Nikita Popov	d09632ba81	[InstCombine] Remove nusw handling in ptrtoint of gep fold (NFCI) (#118804 ) Now that #111144 infers gep nuw, we no longer have to repeat the inference in this fold.	2024-12-05 15:59:35 +01:00
Nikita Popov	4d8eb009d8	[InstCombine] Remove SPF guard for trunc transforms (#117535 ) This shouldn't be necessary anymore now that SPF patterns are canonicalized to intrinsics.	2024-11-25 17:06:58 +01:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Simon Pilgrim	795c24c6fb	[InstCombine] foldVecExtTruncToExtElt - extend to handle trunc(lshr(extractelement(x,c1),c2)) -> extractelement(bitcast(x),c3) patterns. (#109689 ) This patch moves the existing trunc+extractlement -> extractelement+bitcast fold into a foldVecExtTruncToExtElt helper and extends the helper to handle trunc+lshr+extractelement cases as well. Fixes #107404	2024-09-28 17:52:10 +01:00
Alex MacLean	790f2eb16a	[InstCombine] Avoid simplifying bitcast of undef to a zeroinitializer vector (#108872 ) In some cases, if an undef value is the product of another instcombine simplification, a bitcast of undef is simplified to a zeroinitializer vector instead of undef.	2024-09-17 15:31:28 -07:00
Marius Kamp	170a21e7f0	[InstCombine] Extend Fold of Zero-extended Bit Test (#102100 ) Previously, (zext (icmp ne (and X, (1 << ShAmt)), 0)) has only been folded if the bit width of X and the result were equal. Use a trunc or zext instruction to also support other bit widths. This is a follow-up to commit 533190acdb9d2ed774f96a998b5c03be3df4f857, which introduced a regression: (zext (icmp ne (and (lshr X ShAmt) 1) 0)) is not folded any longer to (zext/trunc (and (lshr X ShAmt) 1)) since the commit introduced the fold of (icmp ne (and (lshr X ShAmt) 1) 0) to (icmp ne (and X (1 << ShAmt)) 0). The change introduced by this commit restores this fold. Alive proof: https://alive2.llvm.org/ce/z/MFkNXs Relates to issue #86813 and pull request #101838.	2024-08-21 20:09:02 +08:00
Yingwei Zheng	b455edbc45	[InstCombine] Recognize copysign idioms (#101324 ) This patch folds `(bitcast (or (and (bitcast X to int), signmask), nneg Y) to fp)` into `copysign((bitcast Y to fp), X)`. I found this pattern exists in some graphics applications/math libraries. Alive2: https://alive2.llvm.org/ce/z/ggQZV2	2024-08-01 00:14:29 +08:00
Simon Pilgrim	abacc5220e	Fix unused variable warning. NFC.	2024-07-25 16:56:00 +01:00
James Y Knight	dfeb3991fb	Remove the `x86_mmx` IR type. (#98505 ) It is now translated to `<1 x i64>`, which allows the removal of a bunch of special casing. This _incompatibly_ changes the ABI of any LLVM IR function with `x86_mmx` arguments or returns: instead of passing in mmx registers, they will now be passed via integer registers. However, the real-world incompatibility caused by this is expected to be minimal, because Clang never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>` or `double`, depending on ABI. This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type. That type simply no longer corresponds to an IR type, and is used only by MMX intrinsics and inline-asm operands. Because SelectionDAGBuilder only knows how to generate the operands/results of intrinsics based on the IR type, it thus now generates the intrinsics with the type MVT::v1i64, instead of MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus have the X86 backend fix them up in DAGCombine. (This may be a short-lived hack, if all the MMX intrinsics can be removed in upcoming changes.) Works towards issue #98272.	2024-07-25 09:19:22 -04:00
Nikita Popov	d873630fda	Revert "[InstCombine] Generalize ptrtoint(gep) fold (NFC)" This reverts commit c45f939e34dafaf0f57fd1d93df7df5cc89f1dec. This refactoring turned out to not be useful for the case I had originally in mind, so revert it for now.	2024-07-12 16:54:24 +02:00
Nikita Popov	c45f939e34	[InstCombine] Generalize ptrtoint(gep) fold (NFC) We're currently handling a special case of ptrtoint gep -> add ptrtoint. Reframe the code to make it easier to add more patterns for this transform.	2024-07-12 16:34:26 +02:00
Nikita Popov	4502ea89b9	[InstCombine] More precise nuw preservation in ptrtoint of gep fold We can transfer a nuw flag from the gep to the add. Additionally, the inbounds + nneg case can be relaxed to nusw + nneg. Finally, don't forget to pass the correct context instruction to SimplifyQuery.	2024-07-11 16:55:16 +02:00
Nikita Popov	440af98a04	[InstCombine] Avoid use of ConstantExpr::getShl() Use IRBuilder instead. Also use ImmConstant to guarantee that this will fold.	2024-06-18 16:27:17 +02:00
Nikita Popov	534f8569a3	[InstCombine] Don't preserve context across div We can't preserve the context across a non-speculatable instruction, as this might introduce a trap. Alternatively, we could also insert all the replacement instruction at the use-site, but that would be a more intrusive change for the sake of this edge case. Fixes https://github.com/llvm/llvm-project/issues/95547.	2024-06-17 15:38:59 +02:00
Monad	6bf1601a0d	[InstCombine] Fold pointer adding in integer to arithmetic add (#91596 ) Fold ``` llvm define i32 @src(i32 %x, i32 %y) { %base = inttoptr i32 %x to ptr %ptr = getelementptr inbounds i8, ptr %base, i32 %y %r = ptrtoint ptr %ptr to i32 ret i32 %r } ``` where both `%base` and `%ptr` have only one use, to ``` llvm define i32 @tgt(i32 %x, i32 %y) { %r = add i32 %x, %y ret i32 %r } ``` The `add` can be `nuw` if the GEP is `inbounds` and the offset is non-negative. The relevant Alive2 proof is https://alive2.llvm.org/ce/z/nP3RWy. ### Motivation It seems unnecessary to convert `int` to `ptr` just to get its offset. In most cases, they generates the same assembly, but sometimes it may miss some optimizations since the analysis of `GEP` is not as perfect as that of arithmetic operation. One example is `e3c822bf41/bench/protobuf/optimized/generated_message_reflection.cc.ll (L39860-L39873)` ``` llvm %conv.i188 = zext i32 %145 to i64 %add.i189 = add i64 %conv.i188, %125 %146 = load i16, ptr %num_aux_entries10.i, align 2 %conv2.i191 = zext i16 %146 to i64 %mul.i192 = shl nuw nsw i64 %conv2.i191, 3 %add3.i193 = add i64 %add.i189, %mul.i192 %147 = inttoptr i64 %add3.i193 to ptr %sub.ptr.lhs.cast.i195 = ptrtoint ptr %144 to i64 %sub.ptr.rhs.cast.i196 = ptrtoint ptr %143 to i64 %sub.ptr.sub.i197 = sub i64 %sub.ptr.lhs.cast.i195, %sub.ptr.rhs.cast.i196 %add.ptr = getelementptr inbounds i8, ptr %147, i64 %sub.ptr.sub.i197 %sub.ptr.lhs.cast = ptrtoint ptr %add.ptr to i64 %sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %125 ``` where `%conv.i188` first adds `%125` and then subtracts `%125` (the result is `%sub.ptr.sub`), which can be optimized.	2024-05-20 12:20:47 +08:00
Monad	34c89eff64	[InstCombine] Fold `trunc nuw/nsw (x xor y) to i1` to `x != y` (#90408 ) Fold: ``` llvm define i1 @src(i8 %x, i8 %y) { %xor = xor i8 %x, %y %r = trunc nuw/nsw i8 %xor to i1 ret i1 %r } define i1 @tgt(i8 %x, i8 %y) { %r = icmp ne i8 %x, %y ret i1 %r } ``` Proof: https://alive2.llvm.org/ce/z/dcuHmn	2024-04-30 18:07:40 +08:00
Nikita Popov	873889b7fa	[InstCombine] Extract logic for "emit offset and rewrite gep" (NFC)	2024-04-25 14:18:11 +09:00
Nikita Popov	1baa385065	[IR][PatternMatch] Only accept poison in getSplatValue() (#89159 ) In #88217 a large set of matchers was changed to only accept poison values in splats, but not undef values. This is because we now use poison for non-demanded vector elements, and allowing undef can cause correctness issues. This patch covers the remaining matchers by changing the AllowUndef parameter of getSplatValue() to AllowPoison instead. We also carry out corresponding renames in matchers. As a followup, we may want to change the default for things like m_APInt to m_APIntAllowPoison (as this is much less risky when only allowing poison), but this change doesn't do that. There is one caveat here: We have a single place (X86FixupVectorConstants) which does require handling of vector splats with undefs. This is because this works on backend constant pool entries, which currently still use undef instead of poison for non-demanded elements (because SDAG as a whole does not have an explicit poison representation). As it's just the single use, I've open-coded a getSplatValueAllowUndef() helper there, to discourage use in any other places.	2024-04-18 15:44:12 +09:00
Noah Goldstein	da04e4afd3	[InstCombine] Use `auto *` instead of `auto` in `visitSIToFP`; NFC	2024-04-17 12:36:25 -05:00
Noah Goldstein	b6bd41db31	[InstCombine] Add canonicalization of `sitofp` -> `uitofp nneg` This is essentially the same as #82404 but has the `nneg` flag which allows the backend to reliably undo the transform. Closes #88299	2024-04-16 15:26:25 -05:00
Yingwei Zheng	b109477615	[InstCombine] Infer nsw/nuw for trunc (#87910 ) This patch adds support for inferring trunc's nsw/nuw flags.	2024-04-11 19:10:53 +08:00
Monad	56b3222b79	[InstCombine] Remove the canonicalization of `trunc` to `i1` (#84628 ) Remove the canonicalization of `trunc` to `i1` according to the suggestion of https://github.com/llvm/llvm-project/pull/83829#issuecomment-1986801166 `a84e66a92d/llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp (L737-L745)` Alive2: https://alive2.llvm.org/ce/z/cacYVA	2024-03-29 21:47:35 +08:00
Noah Goldstein	6960ace534	Revert "[InstCombine] Canonicalize `(sitofp x)` -> `(uitofp x)` if `x >= 0`" This reverts commit d80d5b923c6f611590a12543bdb33e0c16044d44. It wasn't a particularly important transform to begin with and caused some codegen regressions on targets that prefer `sitofp` so dropping. Might re-visit along with adding `nneg` flag to `uitofp` so its easily reversable for the backend.	2024-03-20 00:50:45 -05:00
Yingwei Zheng	a747e86caa	[InstCombine] Fold `fpto{s\|u}i non-norm` to zero (#85569 ) This patch enables more optimization after canonicalizing `fmul X, 0.0` into a copysign. I decide to implement this fold in InstCombine because `computeKnownFPClass` may be expensive. Alive2: https://alive2.llvm.org/ce/z/ASM8tQ	2024-03-19 17:16:48 +08:00
Noah Goldstein	d80d5b923c	[InstCombine] Canonicalize `(sitofp x)` -> `(uitofp x)` if `x >= 0` Just a standard canonicalization. Proofs: https://alive2.llvm.org/ce/z/9W4VFm Closes #82404	2024-03-13 18:26:21 -05:00
Yingwei Zheng	c18e1215c4	[InstCombine] Simplify `zext nneg i1 X` to zero (#85043 ) Alive2: https://alive2.llvm.org/ce/z/Wm6kCk	2024-03-13 20:15:29 +08:00
Quentin Dian	e96c0c1d5e	[InstCombine] Fix shift calculation in InstCombineCasts (#84027 ) Fixes #84025.	2024-03-06 06:16:28 +08:00
Benjamin Kramer	d3f6dd6585	[InstCombine] Pick bfloat over half when shrinking ops that started with an fpext from bfloat (#82493 ) This fixes the case where we would shrink an frem to half and then bitcast to bfloat, producing invalid results. The transformation was written under the assumption that there is only one type with a given bit width. Also add a strategic assert to CastInst::CreateFPCast to turn this miscompilation into a crash.	2024-02-22 15:25:17 +01:00
Yingwei Zheng	f37d81f8a3	[PatternMatch] Add a matching helper `m_ElementWiseBitCast`. NFC. (#80764 ) This patch introduces a matching helper `m_ElementWiseBitCast`, which is used for matching element-wise int <-> fp casts. The motivation of this patch is to avoid duplicating checks in https://github.com/llvm/llvm-project/pull/80740 and https://github.com/llvm/llvm-project/pull/80414.	2024-02-07 21:02:13 +08:00
Alexey Bataev	5a667bee9c	[InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636 ) Tries to remove extra trunc/ext instruction for shufflevector instructions. Differential Review: https://github.com/llvm/llvm-project/pull/78636	2024-01-22 05:50:20 -08:00
Pranav Kant	4482fd846a	Revert "[InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636 )" This reverts commit 4d11f04b20f0bd7488e19e8f178ba028412fa519. This breaks some programs as mentioned in #78636	2024-01-19 21:02:20 +00:00
Alexey Bataev	4d11f04b20	[InstCombine] Try to fold trunc(shuffle(zext)) to just a shuffle (#78636 ) Tries to remove extra trunc/ext instruction for shufflevector instructions.	2024-01-19 09:29:01 -05:00
Nikita Popov	4b3ea337ad	[ValueTracking] Convert isKnownNonNegative() to use SimplifyQuery (NFC)	2023-11-29 10:52:52 +01:00
Nikita Popov	ac75171d41	[InstCombine] Fix incorrect nneg inference on shift amount Whether this is valid depends on the bit widths of the involved integers. Fixes https://github.com/llvm/llvm-project/issues/72927.	2023-11-21 15:47:55 +01:00
Yingwei Zheng	44cdbef715	[InstCombine] Infer nneg flag from shift users (#71947 ) This patch sets `nneg` flag when the zext is only used by a shift. Alive2: https://alive2.llvm.org/ce/z/h3xKjP Compile-time impact: https://llvm-compile-time-tracker.com/compare.php?from=bd611264993f64decbce178d460caf1d1cb05f59&to=26bc473b239010bb24ff1bc39d58b42ecbbc4730&stat=instructions:u This is an alternative to #71906.	2023-11-13 21:05:05 +08:00
Nikita Popov	8391f405cb	[InstCombine] Avoid uses of ConstantExpr::getLShr() Use the constant folding API instead.	2023-11-10 15:50:42 +01:00
Nikita Popov	5918f62301	[InstCombine] Infer zext nneg flag (#71534 ) Use KnownBits to infer the nneg flag on zext instructions. Currently we only set nneg when converting sext -> zext, but don't set it when we have a zext in the first place. If we want to use it in optimizations, we should make sure the flag inference is consistent.	2023-11-08 09:34:40 +01:00
Antonio Frighetto	7d39838948	[InstCombine] Favour `CreateZExtOrTrunc` in `narrowFunnelShift` (NFC) Use `CreateZExtOrTrunc`, reduce test and regenerate checks.	2023-11-07 22:48:14 +01:00

1 2 3 4 5 ...

558 Commits