llvm-project

Author	SHA1	Message	Date
Alexey Bataev	78490acb32	[SLP]Support for zext i1 %x modeling as select %x, 1, 0 Model zext i1 %x to in as select i1 %x, in 1, in 0 in case, if there are other select instructions, which can be combined into a bundle. Fixes #178403 Recommit after revert in 993e1f66afcfe9da03bd813e669eada341b11d2f Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/180635	2026-02-10 12:54:12 -08:00
Luke Lau	f81889da29	[VPlan] Fix convertToPhisToBlends folding non poison blend to poison (#180686 ) This fixes a miscompile in #180005 where we didn't check that the first incoming value isn't poison. We should use the first non-poison incoming value if it exists, or just poison if all the incoming values are poison.	2026-02-10 16:15:57 +00:00
Jonas Paulsson	d80a729572	[LoopVectorizer] Rename variable (NFC). (#180585 ) Since TargetTransformInfo::enableAggressiveInterleaving(bool HasReductions) takes the HasReductions argument, the LoopVectorizer should save its returned value in a variable called AggressivelyInterleave instead of AggressivelyInterleaveReductions.	2026-02-10 10:11:43 -06:00
Andrei Elovikov	f96c1ccc1e	[VPlan] Add `-vplan-print-after=` option (#178700 ) UpdateTestChecks support is updated in subsequent https://github.com/llvm/llvm-project/pull/178736.	2026-02-10 16:07:25 +00:00
Alexey Bataev	993e1f66af	Revert "[SLP]Support for zext i1 %x modeling as select %x, 1, 0" This reverts commit 70aebae2a13114f4e3d5e2460c052d8f3de295be to fix buildbots https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flab.llvm.org%2Fbuildbot%2F%23%2Fbuilders%2F85%2Fbuilds%2F18614&data=05%7C02%7C%7Ce5641da3fe984280a6e908de68b3658c%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C639063316889757116%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=65hUwLDdZkXq3zUEt3cVuqJNwXN7Alw4JKDggDbjeVk%3D&reserved=0	2026-02-10 06:49:53 -08:00
Manasij Mukherjee	0fdf9b9676	[ConstraintElim] Infer linear constraints from udiv and urem (#180689 ) urem x, n: result < n (remainder is always less than divisor) urem x, n: result <= x (remainder is at most the dividend) udiv x, n: result <= x (quotient is at most the dividend) https://alive2.llvm.org/ce/z/ezzsjQ	2026-02-10 22:25:13 +08:00
Alexey Bataev	70aebae2a1	[SLP]Support for zext i1 %x modeling as select %x, 1, 0 Model zext i1 %x to in as select i1 %x, in 1, in 0 in case, if there are other select instructions, which can be combined into a bundle. Fixes #178403 Reviewers: hiraditya, RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/180635	2026-02-10 08:59:44 -05:00
Sander de Smalen	3157758190	[LV] Handle partial sub-reductions with sub in middle block. (#178919 ) Sub-reductions can be implemented in two ways: (1) negate the operand in the vector loop (the default way). (2) subtract the reduced value from the init value in the middle block. Note that both ways keep the reduction itself as an 'add' reduction, which is necessary because only llvm.vector.partial.reduce.add exists. The ISD nodes for partial reductions don't support folding the sub/negation into its operands because the following is not a valid transformation: ``` sub(0, mul(ext(a), ext(b))) -> mul(ext(a), ext(sub(0, b))) ``` It can therefore be better to choose option (2) such that the partial reduction is always positive (starting at '0') and to do a final subtract in the middle block. For AArch64 there are no dot-product instructions that can do a `partial.reduce.sub(acc, mul(ext(a), ext(b)))` operation. I'm not sure if such instructions exist for other targets. (If so then we may want to make this decision a target option) This PR also increases the AArch64 cost of a partial sub-reduction when this exists in an 'add-sub' reduction chain. Fixes https://github.com/llvm/llvm-project/issues/178703	2026-02-10 11:00:32 +00:00
Felipe de Azevedo Piovezan	41aed214a0	[CoroSplit][DebugInfo] Fix scope of continuation funclets (#180523 ) The heuristic for deciding which scope line to use for a continuation funclet relies on iterating on the instructions of the first BB of the continuation. Often, this contains a single unconditional branch, which is skipped by the heuristic. However, in coro-retcon, two such "jump-only" BBs are generated. This patch amends the heuristic to account for that.	2026-02-10 10:45:02 +00:00
Benjamin Maxwell	f22a178b13	Reland "[LV] Support conditional scalar assignments of masked operations" (#180708 ) This patch extends the support added in #158088 to loops where the assignment is non-speculatable (e.g. a conditional load or divide). For example, the following loop can now be vectorized: ``` int simple_csa_int_load( int* a, int* b, int default_val, int N, int threshold) { int result = default_val; for (int i = 0; i < N; ++i) if (a[i] > threshold) result = b[i]; return result; } ``` It does this by extending the recurrence matching from only looking for selects, to include phis where all operands are the header phi, except for one which can be an arbitrary value outside the recurrence. --- Reverts llvm/llvm-project#180275 (original PR: #178862) Additional type legalization for `ISD::VECTOR_FIND_LAST_ACTIVE` was added in #180290, which should resolve the backend crashes on x86.	2026-02-10 09:57:48 +00:00
Matt Arsenault	302ff8fd00	InstCombine: Use SimplifyDemandedFPClass on fmul (#177490 ) Start trying to use SimplifyDemandedFPClass on instructions, starting with fmul. This subsumes the old transform on multiply of 0. The main change is the introduction of nnan/ninf. I do not think anywhere was systematically trying to introduce fast math flags before, though a few odd transforms would set them. Previously we only called SimplifyDemandedFPClass on function returns with nofpclass annotations. Start following the pattern of SimplifyDemandedBits, where this will be called from relevant root instructions. I was wondering if this should go into InstCombineAggressive, but that apparently does not make use of InstCombineInternal's worklist.	2026-02-10 09:49:31 +00:00
Nikita Popov	59a8bd0a74	[SimplifyLibCalls] Directly canonicalize fminimum_num to intrinsic (#180555 ) Same as https://github.com/llvm/llvm-project/pull/177988, but for fminimum_num/fmaximum_num. Directly canonicalize these to the corresponding intrinsics, and let the shrinking happen directly on the intrinsics.	2026-02-10 10:07:14 +01:00
Mel Chen	7e5d9189d2	[VPlan] Simplify true && x -> x (#179426 )	2026-02-10 08:49:03 +00:00
Peter Collingbourne	1de721c414	LowerTypeTests: Optimize two-phase check used by llvm.cond.loop. When a type test has two phases and is used by llvm.cond.loop to implement a conditional trap, it is more efficient for two infinite loops to be generated. Arrange for this by having the pass detect the typical IR pattern used for conditional CFI traps and generate the second llvm.cond.loop if found. Part of this RFC: https://discourse.llvm.org/t/rfc-optimizing-conditional-traps/89456 Reviewers: fmayer, vitalybuka Reviewed By: vitalybuka Pull Request: https://github.com/llvm/llvm-project/pull/177687	2026-02-09 18:13:57 -08:00
Andrew Lazarev	cfbb9a66ae	Revert "[msan] Switch switch() from strict handling to (icmp eq)-style handling" (#180636 ) Reverts llvm/llvm-project#179851 Breaks https://lab.llvm.org/buildbot/#/builders/164/builds/18551 and https://lab.llvm.org/buildbot/#/builders/94/builds/15188	2026-02-09 19:23:52 -05:00
Florian Hahn	d1ec04dfd4	[VPlan] Simplify single-entry VPWidenPHIRecipe. Include VPWidenPHIRecipe in phi simplification if there's a single incoming value.	2026-02-09 22:10:13 +00:00
Rahul Joshi	2e34fecf02	[NFC][LLVM][IPO] Remove pass initialization from pass constructors (#180584 )	2026-02-09 12:46:33 -08:00
Justin Fargnoli	754fc78d71	[ForceFunctionAttrs] Fix handling of `alwaysinline` and `noinline` attributes. (#180026 ) Address https://github.com/llvm/llvm-project/pull/152365#issuecomment-3198921342	2026-02-09 11:49:36 -08:00
Rahul Joshi	9ab3e3312d	Revert "[NFC][LLVM][IPO] Remove pass initialization from pass constructors" (#180571 ) Reverts llvm/llvm-project#180154 It seems to cause llc build failures, likely due to missing `ipo` dependency.	2026-02-09 10:06:10 -08:00
Rahul Joshi	d62bc3ae0e	[NFC][LLVM][IPO] Remove pass initialization from pass constructors (#180154 )	2026-02-09 08:58:38 -08:00
Nikita Popov	0a740668a4	[InstCombine] Support minimumnum/maximumnum (#180529 ) Support minimumnum/maximumnum intrinsics in various existing minnum/maxnum/minimum/maximum folds. The test coverage has been copied from minnum/maxnum. Proofs: https://alive2.llvm.org/ce/z/YMlLwO Proofs that time out: https://alive2.llvm.org/ce/z/dJN8wj	2026-02-09 15:55:59 +00:00
Vishruth Thimmaiah	84f4b1e52d	Reland "[LoopVectorize] Support vectorization of overflow intrinsics" (#180526 ) Enables support for marking overflow intrinsics `uadd`, `sadd`, `usub`, `ssub`, `umul` and `smul` as trivially vectorizable. Fixes #174617 --- This patch is a reland of #174835. Reverts #179819	2026-02-09 15:32:04 +00:00
Kiva	03ab85cb87	[InstCombine] fold `gepi _, (srem x, y)` to `gepi _, (urem x, y)` if `y` is power-of-2 (#180148 ) This PR adds a small, targeted InstCombine fold for the pattern: ``` %idx = srem i64 %x, 2^k %p = getelementptr inbounds nuw i8, ptr %base, i64 %idx ``` When the GEP is inbounds + nuw, and the divisor is a non-zero power-of-two constant, the signed remainder cannot produce a negative offset without violating the inbounds/nuw constraints. In that case we can canonicalize the index to a non-negative form and expose the common power-of-two rewrite: - Rewrite the GEP index from `srem %x, 2^k` to `urem %x, 2^k` - Create a new GEP with the new index and replace the original GEP - the `urem %x, 2^k` will further folds to `and %x (2^k-1)` resulting the following pattern ``` %idx = and i64 %x, (2^k-1) %p = getelementptr inbounds nuw i8, ptr %base, i64 %idx ``` Fixes #180097. generalized alive2 proof: https://alive2.llvm.org/ce/z/8EBxug	2026-02-09 22:26:07 +08:00
David Sherwood	44031ae79f	[LV] Fix issue in VPFirstOrderRecurrencePHIRecipe::usesFirstLaneOnly (#179977 ) In some cases we decide to vectorise loops with first-order recurrences using VF=1, IC>1. We then attempt to unroll a vplan in replicateByVF, however when trying to erase the list of values from the parent we trigger the following assert: ``` virtual llvm::VPRecipeValue::~VPRecipeValue(): Assertion `Users.empty() && "trying to delete a VPRecipeValue with remaining users"' failed. ``` The problem seems to stem from this code: ``` DefR->replaceUsesWithIf(LaneDefs[0], [DefR](VPUser &U, unsigned) { return U.usesFirstLaneOnly(DefR); }); ``` since usesFirstLaneOnly returns false and we fail to replace uses of DefR with LaneDefs[0]. Upon inspection the only VPUser objects that return false are VPInstruction::FirstOrderRecurrenceSplice and VPFirstOrderRecurrencePHIRecipe. Since the values are all scalar it's simply not possible for us to be using anything other than the first lane. I've fixed this by bailing out of replicateByVF early for plans with only a scalar VF. Fixes https://github.com/llvm/llvm-project/issues/179671	2026-02-09 13:42:26 +00:00
Florian Hahn	7defb0a4a3	[VPlan] Skip applying InstsToScalarize with forced instr costs. (#168269 ) ForceTargetInstructionCost in the legacy cost model overrides any costs from InstsToScalarize. Match the behavior in the VPlan-based cost model. This fixes a crash with -force-target-instr-cost for the added test case. PR: https://github.com/llvm/llvm-project/pull/168269	2026-02-09 13:20:44 +00:00
Yingwei Zheng	d1b402b612	[InstCombine] Avoid overflow in `foldVecExtTruncToExtElt` (#180414 ) This weird pattern was introduced by LoopVectorize. But it was placed in an unreachable path, so we cannot assert that the indices are always valid in InstCombine. Closes https://github.com/llvm/llvm-project/issues/180233.	2026-02-09 21:09:48 +08:00
Nikolas Klauser	6dbdfd824a	[InstCombine] Drop nonnull assumes if the pointer is already known to be nonnull (#180434 )	2026-02-09 13:13:32 +01:00
Luke Lau	8cd86ff284	[VPlan] Propagate FastMathFlags from phis to blends (#180226 ) If a phi has fast math flags, we can propagate it to the widened select. To do this, this patch makes VPPhi and VPBlendRecipe subclasses of VPRecipeWithIRFlags, and propagates it through PlainCFGBuilder and VPPredicator. Alive2 proofs for some of the FMFs (it looks like it can't reason about the full "fast" set yet) nnan: https://alive2.llvm.org/ce/z/f0bRd4 nsz: https://alive2.llvm.org/ce/z/u9P96T The actual motivation for this to eventually be able to move the special casing for tail folding in LoopVectorizationPlanner::addReductionResultComputation into the CFG in #176143, which requires passing through FMFs.	2026-02-09 19:38:58 +08:00
Nikita Popov	4ef7be9b80	[SimplifyLibCalls] Directly convert fmin/fmax to intrinsics (#177988 ) Drop the custom shrinking code, which we'll also do for intrinsics. Having libcall-only optimizations is confusing, as these are typically directly emitted as intrinsics by the frontend.	2026-02-09 10:10:24 +00:00
Nikita Popov	a654a27fcd	[InstCombine] Fold min/max(fpext x, C) to fpext(min/max(x, fptrunc C)) (#179968 ) Fold `min/max(fpext x, C)` to `fpext(min/max(x, fptrunc C))` in cases where the truncation of the constant is lossless. This helps eliminate fpext/fptrunc pairs around min/max and addresses the regression from https://github.com/llvm/llvm-project/pull/177988. Proof: https://alive2.llvm.org/ce/z/y_Bcdd	2026-02-09 09:13:26 +00:00
Nikita Popov	531430b614	[InstCombine] Relax one-use check for min/max(fpext x, fpext y) to fpext(min/max(x, y)) fold (#180164 ) If only of the operands is one-use, the total number of fpexts stays the same, but the min/max is performed on a narrowed type. Additionally, the fpext may fold with a following fptrunc.	2026-02-09 09:34:17 +01:00
Nikolas Klauser	6c31bf0474	[PredicateInfo] Fix crash on nonnull assume taking a constant (#180440 )	2026-02-09 09:18:21 +01:00
paperchalice	5c5677d7b8	[llvm] Remove "no-infs-fp-math" attribute support (#180083 ) One of global options in `TargetMachine::resetTargetOptions`, now all backends no longer support it, remove it.	2026-02-09 08:43:33 +08:00
Florian Hahn	6324ee32c1	[VPlan] Use PredBB's terminator as insert point for VPIRPhi extracts. Use PredBB's terminator as insert point in VPIRPhi::execute to make sure the extracts are placed after any possibly sunk instructions. Fixes https://github.com/llvm/llvm-project/issues/180363.	2026-02-08 20:36:36 +00:00
Florian Hahn	7509cad693	[VPlan] Support masked VPInsts, use for predication (NFC) (#142285 ) Add support for mask operands to most VPInstructions, using getNumOperandsForOpcode. This allows VPlan predication to predicate VPInstructions directly. The mask will then be dropped or handled when creating wide recipes. Depends on https://github.com/llvm/llvm-project/pull/142284. Depends on https://github.com/llvm/llvm-project/pull/168784. PR: https://github.com/llvm/llvm-project/pull/142285	2026-02-08 18:23:36 +00:00
Florian Hahn	3c5b05427d	[VPlan] Pass underlying instr to getMemoryOpCost in ::computeCost. Pass underlying instruction to getMemoryOpCost in VPReplicateRecipe::computeCost if UsedByLoadStoreAddress is true. Some targets use the underlying instruction to improve costs, and this is needed to match the legacy cost model. Fixes https://github.com/llvm/llvm-project/issues/177780. Fixes https://github.com/llvm/llvm-project/issues/177772.	2026-02-08 16:15:39 +00:00
Florian Hahn	3192fe2c7b	[VPlan] Fall back to legacy cost model if PtrSCEV is nullptr. There are some cases when PtrSCEV can be nullptr. Fall back to legacy cost model, to not call isLoopInvariant with nullptr. Fixes a crash after 0c4f8094939d2.	2026-02-08 11:55:12 +00:00
Florian Hahn	0c4f809493	[VPlan] Compute predicated load/store costs in VPlan. (NFC) (#179129 ) Update VPReplicateReicpe::computeCost to compute predicated load/store costs directly, unless the pointer is uniform. In that case, the legacy cost model uses a different logic, which will be migrated separately. PR: https://github.com/llvm/llvm-project/pull/179129	2026-02-07 20:02:54 +00:00
Aiden Grossman	ec059d81aa	[DSE] Handle variable offsets with sized dead_on_return (#180364 ) With a sized dead_on_return, we need to not eliminate stores if there are to a pointer with a variable offset from the underlying object marked dead_on_return. This manifested as an assertion failure as BaseValue/V ended up not being equal. It's possible we could do a range analysis to try and prove the variable offset stays within bounds, but this case seems to come up relatively rarely (only reproducible with a UBSan build of LLVM) and is probably not worth the compile time. Fixes #180361.	2026-02-07 11:43:56 -08:00
Hongyu Chen	8b5e95b1fd	[InferAddressSpaces] Initialize op(generic const, generic const, ...) -> generic (#172143 ) Fixes #171890 If the pointer operands of an instruction are all constants with generic AS, we always infer the AS of the instruction as uninitialized finally. And the rewrite process will skip cloning the instruction, producing invalid IR. This patch fixes it by inferring the AS of this kind of instruction as flat. Maybe we can fold the operator with all constants to get better performance, but I think this case is rare in the real world.	2026-02-08 01:20:19 +08:00
Florian Hahn	e8908215de	[LSR] Support SCEVPtrToAddr in SCEVDbgValueBuilder. Allow SCEVPtrToAddr as cast in assertion in SCEVDbgValueBuilder. SCEVPtrToAddr is handled similarly to SCEVPtrToInt. Fixes a crash with debug info after bd40d1de9c9ee, which started to generate ptrtoaddr instead of ptrtoint expressions.	2026-02-07 14:02:45 +00:00
hanbeom	8d2078332c	[InstCombine] Shrink added constant using LHS known zeros (#174380 ) Previously, `SimplifyDemandedUseBits` for `add` instructions only used known zeros from the RHS to simplify the LHS. It failed to handle the symmetric case where the LHS has known zeros and the result does not demand the low bits. This patch implements this missing optimization, allowing the RHS constant to be shrunk when the LHS low bits are known zero and unused. Proof: https://alive2.llvm.org/ce/z/6v9iFY Fixed: https://github.com/llvm/llvm-project/issues/135411	2026-02-07 20:41:58 +09:00
Kewen Meng	703c2762d3	Revert "[LV] Support conditional scalar assignments of masked operations" (#180275 ) Reverts llvm/llvm-project#178862 revert to unblock bot: https://lab.llvm.org/buildbot/#/builders/206/builds/13225	2026-02-06 13:24:40 -08:00
Florian Hahn	bd40d1de9c	Reapply "[SCEVExp] Use SCEVPtrToAddr in tryToReuseLCSSAPhi if possible. (#180257 )" This reverts commit cb905605b2e95f88296afe136b21a7d2476cb058. Recommit the patch with a small change to check the destination type matches the address type, to avoid a crash on mismatch. Original message: This patch updates tryToReuseLCSSAPhi to use SCEVPtrToAddr, unless using SCEVPtrToInt allows re-use, because the IR already contains a re-usable phi using PtrToInt. This is a first step towards migrating to SCEVPtrToAddr and avoids regressions in follow-up changes. PR: https://github.com/llvm/llvm-project/pull/178727	2026-02-06 21:14:41 +00:00
Vladimir Radosavljevic	57d1fbf62c	[InstCombine] Limit (icmp eq/ne (and (add A, Addend), Msk), C) fold to one use of and (#172858 ) If the and has multiple uses, the fold can increase the instruction count.	2026-02-07 03:09:27 +08:00
Florian Hahn	cb905605b2	Revert "[SCEVExp] Use SCEVPtrToAddr in tryToReuseLCSSAPhi if possible." (#180257 ) Reverts llvm/llvm-project#178727 triggers asserts in on some build bots	2026-02-06 18:26:37 +00:00
Mingming Liu	544caa627b	[StaticDataLayout] Reconcile string literal hotness from data access profiles and PGO profiles. (#178336 ) https://github.com/llvm/llvm-project/pull/178333 updates the memprof pass to annotate string literal section prefix. The StaticDataProfileInfo.cpp provides an analysis pass to reconcile global variable hotness. It's used by StaticDataAnnotator and AsmPrinter to look up global variable hotness. This PR updates the analysis pass to compute the hotness of string literals. * When both data access profiles and pgo counters provide a hotness attribute, use the hotter one. * Otherwise, use the hotness attribute that's available. Implementation-wise, the option `AnnotateStringLiteralSectionPrefix` is moved from MemProf (a transform pass) to StaticDataProfileInfo (an Analysis pass). Otherwise, there might be errors like caught by CI. Note https://github.com/llvm/llvm-project/pull/178336#issuecomment-3808537817 is an edited message, and its history shows the intermediate failures like below. ~My understanding is~ Preliminary LLM study (:)) shows that the error manifests in PowerPC but not X86 due to cmake variable differences. ``` FAILED: unittests/Target/PowerPC/PowerPCTests ... >>> referenced by CommandLine.h:1437 (/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/include/llvm/Support/CommandLine.h:1437) >>> StaticDataProfileInfo.cpp.o:(llvm::StaticDataProfileInfo::getConstantSectionPrefix(llvm::Constant const, llvm::ProfileSummaryInfo const) const) in archive lib/libLLVMAnalysis.a clang++: error: linker command failed with exit code 1 (use -v to see invocation) ```	2026-02-06 09:49:14 -08:00
Thurston Dang	64de25d183	[msan] Handle NEON floating-point absolute compare greater than/equal (#180120 ) Uses existing handleVectorComparePackedIntrinsic()	2026-02-06 09:43:25 -08:00
Florian Hahn	c32cde4182	[SCEVExp] Use SCEVPtrToAddr in tryToReuseLCSSAPhi if possible. (#178727 ) This patch updates tryToReuseLCSSAPhi to use SCEVPtrToAddr, unless using SCEVPtrToInt allows re-use, because the IR already contains a re-usable phi using PtrToInt. This is a first step towards migrating to SCEVPtrToAddr and avoids regressions in follow-up changes. PR: https://github.com/llvm/llvm-project/pull/178727	2026-02-06 17:38:24 +00:00
Luke Lau	d1d9413e7b	[VPlan] Don't use std::not_fn It looks like some Apple based toolchains can't compile this: https://github.com/llvm/llvm-project/pull/180005#issuecomment-3861614477 There aren't any other users of std::not_fn within LLVM so just use a lambda for now.	2026-02-07 01:30:55 +08:00

1 2 3 4 5 ...

42368 Commits