llvm-project

Author	SHA1	Message	Date
Paul Walker	04f98889ae	[LLVM][NumericalStabilitySanitizer] Add support for vector ConstantFPs. (#151739 )	2025-08-04 13:58:32 +01:00
Paul Walker	fb4a8f67b9	[LLVM][InstCombine] foldICmpEquality: Compare APInt values rather than addresses. (#151726 )	2025-08-04 13:54:44 +01:00
Nikita Popov	e833bb0991	[Local] Do not pass Root to replaceDominatedUsesWith (NFC) Capture it in the lambdas instead.	2025-08-04 14:22:17 +02:00
Florian Hahn	66a8341f6d	[VPlan] Skip disconnected exit blocks in hasEarlyExit. (#151718 ) Currently hasEarlyExit returns true, if there are multiple exit blocks. ExitBlocks contains the wrapped original IR exit blocks. Without checking the predecessors we incorrectly return true for loops with multiple countable exits, that have been vectorized by requiring a scalar epilogue. In that case, the exit blocks will get disconnected. Fix this by filtering out disconnected exit blocks. Currently this should only impact the 'early-exit vectorized' statistic. PR: https://github.com/llvm/llvm-project/pull/151718	2025-08-04 11:31:00 +01:00
Nikita Popov	4b5b36e5c4	[GVN] Avoid creating lifetime of non-alloca There is a larger problem here in that we should not be performing arbitrary pointer replacements for assumes. This is handled for branches, but assume goes through a different code path. Fixes https://github.com/llvm/llvm-project/issues/151785.	2025-08-04 12:06:40 +02:00
Leon Clark	1feed444aa	[VectorCombine] Shrink loads used in shufflevector rebroadcasts (#128938 ) Attempt to shrink the size of vector loads where only some of the incoming lanes are used for rebroadcasts in shufflevector instructions. --------- Co-authored-by: Leon Clark <leoclark@amd.com> Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-04 10:49:27 +01:00
Nikita Popov	86727fe9a1	[IR] Allow poison argument to lifetime markers (#151148 ) This slightly relaxes the invariant established in #149310, by also allowing the lifetime argument to be poison. This is to support the typical pattern of RAUWing with poison when removing an instruction. It's worth noting that this does not require any conservative assumptions, lifetimes with poison arguments can simply be skipped. Fixes https://github.com/llvm/llvm-project/issues/151119.	2025-08-04 10:02:04 +02:00
Mircea Trofin	9a60841dc4	[PGO][profcheck] ignore explicitly cold functions (#151778 ) There is a case when branch profile metadata is OK to miss, namely, cold functions. The goal of the RFC (see the referenced issue) is to avoid accidental omission (and, at a later date, corruption) of profile metadata. However, asking cold functions to have all their conditional branches marked with "0" probabilities would be overdoing it. We can just ask cold functions to have an explicit 0 entry count. This patch: - injects an entry count for functions, unless they have one (synthetic or not) - if the entry count is 0, doesn't inject, nor does it verify the rest of the metadata - at verification, if the entry count is missing, it reports an error Issue #147390	2025-08-04 03:53:49 +02:00
Austin	c7bacc9f26	[llvm] using wrapper llvm::sort(nfc) (#151000 ) using wrapper llvm::sort(nfc)	2025-08-04 09:27:01 +08:00
Kazu Hirata	3549134836	[Vectorize] Remove an unnecessary cast (NFC) (#151850 ) getNumElements() already returns unsigned.	2025-08-03 08:44:50 -07:00
Kazu Hirata	c068f8b408	[Scalar] Remove an unnecessary cast (NFC) (#151849 ) LoadType is already of Type *.	2025-08-03 08:44:43 -07:00
Florian Hahn	559d1dff89	[VPlan] Materialize BackedgeTakenCount using VPInstructions. Explicitly compute the backedge-taken count using VPInstruction. This is needed to model the full skeleton in VPlan. NFC modulo some instruction re-ordering.	2025-08-03 12:21:28 +01:00
Simon Pilgrim	b983ce8145	[VPlan] handleMaxMinNumReductions - fix gcc Wparentheses warning. NFC.	2025-08-03 11:50:31 +01:00
David Green	d9971be83e	[InstCombine] Make foldCmpLoadFromIndexedGlobal more resilient to non-array geps. (#150639 ) My understanding is that gep [n x i8] and gep i8 can be treated equivalently - the array type conveys no extra information and could be removed. This goes through foldCmpLoadFromIndexedGlobal and tries to make it work for non-array gep types, so long as the index type still matches the array being loaded.	2025-08-03 10:19:42 +01:00
Florian Hahn	39c30665e9	[VPlan] Update type of cloned instruction in scalarizeInstruction. The operands of the replicate recipe may have been narrowed, resulting in a narrower result type. Update the type of the cloned instruction to the correct type. Fixes https://github.com/llvm/llvm-project/issues/151392.	2025-08-02 19:49:59 +01:00
Florian Hahn	08f50e9665	[VPlan] Use vector tripcount if computable when simplifying conds. (#151034 ) Update isConditionTrueViaVFAndUF to use the vector trip count if computable. This is the case when it has been materialized to a constant. Otherwise fall back to the trip count. PR: https://github.com/llvm/llvm-project/pull/151034	2025-08-02 16:31:31 +01:00
Ramkumar Ramachandra	af0be76a35	[VPlan] Replace reverse RPOT with PO traversal (NFC) (#151757 )	2025-08-02 08:46:27 +01:00
Florian Hahn	eee9755881	[LV] Refine check to find epilogue IV resume value. Make sure to check that the vector trip count is containedin the list of incoming values to serve as tie-breaker with phis with all-zero incoming values. Fixes https://github.com/llvm/llvm-project/issues/151686.	2025-08-01 20:54:39 +01:00
Teresa Johnson	dc90472532	[MemProf] Ensure node merging happens for newly created nodes (#151593 ) We weren't performing node merging on newly created nodes in some cases. Use a simple iteration over the node and its callers until no more opportunities are found. I confirmed that for several large codes the max iterations is 3 (meaning we only needed to do any work on the first 2, as expected). This can potentially be made more elegant in the future, but it is a simple and effective solution. Also fix a bug exposed by the test case, getting the function for a call instruction in the FullLTO handling, using an existing method to look through aliases if needed.	2025-08-01 12:51:12 -07:00
Florian Hahn	c300a99ea8	[LV] Use MapVector for InstsToScalarize for deterministic iter order (NFC) We iterate over InstsToScalarize when printing costs, and currently the iteration order is not deterministic. Currently no tests check the output with multiple instructions in InstsToScalarize, but those will come soon.	2025-08-01 14:29:53 +01:00
Florian Hahn	2ae996cbbe	[LAA] Support assumptions in evaluatePtrAddRecAtMaxBTCWillNotWrap (#147047 ) This patch extends the logic added in https://github.com/llvm/llvm-project/pull/128061 to support dereferenceability information from assumptions as well. Unfortunately both assumption cache and the dominator tree need to be threaded through multiple layers to make them available where needed. PR: https://github.com/llvm/llvm-project/pull/147047	2025-08-01 14:18:07 +01:00
Florian Hahn	7d815c7642	[LV] Remove unused variables after 965231ca0a9a. (NFC) Clean up unused/dead variables after 965231ca0a9a (https://github.com/llvm/llvm-project/pull/151311)	2025-08-01 10:12:20 +01:00
Nikita Popov	09dc08b707	[InstCombine] Handle repeated users in foldOpIntoPhi() If the phi is used multiple times in the same user, it will appear multiple times in users(), in which case make_early_inc_range() is insufficient to prevent iterator invalidation. Fixes the issue reported at: https://github.com/llvm/llvm-project/pull/151115#issuecomment-3141542852	2025-08-01 11:07:06 +02:00
Kerry McLaughlin	e170676351	[Instcombine] Combine extractelement from a vector_extract at index 0 (#151491 ) Extracting any element from a subvector starting at index 0 is equivalent to extracting from the original vector, i.e. extract_elt(vector_extract(x, 0), y) -> extract_elt(x, y)	2025-08-01 09:54:43 +01:00
Florian Hahn	965231ca0a	[LV] Replace UncountableEdge with UncountableEarlyExitingBlock (NFC). (#151311 ) Only the uncountable exiting BB is used. Store it instead of a piar of Exiting BB and Exit BB. PR: https://github.com/llvm/llvm-project/pull/151311	2025-08-01 09:37:02 +01:00
Nikita Popov	0a41e7c87e	[LICM] Do not reassociate constant offset GEP (#151492 ) LICM tries to reassociate GEPs in order to hoist an invariant GEP. Currently, it also does this in the case where the GEP has a constant offset. This is usually undesirable. From a back-end perspective, constant GEPs are usually free because they can be folded into addressing modes, so this just increases register pressume. From a middle-end perspective, keeping constant offsets last in the chain makes it easier to analyze the relationship between multiple GEPs on the same base, especially after CSE. The worst that can happen here is if we start with something like ``` loop { p + 4x p + 4x + 1 p + 4x + 2 p + 4x + 3 } ``` And LICM converts it into: ``` p.1 = p + 1 p.2 = p + 2 p.3 = p + 3 loop { p + 4x p.1 + 4x p.2 + 4x p.3 + 4x } ``` Which is much worse than leaving it for CSE to convert to: ``` loop { p2 = p + 4*x p2 + 1 p2 + 2 p2 + 3 } ```	2025-08-01 09:43:15 +02:00
Nikita Popov	d4d2d7d785	[InstCombine] Preserve nuw in canonicalizeGEPOfConstGEPI8() (#151533 ) Proof: https://alive2.llvm.org/ce/z/4j8U3f	2025-08-01 09:40:03 +02:00
Nikita Popov	f6161271a3	[SeparateConstOffsetFromGEP] Remove support for arithmetic lowering (#151477 ) I don't think there is any benefit to lowering to ptrtoint + arithmetic + inttoptr over the newer ptradd lowering. Even if a target does not use codegen AA, it probably still has IR passes that benefit from correct representation. As far as I can tell, no targets actually use this configuration anymore (they either don't use the LowerGEP option, or they they UseAA and thus the ptradd lowering).	2025-08-01 09:35:48 +02:00
Kazu Hirata	228e96b28a	[llvm] Use std::make_optional (NFC) (#151627 ) std::make_optional<T> is a lot like std::make_unique<T> in that it performs perfect forwarding of arguments for T's constructor. As a result, we don't have to repeat type names twice.	2025-08-01 00:24:40 -07:00
Mel Chen	86916ff0f0	[LV] Fix gap mask requirement for interleaved access (#151105 ) When interleaved stores contain gaps, a mask is required to skip the gaps, regardless of whether scalar epilogues are allowed. This patch corrects the condition under which a gap mask is needed, ensuring consistency between the legacy and VPlan-based cost models and avoiding assertion failures. Related #149981	2025-08-01 14:24:30 +08:00
Ramkumar Ramachandra	d07f48e4da	[VPlan] Use m_BinaryOr matcher for clarity (NFC) (#151541 )	2025-08-01 06:56:27 +01:00
Luke Lau	7250b66240	[VPlan] Create AVL as a phi from TC -> 0 with EVL tail folding (#151481 ) This implements the first half of #151459, by changing the AVL so it's no longer computed as `trip-count - EVL-based IV`, but instead a separate scalar phi that is decremented by EVL each iteration. This shortens the dependency chain for computing the AVL and should eventually allow us to convert the branch condition to `branch-count avl-next, 0`. `simplifyBranchConditionForVFAndUF` had to be updated to prevent a regression because this introduces a VPPhi in the header block.	2025-08-01 11:00:05 +08:00
Joel E. Denny	37e03b56b8	Revert "[PGO] Add `llvm.loop.estimated_trip_count` metadata" (#151585 ) Reverts llvm/llvm-project#148758 [As requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)	2025-07-31 15:56:31 -04:00
Joel E. Denny	a85c725952	Revert "[Utils] Fix a warning" This reverts commit 3a18fe33f0763cd9276c99c276448412100f6270. So that we can revert PR #148758.	2025-07-31 15:54:01 -04:00
shuffle2	7b5a44c605	[hwasan] Add hwasan-all-globals option (#149621 ) hwasan-globals does not instrument globals with custom sections, because existing code may use `__start_`/`__stop_` symbols to iterate over globals in such a way which will cause hwasan assertions. Introduce new hwasan-all-globals option, which instruments all user-defined globals (but not those globals which are generated by the hwasan instrumentation itself), including those with custom sections. fixes #142442	2025-07-31 11:38:42 -07:00
Kazu Hirata	3a18fe33f0	[Utils] Fix a warning This patch fixes: llvm/lib/Transforms/Utils/LoopUtils.cpp:818:28: error: unused function 'operator<<' [-Werror,-Wunused-function]	2025-07-31 11:24:33 -07:00
Joel E. Denny	f7b65011de	[PGO] Add `llvm.loop.estimated_trip_count` metadata (#148758 ) This patch implements the `llvm.loop.estimated_trip_count` metadata discussed in [[RFC] Fix Loop Transformations to Preserve Block Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785). As [suggested in the RFC comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4), it adds the new metadata to all loops at the time of profile ingestion and estimates each trip count from the loop's `branch_weights` metadata. As [suggested in the PR #128785 review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036), it does so via a new `PGOEstimateTripCountsPass` pass, which creates the new metadata for each loop but omits the value if it cannot estimate a trip count due to the loop's form. An important observation not previously discussed is that `PGOEstimateTripCountsPass` often cannot estimate a loop's trip count, but later passes can sometimes transform the loop in a way that makes it possible. Currently, such passes do not necessarily update the metadata, but eventually that should be fixed. Until then, if the new metadata has no value, `llvm::getLoopEstimatedTripCount` disregards it and tries again to estimate the trip count from the loop's current `branch_weights` metadata.	2025-07-31 12:28:25 -04:00
David Green	8f968fe3ec	[AggressiveInstCombine] Make cttz fold more resiliant to non-array geps (#150896 ) Similar to #150639 this fixes the AggressiveInstCombine fold for convert tables to cttz instructions if the gep types are not array types. i.e `gep i16 @glob, i64 %idx` instead of `gep [64 x i16] @glob, i64 0, i64 %idx`.	2025-07-31 16:53:55 +01:00
Florian Hahn	99d70e09a9	[SCEV] Allow adds of constants in tryToReuseLCSSAPhi. (#150693 ) Update the logic added in https://github.com/llvm/llvm-project/pull/147824 to also allow adds of constants. There are a number of cases where this can help remove redundant phis and replace some computation with a ptrtoint (which likely is free in the backend). PR: https://github.com/llvm/llvm-project/pull/150693	2025-07-31 16:33:25 +01:00
Luke Lau	08c5944222	[VPlan] Fix header phi VPInstruction verification. NFC (#151472 ) Noticed this when checking the invariant that all phis in the header block must be header phis. I think there's a missing set of parentheses here, since otherwise it only cast<VPInstruction> when RecipeI isn't a VPInstruction.	2025-07-31 23:09:20 +08:00
Nikita Popov	a71909156e	[InstCombine] Set flags when canonicalizing GEP indices (#151516 ) When truncating set nsw/nuw based on nusw/nuw. When extending, use zext nneg if nusw+nuw. Proof: https://alive2.llvm.org/ce/z/JA2Yzr	2025-07-31 15:58:04 +02:00
LU-JOHN	a757f23404	[SimplifyCFG] Extend jump-threading to allow live local defs (#135079 ) Extend jump-threading to allow local defs that are live outside of the threaded block. Allow threading to destinations where the local defs are not live. --------- Signed-off-by: John Lu <John.Lu@amd.com>	2025-07-31 09:44:14 -04:00
Samuel Tebbs	339b0a1d74	[LV][NFCI] Format fcc419b05f62	2025-07-31 14:37:59 +01:00
Samuel Tebbs	fcc419b05f	[LV][NFCI] Swap reduction recipe operand order https://github.com/llvm/llvm-project/pull/147026 will enable sub reductions, which require that the phi value is the first operand since they aren't commutative. This re-orders the operands when executing reductions, which actually matches other existing code in VPReductionRecipe::execute.	2025-07-31 14:35:10 +01:00
Nathan Gauër	67273393b1	[VectorCombine][TTI] Prevent extract/ins rewrite to GEP (#150216 ) Using GEP to index into a vector is not disallowed, but not recommended. The SPIR-V backend needs to generate structured access into types, which is impossible with an untyped GEP instruction unless we add more info to the IR. Finding a solution is a work-in-progress, but in the meantime, we'd like to reduce the amount of failures. Preventing this optimizations from rewritting extract/insert instructions into a GEP helps us lower more code to SPIR-V. This change should be OK as it's only active when targeting SPIR-V and disabling a non-recommended transformation. Related to #145002	2025-07-31 14:14:00 +02:00
Ramkumar Ramachandra	b7d00b827e	[VPlan] Uniformly use VPlanPatternMatch in transforms (NFC) (#151488 )	2025-07-31 12:01:40 +01:00
Ramkumar Ramachandra	20f6ec4b29	[VPlan] Make VPBuilder APIs uniformly take ArrayRef (NFC) (#151484 )	2025-07-31 11:33:04 +01:00
Nikita Popov	16d73839b1	[InstCombine] Support folding intrinsics into phis (#151115 ) Call foldOpIntoPhi() for speculatable intrinsics. We already do this for FoldOpIntoSelect(). Among other things, this partially subsumes https://github.com/llvm/llvm-project/pull/149858.	2025-07-31 12:32:37 +02:00
Mel Chen	6752415ce8	[VectorUtils] Simplify the code by new function InterleaveGroup::isFull. nfc (#151112 )	2025-07-31 16:02:53 +08:00
Benjamin Maxwell	cd16c706ba	[IRCE] Use function_ref<> instead of optional<function_ref<>> (NFC) (#151308 ) llvm::function_ref<> is nullable.	2025-07-31 07:56:05 +01:00

... 2 3 4 5 6 ...

40745 Commits