40745 Commits

Author SHA1 Message Date
Paul Walker
04f98889ae
[LLVM][NumericalStabilitySanitizer] Add support for vector ConstantFPs. (#151739) 2025-08-04 13:58:32 +01:00
Paul Walker
fb4a8f67b9
[LLVM][InstCombine] foldICmpEquality: Compare APInt values rather than addresses. (#151726) 2025-08-04 13:54:44 +01:00
Nikita Popov
e833bb0991 [Local] Do not pass Root to replaceDominatedUsesWith (NFC)
Capture it in the lambdas instead.
2025-08-04 14:22:17 +02:00
Florian Hahn
66a8341f6d
[VPlan] Skip disconnected exit blocks in hasEarlyExit. (#151718)
Currently hasEarlyExit returns true, if there are multiple exit blocks.
ExitBlocks contains the wrapped original IR exit blocks. Without
checking the predecessors we incorrectly return true for loops with
multiple countable exits, that have been vectorized by requiring a
scalar epilogue. In that case, the exit blocks will get disconnected.

Fix this by filtering out disconnected exit blocks.

Currently this should only impact the 'early-exit vectorized' statistic.

PR: https://github.com/llvm/llvm-project/pull/151718
2025-08-04 11:31:00 +01:00
Nikita Popov
4b5b36e5c4 [GVN] Avoid creating lifetime of non-alloca
There is a larger problem here in that we should not be performing
arbitrary pointer replacements for assumes. This is handled for
branches, but assume goes through a different code path.

Fixes https://github.com/llvm/llvm-project/issues/151785.
2025-08-04 12:06:40 +02:00
Leon Clark
1feed444aa
[VectorCombine] Shrink loads used in shufflevector rebroadcasts (#128938)
Attempt to shrink the size of vector loads where only some of the incoming lanes are used for rebroadcasts in shufflevector instructions.

---------

Co-authored-by: Leon Clark <leoclark@amd.com>
Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-08-04 10:49:27 +01:00
Nikita Popov
86727fe9a1
[IR] Allow poison argument to lifetime markers (#151148)
This slightly relaxes the invariant established in #149310, by also
allowing the lifetime argument to be poison. This is to support the
typical pattern of RAUWing with poison when removing an instruction.

It's worth noting that this does not require any conservative
assumptions, lifetimes with poison arguments can simply be skipped.

Fixes https://github.com/llvm/llvm-project/issues/151119.
2025-08-04 10:02:04 +02:00
Mircea Trofin
9a60841dc4
[PGO][profcheck] ignore explicitly cold functions (#151778)
There is a case when branch profile metadata is OK to miss, namely, cold functions. The goal of the RFC (see the referenced issue) is to avoid accidental omission (and, at a later date, corruption) of profile metadata. However, asking cold functions to have all their conditional branches marked with "0" probabilities would be overdoing it. We can just ask cold functions to have an explicit 0 entry count.

This patch:
- injects an entry count for functions, unless they have one (synthetic or not)
- if the entry count is 0, doesn't inject, nor does it verify the rest of the metadata
- at verification, if the entry count is missing, it reports an error

Issue #147390
2025-08-04 03:53:49 +02:00
Austin
c7bacc9f26
[llvm] using wrapper llvm::sort(nfc) (#151000)
using wrapper llvm::sort(nfc)
2025-08-04 09:27:01 +08:00
Kazu Hirata
3549134836
[Vectorize] Remove an unnecessary cast (NFC) (#151850)
getNumElements() already returns unsigned.
2025-08-03 08:44:50 -07:00
Kazu Hirata
c068f8b408
[Scalar] Remove an unnecessary cast (NFC) (#151849)
LoadType is already of Type *.
2025-08-03 08:44:43 -07:00
Florian Hahn
559d1dff89
[VPlan] Materialize BackedgeTakenCount using VPInstructions.
Explicitly compute the backedge-taken count using VPInstruction. This is
needed to model the full skeleton in VPlan.

NFC modulo some instruction re-ordering.
2025-08-03 12:21:28 +01:00
Simon Pilgrim
b983ce8145 [VPlan] handleMaxMinNumReductions - fix gcc Wparentheses warning. NFC. 2025-08-03 11:50:31 +01:00
David Green
d9971be83e
[InstCombine] Make foldCmpLoadFromIndexedGlobal more resilient to non-array geps. (#150639)
My understanding is that gep [n x i8] and gep i8 can be treated
equivalently - the array type conveys no extra information and could be
removed. This goes through foldCmpLoadFromIndexedGlobal and tries to
make it work for non-array gep types, so long as the index type still
matches the array being loaded.
2025-08-03 10:19:42 +01:00
Florian Hahn
39c30665e9
[VPlan] Update type of cloned instruction in scalarizeInstruction.
The operands of the replicate recipe may have been narrowed, resulting
in a narrower result type. Update the type of the cloned instruction to
the correct type.

Fixes https://github.com/llvm/llvm-project/issues/151392.
2025-08-02 19:49:59 +01:00
Florian Hahn
08f50e9665
[VPlan] Use vector tripcount if computable when simplifying conds. (#151034)
Update isConditionTrueViaVFAndUF to use the vector trip count if
computable. This is the case when it has been materialized to a
constant. Otherwise fall back to the trip count.

PR: https://github.com/llvm/llvm-project/pull/151034
2025-08-02 16:31:31 +01:00
Ramkumar Ramachandra
af0be76a35
[VPlan] Replace reverse RPOT with PO traversal (NFC) (#151757) 2025-08-02 08:46:27 +01:00
Florian Hahn
eee9755881
[LV] Refine check to find epilogue IV resume value.
Make sure to check that the vector trip count is containedin the list of
incoming values to serve as tie-breaker with phis with all-zero incoming
values.

Fixes https://github.com/llvm/llvm-project/issues/151686.
2025-08-01 20:54:39 +01:00
Teresa Johnson
dc90472532
[MemProf] Ensure node merging happens for newly created nodes (#151593)
We weren't performing node merging on newly created nodes in some cases.
Use a simple iteration over the node and its callers until no more
opportunities are found. I confirmed that for several large codes the
max iterations is 3 (meaning we only needed to do any work on the first
2, as expected). This can potentially be made more elegant in the
future, but it is a simple and effective solution.

Also fix a bug exposed by the test case, getting the function for a call
instruction in the FullLTO handling, using an existing method to look
through aliases if needed.
2025-08-01 12:51:12 -07:00
Florian Hahn
c300a99ea8
[LV] Use MapVector for InstsToScalarize for deterministic iter order (NFC)
We iterate over InstsToScalarize when printing costs, and currently the
iteration order is not deterministic. Currently no tests check the
output with multiple instructions in InstsToScalarize, but those will
come soon.
2025-08-01 14:29:53 +01:00
Florian Hahn
2ae996cbbe
[LAA] Support assumptions in evaluatePtrAddRecAtMaxBTCWillNotWrap (#147047)
This patch extends the logic added in
https://github.com/llvm/llvm-project/pull/128061 to support
dereferenceability information from assumptions as well.

Unfortunately both assumption cache and the dominator tree need to be
threaded through multiple layers to make them available where needed.

PR: https://github.com/llvm/llvm-project/pull/147047
2025-08-01 14:18:07 +01:00
Florian Hahn
7d815c7642
[LV] Remove unused variables after 965231ca0a9a. (NFC)
Clean up unused/dead variables after 965231ca0a9a
(https://github.com/llvm/llvm-project/pull/151311)
2025-08-01 10:12:20 +01:00
Nikita Popov
09dc08b707 [InstCombine] Handle repeated users in foldOpIntoPhi()
If the phi is used multiple times in the same user, it will appear
multiple times in users(), in which case make_early_inc_range()
is insufficient to prevent iterator invalidation.

Fixes the issue reported at:
https://github.com/llvm/llvm-project/pull/151115#issuecomment-3141542852
2025-08-01 11:07:06 +02:00
Kerry McLaughlin
e170676351
[Instcombine] Combine extractelement from a vector_extract at index 0 (#151491)
Extracting any element from a subvector starting at index 0 is
equivalent to extracting from the original vector, i.e.
  extract_elt(vector_extract(x, 0), y) -> extract_elt(x, y)
2025-08-01 09:54:43 +01:00
Florian Hahn
965231ca0a
[LV] Replace UncountableEdge with UncountableEarlyExitingBlock (NFC). (#151311)
Only the uncountable exiting BB is used. Store it instead of a piar of
Exiting BB and Exit BB.

PR: https://github.com/llvm/llvm-project/pull/151311
2025-08-01 09:37:02 +01:00
Nikita Popov
0a41e7c87e
[LICM] Do not reassociate constant offset GEP (#151492)
LICM tries to reassociate GEPs in order to hoist an invariant GEP.
Currently, it also does this in the case where the GEP has a constant
offset.

This is usually undesirable. From a back-end perspective, constant GEPs
are usually free because they can be folded into addressing modes, so
this just increases register pressume. From a middle-end perspective,
keeping constant offsets last in the chain makes it easier to analyze
the relationship between multiple GEPs on the same base, especially
after CSE.

The worst that can happen here is if we start with something like

```
loop {
   p + 4*x
   p + 4*x + 1
   p + 4*x + 2
   p + 4*x + 3
}
```

And LICM converts it into:

```
p.1 = p + 1
p.2 = p + 2
p.3 = p + 3
loop {
   p + 4*x
   p.1 + 4*x
   p.2 + 4*x
   p.3 + 4*x
}
```

Which is much worse than leaving it for CSE to convert to:
```
loop {
   p2 = p + 4*x
   p2 + 1
   p2 + 2
   p2 + 3
}
```
2025-08-01 09:43:15 +02:00
Nikita Popov
d4d2d7d785
[InstCombine] Preserve nuw in canonicalizeGEPOfConstGEPI8() (#151533)
Proof: https://alive2.llvm.org/ce/z/4j8U3f
2025-08-01 09:40:03 +02:00
Nikita Popov
f6161271a3
[SeparateConstOffsetFromGEP] Remove support for arithmetic lowering (#151477)
I don't think there is any benefit to lowering to ptrtoint + arithmetic
+ inttoptr over the newer ptradd lowering. Even if a target does not use
codegen AA, it probably still has IR passes that benefit from correct
representation.

As far as I can tell, no targets actually use this configuration anymore
(they either don't use the LowerGEP option, or they they UseAA and thus
the ptradd lowering).
2025-08-01 09:35:48 +02:00
Kazu Hirata
228e96b28a
[llvm] Use std::make_optional (NFC) (#151627)
std::make_optional<T> is a lot like std::make_unique<T> in that it
performs perfect forwarding of arguments for T's constructor.  As a
result, we don't have to repeat type names twice.
2025-08-01 00:24:40 -07:00
Mel Chen
86916ff0f0
[LV] Fix gap mask requirement for interleaved access (#151105)
When interleaved stores contain gaps, a mask is required to skip the
gaps, regardless of whether scalar epilogues are allowed.
This patch corrects the condition under which a gap mask is needed,
ensuring consistency between the legacy and VPlan-based cost models and
avoiding assertion failures.

Related #149981
2025-08-01 14:24:30 +08:00
Ramkumar Ramachandra
d07f48e4da
[VPlan] Use m_BinaryOr matcher for clarity (NFC) (#151541) 2025-08-01 06:56:27 +01:00
Luke Lau
7250b66240
[VPlan] Create AVL as a phi from TC -> 0 with EVL tail folding (#151481)
This implements the first half of #151459, by changing the AVL so it's
no longer computed as `trip-count - EVL-based IV`, but instead a
separate scalar phi that is decremented by EVL each iteration.

This shortens the dependency chain for computing the AVL and should
eventually allow us to convert the branch condition to `branch-count
avl-next, 0`.

`simplifyBranchConditionForVFAndUF` had to be updated to prevent a
regression because this introduces a VPPhi in the header block.
2025-08-01 11:00:05 +08:00
Joel E. Denny
37e03b56b8
Revert "[PGO] Add llvm.loop.estimated_trip_count metadata" (#151585)
Reverts llvm/llvm-project#148758

[As
requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)
2025-07-31 15:56:31 -04:00
Joel E. Denny
a85c725952 Revert "[Utils] Fix a warning"
This reverts commit 3a18fe33f0763cd9276c99c276448412100f6270.

So that we can revert PR #148758.
2025-07-31 15:54:01 -04:00
shuffle2
7b5a44c605
[hwasan] Add hwasan-all-globals option (#149621)
hwasan-globals does not instrument globals with custom sections, because
existing code may use `__start_`/`__stop_` symbols to iterate over
globals in such a way which will cause hwasan assertions.

Introduce new hwasan-all-globals option, which instruments all
user-defined globals (but not those globals which are generated by the
hwasan instrumentation itself), including those with custom sections.

fixes #142442
2025-07-31 11:38:42 -07:00
Kazu Hirata
3a18fe33f0 [Utils] Fix a warning
This patch fixes:

  llvm/lib/Transforms/Utils/LoopUtils.cpp:818:28: error: unused
  function 'operator<<' [-Werror,-Wunused-function]
2025-07-31 11:24:33 -07:00
Joel E. Denny
f7b65011de
[PGO] Add llvm.loop.estimated_trip_count metadata (#148758)
This patch implements the `llvm.loop.estimated_trip_count` metadata
discussed in [[RFC] Fix Loop Transformations to Preserve Block
Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785).
As [suggested in the RFC
comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4),
it adds the new metadata to all loops at the time of profile ingestion
and estimates each trip count from the loop's `branch_weights` metadata.
As [suggested in the PR #128785
review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036),
it does so via a new `PGOEstimateTripCountsPass` pass, which creates the
new metadata for each loop but omits the value if it cannot estimate a
trip count due to the loop's form.

An important observation not previously discussed is that
`PGOEstimateTripCountsPass` *often* cannot estimate a loop's trip count,
but later passes can sometimes transform the loop in a way that makes it
possible. Currently, such passes do not necessarily update the metadata,
but eventually that should be fixed. Until then, if the new metadata has
no value, `llvm::getLoopEstimatedTripCount` disregards it and tries
again to estimate the trip count from the loop's current
`branch_weights` metadata.
2025-07-31 12:28:25 -04:00
David Green
8f968fe3ec
[AggressiveInstCombine] Make cttz fold more resiliant to non-array geps (#150896)
Similar to #150639 this fixes the AggressiveInstCombine fold for convert
tables to cttz instructions if the gep types are not array types. i.e
`gep i16 @glob, i64 %idx` instead of `gep [64 x i16] @glob, i64 0, i64 %idx`.
2025-07-31 16:53:55 +01:00
Florian Hahn
99d70e09a9
[SCEV] Allow adds of constants in tryToReuseLCSSAPhi. (#150693)
Update the logic added in
https://github.com/llvm/llvm-project/pull/147824 to also allow adds of
constants. There are a number of cases where this can help remove
redundant phis and replace some computation with a ptrtoint (which
likely is free in the backend).

PR: https://github.com/llvm/llvm-project/pull/150693
2025-07-31 16:33:25 +01:00
Luke Lau
08c5944222
[VPlan] Fix header phi VPInstruction verification. NFC (#151472)
Noticed this when checking the invariant that all phis in the header
block must be header phis. I think there's a missing set of parentheses
here, since otherwise it only cast<VPInstruction> when RecipeI isn't a
VPInstruction.
2025-07-31 23:09:20 +08:00
Nikita Popov
a71909156e
[InstCombine] Set flags when canonicalizing GEP indices (#151516)
When truncating set nsw/nuw based on nusw/nuw. When extending, use zext
nneg if nusw+nuw.

Proof: https://alive2.llvm.org/ce/z/JA2Yzr
2025-07-31 15:58:04 +02:00
LU-JOHN
a757f23404
[SimplifyCFG] Extend jump-threading to allow live local defs (#135079)
Extend jump-threading to allow local defs that are live outside of the
threaded block. Allow threading to destinations where the local defs are
not live.

---------

Signed-off-by: John Lu <John.Lu@amd.com>
2025-07-31 09:44:14 -04:00
Samuel Tebbs
339b0a1d74 [LV][NFCI] Format fcc419b05f62 2025-07-31 14:37:59 +01:00
Samuel Tebbs
fcc419b05f [LV][NFCI] Swap reduction recipe operand order
https://github.com/llvm/llvm-project/pull/147026 will enable sub
reductions, which require that the phi value is the first operand since
they aren't commutative. This re-orders the operands when executing
reductions, which actually matches other existing code in
VPReductionRecipe::execute.
2025-07-31 14:35:10 +01:00
Nathan Gauër
67273393b1
[VectorCombine][TTI] Prevent extract/ins rewrite to GEP (#150216)
Using GEP to index into a vector is not disallowed, but not recommended.
The SPIR-V backend needs to generate structured access into types, which
is impossible with an untyped GEP instruction unless we add more info to
the IR. Finding a solution is a work-in-progress, but in the meantime,
we'd like to reduce the amount of failures.

Preventing this optimizations from rewritting extract/insert
instructions into a GEP helps us lower more code to SPIR-V. This change
should be OK as it's only active when targeting SPIR-V and disabling a
non-recommended transformation.

Related to #145002
2025-07-31 14:14:00 +02:00
Ramkumar Ramachandra
b7d00b827e
[VPlan] Uniformly use VPlanPatternMatch in transforms (NFC) (#151488) 2025-07-31 12:01:40 +01:00
Ramkumar Ramachandra
20f6ec4b29
[VPlan] Make VPBuilder APIs uniformly take ArrayRef (NFC) (#151484) 2025-07-31 11:33:04 +01:00
Nikita Popov
16d73839b1
[InstCombine] Support folding intrinsics into phis (#151115)
Call foldOpIntoPhi() for speculatable intrinsics. We already do this for
FoldOpIntoSelect().

Among other things, this partially subsumes
https://github.com/llvm/llvm-project/pull/149858.
2025-07-31 12:32:37 +02:00
Mel Chen
6752415ce8
[VectorUtils] Simplify the code by new function InterleaveGroup::isFull. nfc (#151112) 2025-07-31 16:02:53 +08:00
Benjamin Maxwell
cd16c706ba
[IRCE] Use function_ref<> instead of optional<function_ref<>> (NFC) (#151308)
llvm::function_ref<> is nullable.
2025-07-31 07:56:05 +01:00