14280 Commits

Author SHA1 Message Date
Jasmine Tang
9c6bb18040
[WebAssembly] Constant fold wasm.dot (#149619)
Constant fold wasm.dot of constant vectors/splats.

Test case added in
`llvm/test/Transforms/InstSimplify/ConstProp/WebAssembly/dot.ll`

Related to https://github.com/llvm/llvm-project/issues/55933
2025-08-05 15:22:37 -07:00
Pedro Lobo
2bbc614713
[InstCombine] Support offsets in memset to load forwarding (#151924)
Adds support for load offsets when performing `memset` load forwarding.
2025-08-05 17:09:06 +01:00
Nikita Popov
c1b387e23d
[MemoryLocation] Compute lifetime size from alloca size (#151982)
Split out from #150248:

Since #150944 the size passed to lifetime.start/end is considered
meaningless. The lifetime always applies to the whole alloca.

This adjusts MemoryLocation to determine the MemoryLocation size from
the alloca size, instead of using the argument.
2025-08-05 10:47:07 +02:00
Nikita Popov
ba099c516d
[StackLifetime] Remove handling for lifetime size mismatch (#151965)
Split out from #150248:

Since #150944 the size passed to lifetime.start/end is considered
meaningless. The lifetime always applies to the whole alloca.

Accordingly remove handling for size mismatch in the StackLifetime
analysis.
2025-08-05 09:19:10 +02:00
Nikita Popov
4b5b36e5c4 [GVN] Avoid creating lifetime of non-alloca
There is a larger problem here in that we should not be performing
arbitrary pointer replacements for assumes. This is handled for
branches, but assume goes through a different code path.

Fixes https://github.com/llvm/llvm-project/issues/151785.
2025-08-04 12:06:40 +02:00
Abhishek Kaushik
30728eb26b
[Reland][ValueTracking] Improve Bitcast handling to match SDAG (#145223)
Fixes #125228

---------

Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>
2025-08-04 14:51:03 +05:30
Nikita Popov
86727fe9a1
[IR] Allow poison argument to lifetime markers (#151148)
This slightly relaxes the invariant established in #149310, by also
allowing the lifetime argument to be poison. This is to support the
typical pattern of RAUWing with poison when removing an instruction.

It's worth noting that this does not require any conservative
assumptions, lifetimes with poison arguments can simply be skipped.

Fixes https://github.com/llvm/llvm-project/issues/151119.
2025-08-04 10:02:04 +02:00
Florian Hahn
2ae996cbbe
[LAA] Support assumptions in evaluatePtrAddRecAtMaxBTCWillNotWrap (#147047)
This patch extends the logic added in
https://github.com/llvm/llvm-project/pull/128061 to support
dereferenceability information from assumptions as well.

Unfortunately both assumption cache and the dominator tree need to be
threaded through multiple layers to make them available where needed.

PR: https://github.com/llvm/llvm-project/pull/147047
2025-08-01 14:18:07 +01:00
Lewis Crawford
5146917407
[ConstantFolding] Fix incorrect nvvm_round folding (#151563)
The `nvvm_round` intrinsic should round to the nearest even number in
the case of ties. It lowers to PTX `cvt.rni`, which will "round to
nearest integer, choosing even integer if source is equidistant between
two integers", so it matches the semantics of `rint` (and not `round` as
the name suggests).
2025-08-01 10:31:43 +01:00
Muhammad Omair Javaid
176d54aa33 Revert "[VectorUtils] Trivially vectorize ldexp, [l]lround (#145545)"
This reverts commit 13366759c3b9db9366659d870cc73c938422b020.

This broke various LLVM testsuite buildbots for AArch64 SVE, but the
problem got masked because relevant buildbots were already failing
due to other breakage.

It has broken llvm-test-suite test:
gfortran-regression-compile-regression__vect__pr106253_f.test

https://lab.llvm.org/buildbot/#/builders/4/builds/8164
https://lab.llvm.org/buildbot/#/builders/17/builds/9858
https://lab.llvm.org/buildbot/#/builders/41/builds/8067
https://lab.llvm.org/buildbot/#/builders/143/builds/9607
2025-08-01 01:24:52 +05:00
Joel E. Denny
37e03b56b8
Revert "[PGO] Add llvm.loop.estimated_trip_count metadata" (#151585)
Reverts llvm/llvm-project#148758

[As
requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)
2025-07-31 15:56:31 -04:00
Joel E. Denny
f7b65011de
[PGO] Add llvm.loop.estimated_trip_count metadata (#148758)
This patch implements the `llvm.loop.estimated_trip_count` metadata
discussed in [[RFC] Fix Loop Transformations to Preserve Block
Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785).
As [suggested in the RFC
comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4),
it adds the new metadata to all loops at the time of profile ingestion
and estimates each trip count from the loop's `branch_weights` metadata.
As [suggested in the PR #128785
review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036),
it does so via a new `PGOEstimateTripCountsPass` pass, which creates the
new metadata for each loop but omits the value if it cannot estimate a
trip count due to the loop's form.

An important observation not previously discussed is that
`PGOEstimateTripCountsPass` *often* cannot estimate a loop's trip count,
but later passes can sometimes transform the loop in a way that makes it
possible. Currently, such passes do not necessarily update the metadata,
but eventually that should be fixed. Until then, if the new metadata has
no value, `llvm::getLoopEstimatedTripCount` disregards it and tries
again to estimate the trip count from the loop's current
`branch_weights` metadata.
2025-07-31 12:28:25 -04:00
Justin Bogner
3f066f5fcf
[HLSL][DirectX] Extract HLSLBinding out of DXILResource. NFC (#150633)
We extract the binding logic out of the DXILResource analysis passes into the
FrontendHLSL library. This will allow us to use this logic for resource and
root signature bindings in both the DirectX backend and the HLSL frontend.
2025-07-31 08:35:47 -07:00
Nathan Gauër
67273393b1
[VectorCombine][TTI] Prevent extract/ins rewrite to GEP (#150216)
Using GEP to index into a vector is not disallowed, but not recommended.
The SPIR-V backend needs to generate structured access into types, which
is impossible with an untyped GEP instruction unless we add more info to
the IR. Finding a solution is a work-in-progress, but in the meantime,
we'd like to reduce the amount of failures.

Preventing this optimizations from rewritting extract/insert
instructions into a GEP helps us lower more code to SPIR-V. This change
should be OK as it's only active when targeting SPIR-V and disabling a
non-recommended transformation.

Related to #145002
2025-07-31 14:14:00 +02:00
Florian Hahn
ab9b23c446
[SCEV] Use pattern match to check ZExt(Add()). (NFC)
Follow-up to
https://github.com/llvm/llvm-project/pull/151227#pullrequestreview-3074670031
to check the inner expression is an Add before calling getTruncateExpr.

Adds a new matcher that just matches and captures SCEVAddExpr, to
support matching a SCEVAddExpr with arbitrary number of operands.
2025-07-31 12:47:14 +01:00
Mel Chen
6752415ce8
[VectorUtils] Simplify the code by new function InterleaveGroup::isFull. nfc (#151112) 2025-07-31 16:02:53 +08:00
Florian Hahn
d74d841b65
[SECV] Try to push the op into ZExt: A + zext (-A + B) -> zext (B) (#151227)
Try to push the constant operand into a ZExt:
A + zext (-A + B) -> zext (B), if trunc (A) + -A + B does not
unsigned-wrap.

The actual code supports ZExts with arbitrary number of arguments, hence
the getAddExpr in the return.

This helps SCEV reasoning in some cases, commonly when adding an offset
to a zero-extended SCEV that subtracts the same offset.

Note that this is restricted to cases where we can fold away an operand
of the inner Add. This is needed to avoid bad interactions with patterns
when forming ZExts, which try to push to ZExt to add operands.

https://alive2.llvm.org/ce/z/q7d303

PR: https://github.com/llvm/llvm-project/pull/151227
2025-07-30 21:10:57 +01:00
Lewis Crawford
c5327b935b
[ConstantFolding] Fix typo in GetNVVMDenormMode (#151297)
Fix typo in function name of GetNVVMDenormMode
(Denrom vs Denorm).
2025-07-30 10:48:09 +01:00
Abhinav Garg
f527b319e3
[Uniformity Analysis] Fix print method to dump uniformity info (#151130) 2025-07-30 10:57:57 +05:30
Ramkumar Ramachandra
13366759c3
[VectorUtils] Trivially vectorize ldexp, [l]lround (#145545) 2025-07-29 19:23:09 +01:00
Paul Walker
1528ddbe76
[ConstantFolding][SVE] Do not fold fcmp of denormal without known mode. (#150614)
This is a follow on to
https://github.com/llvm/llvm-project/pull/115407 that introduced code
which bypasses the splat handling for scalable vectors. To maintain
existing tests I have moved the early return until after the splat
handling so all vector types are treated equally.
2025-07-29 12:37:59 +01:00
David Sherwood
6fbc397964
[IR] Add new CreateVectorInterleave interface (#150931)
This PR adds a new interface to IRBuilder called CreateVectorInterleave,
which can be used to create vector.interleave intrinsics of factors 2-8.

For convenience I have also moved getInterleaveIntrinsicID and
getDeinterleaveIntrinsicID from VectorUtils.cpp to Intrinsics.cpp where
it can be used by IRBuilder.
2025-07-29 08:47:07 +01:00
Shoreshen
a5deb59dfe
[AMDGPU] Add NoaliasAddrSpace to AAMDnodes (#149247)
This is the following PR of
https://github.com/llvm/llvm-project/pull/136553 which calculate
NoaliasAddrSpace.

This PR carries the info calculated into MIR by adding it into AAMDnodes
2025-07-29 10:10:06 +08:00
Kazu Hirata
c7cd1d0ae3
[Analysis] Remove an unnecessary cast (NFC) (#150838)
getOpcode() already returns Instruction::CastOps.
2025-07-27 10:43:30 -07:00
Pedro Lobo
67658af1cc
[ConstantFolding] Merge constant gep inrange attributes (#150546)
When folding a gep+gep into a single gep, intersect their `inrange`
attributes.
2025-07-25 20:02:06 +01:00
Ryotaro Kasuga
b06f10d96c
[DA] Add check for base pointer invariance (#148241)
As specified in #53942, DA assumes base pointer invariance in its
process. Some cases were fixed by #116628. However, that PR only
addressed the parts related to AliasAnalysis, so the original issue
persists in later stages, especially when the AliasAnalysis results in
`MustAlias`.
This patch insert an explicit loop-invariant checks for the base pointer
and skips analysis when it is not loop-invariant.

Fix the cases added in #148240.
2025-07-26 03:25:01 +09:00
Helena Kotas
f169af3ba7
[HLSL] Fix detection of overlapping binding with unbounded array (#150547)
Fixes #150534
2025-07-25 09:41:20 -07:00
DingdWang
0c6784c951
[MemDep] Optimize SortNonLocalDepInfoCache sorting strategy for large caches with few unsorted entries (#143107)
During compilation of large files with many branches, I observed that
the function `SortNonLocalDepInfoCache` in `MemoryDependenceAnalysis`
becomes a significant performance bottleneck. This is because
`Cache.size()` can be very large (around 20,000), but only a small
number of entries (approximately 5 to 8) actually need sorting. The
original implementation performs a full sort in all cases, which is
inefficient.

This patch introduces a lightweight heuristic to quickly estimate the
number of unsorted entries and choose a more efficient sorting method
accordingly.

As a result, the GVN pass runtime on a large file is reduced from
approximately 26.3 minutes to 16.5 minutes.
2025-07-25 16:45:01 +02:00
Gleb Popov
75346e33d9
TargetLibraryInfo: Bring FreeBSD function list up to date (#144846) 2025-07-25 14:39:49 +02:00
xur-llvm
c9a8e15494
[ICP] Add a few tunings to indirect-call-promotion (#149892)
[ICP] Add a few tunings to indirect-call-promtion

Indirect-call promotion (ICP) has been adjusted with the following
tunings:
(1) Candidate functions can be now ICP'd even if only a declaration is
     present.
(2) All non-cold candidate functions are now considered by ICP.
      Previously, only hot targets were considered.
(3) If one target cannot be ICP'd, proceed with the remaining targets
     instead of exiting the callsite.
    
This update hides all tunings under internal options and disables them
by default. They'll be enabled in a later update. There'll also be
another update to address the "not found" issue with indirect targets.
2025-07-24 09:55:28 -07:00
Kazu Hirata
31281da34b
[Analysis] Drop const from return types (NFC) (#150258)
We don't need const on APFloat.
2025-07-23 15:18:38 -07:00
Alexandros Lamprineas
3ab64c5b29
[NFC][Clang][FMV] Make FMV priority data type future proof. (#150079)
FMV priority is the returned value of a polymorphic function. On RISC-V
and X86 targets a 32-bit value is enough. On AArch64 we currently need
64 bits and we will soon exceed that. APInt seems to be a suitable
replacement for uint64_t, presumably with minimal compile time overhead.
It allows bit manipulation, comparison and variable bit width.
2025-07-23 10:37:29 +01:00
Florian Hahn
6c50e2b2dd
[SCEV] Don't require NUW at first add when checking A+C1 < (A+C2)<nuw> (#149795)
Relax the NUW requirements for isKnownPredicateViaNoOverflow, if the
second operand (Y) is an ADD. The code only simplifies the condition if
C1 < C2, so if the second ADD is NUW, it doesn't matter whether the
first operand also has the NUW flag, as it cannot wrap if C1 < C2.

https://alive2.llvm.org/ce/z/b3dM7N


PR: https://github.com/llvm/llvm-project/pull/149795
2025-07-23 09:33:34 +01:00
Nikita Popov
b59aaf7da7
[Sanitizers] Remove handling for lifetimes on non-alloca insts (NFC) (#149994)
After #149310 the pointer argument of lifetime.start/lifetime.end is
guaranteed to be an alloca, so we don't need to go through
findAllocaForValue() anymore, and don't have to have special handling
for the case where it fails.
2025-07-23 09:48:32 +02:00
Ramkumar Ramachandra
b692b239f0
[LAA] Rename var used to retry with RT-checks (NFC) (#147307)
FoundNonConstantDistanceDependence is a misleading name for a variable
that determines whether we retry with runtime checks. Rename it.
2025-07-22 13:36:33 +01:00
Lewis Crawford
0823f4ff08
[ConstantFolding] Fix nvvm_round folding on PPC (#149837)
Fix a failing test for constant-folding the nvvm_round intrinsic. The
original implementation added in #141233 used a native libm call to the
"round" function, but on PPC this produces +0.0 if the input is -0.0,
which caused a test failure.

This patch updates it to use APFloat functions instead of native libm
calls to ensure cross-platform consistency.
2025-07-21 17:48:45 +01:00
Lewis Crawford
fd8ae2cb76
Add constant-folding for unary NVVM intrinsics (#141233)
Add support for constant-folding numerous NVVM unary arithmetic
intrinsics (including f, d, and ftz_f variants):
  - nvvm.ceil.*
  - nvvm.fabs.*
  - nvvm.floor.*
  - nvvm.rcp.*
  - nvvm.round.*
  - nvvm.saturate.*
  - nvvm.sqrt.f
  - nvvm.sqrt.rn.*
2025-07-21 11:32:09 +01:00
Jasmine Tang
e7ac49977a
[InstSimplify] Add poison propagation for trivially vectorizable intrinsics (#149243)
Fixes https://github.com/llvm/llvm-project/issues/146769

Test cases added to
`llvm/test/Transforms/InstSimplify/fold-intrinsics.ll`
2025-07-19 19:37:21 -07:00
Teresa Johnson
e57315e6ca
[MemProf] Fix discarding of noncold contexts after inlining (#149599)
When we rebuild the call site tries after inlining of an allocation with
MD_memprof metadata, we don't want to reapply the discarding of small
non-cold contexts (under -memprof-callsite-cold-threshold=) because we
have either no context size info (without -memprof-report-hinted-sizes
or another option that causes us to keep that as metadata), and even
with that information in the metadata, we have imperfect information at
that point as we have already discarded some contexts during matching.

The first case was even worse because we didn't guard our check by
whether the number of cold bytes was 0, leading to very aggressive
pruning during post-inline metadata rebuilding without the context size
information.
2025-07-18 21:11:37 -07:00
Florian Hahn
004c67ea25
[LV] Vectorize maxnum/minnum w/o fast-math flags. (#148239)
Update LV to vectorize maxnum/minnum reductions without fast-math flags,
by adding an extra check in the loop if any inputs to maxnum/minnum are
NaN, due to maxnum/minnum behavior w.r.t to signaling NaNs. Signed-zeros 
are already handled consistently by maxnum/minnum.

If any input is NaN,
 *exit the vector loop,
 *compute the reduction result up to the vector iteration that contained
   NaN inputs and
 * resume in the scalar loop


New recurrence kinds are added for reductions using maxnum/minnum
without fast-math flags.

PR: https://github.com/llvm/llvm-project/pull/148239
2025-07-18 21:58:19 +01:00
S. VenkataKeerthy
61a45d20cf
[IR2Vec][NFC] Add helper methods for numeric ID mapping in Vocabulary (#149212)
Add helper methods to IR2Vec's Vocabulary class for numeric ID mapping and vocabulary size calculation. These APIs will be useful in triplet generation for `llvm-ir2vec` tool (See #149214). 

(Tracking issue - #141817)
2025-07-17 13:40:51 -07:00
Ryotaro Kasuga
2b3a410f5b
[DA] Check element size when analyzing deps between same instruction (#148813)
DependenceAnalysis checks whether the given addresses are divisible by
the element size of corresponding load/store instructions. However, this
check was only executed when the two instructions (Src and Dst) are
different. We must also perform the same check when Src and Dst are the
same instruction.

Fix the test added in #147715.
2025-07-17 21:11:37 +09:00
Min-Yih Hsu
6824bcfdb4
[IA] Relax the requirement of having ExtractValue users on deinterleave intrinsic (#148716)
There are cases where InstCombine / InstSimplify might sink extractvalue
instructions that use a deinterleave intrinsic into successor blocks,
which prevents InterleavedAccess from kicking in because the current
pattern requires deinterleave intrinsic to be used by extractvalue.
However, this requirement is bit too strict while we could have just
replaced the users of deinterleave intrinsic with whatever generated by
the target TLI hooks.
2025-07-16 13:46:02 -07:00
jjasmine
2206c7d4af
[InstSimplify] Fold trig functions call of poison to poison (#148969)
Fold trig functions call of poison to poison.

This includes sin, cos, asin, acos, atan, atan2, sinh, cosh, sincos,
sincospi.

Test cases are fixed and also added to
llvm/test/Transforms/InstSimplify/fold-intrinsics.ll just like in
https://github.com/llvm/llvm-project/pull/146750
2025-07-16 08:35:13 -07:00
Ramkumar Ramachandra
584158f9ae
[LAA] Hoist check for SCEV-uncomputable dist (NFC) (#148841)
Hoist the check for SCEVCouldNotCompute distance into
getDependenceDistanceAndSize.
2025-07-16 15:30:53 +01:00
Ramkumar Ramachandra
10d4652144
[HashRecognize] Track visited in ValueEvolution (#147812)
Require that all Instructions in the Loop are visited by ValueEvolution,
as any stray instructions would complicate life for the optimization.
2025-07-16 15:27:41 +01:00
S. VenkataKeerthy
fad0fbc937
[NFC][IR2Vec] Fix warnings on MSVC compilation (#148911) 2025-07-15 10:54:00 -07:00
Jeremy Morse
57a5f9c47e
[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383)
There are no longer debug-info instructions, thus we don't need this
skipping. Horray!
2025-07-15 15:34:10 +01:00
Nikita Popov
7c30897b4c [TLI] Handle cabs without parameters gracefully
Check that the function has at least one parameter before trying
to access its type.

Fixes https://github.com/llvm/llvm-project/issues/148770.
2025-07-15 10:41:32 +02:00
Kazu Hirata
7c83d66719
[llvm] Remove unused includes (NFC) (#148768)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-07-14 22:19:14 -07:00