1457 Commits

Author SHA1 Message Date
Kazu Hirata
cbf5af9668
[llvm] Remove unused includes (NFC) (#154051)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-08-17 23:46:35 -07:00
Joel E. Denny
37e03b56b8
Revert "[PGO] Add llvm.loop.estimated_trip_count metadata" (#151585)
Reverts llvm/llvm-project#148758

[As
requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)
2025-07-31 15:56:31 -04:00
Joel E. Denny
f7b65011de
[PGO] Add llvm.loop.estimated_trip_count metadata (#148758)
This patch implements the `llvm.loop.estimated_trip_count` metadata
discussed in [[RFC] Fix Loop Transformations to Preserve Block
Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785).
As [suggested in the RFC
comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4),
it adds the new metadata to all loops at the time of profile ingestion
and estimates each trip count from the loop's `branch_weights` metadata.
As [suggested in the PR #128785
review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036),
it does so via a new `PGOEstimateTripCountsPass` pass, which creates the
new metadata for each loop but omits the value if it cannot estimate a
trip count due to the loop's form.

An important observation not previously discussed is that
`PGOEstimateTripCountsPass` *often* cannot estimate a loop's trip count,
but later passes can sometimes transform the loop in a way that makes it
possible. Currently, such passes do not necessarily update the metadata,
but eventually that should be fixed. Until then, if the new metadata has
no value, `llvm::getLoopEstimatedTripCount` disregards it and tries
again to estimate the trip count from the loop's current
`branch_weights` metadata.
2025-07-31 12:28:25 -04:00
Alex Voicu
6bcff9eb13
[HIPSTDPAR] Add handling for math builtins (#140158)
When compiling in `--hipstdpar` mode, the builtins corresponding to the
standard library might end up in code that is expected to execute on the
accelerator (e.g. by using the `std::` prefixed functions from
`<cmath>`). We do not have uniform handling for this in AMDGPU, and the
errors that obtain are quite arcane. Furthermore, the user-space changes
required to work around this tend to be rather intrusive.

This patch adds an additional `--hipstdpar` specific pass which forwards
to the run time component of HIPSTDPAR the intrinsics / libcalls which
result from the use of the math builtins, and which are not properly
handled. In the long run we will want to stop relying on this and handle
things in the compiler, but it is going to be a rather lengthy journey,
which makes this medium term escape hatch necessary.

The paired change in the run time component is here
<https://github.com/ROCm/rocThrust/pull/551>.
2025-07-28 22:29:31 +01:00
AZero13
f2fe4718aa
[ObjCARC] Completely remove ObjCARCAPElimPass (#150717)
ObjCARCAPElimPass has been made obsolete now that we remove unused
autorelease pools.
2025-07-26 08:07:27 -07:00
Mircea Trofin
df2d2d125b
[PGO] Add ProfileInjector and ProfileVerifier passes (#147388)
Adding 2 passes, one to inject `MD_prof` and one to check its presence. A subsequent patch will add these (similar to debugify) to `opt` (and, eventually, a variant of this, to `llc`)

Tracking issue: #147390
2025-07-23 21:34:58 +02:00
Jay Foad
756ac65987
[CodeGen] Add a pass for testing finalizeBundle (#149813)
This allows for unit testing of finalizeBundle with standard MIR tests
using update_mir_test_checks.py.
2025-07-23 11:35:57 +01:00
Madhur Amilkanthwar
2320cddfc2
Reapply "[GVN] memoryssa implies no-memdep (#149473)" (#149767)
Enabling one of MemorySSA or MD implies the other is off.

Already approved in https://github.com/llvm/llvm-project/pull/149473 but
I had to revert as I missed updating one test.
2025-07-21 14:05:29 +05:30
Madhur Amilkanthwar
f79d6b319d
Revert "[GVN] memoryssa implies no-memdep (#149473)" (#149766)
This reverts commit 60d2d94db253a9fdc7bd111120c803f808564b30.
2025-07-21 11:04:54 +05:30
Madhur Amilkanthwar
60d2d94db2
[GVN] memoryssa implies no-memdep (#149473)
Enabling one of MemorySSA or MD implies the other is off.
2025-07-21 10:48:03 +05:30
Vikram Hegde
06528070fc
[CodeGen][NPM] Clear MachineFunctions without using PA (#148113)
same as https://github.com/llvm/llvm-project/pull/139517

This replaces the InvalidateAnalysisPass<MachineFunctionAnalysis> pass.

There are no cross-function analysis requirements right now, so clearing
all analyses works for the last pass in the pipeline.

Having the InvalidateAnalysisPass<MachineFunctionAnalysis>() is causing
a problem with ModuleToCGSCCPassAdaptor by deleting machine functions
for other functions and ending up with exactly one correctly compiled
MF, with the rest being vanished.

This is because ModuleToCGSCCPAdaptor propagates PassPA (received from
the CGSCCToFunctionPassAdaptor that runs the actual codegen pipeline on
MFs) to the next SCC. That causes MFA invalidation on functions in the
next SCC.

For us, PassPA happens to be returned from
invalidate<machine-function-analysis> which abandons the
MachineFunctionAnalysis. So while the first function runs through the
pipeline normally, invalidate also deletes the functions in the next SCC
before its pipeline is run. (this seems to be the intended mechanism of
the CG adaptor to allow cross-SCC invalidations.

Co-authored-by : Oke, Akshat
<[Akshat.Oke@amd.com](mailto:Akshat.Oke@amd.com)>
2025-07-18 11:58:01 +05:30
Cristian Assaiante
81eb7defa2
[OptBisect][IR] Adding a new OptPassGate for disabling passes via name (#145059)
This commit adds a new pass gate that allows selective disabling
of one or more passes via the clang command line using the
`-opt-disable` option. Passes to be disabled should be specified as a
comma-separated list of their names.
The implementation resides in the same file as the bisection tool. The
`getGlobalPassGate()` function returns the currently enabled gate.

Example: `-opt-disable="PassA,PassB"`

Pass names are matched using case-insensitive comparisons. However, note
that special characters, including spaces, must be included exactly as
they appear in the pass names.

Additionally, a `-opt-disable-enable-verbosity` flag has been introduced to
enable verbose output when this functionality is in use. When enabled,
it prints the status of all passes (either running or NOT running),
similar to the default behavior of `-opt-bisect-limit`. This flag is
disabled by default, which is the opposite of the `-opt-bisect-verbose`
flag (which defaults to enabled).

To validate this functionality, a test file has also been provided. It reuses
the same infrastructure as the opt-bisect test, but disables three
specific passes and checks the output to ensure the expected behavior.

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2025-07-16 16:51:58 -07:00
Vikram Hegde
4aa85cc313
[CodeGen][NPM] Port ProcessImplicitDefs to NPM (#148110)
same as https://github.com/llvm/llvm-project/pull/138829

Co-authored-by : Oke, Akshat
<[Akshat.Oke@amd.com](mailto:Akshat.Oke@amd.com)>
2025-07-16 13:23:27 +05:30
Vikram Hegde
8cbcaee7fe
[CodeGen][NPM] Register Function Passes (#148109)
same as https://github.com/llvm/llvm-project/pull/138828,

Co-authored-by : Oke, Akshat
<[Akshat.Oke@amd.com](mailto:Akshat.Oke@amd.com)>
2025-07-15 17:01:28 +05:30
Vikram Hegde
fcd4a2fe7a
[CodeGen][NewPM] Port "PostRAMachineSink" pass to NPM (#129690) 2025-07-10 13:10:46 +05:30
Akshat Oke
b33d95fb8a
[CodeGen][NPM] Port InitUndef to NPM (#138495) 2025-07-09 15:31:31 +05:30
Matt Arsenault
1915fa15c3
Utils: Add pass to declare runtime libcalls (#147534)
This will be useful for testing the set of calls for different systems,
and eventually the product of context specific modifiers applied. In
the future we should also know the type signatures, and be able to
emit the correct one.
2025-07-09 00:52:22 +09:00
Akshat Oke
0fbaeafd7f
[CodeGen][NPM] Allow nested MF pass managers for -passes (#128852)
This allows `machine-function(p1,machine-function(...))` instead of
erroring.

Effectively it is flattened to a single MFPM.
2025-07-07 12:10:28 +05:30
Ryotaro Kasuga
3099b7eb5d
[Passes] Move LoopInterchange into optimization pipeline (#145503)
As mentioned in https://github.com/llvm/llvm-project/pull/145071,
LoopInterchange should be part of the optimization pipeline rather than
the simplification pipeline. This patch moves LoopInterchange into the
optimization pipeline.

More contexts:

- By default, LoopInterchange attempts to improve data locality,
however, it also takes increasing vectorization opportunities into
account. Given that, it is reasonable to run it as close to
vectorization as possible.
- I looked into previous changes related to the placement of
LoopInterchange, but couldn’t find any strong motivation suggesting that
it benefits other simplifications.
- As far as I tried some tests (including llvm-test-suite), removing
LoopInterchange from the simplification pipeline does not affect other
simplifications. Therefore, there doesn't seem to be much value in
keeping it there.
- The new position reduces compile-time for ThinLTO, probably because it
only runs once per function in post-link optimization, rather than both
in pre-link and post-link optimization.

I haven't encountered any cases where the positional difference affects
optimization results, so please feel free to revert if you run into any issues.
2025-07-04 20:06:53 +09:00
Meredith Julian
7da8ed8d33
Fix missing/outdated pass options in PassRegistry.def (#146160)
There are a handful of passes in PassRegistry.def with outdated or
missing pass options. These strings describing pass options are used for
the printPassNames() function only, which is likely why they have gotten
out-of-date without being caught. This MR simply changes the few passes
where the option string is out-of-date, fixing the output of
-print-passes. This does not affect functionality of the pipeline
parser, and is hard to verify in a unit test, so no tests were added.
2025-07-01 12:41:26 -07:00
S. VenkataKeerthy
35b80031f4
[NFC] Formatting PassRegistry.def (#144139) 2025-07-01 11:03:43 -07:00
S. VenkataKeerthy
0745eb501d
[IR2Vec] Scale embeddings once in vocab analysis instead of repetitive scaling (#143986)
Changes to scale opcodes, types and args once in `IR2VecVocabAnalysis` so that we can avoid scaling each time while computing embeddings. This PR refactors the vocabulary to explicitly define 3 sections---Opcodes, Types, and Arguments---used for computing Embeddings. 

(Tracking issue - #141817 ; partly fixes - #141832)
2025-06-30 23:09:19 +02:00
Rahul Joshi
b0ff473340
[LLVM] Change ModulePass::skipModule to take a const reference (#146168)
Change `ModulePass::skipModule` to take const Module reference.
Additionally, make `OptPassGate::shouldRunPass` const as well as for
most implementations it's a const query. For `OptBisect`, make
`LastBisectNum` mutable so it could be updated in `shouldRunPass`.

Additional minor cleanup: Change all StringRef arguments to simple
StringRef (no const or reference), change `OptBisect::Disabled` to
constexpr.
2025-06-30 07:23:04 -07:00
Nikita Popov
d7a3bdffb9
[PassBuilder][FatLTO] Expose FatLTO pipeline via pipeline string (#146048)
Expose the FatLTO pipeline via `-passes="fatlto-pre-link<Ox>"`, similar
to all the other optimization pipelines. This is to allow reproducing it
outside clang. (Possibly also useful for C API users.)
2025-06-30 12:04:42 +02:00
Florian Mayer
71bc606e95
[LowerAllowCheckPass] allow to specify runtime.check hotness (#145998) 2025-06-27 11:28:07 -07:00
Nikita Popov
7f223d121d
[PassBuilder] Treat pipeline aliases as normal passes (#146038)
Pipelines like `-passes="default<O3>"` are currently parsed in a special
way. Switch them to work like normal, parameterized module passes.
2025-06-27 12:07:09 +02:00
Mircea Trofin
daa2a587cc
[TRE] Adjust function entry count when using instrumented profiles (#143987)
The entry count of a function needs to be updated after a callsite is elided by TRE: before elision, the entry count accounted for the recursive call at that callsite. After TRE, we need to remove that callsite's contribution.

This patch enables this for instrumented profiling cases because, there, we know the function entry count captured entries before TRE. We cannot currently address this for sample-based (because we don't know whether this function was TRE-ed in the binary that donated samples)
2025-06-23 08:07:31 -07:00
Nikita Popov
ae8c85c9ce
[Passes] Remove LoopInterchange from O1 pipeline (#145071)
This is a fairly exotic pass, I don't think it makes a lot of sense to
run it at O1, esp. as vectorization wouldn't run at O1 anyway.
2025-06-23 09:11:03 +02:00
Ramkumar Ramachandra
74054cab7a
[HashRecognize] Make it a non-PM analysis (#144742)
Make HashRecognize a non-PassManager analysis that can be called to get
the result on-demand, creating a new getResult() entry-point. The issue
was discovered when attempting to use the analysis to perform a
transform in LoopIdiomRecognize.
2025-06-19 12:29:58 +01:00
Peter Collingbourne
3fa231f47c
Add SimplifyTypeTests pass.
This pass figures out whether inlining has exposed a constant address to
a lowered type test, and remove the test if so and the address is known
to pass the test. Unfortunately this pass ends up needing to reverse
engineer what LowerTypeTests did; this is currently inherent to the design
of ThinLTO importing where LowerTypeTests needs to run at the start.

Reviewers: teresajohnson

Reviewed By: teresajohnson

Pull Request: https://github.com/llvm/llvm-project/pull/141327
2025-06-05 11:09:20 -07:00
Snehasish Kumar
16c7b3c9f5
[MemProf] Split MemProfiler into Instrumentation and Use. (#142811)
Most of the recent development on the MemProfiler has been on the Use part. The instrumentation has been quite stable for a while. As the complexity of the use grows (with undrifting, diagnostics etc) I figured it would be good to separate these two implementations.
2025-06-05 07:36:50 -07:00
Kazu Hirata
228f66807d
[llvm] Remove unused includes (NFC) (#142733)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-06-04 12:30:52 -07:00
John Brawn
81d3189891
[LAA] Keep pointer checks on partial analysis (#139719)
Currently if there's any memory access that AccessAnalysis couldn't
analyze then all of the runtime pointer check results are discarded.
This patch makes this able to be controlled with the AllowPartial
option, which makes it so we generate the runtime check information
for those pointers that we could analyze, as transformations may still
be able to make use of the partial information.

Of the transformations that use LoopAccessAnalysis, only
LoopVersioningLICM changes behaviour as a result of this change. This is
because the others either:
* Check canVectorizeMemory, which will return false when we have partial
pointer information as analyzeLoop() will return false.
* Examine the dependencies returned by getDepChecker(), which will be
empty as we exit analyzeLoop if we have partial pointer information
before calling areDepsSafe(), which is what fills in the dependency
information.
2025-06-04 16:47:20 +01:00
Ramkumar Ramachandra
af2f8a8c14
[HashRecognize] Introduce new analysis (#139120)
Introduce a fresh analysis for recognizing polynomial hashes, with the
rationale that several targets have specific instructions to optimize
things like CRC and GHASH (eg. X86 and RISC-V crypto extension). We
limit the scope to polynomial hashes computed in a Galois field of
characteristic 2, since this class of operations can also be optimized
in the absence of target-specific instructions to use a lookup table.

At the moment, we only recognize the CRC algorithm.

RFC:
https://discourse.llvm.org/t/rfc-new-analysis-for-polynomial-hash-recognition/86268
2025-06-02 08:25:50 +01:00
Rahul Joshi
062353d1f5
[NFC][LLVM] Minor namespace fixes in PassBuilder (#141288)
- No need to prefix `PointerType` with `llvm::`.
- Avoid namespace  block to define `PrintPipelinePasses`.
2025-05-27 07:26:28 -07:00
Rahul Joshi
58f78d84fd
[NFC][LLVM] Use formatv automatic index assignment in PassBuilder (#141286) 2025-05-27 07:25:37 -07:00
Rahul Joshi
52c2e45c11
[NFC][CodeGen] Adopt MachineFunctionProperties convenience accessors (#141101) 2025-05-23 08:30:29 -07:00
S. VenkataKeerthy
58ab005d8d
Adding IR2Vec as an analysis pass (#134004)
This PR introduces IR2Vec as an analysis pass. The changes include:
- Logic for generating Symbolic encodings.
- 75D learned vocabulary.
- lit tests.

Here is the link to the RFC -
https://discourse.llvm.org/t/rfc-enhancing-mlgo-inlining-with-ir2vec-embeddings

Acknowledgements: contributors -
https://github.com/IITH-Compilers/IR2Vec/graphs/contributors

---------

Co-authored-by: svkeerthy <venkatakeerthy@google.com>
Co-authored-by: Mircea Trofin <mtrofin@google.com>
2025-05-22 09:50:21 -07:00
Min-Yih Hsu
0ab67ec191
[LV][EVL] Introduce the EVLIndVarSimplify Pass for EVL-vectorized loops (#131005)
When we enable EVL-based loop vectorization w/ predicated tail-folding,
each vectorized loop has effectively two induction variables: one
calculates the step using (VF x vscale) and the other one increases the
IV by values returned from experiment.get.vector.length. The former,
also known as canonical IV, is more favorable for analyses as it's
"countable" in the sense of SCEV; the latter (EVL-based IV), however, is
more favorable to codegen, at least for those that support scalable
vectors like AArch64 SVE and RISC-V.

The idea is that we use canonical IV all the way until the end of all
vectorizers, where we replace it with EVL-based IV using EVLIVSimplify
introduced here. Such that we can have the best from both worlds.

This Pass is enabled by default in RISC-V. However, since we haven't
really vectorize loops with predicate tail-folding by default, this Pass
is no-op at this moment.
2025-05-14 13:49:50 -07:00
David Green
ec406e8674
[GlobalISel] Add a GISelValueTracker printing pass (#139687)
This adds a GISelValueTrackingPrinterPass that can print the known bits
and sign bit of each def in a function. It is built on the new pass
manager and so adds a NPM GISelValueTrackingAnalysis, renaming the older
class to GISelValueTrackingAnalysisLegacy.

The first 2 functions from the AArch64GISelMITest are ported over to an
mir test to show it working. It also runs successfully on all files in
llvm/test/CodeGen/AArch64/GlobalISel/*.mir that are not invalid. It can
hopefully be used to test GlobalISel known bits analysis more directly
in common cases, without jumping through the hoops that the C++ tests
requires.
2025-05-14 11:05:04 +01:00
Helena Kotas
c66f401e1e
[DirectX] Implement DXILResourceBindingAnalysis (#137258)
`DXILResourceBindingAnalysis` analyses explicit resource bindings in the
module and puts together lists of used virtual register spaces and
available virtual register slot ranges for each binding type. It also
stores additional information found during the analysis such as whether
the module uses implicit bindings or if any of the bindings overlap.

This information will be used in `DXILResourceImplicitBindings` pass
(coming soon) to assign register slots to resources with implicit
bindings, and in a post-optimization validation pass that will raise
diagnostic about overlapping bindings.

Part 1/2 of #136786
2025-05-09 10:42:31 -07:00
Chengjun
94d933676c
[AA] Move Target Specific AA before BasicAA (#125965)
In this change, NVPTX AA is moved before Basic AA to potentially improve
compile time. Additionally, it introduces a flag in the
`ExternalAAWrapper` that allows other backends to run their
target-specific AA passes before Basic AA, if desired.

The change works for both New Pass Manager and Legacy Pass Manager.

Original implementation by Princeton Ferro <pferro@nvidia.com>
2025-05-07 15:25:48 -07:00
Jie Fu
1563d74145 [Passes] Remove extra ';' outside of a function (NFC)
/llvm-project/llvm/lib/Passes/PassBuilder.cpp:1508:2:
error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]
 1508 | };
      |  ^
1 error generated.
2025-04-30 18:44:45 +08:00
Akshat Oke
e91cbd4f29
[CodeGen][NPM] Port VirtRegRewriter to NPM (#130564) 2025-04-30 14:10:46 +05:30
Vikram Hegde
53a8b89003
[CodeGen][NewPM] Port "ShrinkWrap" pass to NPM (#129880) 2025-04-30 13:11:17 +05:30
paperchalice
159628cc22
[CodeGen] Port MachineUniformityAnalysis to new pass manager (#137578)
- Add new pass manager version of `MachineUniformityAnalysis `.
- Query `TargetTransformInfo` in new pass manager version.
- Use `printAsOperand` when printing machine function name
2025-04-30 10:44:06 +08:00
Vikram Hegde
86d8e8d9a6
[CodeGen][NewPM] Port "PrologEpilogInserter" to NPM (#130550) 2025-04-29 13:13:45 +05:30
Akshat Oke
31ddaef8d1
[CodeGen][NPM] Port UnreachableMachineBlockElim to NPM (#136127) 2025-04-18 15:06:30 +05:30
Akshat Oke
a388395b86
[CodeGen][NPM] Port StackFrameLayoutAnalysisPass to NPM (#130070) 2025-04-15 12:37:19 +05:30
Akshat Oke
f133eae70c
[CodeGen][NPM] Port MachineSanitizerBinaryMetadata to NPM (#130069)
Didn't find a test for this (but there are tests for the `Function`
version of this pass)
2025-04-14 20:52:26 +05:30