962 Commits

Author SHA1 Message Date
Joel E. Denny
37e03b56b8
Revert "[PGO] Add llvm.loop.estimated_trip_count metadata" (#151585)
Reverts llvm/llvm-project#148758

[As
requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)
2025-07-31 15:56:31 -04:00
Joel E. Denny
f7b65011de
[PGO] Add llvm.loop.estimated_trip_count metadata (#148758)
This patch implements the `llvm.loop.estimated_trip_count` metadata
discussed in [[RFC] Fix Loop Transformations to Preserve Block
Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785).
As [suggested in the RFC
comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4),
it adds the new metadata to all loops at the time of profile ingestion
and estimates each trip count from the loop's `branch_weights` metadata.
As [suggested in the PR #128785
review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036),
it does so via a new `PGOEstimateTripCountsPass` pass, which creates the
new metadata for each loop but omits the value if it cannot estimate a
trip count due to the loop's form.

An important observation not previously discussed is that
`PGOEstimateTripCountsPass` *often* cannot estimate a loop's trip count,
but later passes can sometimes transform the loop in a way that makes it
possible. Currently, such passes do not necessarily update the metadata,
but eventually that should be fixed. Until then, if the new metadata has
no value, `llvm::getLoopEstimatedTripCount` disregards it and tries
again to estimate the trip count from the loop's current
`branch_weights` metadata.
2025-07-31 12:28:25 -04:00
Mircea Trofin
df2d2d125b
[PGO] Add ProfileInjector and ProfileVerifier passes (#147388)
Adding 2 passes, one to inject `MD_prof` and one to check its presence. A subsequent patch will add these (similar to debugify) to `opt` (and, eventually, a variant of this, to `llc`)

Tracking issue: #147390
2025-07-23 21:34:58 +02:00
Jay Foad
756ac65987
[CodeGen] Add a pass for testing finalizeBundle (#149813)
This allows for unit testing of finalizeBundle with standard MIR tests
using update_mir_test_checks.py.
2025-07-23 11:35:57 +01:00
Madhur Amilkanthwar
2320cddfc2
Reapply "[GVN] memoryssa implies no-memdep (#149473)" (#149767)
Enabling one of MemorySSA or MD implies the other is off.

Already approved in https://github.com/llvm/llvm-project/pull/149473 but
I had to revert as I missed updating one test.
2025-07-21 14:05:29 +05:30
Madhur Amilkanthwar
f79d6b319d
Revert "[GVN] memoryssa implies no-memdep (#149473)" (#149766)
This reverts commit 60d2d94db253a9fdc7bd111120c803f808564b30.
2025-07-21 11:04:54 +05:30
Madhur Amilkanthwar
60d2d94db2
[GVN] memoryssa implies no-memdep (#149473)
Enabling one of MemorySSA or MD implies the other is off.
2025-07-21 10:48:03 +05:30
Vikram Hegde
4aa85cc313
[CodeGen][NPM] Port ProcessImplicitDefs to NPM (#148110)
same as https://github.com/llvm/llvm-project/pull/138829

Co-authored-by : Oke, Akshat
<[Akshat.Oke@amd.com](mailto:Akshat.Oke@amd.com)>
2025-07-16 13:23:27 +05:30
Vikram Hegde
8cbcaee7fe
[CodeGen][NPM] Register Function Passes (#148109)
same as https://github.com/llvm/llvm-project/pull/138828,

Co-authored-by : Oke, Akshat
<[Akshat.Oke@amd.com](mailto:Akshat.Oke@amd.com)>
2025-07-15 17:01:28 +05:30
Vikram Hegde
fcd4a2fe7a
[CodeGen][NewPM] Port "PostRAMachineSink" pass to NPM (#129690) 2025-07-10 13:10:46 +05:30
Akshat Oke
b33d95fb8a
[CodeGen][NPM] Port InitUndef to NPM (#138495) 2025-07-09 15:31:31 +05:30
Matt Arsenault
1915fa15c3
Utils: Add pass to declare runtime libcalls (#147534)
This will be useful for testing the set of calls for different systems,
and eventually the product of context specific modifiers applied. In
the future we should also know the type signatures, and be able to
emit the correct one.
2025-07-09 00:52:22 +09:00
Akshat Oke
0fbaeafd7f
[CodeGen][NPM] Allow nested MF pass managers for -passes (#128852)
This allows `machine-function(p1,machine-function(...))` instead of
erroring.

Effectively it is flattened to a single MFPM.
2025-07-07 12:10:28 +05:30
Nikita Popov
d7a3bdffb9
[PassBuilder][FatLTO] Expose FatLTO pipeline via pipeline string (#146048)
Expose the FatLTO pipeline via `-passes="fatlto-pre-link<Ox>"`, similar
to all the other optimization pipelines. This is to allow reproducing it
outside clang. (Possibly also useful for C API users.)
2025-06-30 12:04:42 +02:00
Florian Mayer
71bc606e95
[LowerAllowCheckPass] allow to specify runtime.check hotness (#145998) 2025-06-27 11:28:07 -07:00
Nikita Popov
7f223d121d
[PassBuilder] Treat pipeline aliases as normal passes (#146038)
Pipelines like `-passes="default<O3>"` are currently parsed in a special
way. Switch them to work like normal, parameterized module passes.
2025-06-27 12:07:09 +02:00
Snehasish Kumar
16c7b3c9f5
[MemProf] Split MemProfiler into Instrumentation and Use. (#142811)
Most of the recent development on the MemProfiler has been on the Use part. The instrumentation has been quite stable for a while. As the complexity of the use grows (with undrifting, diagnostics etc) I figured it would be good to separate these two implementations.
2025-06-05 07:36:50 -07:00
Kazu Hirata
228f66807d
[llvm] Remove unused includes (NFC) (#142733)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-06-04 12:30:52 -07:00
Ramkumar Ramachandra
af2f8a8c14
[HashRecognize] Introduce new analysis (#139120)
Introduce a fresh analysis for recognizing polynomial hashes, with the
rationale that several targets have specific instructions to optimize
things like CRC and GHASH (eg. X86 and RISC-V crypto extension). We
limit the scope to polynomial hashes computed in a Galois field of
characteristic 2, since this class of operations can also be optimized
in the absence of target-specific instructions to use a lookup table.

At the moment, we only recognize the CRC algorithm.

RFC:
https://discourse.llvm.org/t/rfc-new-analysis-for-polynomial-hash-recognition/86268
2025-06-02 08:25:50 +01:00
Rahul Joshi
062353d1f5
[NFC][LLVM] Minor namespace fixes in PassBuilder (#141288)
- No need to prefix `PointerType` with `llvm::`.
- Avoid namespace  block to define `PrintPipelinePasses`.
2025-05-27 07:26:28 -07:00
Rahul Joshi
58f78d84fd
[NFC][LLVM] Use formatv automatic index assignment in PassBuilder (#141286) 2025-05-27 07:25:37 -07:00
Rahul Joshi
52c2e45c11
[NFC][CodeGen] Adopt MachineFunctionProperties convenience accessors (#141101) 2025-05-23 08:30:29 -07:00
S. VenkataKeerthy
58ab005d8d
Adding IR2Vec as an analysis pass (#134004)
This PR introduces IR2Vec as an analysis pass. The changes include:
- Logic for generating Symbolic encodings.
- 75D learned vocabulary.
- lit tests.

Here is the link to the RFC -
https://discourse.llvm.org/t/rfc-enhancing-mlgo-inlining-with-ir2vec-embeddings

Acknowledgements: contributors -
https://github.com/IITH-Compilers/IR2Vec/graphs/contributors

---------

Co-authored-by: svkeerthy <venkatakeerthy@google.com>
Co-authored-by: Mircea Trofin <mtrofin@google.com>
2025-05-22 09:50:21 -07:00
Min-Yih Hsu
0ab67ec191
[LV][EVL] Introduce the EVLIndVarSimplify Pass for EVL-vectorized loops (#131005)
When we enable EVL-based loop vectorization w/ predicated tail-folding,
each vectorized loop has effectively two induction variables: one
calculates the step using (VF x vscale) and the other one increases the
IV by values returned from experiment.get.vector.length. The former,
also known as canonical IV, is more favorable for analyses as it's
"countable" in the sense of SCEV; the latter (EVL-based IV), however, is
more favorable to codegen, at least for those that support scalable
vectors like AArch64 SVE and RISC-V.

The idea is that we use canonical IV all the way until the end of all
vectorizers, where we replace it with EVL-based IV using EVLIVSimplify
introduced here. Such that we can have the best from both worlds.

This Pass is enabled by default in RISC-V. However, since we haven't
really vectorize loops with predicate tail-folding by default, this Pass
is no-op at this moment.
2025-05-14 13:49:50 -07:00
David Green
ec406e8674
[GlobalISel] Add a GISelValueTracker printing pass (#139687)
This adds a GISelValueTrackingPrinterPass that can print the known bits
and sign bit of each def in a function. It is built on the new pass
manager and so adds a NPM GISelValueTrackingAnalysis, renaming the older
class to GISelValueTrackingAnalysisLegacy.

The first 2 functions from the AArch64GISelMITest are ported over to an
mir test to show it working. It also runs successfully on all files in
llvm/test/CodeGen/AArch64/GlobalISel/*.mir that are not invalid. It can
hopefully be used to test GlobalISel known bits analysis more directly
in common cases, without jumping through the hoops that the C++ tests
requires.
2025-05-14 11:05:04 +01:00
Jie Fu
1563d74145 [Passes] Remove extra ';' outside of a function (NFC)
/llvm-project/llvm/lib/Passes/PassBuilder.cpp:1508:2:
error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi]
 1508 | };
      |  ^
1 error generated.
2025-04-30 18:44:45 +08:00
Akshat Oke
e91cbd4f29
[CodeGen][NPM] Port VirtRegRewriter to NPM (#130564) 2025-04-30 14:10:46 +05:30
Vikram Hegde
53a8b89003
[CodeGen][NewPM] Port "ShrinkWrap" pass to NPM (#129880) 2025-04-30 13:11:17 +05:30
paperchalice
159628cc22
[CodeGen] Port MachineUniformityAnalysis to new pass manager (#137578)
- Add new pass manager version of `MachineUniformityAnalysis `.
- Query `TargetTransformInfo` in new pass manager version.
- Use `printAsOperand` when printing machine function name
2025-04-30 10:44:06 +08:00
Vikram Hegde
86d8e8d9a6
[CodeGen][NewPM] Port "PrologEpilogInserter" to NPM (#130550) 2025-04-29 13:13:45 +05:30
Akshat Oke
31ddaef8d1
[CodeGen][NPM] Port UnreachableMachineBlockElim to NPM (#136127) 2025-04-18 15:06:30 +05:30
Akshat Oke
a388395b86
[CodeGen][NPM] Port StackFrameLayoutAnalysisPass to NPM (#130070) 2025-04-15 12:37:19 +05:30
Akshat Oke
f133eae70c
[CodeGen][NPM] Port MachineSanitizerBinaryMetadata to NPM (#130069)
Didn't find a test for this (but there are tests for the `Function`
version of this pass)
2025-04-14 20:52:26 +05:30
Akshat Oke
e29f986838
[CodeGen][NPM] Port RemoveLoadsIntoFakeUses to NPM (#130068) 2025-04-14 12:58:03 +05:30
Akshat Oke
b283ff7eb1
[CodeGen][NPM] Port BranchRelaxation to NPM (#130067)
This completes the PreEmitPasses.
2025-04-14 10:19:42 +05:30
Akshat Oke
2f6b06b264
[CodeGen][NPM] Port PostRAHazardRecognizer to NPM (#130066) 2025-04-09 16:36:22 +05:30
Matt Arsenault
7e25b24073
IRNormalizer: Replace cl::opts with pass parameters (#133874)
Not sure why the "fold-all" option naming didn't match the
variable "FoldPreOutputs", but I've preserved the difference.

More annoyingly, the pass name "normalize" does not match the pass
name IRNormalizer and should probably be fixed one way or the other.

Also the existing test coverage for the flags is lacking. I've added
a test that shows they parse, but we should have tests that they
do something.
2025-04-01 23:27:20 +07:00
Akshat Oke
4a68702455
[CodeGen][NPM] Port XRayInstrumentation to NPM (#129865) 2025-04-01 15:38:49 +05:30
Matt Arsenault
94122d58fc
Lint: Replace -lint-abort-on-error cl::opt with pass parameter (#132933) 2025-03-31 08:42:51 +07:00
Akshat Oke
174110bf3c
[CodeGen][NPM] Port LiveDebugValues to NPM (#131563) 2025-03-24 11:34:45 +05:30
vporpo
08dda4dcbf
[Analysis][EphemeralValuesCache][InlineCost] Ephemeral values caching for the CallAnalyzer (#130210)
This patch does two things:

1. It implements an ephemeral values cache analysis pass that collects the ephemeral values of a function and caches them for fast lookups. The collection of the ephemeral values is done lazily when the user calls `EphemeralValuesCache::ephValues()`.

2. It adds caching of ephemeral values using the `EphemeralValuesCache` to speed up `CallAnalyzer::analyze()`. Without caching this can take a long time to run in cases where the function contains a large number of `@llvm.assume()` calls and a large number of callsites. The time is spent in `collectEphemeralvalues()`.
2025-03-19 18:18:45 -07:00
Nikita Popov
8f66fb7842 [GlobalMerge] Fix handling of const options
For the NewPM, the merge-const option was assigned to an unused
option field. Assign it to the correct one. The merge-const-aggressive
option was not supported -- and invalid options were silently ignored.
Accept it and error on invalid options.

For the LegacyPM, the corresponding cl::opt options were ignored when
called via opt rather than llc.
2025-03-18 15:06:39 +01:00
Akshat Oke
687c9d359e
[CodeGen][NPM] Port FEntryInserter to NPM (#129857) 2025-03-17 10:35:53 +05:30
Frederik Harwath
6962cf1700
Rename ExpandLargeFpConvertPass to ExpandFpPass (#131128)
This is meant as a preparation for PR #130988 "[AMDGPU] Implement IR
expansion for frem instruction" which implements the expansion of
another instruction in this pass. The more general name seems more
appropriate given this change and quite reasonable even without it.
2025-03-14 13:11:45 +01:00
Akshat Oke
87916f8c32
[CodeGen][NPM] Port MachineBlockPlacement to NPM (#129828) 2025-03-14 10:31:53 +05:30
Ellis Hoag
2044dd07da
[InstrProf] Remove -forder-file-instrumentation (#130192) 2025-03-13 08:28:16 -07:00
Akshat Oke
5952972c91
[CodeGen][NPM] Port BranchFolder to NPM (#128858)
EnableTailMerge is false by default and is handled by the pass builder.
Passes are independent of target pipeline options.

This completes the generic `MachineLateOptimization` passes for the NPM
pipeline.
2025-03-13 13:41:28 +05:30
Guy David
9820248e0a
AddressSanitizer: Add use-after-scope to pass options (#130924) 2025-03-12 17:17:51 +02:00
Akshat Oke
9f617161aa
[CodeGen][NPM] Port PatchableFunction to NPM (#129866) 2025-03-12 15:11:11 +05:30
Akshat Oke
57a90883ca
[CodeGen][NPM] Port DetectDeadLanes to NPM (#130567) 2025-03-12 11:22:02 +05:30