llvm-project

Author	SHA1	Message	Date
Kazu Hirata	cbf5af9668	[llvm] Remove unused includes (NFC) (#154051 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-08-17 23:46:35 -07:00
Joel E. Denny	37e03b56b8	Revert "[PGO] Add `llvm.loop.estimated_trip_count` metadata" (#151585 ) Reverts llvm/llvm-project#148758 [As requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)	2025-07-31 15:56:31 -04:00
Joel E. Denny	f7b65011de	[PGO] Add `llvm.loop.estimated_trip_count` metadata (#148758 ) This patch implements the `llvm.loop.estimated_trip_count` metadata discussed in [[RFC] Fix Loop Transformations to Preserve Block Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785). As [suggested in the RFC comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4), it adds the new metadata to all loops at the time of profile ingestion and estimates each trip count from the loop's `branch_weights` metadata. As [suggested in the PR #128785 review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036), it does so via a new `PGOEstimateTripCountsPass` pass, which creates the new metadata for each loop but omits the value if it cannot estimate a trip count due to the loop's form. An important observation not previously discussed is that `PGOEstimateTripCountsPass` often cannot estimate a loop's trip count, but later passes can sometimes transform the loop in a way that makes it possible. Currently, such passes do not necessarily update the metadata, but eventually that should be fixed. Until then, if the new metadata has no value, `llvm::getLoopEstimatedTripCount` disregards it and tries again to estimate the trip count from the loop's current `branch_weights` metadata.	2025-07-31 12:28:25 -04:00
Alex Voicu	6bcff9eb13	[HIPSTDPAR] Add handling for math builtins (#140158 ) When compiling in `--hipstdpar` mode, the builtins corresponding to the standard library might end up in code that is expected to execute on the accelerator (e.g. by using the `std::` prefixed functions from `<cmath>`). We do not have uniform handling for this in AMDGPU, and the errors that obtain are quite arcane. Furthermore, the user-space changes required to work around this tend to be rather intrusive. This patch adds an additional `--hipstdpar` specific pass which forwards to the run time component of HIPSTDPAR the intrinsics / libcalls which result from the use of the math builtins, and which are not properly handled. In the long run we will want to stop relying on this and handle things in the compiler, but it is going to be a rather lengthy journey, which makes this medium term escape hatch necessary. The paired change in the run time component is here <https://github.com/ROCm/rocThrust/pull/551>.	2025-07-28 22:29:31 +01:00
AZero13	f2fe4718aa	[ObjCARC] Completely remove ObjCARCAPElimPass (#150717 ) ObjCARCAPElimPass has been made obsolete now that we remove unused autorelease pools.	2025-07-26 08:07:27 -07:00
Mircea Trofin	df2d2d125b	[PGO] Add ProfileInjector and ProfileVerifier passes (#147388 ) Adding 2 passes, one to inject `MD_prof` and one to check its presence. A subsequent patch will add these (similar to debugify) to `opt` (and, eventually, a variant of this, to `llc`) Tracking issue: #147390	2025-07-23 21:34:58 +02:00
Jay Foad	756ac65987	[CodeGen] Add a pass for testing finalizeBundle (#149813 ) This allows for unit testing of finalizeBundle with standard MIR tests using update_mir_test_checks.py.	2025-07-23 11:35:57 +01:00
Madhur Amilkanthwar	2320cddfc2	Reapply "[GVN] memoryssa implies no-memdep (#149473 )" (#149767 ) Enabling one of MemorySSA or MD implies the other is off. Already approved in https://github.com/llvm/llvm-project/pull/149473 but I had to revert as I missed updating one test.	2025-07-21 14:05:29 +05:30
Madhur Amilkanthwar	f79d6b319d	Revert "[GVN] memoryssa implies no-memdep (#149473 )" (#149766 ) This reverts commit 60d2d94db253a9fdc7bd111120c803f808564b30.	2025-07-21 11:04:54 +05:30
Madhur Amilkanthwar	60d2d94db2	[GVN] memoryssa implies no-memdep (#149473 ) Enabling one of MemorySSA or MD implies the other is off.	2025-07-21 10:48:03 +05:30
Vikram Hegde	06528070fc	[CodeGen][NPM] Clear MachineFunctions without using PA (#148113 ) same as https://github.com/llvm/llvm-project/pull/139517 This replaces the InvalidateAnalysisPass<MachineFunctionAnalysis> pass. There are no cross-function analysis requirements right now, so clearing all analyses works for the last pass in the pipeline. Having the InvalidateAnalysisPass<MachineFunctionAnalysis>() is causing a problem with ModuleToCGSCCPassAdaptor by deleting machine functions for other functions and ending up with exactly one correctly compiled MF, with the rest being vanished. This is because ModuleToCGSCCPAdaptor propagates PassPA (received from the CGSCCToFunctionPassAdaptor that runs the actual codegen pipeline on MFs) to the next SCC. That causes MFA invalidation on functions in the next SCC. For us, PassPA happens to be returned from invalidate<machine-function-analysis> which abandons the MachineFunctionAnalysis. So while the first function runs through the pipeline normally, invalidate also deletes the functions in the next SCC before its pipeline is run. (this seems to be the intended mechanism of the CG adaptor to allow cross-SCC invalidations. Co-authored-by : Oke, Akshat <[Akshat.Oke@amd.com](mailto:Akshat.Oke@amd.com)>	2025-07-18 11:58:01 +05:30
Cristian Assaiante	81eb7defa2	[OptBisect][IR] Adding a new OptPassGate for disabling passes via name (#145059 ) This commit adds a new pass gate that allows selective disabling of one or more passes via the clang command line using the `-opt-disable` option. Passes to be disabled should be specified as a comma-separated list of their names. The implementation resides in the same file as the bisection tool. The `getGlobalPassGate()` function returns the currently enabled gate. Example: `-opt-disable="PassA,PassB"` Pass names are matched using case-insensitive comparisons. However, note that special characters, including spaces, must be included exactly as they appear in the pass names. Additionally, a `-opt-disable-enable-verbosity` flag has been introduced to enable verbose output when this functionality is in use. When enabled, it prints the status of all passes (either running or NOT running), similar to the default behavior of `-opt-bisect-limit`. This flag is disabled by default, which is the opposite of the `-opt-bisect-verbose` flag (which defaults to enabled). To validate this functionality, a test file has also been provided. It reuses the same infrastructure as the opt-bisect test, but disables three specific passes and checks the output to ensure the expected behavior. --------- Co-authored-by: Nikita Popov <github@npopov.com>	2025-07-16 16:51:58 -07:00
Vikram Hegde	4aa85cc313	[CodeGen][NPM] Port ProcessImplicitDefs to NPM (#148110 ) same as https://github.com/llvm/llvm-project/pull/138829 Co-authored-by : Oke, Akshat <[Akshat.Oke@amd.com](mailto:Akshat.Oke@amd.com)>	2025-07-16 13:23:27 +05:30
Vikram Hegde	8cbcaee7fe	[CodeGen][NPM] Register Function Passes (#148109 ) same as https://github.com/llvm/llvm-project/pull/138828, Co-authored-by : Oke, Akshat <[Akshat.Oke@amd.com](mailto:Akshat.Oke@amd.com)>	2025-07-15 17:01:28 +05:30
Vikram Hegde	fcd4a2fe7a	[CodeGen][NewPM] Port "PostRAMachineSink" pass to NPM (#129690 )	2025-07-10 13:10:46 +05:30
Akshat Oke	b33d95fb8a	[CodeGen][NPM] Port InitUndef to NPM (#138495 )	2025-07-09 15:31:31 +05:30
Matt Arsenault	1915fa15c3	Utils: Add pass to declare runtime libcalls (#147534 ) This will be useful for testing the set of calls for different systems, and eventually the product of context specific modifiers applied. In the future we should also know the type signatures, and be able to emit the correct one.	2025-07-09 00:52:22 +09:00
Akshat Oke	0fbaeafd7f	[CodeGen][NPM] Allow nested MF pass managers for -passes (#128852 ) This allows `machine-function(p1,machine-function(...))` instead of erroring. Effectively it is flattened to a single MFPM.	2025-07-07 12:10:28 +05:30
Ryotaro Kasuga	3099b7eb5d	[Passes] Move LoopInterchange into optimization pipeline (#145503 ) As mentioned in https://github.com/llvm/llvm-project/pull/145071, LoopInterchange should be part of the optimization pipeline rather than the simplification pipeline. This patch moves LoopInterchange into the optimization pipeline. More contexts: - By default, LoopInterchange attempts to improve data locality, however, it also takes increasing vectorization opportunities into account. Given that, it is reasonable to run it as close to vectorization as possible. - I looked into previous changes related to the placement of LoopInterchange, but couldn’t find any strong motivation suggesting that it benefits other simplifications. - As far as I tried some tests (including llvm-test-suite), removing LoopInterchange from the simplification pipeline does not affect other simplifications. Therefore, there doesn't seem to be much value in keeping it there. - The new position reduces compile-time for ThinLTO, probably because it only runs once per function in post-link optimization, rather than both in pre-link and post-link optimization. I haven't encountered any cases where the positional difference affects optimization results, so please feel free to revert if you run into any issues.	2025-07-04 20:06:53 +09:00
Meredith Julian	7da8ed8d33	Fix missing/outdated pass options in PassRegistry.def (#146160 ) There are a handful of passes in PassRegistry.def with outdated or missing pass options. These strings describing pass options are used for the printPassNames() function only, which is likely why they have gotten out-of-date without being caught. This MR simply changes the few passes where the option string is out-of-date, fixing the output of -print-passes. This does not affect functionality of the pipeline parser, and is hard to verify in a unit test, so no tests were added.	2025-07-01 12:41:26 -07:00
S. VenkataKeerthy	35b80031f4	[NFC] Formatting PassRegistry.def (#144139 )	2025-07-01 11:03:43 -07:00
S. VenkataKeerthy	0745eb501d	[IR2Vec] Scale embeddings once in vocab analysis instead of repetitive scaling (#143986 ) Changes to scale opcodes, types and args once in `IR2VecVocabAnalysis` so that we can avoid scaling each time while computing embeddings. This PR refactors the vocabulary to explicitly define 3 sections---Opcodes, Types, and Arguments---used for computing Embeddings. (Tracking issue - #141817 ; partly fixes - #141832)	2025-06-30 23:09:19 +02:00
Rahul Joshi	b0ff473340	[LLVM] Change `ModulePass::skipModule` to take a const reference (#146168 ) Change `ModulePass::skipModule` to take const Module reference. Additionally, make `OptPassGate::shouldRunPass` const as well as for most implementations it's a const query. For `OptBisect`, make `LastBisectNum` mutable so it could be updated in `shouldRunPass`. Additional minor cleanup: Change all StringRef arguments to simple StringRef (no const or reference), change `OptBisect::Disabled` to constexpr.	2025-06-30 07:23:04 -07:00
Nikita Popov	d7a3bdffb9	[PassBuilder][FatLTO] Expose FatLTO pipeline via pipeline string (#146048 ) Expose the FatLTO pipeline via `-passes="fatlto-pre-link<Ox>"`, similar to all the other optimization pipelines. This is to allow reproducing it outside clang. (Possibly also useful for C API users.)	2025-06-30 12:04:42 +02:00
Florian Mayer	71bc606e95	[LowerAllowCheckPass] allow to specify runtime.check hotness (#145998 )	2025-06-27 11:28:07 -07:00
Nikita Popov	7f223d121d	[PassBuilder] Treat pipeline aliases as normal passes (#146038 ) Pipelines like `-passes="default<O3>"` are currently parsed in a special way. Switch them to work like normal, parameterized module passes.	2025-06-27 12:07:09 +02:00
Mircea Trofin	daa2a587cc	[TRE] Adjust function entry count when using instrumented profiles (#143987 ) The entry count of a function needs to be updated after a callsite is elided by TRE: before elision, the entry count accounted for the recursive call at that callsite. After TRE, we need to remove that callsite's contribution. This patch enables this for instrumented profiling cases because, there, we know the function entry count captured entries before TRE. We cannot currently address this for sample-based (because we don't know whether this function was TRE-ed in the binary that donated samples)	2025-06-23 08:07:31 -07:00
Nikita Popov	ae8c85c9ce	[Passes] Remove LoopInterchange from O1 pipeline (#145071 ) This is a fairly exotic pass, I don't think it makes a lot of sense to run it at O1, esp. as vectorization wouldn't run at O1 anyway.	2025-06-23 09:11:03 +02:00
Ramkumar Ramachandra	74054cab7a	[HashRecognize] Make it a non-PM analysis (#144742 ) Make HashRecognize a non-PassManager analysis that can be called to get the result on-demand, creating a new getResult() entry-point. The issue was discovered when attempting to use the analysis to perform a transform in LoopIdiomRecognize.	2025-06-19 12:29:58 +01:00
Peter Collingbourne	3fa231f47c	Add SimplifyTypeTests pass. This pass figures out whether inlining has exposed a constant address to a lowered type test, and remove the test if so and the address is known to pass the test. Unfortunately this pass ends up needing to reverse engineer what LowerTypeTests did; this is currently inherent to the design of ThinLTO importing where LowerTypeTests needs to run at the start. Reviewers: teresajohnson Reviewed By: teresajohnson Pull Request: https://github.com/llvm/llvm-project/pull/141327	2025-06-05 11:09:20 -07:00
Snehasish Kumar	16c7b3c9f5	[MemProf] Split MemProfiler into Instrumentation and Use. (#142811 ) Most of the recent development on the MemProfiler has been on the Use part. The instrumentation has been quite stable for a while. As the complexity of the use grows (with undrifting, diagnostics etc) I figured it would be good to separate these two implementations.	2025-06-05 07:36:50 -07:00
Kazu Hirata	228f66807d	[llvm] Remove unused includes (NFC) (#142733 ) These are identified by misc-include-cleaner. I've filtered out those that break builds. Also, I'm staying away from llvm-config.h, config.h, and Compiler.h, which likely cause platform- or compiler-specific build failures.	2025-06-04 12:30:52 -07:00
John Brawn	81d3189891	[LAA] Keep pointer checks on partial analysis (#139719 ) Currently if there's any memory access that AccessAnalysis couldn't analyze then all of the runtime pointer check results are discarded. This patch makes this able to be controlled with the AllowPartial option, which makes it so we generate the runtime check information for those pointers that we could analyze, as transformations may still be able to make use of the partial information. Of the transformations that use LoopAccessAnalysis, only LoopVersioningLICM changes behaviour as a result of this change. This is because the others either: * Check canVectorizeMemory, which will return false when we have partial pointer information as analyzeLoop() will return false. * Examine the dependencies returned by getDepChecker(), which will be empty as we exit analyzeLoop if we have partial pointer information before calling areDepsSafe(), which is what fills in the dependency information.	2025-06-04 16:47:20 +01:00
Ramkumar Ramachandra	af2f8a8c14	[HashRecognize] Introduce new analysis (#139120 ) Introduce a fresh analysis for recognizing polynomial hashes, with the rationale that several targets have specific instructions to optimize things like CRC and GHASH (eg. X86 and RISC-V crypto extension). We limit the scope to polynomial hashes computed in a Galois field of characteristic 2, since this class of operations can also be optimized in the absence of target-specific instructions to use a lookup table. At the moment, we only recognize the CRC algorithm. RFC: https://discourse.llvm.org/t/rfc-new-analysis-for-polynomial-hash-recognition/86268	2025-06-02 08:25:50 +01:00
Rahul Joshi	062353d1f5	[NFC][LLVM] Minor namespace fixes in PassBuilder (#141288 ) - No need to prefix `PointerType` with `llvm::`. - Avoid namespace block to define `PrintPipelinePasses`.	2025-05-27 07:26:28 -07:00
Rahul Joshi	58f78d84fd	[NFC][LLVM] Use formatv automatic index assignment in PassBuilder (#141286 )	2025-05-27 07:25:37 -07:00
Rahul Joshi	52c2e45c11	[NFC][CodeGen] Adopt MachineFunctionProperties convenience accessors (#141101 )	2025-05-23 08:30:29 -07:00
S. VenkataKeerthy	58ab005d8d	Adding IR2Vec as an analysis pass (#134004 ) This PR introduces IR2Vec as an analysis pass. The changes include: - Logic for generating Symbolic encodings. - 75D learned vocabulary. - lit tests. Here is the link to the RFC - https://discourse.llvm.org/t/rfc-enhancing-mlgo-inlining-with-ir2vec-embeddings Acknowledgements: contributors - https://github.com/IITH-Compilers/IR2Vec/graphs/contributors --------- Co-authored-by: svkeerthy <venkatakeerthy@google.com> Co-authored-by: Mircea Trofin <mtrofin@google.com>	2025-05-22 09:50:21 -07:00
Min-Yih Hsu	0ab67ec191	[LV][EVL] Introduce the EVLIndVarSimplify Pass for EVL-vectorized loops (#131005 ) When we enable EVL-based loop vectorization w/ predicated tail-folding, each vectorized loop has effectively two induction variables: one calculates the step using (VF x vscale) and the other one increases the IV by values returned from experiment.get.vector.length. The former, also known as canonical IV, is more favorable for analyses as it's "countable" in the sense of SCEV; the latter (EVL-based IV), however, is more favorable to codegen, at least for those that support scalable vectors like AArch64 SVE and RISC-V. The idea is that we use canonical IV all the way until the end of all vectorizers, where we replace it with EVL-based IV using EVLIVSimplify introduced here. Such that we can have the best from both worlds. This Pass is enabled by default in RISC-V. However, since we haven't really vectorize loops with predicate tail-folding by default, this Pass is no-op at this moment.	2025-05-14 13:49:50 -07:00
David Green	ec406e8674	[GlobalISel] Add a GISelValueTracker printing pass (#139687 ) This adds a GISelValueTrackingPrinterPass that can print the known bits and sign bit of each def in a function. It is built on the new pass manager and so adds a NPM GISelValueTrackingAnalysis, renaming the older class to GISelValueTrackingAnalysisLegacy. The first 2 functions from the AArch64GISelMITest are ported over to an mir test to show it working. It also runs successfully on all files in llvm/test/CodeGen/AArch64/GlobalISel/*.mir that are not invalid. It can hopefully be used to test GlobalISel known bits analysis more directly in common cases, without jumping through the hoops that the C++ tests requires.	2025-05-14 11:05:04 +01:00
Helena Kotas	c66f401e1e	[DirectX] Implement DXILResourceBindingAnalysis (#137258 ) `DXILResourceBindingAnalysis` analyses explicit resource bindings in the module and puts together lists of used virtual register spaces and available virtual register slot ranges for each binding type. It also stores additional information found during the analysis such as whether the module uses implicit bindings or if any of the bindings overlap. This information will be used in `DXILResourceImplicitBindings` pass (coming soon) to assign register slots to resources with implicit bindings, and in a post-optimization validation pass that will raise diagnostic about overlapping bindings. Part 1/2 of #136786	2025-05-09 10:42:31 -07:00
Chengjun	94d933676c	[AA] Move Target Specific AA before BasicAA (#125965 ) In this change, NVPTX AA is moved before Basic AA to potentially improve compile time. Additionally, it introduces a flag in the `ExternalAAWrapper` that allows other backends to run their target-specific AA passes before Basic AA, if desired. The change works for both New Pass Manager and Legacy Pass Manager. Original implementation by Princeton Ferro <pferro@nvidia.com>	2025-05-07 15:25:48 -07:00
Jie Fu	1563d74145	[Passes] Remove extra ';' outside of a function (NFC) /llvm-project/llvm/lib/Passes/PassBuilder.cpp:1508:2: error: extra ';' outside of a function is incompatible with C++98 [-Werror,-Wc++98-compat-extra-semi] 1508 \| }; \| ^ 1 error generated.	2025-04-30 18:44:45 +08:00
Akshat Oke	e91cbd4f29	[CodeGen][NPM] Port VirtRegRewriter to NPM (#130564 )	2025-04-30 14:10:46 +05:30
Vikram Hegde	53a8b89003	[CodeGen][NewPM] Port "ShrinkWrap" pass to NPM (#129880 )	2025-04-30 13:11:17 +05:30
paperchalice	159628cc22	[CodeGen] Port MachineUniformityAnalysis to new pass manager (#137578 ) - Add new pass manager version of `MachineUniformityAnalysis `. - Query `TargetTransformInfo` in new pass manager version. - Use `printAsOperand` when printing machine function name	2025-04-30 10:44:06 +08:00
Vikram Hegde	86d8e8d9a6	[CodeGen][NewPM] Port "PrologEpilogInserter" to NPM (#130550 )	2025-04-29 13:13:45 +05:30
Akshat Oke	31ddaef8d1	[CodeGen][NPM] Port UnreachableMachineBlockElim to NPM (#136127 )	2025-04-18 15:06:30 +05:30
Akshat Oke	a388395b86	[CodeGen][NPM] Port StackFrameLayoutAnalysisPass to NPM (#130070 )	2025-04-15 12:37:19 +05:30
Akshat Oke	f133eae70c	[CodeGen][NPM] Port MachineSanitizerBinaryMetadata to NPM (#130069 ) Didn't find a test for this (but there are tests for the `Function` version of this pass)	2025-04-14 20:52:26 +05:30

1 2 3 4 5 ...

1457 Commits