llvm-project

Author	SHA1	Message	Date
Teresa Johnson	546ec641b4	Restore "[MemProf] Use new option/pass for profile feedback and matching" This restores commit b4a82b62258c5f650a1cccf5b179933e6bae4867, reverted in 3ab7ef28eebf9019eb3d3c4efd7ebfd160106bb1 because it was thought to cause a bot failure, which ended up being unrelated to this patch set. Differential Revision: https://reviews.llvm.org/D154856	2023-07-11 13:16:20 -07:00
JP Lehr	3ab7ef28ee	Revert "[MemProf] Use new option/pass for profile feedback and matching" This reverts commit b4a82b62258c5f650a1cccf5b179933e6bae4867. Broke AMDGPU OpenMP Offload buildbot	2023-07-11 05:44:42 -04:00
Teresa Johnson	b4a82b6225	[MemProf] Use new option/pass for profile feedback and matching Previously the MemProf profile was expected to be in the same profile file as a normal PGO profile, passed via the usual -fprofile-use= option, and was matched in the same pass. To simplify profile preparation, since the raw MemProf profile requires the binary for symbolization and may be simpler to index separately from the raw PGO profile, and also to enable providing a MemProf profile for a SamplePGO build, separate out the MemProf feedback option and matching pass. This patch adds the -fmemory-profile-use=${file} option, and the provided file is passed down to LLVM and ultimately used in a new MemProfUsePass which performs the matching of just the memory profile contents of that file. Note that a single profile file containing both normal PGO and MemProf profile data is still supported, and the relevant profile data is matched by the appropriate matching pass(es) based on which option(s) the profile is provided with (the same profile file can be supplied to both feedback options). Differential Revision: https://reviews.llvm.org/D154856	2023-07-10 16:42:56 -07:00
Arthur Eubanks	72e7e5851f	[MemorySSA] Always perform MemoryUses liveOnEntry optimization on MSSA construction Fixes invariant memory regressions in future DSE patches. Also add a flag to print<memoryssa> to not ensure optimized uses to test this. Noticeable compile time regression [1], but a future DSE change that depends on this more than makes up for it. [1] https://llvm-compile-time-tracker.com/compare.php?from=9d5466849a770eeab222d5a5890376d3596e8ad6&to=95682dbe11d76a3342870437377216e96b167504&stat=instructions:u Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D152859	2023-07-06 14:09:47 -07:00
Matthew Voss	a1ca3af31e	[llvm] A Unified LTO Bitcode Frontend Here's a high level summary of the changes in this patch. For more information on rational, see the RFC. (https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode-frontend/61774). - Add config parameter to LTO backend, specifying which LTO mode is desired when using unified LTO. - Add unified LTO flag to the summary index for efficiency. Unified LTO modules can be detected without parsing the module. - Make sure that the ModuleID is generated by incorporating more types of symbols. Differential Revision: https://reviews.llvm.org/D123803	2023-07-05 14:53:14 -07:00
Arthur Eubanks	ff79eb3af6	[PassBuilder] Add textual representation for function simplification pipeline Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D153784	2023-06-29 09:39:04 -07:00
Paul Kirth	75a1797044	Reland [llvm] Preliminary fat-lto-objects support Fat LTO objects contain both LTO compatible IR, as well as generated object code. This allows users to defer the choice of whether to use LTO or not to link-time. This is a feature available in GCC for some time, and makes the existing -ffat-lto-objects flag functional in the same way as GCC's. Within LLVM, we add a new EmbedBitcodePass that serializes the module to the object file, and expose a new pass pipeline for compiling fat objects. The new pipeline initially clones the module and runs the selected (Thin)LTOPrelink pipeline, after which it will serialize the module into a `.llvm.lto` section of an ELF file. When compiling for (Thin)LTO, this normally the point at which the compiler would emit a object file containing the bitcode and metadata. After that point we compile the original module using the PerModuleDefaultPipeline used for non-LTO compilation. We generate standard object files at the end of this pipeline, which contain machine code and the new `.llvm.lto` section containing bitcode. Since the two pipelines operate on different copies of the module, we can be sure that the bitcode in the `.llvm.lto` section and object code in `.text` are congruent with the existing output produced by the default and LTO pipelines. Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977 Earlier versions of this patch were missing REQUIRES lines for llc related tests in Transforms/EmbedBitcode. Those tests are now under CodeGen/X86, which should avoid running the check on unsupported platforms. The EmbedbBitcodePass also returned PreservedAnalyses::all when adding a metadata section, which failed expensive checks, since it modified the module. This is now corrected. Reviewed By: tejohnson, MaskRay, nikic Differential Revision: https://reviews.llvm.org/D146776	2023-06-28 21:37:50 +00:00
Alex Brachet	6085eb3084	Revert "Reland [llvm] Preliminary fat-lto-objects support" This reverts commit 44265dc3554ef40920b587eeb787a400663af6c7.	2023-06-24 01:15:50 +00:00
Teresa Johnson	200cc952a2	[LTO][GlobalDCE] Use pass parameter instead of module flag for LTO phase D63932 added a module flag to indicate that we are executing the regular LTO post merge pipeline, so that GlobalDCE could perform more aggressive optimization for Dead Virtual Function Elimination. This caused issues trying to reuse bitcode that had already been through the LTO pipeline (see context in D139816). Instead support this by passing down a parameter flag to the GlobalDCEPass constructor, which is the more usual way for indicating this information. Most test changes are to remove incidental uses of this flag. Of the 2 real uses, llvm/test/LTO/ARM/lto-linking-metadata.ll is now obsolete and removed in this patch, and the virtual-functions-visibility-post-lto.ll test is updated to use the regular LTO default pipeline where this parameter is set to true. Differential Revision: https://reviews.llvm.org/D153655	2023-06-23 17:05:07 -07:00
Paul Kirth	44265dc355	Reland [llvm] Preliminary fat-lto-objects support Fat LTO objects contain both LTO compatible IR, as well as generated object code. This allows users to defer the choice of whether to use LTO or not to link-time. This is a feature available in GCC for some time, and makes the existing -ffat-lto-objects flag functional in the same way as GCC's. Within LLVM, we add a new EmbedBitcodePass that serializes the module to the object file, and expose a new pass pipeline for compiling fat objects. The new pipeline initially clones the module and runs the selected (Thin)LTOPrelink pipeline, after which it will serialize the module into a `.llvm.lto` section of an ELF file. When compiling for (Thin)LTO, this normally the point at which the compiler would emit a object file containing the bitcode and metadata. After that point we compile the original module using the PerModuleDefaultPipeline used for non-LTO compilation. We generate standard object files at the end of this pipeline, which contain machine code and the new `.llvm.lto` section containing bitcode. Since the two pipelines operate on different copies of the module, we can be sure that the bitcode in the `.llvm.lto` section and object code in `.text` are congruent with the existing output produced by the default and LTO pipelines. Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977 Earlier versions of this patch were missing REQUIRES lines for llc related tests in Transforms/EmbedBitcode. Those tests are now under CodeGen/X86, which should avoid running the check on unsupported platforms. Reviewed By: tejohnson, MaskRay, nikic Differential Revision: https://reviews.llvm.org/D146776	2023-06-23 23:23:58 +00:00
Paul Kirth	a3800ad9d8	Revert "[llvm] Preliminary fat-lto-objects support" There seems to be a problem on arm buildbots. Reverting until I can investigate. https://lab.llvm.org/buildbot#builders/245/builds/10184 This reverts commit a67208e1c697649ce432e6497f56a93675273dd8 and dependent commit e54a3112cee5ae0a9117359ecbea878e1388f51e.	2023-06-23 18:43:41 +00:00
Paul Kirth	a67208e1c6	[llvm] Preliminary fat-lto-objects support Fat LTO objects contain both LTO compatible IR, as well as generated object code. This allows users to defer the choice of whether to use LTO or not to link-time. This is a feature available in GCC for some time, and makes the existing -ffat-lto-objects flag functional in the same way as GCC's. Within LLVM, we add a new EmbedBitcodePass that serializes the module to the object file, and expose a new pass pipeline for compiling fat objects. The new pipeline initially clones the module and runs the selected (Thin)LTOPrelink pipeline, after which it will serialize the module into a `.llvm.lto` section of an ELF file. When compiling for (Thin)LTO, this normally the point at which the compiler would emit a object file containing the bitcode and metadata. After that point we compile the original module using the PerModuleDefaultPipeline used for non-LTO compilation. We generate standard object files at the end of this pipeline, which contain machine code and the new `.llvm.lto` section containing bitcode. Since the two pipelines operate on different copies of the module, we can be sure that the bitcode in the `.llvm.lto` section and object code in `.text` are congruent with the existing output produced by the default and LTO pipelines. Original RFC: https://discourse.llvm.org/t/rfc-ffat-lto-objects-support/63977 Reviewed By: tejohnson, MaskRay, nikic Differential Revision: https://reviews.llvm.org/D146776	2023-06-23 17:51:30 +00:00
Yann Girsberger	1d5651060e	[opt] Exposing the parameters of LoopRotate to the -passes interface There is a gap between running opt -Oz and running opt -passes="OZ_PASSES" where OZ_PASSES is taken from running opt -Oz -print-pipeline-passes. One of the reasons causing this is that -Oz uses non-default setting for LoopRotate but LoopRotate does not expose its settings when printing the pipeline. This commit fixes this by exposing LoopRotates parameters. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D153437	2023-06-22 11:09:23 -07:00
Arthur Eubanks	d49984fa4f	[SimplifyCFG] Add option to not speculate blocks Required for phase ordering changes to not regress Rust code with D145265. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D153391	2023-06-22 08:51:40 -07:00
Arthur Eubanks	278d65b2cf	[SimplifyCFG] Add textual pass params for FoldTwoEntryPHINode and SimplifyCondBranch	2023-06-15 14:21:24 -07:00
serge-sans-paille	afa13ba18d	Reapply Move "auto-init" instructions to the dominator of their users Original patch (50b2a113db197a97f60ad2aace8b7382dc9b8c31) ignored the fact that -ftrivial-auto-var-init could affect function parameters with the sret attribute. Just do not move instruction that don't affect alloca. Also add missing test case for volatile instruction. Differential Revision: https://reviews.llvm.org/D148507	2023-04-24 18:10:10 +02:00
pvanhout	ae77aceba5	[Analysis] Remove DA & LegacyDA UniformityAnalysis offers all of the same features and much more, there is no reason left to use the legacy DAs. See RFC: https://discourse.llvm.org/t/rfc-deprecate-divergenceanalysis-legacydivergenceanalysis/69538 - Remove LegacyDivergenceAnalysis.h/.cpp - Remove DivergenceAnalysis.h/.cpp + Unit tests - Remove SyncDependenceAnalysis - it was not a real registered analysis and was only used by DAs - Remove/adjust references to the passes in the docs where applicable - Remove TTI hook associated with those passes. - Move tests to UniformityAnalysis folder. - Remove RUN lines for the DA, leave only the UA ones. - Some tests had to be adjusted/removed depending on how they used the legacy DAs. Reviewed By: foad, sameerds Differential Revision: https://reviews.llvm.org/D148116	2023-04-17 09:01:22 +02:00
Hans Wennborg	a6d9730f40	Revert "Move "auto-init" instructions to the dominator of their users" This could also move initialization of sret args, causing actually initialized parts of such return values to be uninitialized. See discussion on the code review. > As a result of -ftrivial-auto-var-init, clang generates instructions to > set alloca'd memory to a given pattern, right after the allocation site. > In some cases, this (somehow costly) operation could be delayed, leading > to conditional execution in some cases. > > This is not an uncommon situation: it happens ~500 times on the cPython > code base, and much more on the LLVM codebase. The benefit greatly > varies on the execution path, but it should not regress on performance. > > This is a recommit of cca01008cc31a891d0ec70aff2201b25d05d8f1b with > MemorySSA update fixes. > > Differential Revision: https://reviews.llvm.org/D137707 This reverts commit 50b2a113db197a97f60ad2aace8b7382dc9b8c31 and follow-up commit ad9ad3735c4821ff4651fab7537a75b8f0bb60f8.	2023-04-12 13:37:21 +02:00
serge-sans-paille	50b2a113db	Move "auto-init" instructions to the dominator of their users As a result of -ftrivial-auto-var-init, clang generates instructions to set alloca'd memory to a given pattern, right after the allocation site. In some cases, this (somehow costly) operation could be delayed, leading to conditional execution in some cases. This is not an uncommon situation: it happens ~500 times on the cPython code base, and much more on the LLVM codebase. The benefit greatly varies on the execution path, but it should not regress on performance. This is a recommit of cca01008cc31a891d0ec70aff2201b25d05d8f1b with MemorySSA update fixes. Differential Revision: https://reviews.llvm.org/D137707	2023-04-04 07:30:03 +02:00
serge-sans-paille	11ae47dfc6	Revert "Move "auto-init" instructions to the dominator of their users" This reverts commit cca01008cc31a891d0ec70aff2201b25d05d8f1b. This change breaks memory ssa checks, see https://lab.llvm.org/buildbot#builders/109/builds/60970	2023-04-03 15:46:18 +02:00
serge-sans-paille	cca01008cc	Move "auto-init" instructions to the dominator of their users As a result of -ftrivial-auto-var-init, clang generates instructions to set alloca'd memory to a given pattern, right after the allocation site. In some cases, this (somehow costly) operation could be delayed, leading to conditional execution in some cases. This is not an uncommon situation: it happens ~500 times on the cPython code base, and much more on the LLVM codebase. The benefit greatly varies on the execution path, but it should not regress on performance. Differential Revision: https://reviews.llvm.org/D137707	2023-04-03 15:27:27 +02:00
Teresa Johnson	700cd99061	Restore "[MemProf] Context disambiguation cloning pass [patch 1a/3]" This restores commit d6ad4f01c3dafcab335bca66dac6e36d9eac8421, which was reverted in commit 883dbb9c86be87593a58ef10b070b3a0564c7fee, along with a fix for gcc 12.2 build errors in the original commit. Support for building, printing, and displaying CallsiteContextGraph which represents the MemProf metadata contexts. Uses CRTP to enable support for both IR (regular LTO) and summary (ThinLTO). This patch includes the support for building it in regular LTO mode (from memprof and callsite metadata), and the next patch will add the handling for building it from ThinLTO summaries. Also includes support for dumping the graph to text and to dot files. Follow-on patches will contain the support for cloning on the graph and in the IR. The graph represents the call contexts in all memprof metadata on allocation calls, with nodes for the allocations themselves, as well as for the calls in each context. The graph is initially built from the allocation memprof metadata (or summary) MIBs. It is then updated to match calls with callsite metadata onto the nodes, updating it to reflect any inlining performed on those calls. Each MIB (representing an allocation's call context with allocation behavior) is assigned a unique context id during the graph build. The edges and nodes in the graph are decorated with the context ids they carry. This is used to correctly update the graph when cloning is performed so that we can uniquify the context for a single (possibly cloned) allocation. Differential Revision: https://reviews.llvm.org/D140908	2023-03-22 10:16:06 -07:00
Nikita Popov	883dbb9c86	Revert "[MemProf] Context disambiguation cloning pass [patch 1a/3]" This reverts commit d6ad4f01c3dafcab335bca66dac6e36d9eac8421. Fails to build on at least gcc 12.2: /home/npopov/repos/llvm-project/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:482:1: error: no declaration matches ‘ContextNode<DerivedCCG, FuncTy, CallTy>* CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::getNodeForInst(const CallInfo&)’ 482 \| CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::getNodeForInst( \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/npopov/repos/llvm-project/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:393:16: note: candidate is: ‘CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::ContextNode* CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::getNodeForInst(const CallInfo&)’ 393 \| ContextNode *getNodeForInst(const CallInfo &C); \| ^~~~~~~~~~~~~~ /home/npopov/repos/llvm-project/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp:99:7: note: ‘class CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>’ defined here 99 \| class CallsiteContextGraph { \| ^~~~~~~~~~~~~~~~~~~~	2023-03-22 15:43:46 +01:00
Teresa Johnson	d6ad4f01c3	[MemProf] Context disambiguation cloning pass [patch 1a/3] Support for building, printing, and displaying CallsiteContextGraph which represents the MemProf metadata contexts. Uses CRTP to enable support for both IR (regular LTO) and summary (ThinLTO). This patch includes the support for building it in regular LTO mode (from memprof and callsite metadata), and the next patch will add the handling for building it from ThinLTO summaries. Also includes support for dumping the graph to text and to dot files. Follow-on patches will contain the support for cloning on the graph and in the IR. The graph represents the call contexts in all memprof metadata on allocation calls, with nodes for the allocations themselves, as well as for the calls in each context. The graph is initially built from the allocation memprof metadata (or summary) MIBs. It is then updated to match calls with callsite metadata onto the nodes, updating it to reflect any inlining performed on those calls. Each MIB (representing an allocation's call context with allocation behavior) is assigned a unique context id during the graph build. The edges and nodes in the graph are decorated with the context ids they carry. This is used to correctly update the graph when cloning is performed so that we can uniquify the context for a single (possibly cloned) allocation. Depends on D140786. Differential Revision: https://reviews.llvm.org/D140908	2023-03-22 07:05:27 -07:00
Arthur Eubanks	5558346c2b	[CGSCC] Allow creation of no-rerun CGSCC->function adaptor via textual pipeline Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D145196	2023-03-19 16:26:27 -07:00
Nikita Popov	a8f6b5763e	[PassBuilder] Support O0 in default pipelines The default and pre-link pipeline builders currently require you to call a separate method for optimization level O0, even though they have perfectly well-defined O0 optimization pipelines. Accept O0 optimization level and call buildO0DefaultPipeline() internally, so all consumers don't need to repeat this. Differential Revision: https://reviews.llvm.org/D146200	2023-03-17 10:00:05 +01:00
Arthur Eubanks	0d4a709bb8	[Pipeline] Adjust PostOrderFunctionAttrs placement in simplification pipeline We can infer more attribute information once functions are fully simplified, so move the PostOrderFunctionAttrs pass after the function simplification pipeline. However, just doing this can impact simplification of recursive functions since function simplification takes advantage of function attributes of callees (some LLVM tests are actually impacted by this), so keep a copy of PostOrderFunctionAttrs before the function simplification pipeline that only runs on recursive functions. For example, this fixes the small regression noticed in https://reviews.llvm.org/D128830. This requires some restructuring of the CGSCC NoRerun feature. We need to cache the ShouldNotRunFunctionPassesAnalysis analysis after the simplification is done, which now is after the second PostOrderFunctionAttrs run, rather than after the function simplification pipeline. Compile time impact: https://llvm-compile-time-tracker.com/compare.php?from=33cf40122279342b50f92a3a53f5c185390b6018&to=1bb2a07875634e508a6bdf2ca1b130f55510f060&stat=instructions:u Compile time increase from unconditionally running the first PostOrderFunctionAttrs: https://llvm-compile-time-tracker.com/compare.php?from=1bb2a07875634e508a6bdf2ca1b130f55510f060&to=f4f87e89cc7a35c64e3a103a8036192a84ae002b&stat=instructions:u Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D145210	2023-03-06 09:01:45 -08:00
Jan Dupej	1ceb79e2e0	Port PlaceSafepoints pass to the new pass manager This patch ports the PlaceSafepoints pass to the new pass manager as it is used by .NET/Mono. Compatibility with the legacy pass manager is maintained by adding PlaceSafepointsLegacyPass. This pass also depends on PlaceBackedgeSafepointsLegacyPass, which has been kept in the legacy-only variant, since it is apparently used only from PlaceSafepointsPass. It has been renamed, though, to indicate its legacy interface. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D136163	2023-02-17 09:17:49 -08:00
Sanjay Patel	4ecc6af813	[InstCombine] create a pass options container and add "use-loop-info" argument This is a cleanup/modernization patch requested in D144045 to make loop analysis a proper optional parameter to the pass rather than a semi-arbitrary value inherited via the pass pipeline. It's a bit more complicated than the recent patch I started copying from (D143980) because InstCombine already has an option for MaxIterations (added with D71145). I debated just deleting that option, but it was used by a pair of existing tests, so I put it into a struct (code largely copied from SimplifyCFG's implementation) to make the code more flexible for future options enhancements. I didn't alter the pass manager invocations of InstCombine in this patch because the patch was already getting big, but that will be a small follow-up as noted with the TODO comment. Differential Revision: https://reviews.llvm.org/D144199	2023-02-17 10:30:15 -05:00
Liren Peng	a52432f633	[NFC][SeparateConstOffsetFromGEP] Added flag `lower-gep` We need such a flag to check whether the transformation is correct if LowerGEP was enabled. Reviewed By: nikic, arsenm, spatel Differential Revision: https://reviews.llvm.org/D143980	2023-02-15 02:04:30 +00:00
Samuel Parker	2a58be4239	[HardwareLoops] NewPM support. With the NPM, we're now defaulting to preserving LCSSA, so a couple of tests have changed slightly. Differential Revision: https://reviews.llvm.org/D140982	2023-02-13 09:46:31 +00:00
Arthur Eubanks	4ce34bb2a9	[CGSCC] Add pass which counts the max number of times we visit a function This will help with finding potential pathological CGSCC cases. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D142853	2023-01-30 10:06:53 -08:00
Nikita Popov	edc1bcfc24	[PassBuilder] Detect loop-mssa for licm with parameters (PR60149) When auto-detecting loop-mssa for licm/lnicm, also handle the case where there are pass parameters. Fixes https://github.com/llvm/llvm-project/issues/60149.	2023-01-23 12:11:33 +01:00
Anshil Gandhi	c52f9485b0	[LegacyDivergenceAnalysis] Add NewPM support Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D142161	2023-01-20 15:53:28 -07:00
Matt Arsenault	e7cd42f8e4	Utils: Add utility pass to lower ifuncs Create a global constructor which will initialize a global table of function pointers. For now, this is only used as a reduction technique for llvm-reduce. In the future this may be useful to support ifunc on systems where the program loader doesn't natively support it.	2023-01-17 22:33:56 -05:00
Samuel Parker	615333bc09	[TypePromotion] NewPM support. Differential Revision: https://reviews.llvm.org/D140893	2023-01-03 15:09:29 +00:00
Alexandros Lamprineas	f952bc05fd	[IPSCCP] Create a Pass parameter to control specialization of functions. Required for D140210 in order to disable FuncSpec at {Os, Oz} optimization levels. Differential Revision: https://reviews.llvm.org/D140564	2022-12-23 16:54:45 +00:00
Sameer Sahasrabuddhe	475ce4c200	RFC: Uniformity Analysis for Irreducible Control Flow Uniformity analysis is a generalization of divergence analysis to include irreducible control flow: 1. The proposed spec presents a notion of "maximal convergence" that captures the existing convention of converging threads at the headers of natual loops. 2. Maximal convergence is then extended to irreducible cycles. The identity of irreducible cycles is determined by the choices made in a depth-first traversal of the control flow graph. Uniformity analysis uses criteria that depend only on closed paths and not cycles, to determine maximal convergence. This makes it a conservative analysis that is independent of the effect of DFS on CycleInfo. 3. The analysis is implemented as a template that can be instantiated for both LLVM IR and Machine IR. Validation: - passes existing tests for divergence analysis - passes new tests with irreducible control flow - passes equivalent tests in MIR and GMIR Based on concepts originally outlined by Nicolai Haehnle <nicolai.haehnle@amd.com> With contributions from Ruiling Song <ruiling.song@amd.com> and Jay Foad <jay.foad@amd.com>. Support for GMIR and lit tests for GMIR/MIR added by Yashwant Singh <yashwant.singh@amd.com>. Differential Revision: https://reviews.llvm.org/D130746	2022-12-20 07:22:24 +05:30
Nikita Popov	8005332835	[AA] Remove CFL AA passes The CFL Steens/Anders alias analysis passes are not enabled by default, and to the best of my knowledge have no pathway towards ever being enabled by default. The last significant interest in these passes seems to date back to 2016. Given the little maintenance these have seen in recent times, I also have very little confidence in the correctness of these passes. I don't think we should keep these in-tree. Differential Revision: https://reviews.llvm.org/D139703	2022-12-12 09:34:20 +01:00
Roman Lebedev	4f7e5d2206	[SROA] For non-speculatable `load`s of `select`s -- split block, insert then/else blocks, form two-entry PHI node, take 2 Currently, SROA is CFG-preserving. Not doing so does not affect any pipeline test. (???) Internally, SROA requires Dominator Tree, and uses it solely for the final `-mem2reg` call. By design, we can't really SROA alloca if their address escapes somehow, but we have logic to deal with `load` of `select`/`PHI`, where at least one of the possible addresses prevents promotion, by speculating the `load`s and `select`ing between loaded values. As one would expect, that requires ensuring that the speculation is actually legal. Even ignoring complexity bailouts, that logic does not deal with everything, e.g. `isSafeToLoadUnconditionally()` does not recurse into hands of `select`. There can also be cases where the load is genuinely non-speculate. So if we can't prove that the load can be speculated, unfold the select, produce two-entry phi node, and perform predicated load. Now, that transformation must obviously update Dominator Tree, since we require it later on. Doing so is trivial. Additionally, we don't want to do this for the final SROA invocation (D136806). In the end, this ends up having negative (!) compile-time cost: https://llvm-compile-time-tracker.com/compare.php?from=c6d7e80ec4c17a415673b1cfd25924f98ac83608&to=ddf9600365093ea50d7e278696cbfa01641c959d&stat=instructions:u Though indeed, this only deals with `select`s, `PHI`s are still using speculation. Should we update some more analysis? Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138238 This reverts commit 739611870d3b06605afe25cc07833f6a62de9545, and recommits 03e6d9d9d1d48e43f3efc35eb75369b90d4510d5 with a fixed assertion - we should check that DTU is there, not just assert false...	2022-12-08 20:19:55 +03:00
Roman Lebedev	739611870d	Revert "[SROA] For non-speculatable `load`s of `select`s -- split block, insert then/else blocks, form two-entry PHI node" The assertion about not modifying the CFG seems to not hold, will recommit in a bit. https://lab.llvm.org/buildbot#builders/139/builds/32412 This reverts commit 03e6d9d9d1d48e43f3efc35eb75369b90d4510d5. This reverts commit 4f90f4ada33718f9025d0870a4fe3fe88276b3da.	2022-12-08 19:51:15 +03:00
Roman Lebedev	03e6d9d9d1	[SROA] For non-speculatable `load`s of `select`s -- split block, insert then/else blocks, form two-entry PHI node Currently, SROA is CFG-preserving. Not doing so does not affect any pipeline test. (???) Internally, SROA requires Dominator Tree, and uses it solely for the final `-mem2reg` call. By design, we can't really SROA alloca if their address escapes somehow, but we have logic to deal with `load` of `select`/`PHI`, where at least one of the possible addresses prevents promotion, by speculating the `load`s and `select`ing between loaded values. As one would expect, that requires ensuring that the speculation is actually legal. Even ignoring complexity bailouts, that logic does not deal with everything, e.g. `isSafeToLoadUnconditionally()` does not recurse into hands of `select`. There can also be cases where the load is genuinely non-speculate. So if we can't prove that the load can be speculated, unfold the select, produce two-entry phi node, and perform predicated load. Now, that transformation must obviously update Dominator Tree, since we require it later on. Doing so is trivial. Additionally, we don't want to do this for the final SROA invocation (D136806). In the end, this ends up having negative (!) compile-time cost: https://llvm-compile-time-tracker.com/compare.php?from=c6d7e80ec4c17a415673b1cfd25924f98ac83608&to=ddf9600365093ea50d7e278696cbfa01641c959d&stat=instructions:u Though indeed, this only deals with `select`s, `PHI`s are still using speculation. Should we update some more analysis? Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138238	2022-12-08 16:51:32 +03:00
Fangrui Song	4e62072ca1	[Passes] llvm::Optional => std::optional	2022-12-04 20:44:52 +00:00
Kazu Hirata	aadaaface2	[llvm] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 21:11:44 -08:00
Kazu Hirata	410c1f6269	[Passes] Use std::optional in PassBuilder.cpp (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-11-25 12:47:46 -08:00
Sami Tolvanen	cacd3e73d7	Add generic KCFI operand bundle lowering The KCFI sanitizer emits "kcfi" operand bundles to indirect call instructions, which the LLVM back-end lowers into an architecture-specific type check with a known machine instruction sequence. Currently, KCFI operand bundle lowering is supported only on 64-bit X86 and AArch64 architectures. As a lightweight forward-edge CFI implementation that doesn't require LTO is also useful for non-Linux low-level targets on other machine architectures, add a generic KCFI operand bundle lowering pass that's only used when back-end lowering support is not available and allows -fsanitize=kcfi to be enabled in Clang on all architectures. This relands commit eb2a57ebc7aaad551af30462097a9e06c96db925 with fixes. Reviewed By: nickdesaulniers, MaskRay Differential Revision: https://reviews.llvm.org/D135411	2022-11-22 23:01:18 +00:00
Alexander Shaposhnikov	7059a6c32c	[IR] Split out IR printing passes into IRPrinter This diff splits out (from LLVMCore) IR printing passes into IRPrinter. This structure is similar to what we already have for IRReader and enables us to avoid circular dependencies between LLVMCore and Analysis (this is a preparation for https://reviews.llvm.org/D137768). The legacy interface is left unchanged, once the legacy pass manager is removed (in the future) we will be able to clean it up further. The bazel build configuration has been updated as well. Test plan: 1/ Tested the following cmake configurations: static/dynamic linking * lld/gold * clang/gcc 2/ bazel build --config=generic_clang @llvm-project//... Differential revision: https://reviews.llvm.org/D138081	2022-11-18 01:47:56 +00:00
Fangrui Song	fc91c70593	Revert D135411 "Add generic KCFI operand bundle lowering" This reverts commit eb2a57ebc7aaad551af30462097a9e06c96db925. llvm/include/llvm/Transforms/Instrumentation/KCFI.h including llvm/CodeGen is a layering violation. We should use an approach where Instrumementation/ doesn't need to include CodeGen/. Sorry for not spotting this in the review.	2022-11-17 22:45:30 +00:00
Sami Tolvanen	eb2a57ebc7	Add generic KCFI operand bundle lowering The KCFI sanitizer emits "kcfi" operand bundles to indirect call instructions, which the LLVM back-end lowers into an architecture-specific type check with a known machine instruction sequence. Currently, KCFI operand bundle lowering is supported only on 64-bit X86 and AArch64 architectures. As a lightweight forward-edge CFI implementation that doesn't require LTO is also useful for non-Linux low-level targets on other machine architectures, add a generic KCFI operand bundle lowering pass that's only used when back-end lowering support is not available and allows -fsanitize=kcfi to be enabled in Clang on all architectures. Reviewed By: nickdesaulniers, MaskRay Differential Revision: https://reviews.llvm.org/D135411	2022-11-17 21:55:00 +00:00
OCHyams	913b561c0a	[Assignment Tracking][6/*] Add trackAssignments function The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir Add trackAssignments which adds assignment tracking metadata to a function for a specified set of variables. The intended callers are the inliner and the front end - those calls will be added in separate patches. I've added a pass called declare-to-assign (AssignmentTrackingPass) that converts dbg.declare intrinsics to dbg.assigns using trackAssignments so that the function can be easily tested (see llvm/test/DebugInfo/Generic/track-assignments.ll). The pass could also be used by front ends to easily test out enabling assignment tracking. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D132225	2022-11-08 16:52:11 +00:00

1 2 3 4 5 ...

721 Commits