llvm-project

Author	SHA1	Message	Date
Lei Wang	bef3b54ea1	[InstrPGO] Avoid using global variable to fix potential data race (#114364 ) In https://github.com/llvm/llvm-project/pull/109837, it sets a global variable(`PGOInstrumentColdFunctionOnly`) in PassBuilderPipelines.cpp which introduced a data race detected by TSan. To fix this, I decouple the flag setting, the flags are now set separately(`instrument-cold-function-only-path` is required to be used with `--pgo-instrument-cold-function-only`).	2024-10-31 21:28:13 -07:00
Dmitry Chernenkov	d924a9ba03	Revert "[InstrPGO] Support cold function coverage instrumentation (#109837 )" This reverts commit e517cfc531886bf6ed64b4e7109bb3141ac7f430.	2024-10-31 10:55:17 +00:00
Lei Wang	e517cfc531	[InstrPGO] Support cold function coverage instrumentation (#109837 ) This patch adds support for cold function coverage instrumentation based on sampling PGO counts. The major motivation is to detect dead functions for the services that are optimized with sampling PGO. If a function is covered by sampling profile count (e.g., those with an entry count > 0), we choose to skip instrumenting those functions, which significantly reduces the instrumentation overhead. More details about the implementation and flags: - Added a flag `--pgo-instrument-cold-function-only` in `PGOInstrumentation.cpp` as the main switch to control skipping the instrumentation. - Built the extra instrumentation passes(a bundle of passes in `addPGOInstrPasses`) under sampling PGO pipeline. This is controlled by `--instrument-cold-function-only-path` flag. - Added a driver flag `-fprofile-generate-cold-function-coverage`: - 1) Config the flags in one place, i,e. adding `--instrument-cold-function-only-path=<...>` and `--pgo-function-entry-coverage`. Note that the instrumentation file path is passed through `--instrument-sample-cold-function-path`, because we cannot use the `PGOOptions.ProfileFile` as it's already used by `-fprofile-sample-use=<...>`. - 2) makes linker to link `compiler_rt.profile` lib(see [ToolChain.cpp#L1125-L1131](https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChain.cpp#L1125-L1131) ). - Added a flag(`--pgo-cold-instrument-entry-threshold`) to config entry count to determine cold function. Overall, the full command is like: ``` clang++ -O2 -fprofile-generate-cold-function-coverage=<...> -fprofile-sample-use=<...> code.cc -o code ```	2024-10-28 10:13:45 -07:00
Michael O'Farrell	10f0c1aadd	[PGO] Ensure non-zero entry-count after `populateCounters` (#112029 ) With sampled instrumentation (#69535), profile counts may appear corrupt and `fixFuncEntryCount` may assert. In particular a function can have a 0 block count for its entry, while later blocks are non zero. This is only likely to happen for colder functions, so it is reasonable to take any action that does not crash. Here we simply bail from fixing the entry count.	2024-10-22 16:05:40 -07:00
Jay Foad	d9c95efb6c	[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112546 ) Convert almost every instance of: CreateCall(Intrinsic::getOrInsertDeclaration(...), ...) to the equivalent CreateIntrinsic call.	2024-10-16 15:43:30 +01:00
Howard Roark	e36b22f3bf	Revert "[PGO] Preserve analysis results when nothing was instrumented (#93421 )" This reverts commit 23c64beeccc03c6a8329314ecd75864e09bb6d97.	2024-10-16 10:50:48 +03:00
Pavel Samolysov	23c64beecc	[PGO] Preserve analysis results when nothing was instrumented (#93421 ) The `PGOInstrumentationGen` pass should preserve all analysis results when nothing was actually instrumented. Currently, only modules that contain at least a single function definition are instrumented. When a module contains only function declarations and, optionally, global variable definitions (a module for the regular-LTO phase for thin-LTO when LTOUnit splitting is enabled, for example), such module is not instrumented (yet?) and there is no reason to invalidate any analysis results. NFC.	2024-10-12 06:29:55 +03:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Youngsuk Kim	d31e314131	[llvm] Don't call raw_string_ostream::flush() (NFC) Don't call raw_string_ostream::flush(), which is essentially a no-op. As specified in the docs, raw_string_ostream is always unbuffered. ( 65b13610a5226b84889b923bae884ba395ad084d for further reference )	2024-09-20 12:19:59 -05:00
Antonio Frighetto	2ae968a0d9	[Instrumentation] Move out to Utils (NFC) (#108532 ) Utility functions have been moved out to Utils. Minor opportunity to drop the header where not needed.	2024-09-15 21:07:40 -07:00
Mircea Trofin	82266d3a2b	[nfc][ctx_prof] Factor the callsite instrumentation exclusion criteria (#108471 ) Reusing this in the logic fetching the instrumentation in `CtxProfAnalysis`.	2024-09-13 21:25:47 -07:00
Mircea Trofin	f32e5bdcef	[NFC] Rename the `Nr` abbreviation to `Num` (#107151 ) It's more clear. (This isn't exhaustive).	2024-09-05 12:34:47 -07:00
Avi Kivity	46a4132e16	[Instrumentation] Fix EdgeCounts vector size in SetBranchWeights (#99064 )	2024-08-26 07:56:45 -07:00
Mircea Trofin	960a210b1f	[ctx_prof] Remove the dependency on the "name" GlobalVariable (#105731 ) We don't need that name variable for contextual instrumentation, we just use the function to get its GUID which we pass to the runtime, and rely on metadata to capture it through the various optimization passes. This change removes the need for the name global variable.	2024-08-23 10:31:43 -07:00
Ethan Luis McDonough	fde2d23ee2	[PGO][OpenMP] Instrumentation for GPU devices (Revision of #76587 ) (#102691 ) This pull request is a revised version of #76587. This pull request fixes some build issues that were present in the previous version of this change. > This pull request is the first part of an ongoing effort to extends PGO instrumentation to GPU device code. This PR makes the following changes: > > - Adds blank registration functions to device RTL > - Gives PGO globals protected visibility when targeting a supported GPU > - Handles any addrspace casts for PGO calls > - Implements PGO global extraction in GPU plugins (currently only dumps info) > > These changes can be tested by supplying `-fprofile-instrument=clang` while targeting a GPU.	2024-08-22 01:10:54 -05:00
Mircea Trofin	3f18a0a71c	[nfc] Improve testability of PGOInstrumentationGen (#104490 ) Passing to the `PGOInstrumentationGen` pass whether it needs to produce contextual profiling instrumentation as a flag, in the process restructuring a bit the places that need to be aware of that (some were unnecessarily in `PGOInstrumentationUse`)	2024-08-16 09:45:29 -07:00
Mircea Trofin	4a2bf05980	Reapply "[ctx_prof] Fix the pre-thinlink "use" case (#102511 )" This reverts commit 967185eeb85abb77bd6b6cdd2b026d5c54b7d4f3. The problem was link dependencies, moved `UseCtxProfile` to `Analysis`.	2024-08-08 17:04:00 -07:00
Aiden Grossman	967185eeb8	Revert "[ctx_prof] Fix the pre-thinlink "use" case (#102511 )" This reverts commit 1a6d60e0162b3ef767c87c95512dd453bf4f4746. Broke some buildbots.	2024-08-08 21:14:56 +00:00
Mircea Trofin	1a6d60e016	[ctx_prof] Fix the pre-thinlink "use" case (#102511 ) Didn't notice in #101338 that the instrumentation in `llvm/test/Transforms/PGOProfile/ctx-prof-use-prelink.ll` was actually incorrect.	2024-08-08 16:45:04 -04:00
Mingming Liu	a634171896	[InstrPGO][TypeProf]Annotate vtable types when they are present in the profile (#99402 ) Before this change, when `file.profdata` have vtable profiles but `--enable-vtable-value-profiling` is not on for optimized build, warnings from this line [1] will show up. They are benign for performance but confusing. It's better to automatically annotate vtable profiles if `file.profdata` has them. This PR implements it in profile use pass. * If `-icp-max-num-vtables` is zero (default value is 6), vtable profiles won't be annotated. [1] `464d321ee8/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1762-L1768)`	2024-07-22 11:57:36 -07:00
xur-llvm	b1ca2a9546	[PGO] Sampled instrumentation in PGO to speed up instrumentation binary (#69535 ) In comparison to non-instrumented binaries, PGO instrumentation binaries can be significantly slower. For highly threaded programs, this slowdown can reach 10x due to data races or false sharing within counters. This patch incorporates sampling into the PGO instrumentation process to enhance the speed of instrumentation binaries. The fundamental concept is similar to the one proposed in https://reviews.llvm.org/D63949. Three sampling modes are introduced: 1. Simple Sampling: When '-sampled-instr-bust-duration' is set to 1. 2. Fast Burst Sampling: When not using simple sampling, and '-sampled-instr-period' is set to 65535. This is the default mode of sampling. 3. Full Burst Sampling: When neither simple nor fast burst sampling is used. Utilizing this sampled instrumentation significantly improves the binary's execution speed. Measurements show up to 5x speedup with default settings. Fast burst sampling now results in only around 20% to 30% slowdown (compared to 8 to 10x slowdown without sampling). Out tests show that profile quality remains good with sampling, with edge counts typically showing more than 90% overlap. For applications whose behavior changes due to binary speed, sampling instrumentation can enhance performance. Observations have shown some apps experiencing up to a ~2% improvement in PGO. A potential drawback of this patch is the increased binary size and compilation time. The Sampling method in this patch does not improve single threaded program instrumentation binary speed.	2024-07-22 09:19:17 -07:00
Youngsuk Kim	2051736f7b	[llvm][Transforms] Avoid 'raw_string_ostream::str' (NFC) Since `raw_string_ostream` doesn't own the string buffer, it is desirable (in terms of memory safety) for users to directly reference the string buffer rather than use `raw_string_ostream::str()`. Work towards TODO comment to remove `raw_string_ostream::str()`.	2024-06-30 09:03:29 -05:00
Mingming Liu	1518b260ce	[TypeProf][InstrFDO]Implement more efficient comparison sequence for indirect-call-promotion with vtable profiles. (#81442 ) Clang's `-fwhole-program-vtables` is required for this optimization to take place. If `-fwhole-program-vtables` is not enabled, this change is no-op. * Function-comparison (before): ``` %vtable = load ptr, ptr %obj %vfn = getelementptr inbounds ptr, ptr %vtable, i64 1 %func = load ptr, ptr %vfn %cond = icmp eq ptr %func, @callee br i1 %cond, label bb1, label bb2: bb1: call @callee bb2: call %func ``` * VTable-comparison (after): ``` %vtable = load ptr, ptr %obj %cond = icmp eq ptr %vtable, @vtable-address-point br i1 %cond, label bb1, label bb2: bb1: call @callee bb2: %vfn = getelementptr inbounds ptr, ptr %vtable, i64 1 %func = load ptr, ptr %vfn call %func ``` Key changes: 1. Find out virtual calls and the vtables they come from. - The ICP relies on type intrinsic `llvm.type.test` to find out virtual calls and the compatible vtables, and relies on type metadata to find the address point for comparison. 2. ICP pass does cost-benefit analysis and compares vtable only when the number of vtables for a function candidate is within (option specified) threshold. 3. Sink the function addressing and vtable load instruction to indirect fallback. - The sink helper functions are simplified versions of `InstCombinerImpl::tryToSinkInstruction`. Currently debug intrinsics are not handled. Ideally `InstCombinerImpl::tryToSinkInstructionDbgValues` and `InstCombinerImpl::tryToSinkInstructionDbgVariableRecords` could be moved into Transforms/Utils/Local.cpp (or another util cpp file) to handle debug intrinsics when moving instructions across basic blocks. 4. Keep value profiles updated 1) Update vtable value profiles after inline 2) For either function-based comparison or vtable-based comparison, update both vtable and indirect call value profiles.	2024-06-29 23:21:33 -07:00
Ethan Luis McDonough	2c8b912f63	Revert "[PGO][OpenMP] Instrumentation for GPU devices (#76587 )" This reverts commit 5fd2af38e461445c583d7ffc2fe23858966eee76. It caused build issues and broke the buildbot.	2024-06-28 12:30:45 -05:00
Ethan Luis McDonough	5fd2af38e4	[PGO][OpenMP] Instrumentation for GPU devices (#76587 ) This pull request is the first part of an ongoing effort to extends PGO instrumentation to GPU device code. This PR makes the following changes: - Adds blank registration functions to device RTL - Gives PGO globals protected visibility when targeting a supported GPU - Handles any addrspace casts for PGO calls - Implements PGO global extraction in GPU plugins (currently only dumps info) These changes can be tested by supplying `-fprofile-instrument=clang` while targeting a GPU.	2024-06-28 10:42:19 -05:00
Stephen Tozer	d75f9dd1d2	Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497 )" Reverts the above commit, as it updates a common header function and did not update all callsites: https://lab.llvm.org/buildbot/#/builders/29/builds/382 This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.	2024-06-24 18:00:22 +01:00
Stephen Tozer	6481dc5761	[IR][NFC] Update IRBuilder to use InsertPosition (#96497 ) Uses the new InsertPosition class (added in #94226) to simplify some of the IRBuilder interface, and removes the need to pass a BasicBlock alongside a BasicBlock::iterator, using the fact that we can now get the parent basic block from the iterator even if it points to the sentinel. This patch removes the BasicBlock argument from each constructor or call to setInsertPoint. This has no functional effect, but later on as we look to remove the `Instruction *InsertBefore` argument from instruction-creation (discussed [here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)), this will simplify the process by allowing us to deprecate the InsertPosition constructor directly and catch all the cases where we use instructions rather than iterators.	2024-06-24 17:27:43 +01:00
Nikita Popov	48ef912e2b	[VFS] Avoid <stack> include (NFC) Directly use a vector instead of wrapping it in a stack, like we do in most places.	2024-06-21 15:17:41 +02:00
Paul Kirth	294f3ce5dd	Reapply "[llvm][IR] Extend BranchWeightMetadata to track provenance o… (#95281 ) …f weights" #95136 Reverts #95060, and relands #86609, with the unintended code generation changes addressed. This patch implements the changes to LLVM IR discussed in https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032 In this patch, we add an optional field to MD_prof meatdata nodes for branch weights, which can be used to distinguish weights added from llvm.expect* intrinsics from those added via other methods, e.g. from profiles or inserted by the compiler. One of the major motivations, is for use with MisExpect diagnostics, which need to know if branch_weight metadata originates from an llvm.expect intrinsic. Without that information, we end up checking branch weights multiple times in the case if ThinLTO + SampleProfiling, leading to some inaccuracy in how we report MisExpect related diagnostics to users. Since we change the format of MD_prof metadata in a fundamental way, we need to update code handling branch weights in a number of places. We also update the lang ref for branch weights to reflect the change.	2024-06-12 12:52:28 -07:00
Paul Kirth	607afa0b63	Revert "[llvm][IR] Extend BranchWeightMetadata to track provenance of weights" (#95060 ) Reverts llvm/llvm-project#86609 This change causes compile-time regressions for stage2 builds (https://llvm-compile-time-tracker.com/compare.php?from=3254f31a66263ea9647c9547f1531c3123444fcd&to=c5978f1eb5eeca8610b9dfce1fcbf1f473911cd8&stat=instructions:u). It also introduced unintended changes to `.text` which should be addressed before relanding.	2024-06-11 08:06:06 +02:00
Paul Kirth	c5978f1eb5	[llvm][IR] Extend BranchWeightMetadata to track provenance of weights (#86609 ) This patch implements the changes to LLVM IR discussed in https://discourse.llvm.org/t/rfc-update-branch-weights-metadata-to-allow-tracking-branch-weight-origins/75032 In this patch, we add an optional field to MD_prof metadata nodes for branch weights, which can be used to distinguish weights added from `llvm.expect*` intrinsics from those added via other methods, e.g. from profiles or inserted by the compiler. One of the major motivations, is for use with MisExpect diagnostics, which need to know if branch_weight metadata originates from an llvm.expect intrinsic. Without that information, we end up checking branch weights multiple times in the case if ThinLTO + SampleProfiling, leading to some inaccuracy in how we report MisExpect related diagnostics to users. Since we change the format of MD_prof metadata in a fundamental way, we need to update code handling branch weights in a number of places. We also update the lang ref for branch weights to reflect the change.	2024-06-10 11:27:21 -07:00
Kazu Hirata	677dddebae	[Transforms] Use StringRef::operator== instead of StringRef::equals (NFC) (#91072 ) I'm planning to remove StringRef::equals in favor of StringRef::operator==. - StringRef::operator==/!= outnumber StringRef::equals by a factor of 31 under llvm/ in terms of their usage. - The elimination of StringRef::equals brings StringRef closer to std::string_view, which has operator== but not equals. - S == "foo" is more readable than S.equals("foo"), especially for !Long.Expression.equals("str") vs Long.Expression != "str".	2024-05-04 12:33:12 -07:00
Mircea Trofin	c2d892668b	[llvm][ctx_profile] Add instrumentation (#90136 ) This adds instrumenting callsites to PGOInstrumentation, if contextual profiling is requested. The latter also enables inserting counters in the entry basic block and disables value profiling (the latter is a point in time change) This change adds the skeleton of the contextual profiling lowering pass, just so we can introduce the flag controlling that and the API to check that. The actual lowering pass will be introduced in a subsequent patch. (Tracking Issue: #89287, RFC referenced there)	2024-05-01 14:47:49 -07:00
Ellis Hoag	abfb4915e1	[NFC][InstrProf] Increment valid profile stat in populateCoverage (#89660 ) We increment `NumOfCSPGOFunc` and `NumOfPGOFunc` in `PGOUseFunc::readCounters()` already. We should do the same in `PGOUseFunc::populateCoverage`. `83bc7b5771/llvm/lib/Transforms/Instrumentation/PGOInstrumentation.cpp (L1331)`	2024-04-23 11:51:24 -05:00
Mingming Liu	1351d17826	[InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (#66825 ) (The profile format change is split into a standalone change into https://github.com/llvm/llvm-project/pull/81691) * For InstrFDO value profiling, implement instrumentation and lowering for virtual table address. * This is controlled by `-enable-vtable-value-profiling` and off by default. * When the option is on, raw profiles will carry serialized `VTableProfData` structs and compressed vtables as payloads. * Implement profile reader and writer support * Raw profile reader is used by `llvm-profdata` but not compiler. Raw profile reader will construct InstrProfSymtab with symbol names, and map profiled runtime address to vtable symbols. * Indexed profile reader is used by `llvm-profdata` and compiler. When initialized, the reader stores a pointer to the beginning of in-memory compressed vtable names and the length of string. When used in `llvm-profdata`, reader decompress the string to show symbols of a profiled site. When used in compiler, string decompression doesn't happen since IR is used to construct InstrProfSymtab. * Indexed profile writer collects the list of vtable names, and stores that to index profiles. * Text profile reader and writer support are added but mostly follow the implementation for indirect-call value type. * `llvm-profdata show -show-vtables <args> <profile>` is implemented. rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600#pick-instrumentation-points-and-instrument-runtime-types-7	2024-04-01 08:52:35 -07:00
Heejin Ahn	2b7289d48a	[PGO] Use isScopedEHPersonality for funclet check (#85671 ) This line should be `isScopedEHPersonality` rather than `isFuncletEHPersonality` because this line is used for checking whether we need to add `funclet` op bundles to newly added calls, and Wasm EH needs that too. The new test case is adapted from https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/PGOProfile/memop_profile_funclet.ll.	2024-03-20 14:22:14 -07:00
Mircea Trofin	ce95a56db5	[pgo][nfc] Model `Count` as a `std::optional` in `PGOUseEdge` (#83505 ) Similar to PR #83364.	2024-03-01 09:44:22 -08:00
Mircea Trofin	1a2960bab6	[pgo][nfc] Model `Count` as a `std::optional` in `PGOUseBBInfo` (#83364 ) Simpler code, compared to tracking state of 2 variables and the ambiguity of "0" CountValue (is it 0 or is it invalid?)	2024-02-29 15:12:07 -08:00
Kazu Hirata	4eb68f53db	[Instrumentation] Use a range-based for loop (NFC)	2024-01-10 19:05:25 -08:00
Kazu Hirata	3a8a9267c5	[Instrumentation] Remove redundant LLVM_DEBUG (NFC)	2024-01-09 12:54:39 -08:00
Kazu Hirata	b2b4ffbc9b	[Instrumentation] Remove -pgo-instr-old-cfg-hashing (#77357 ) It's been more than 3 years since -pgo-instr-old-cfg-hashing was introduced by: commit 120e66b3418b37b95fc1dbbb23e296a602a24fa8 Author: Hiroshi Yamauchi <yamauchi@google.com> Date: Tue Jul 28 10:09:49 2020 -0700 I don't think anyone really cares about the ability to use the old CFG hashing at this point.	2024-01-08 21:05:57 -08:00
Zequan Wu	ab3430f891	[Profile] Add binary profile correlation for code coverage. (#69493 ) ## Motivation Since we don't need the metadata sections at runtime, we can somehow offload them from memory at runtime. Initially, I explored [debug info correlation](https://discourse.llvm.org/t/instrprofiling-lightweight-instrumentation/59113), which is used for PGO with value profiling disabled. However, it currently only works with DWARF and it's be hard to add such artificial debug info for every function in to CodeView which is used on Windows. So, offloading profile metadata sections at runtime seems to be a platform independent option. ## Design The idea is to use new section names for profile name and data sections and mark them as metadata sections. Under this mode, the new sections are non-SHF_ALLOC in ELF. So, they are not loaded into memory at runtime and can be stripped away as a post-linking step. After the process exits, the generated raw profiles will contains only headers + counters. llvm-profdata can be used correlate raw profiles with the unstripped binary to generate indexed profile. ## Data For chromium base_unittests with code coverage on linux, the binary size overhead due to instrumentation reduced from 64M to 38.8M (39.4%) and the raw profile files size reduce from 128M to 68M (46.9%) ``` $ bloaty out/cov/base_unittests.stripped -- out/no-cov/base_unittests.stripped FILE SIZE VM SIZE -------------- -------------- +121% +30.4Mi +121% +30.4Mi .text [NEW] +14.6Mi [NEW] +14.6Mi __llvm_prf_data [NEW] +10.6Mi [NEW] +10.6Mi __llvm_prf_names [NEW] +5.86Mi [NEW] +5.86Mi __llvm_prf_cnts +95% +1.75Mi +95% +1.75Mi .eh_frame +108% +400Ki +108% +400Ki .eh_frame_hdr +9.5% +211Ki +9.5% +211Ki .rela.dyn +9.2% +95.0Ki +9.2% +95.0Ki .data.rel.ro +5.0% +87.3Ki +5.0% +87.3Ki .rodata [ = ] 0 +13% +47.0Ki .bss +40% +1.78Ki +40% +1.78Ki .got +12% +1.49Ki +12% +1.49Ki .gcc_except_table [ = ] 0 +65% +1.23Ki .relro_padding +62% +1.20Ki [ = ] 0 [Unmapped] +13% +448 +19% +448 .init_array +8.8% +192 [ = ] 0 [ELF Section Headers] +0.0% +136 +0.0% +80 [7 Others] +0.1% +96 +0.1% +96 .dynsym +1.2% +96 +1.2% +96 .rela.plt +1.5% +80 +1.2% +64 .plt [ = ] 0 -99.2% -3.68Ki [LOAD #5 [RW]] +195% +64.0Mi +194% +64.0Mi TOTAL $ bloaty out/cov-cor/base_unittests.stripped -- out/no-cov/base_unittests.stripped FILE SIZE VM SIZE -------------- -------------- +121% +30.4Mi +121% +30.4Mi .text [NEW] +5.86Mi [NEW] +5.86Mi __llvm_prf_cnts +95% +1.75Mi +95% +1.75Mi .eh_frame +108% +400Ki +108% +400Ki .eh_frame_hdr +9.5% +211Ki +9.5% +211Ki .rela.dyn +9.2% +95.0Ki +9.2% +95.0Ki .data.rel.ro +5.0% +87.3Ki +5.0% +87.3Ki .rodata [ = ] 0 +13% +47.0Ki .bss +40% +1.78Ki +40% +1.78Ki .got +12% +1.49Ki +12% +1.49Ki .gcc_except_table +13% +448 +19% +448 .init_array +0.1% +96 +0.1% +96 .dynsym +1.2% +96 +1.2% +96 .rela.plt +1.2% +64 +1.2% +64 .plt +2.9% +64 [ = ] 0 [ELF Section Headers] +0.0% +40 +0.0% +40 .data +1.2% +32 +1.2% +32 .got.plt +0.0% +24 +0.0% +8 [5 Others] [ = ] 0 -22.9% -872 [LOAD #5 [RW]] -74.5% -1.44Ki [ = ] 0 [Unmapped] [ = ] 0 -76.5% -1.45Ki .relro_padding +118% +38.8Mi +117% +38.8Mi TOTAL ``` A few things to note: 1. llvm-profdata doesn't support filter raw profiles by binary id yet, so when a raw profile doesn't belongs to the binary being digested by llvm-profdata, merging will fail. Once this is implemented, llvm-profdata should be able to only merge raw profiles with the same binary id as the binary and discard the rest (with mismatched/missing binary id). The workflow I have in mind is to have scripts invoke llvm-profdata to get all binary ids for all raw profiles, and selectively choose the raw pnrofiles with matching binary id and the binary to llvm-profdata for merging. 2. Note: In COFF, currently they are still loaded into memory but not used. I didn't do it in this patch because I noticed that `.lcovmap` and `.lcovfunc` are loaded into memory. A separate patch will address it. 3. This should works with PGO when value profiling is disabled as debug info correlation currently doing, though I haven't tested this yet.	2023-12-14 14:16:38 -05:00
serge-sans-paille	4b5224a27e	Disable PGO instrumentation on naked function (#75224 ) We only allow for assembly code in naked function, and PGO instrumentation (esp. temporal instrumentation that introduces a function call) can wreak havoc in this. Fix #74573	2023-12-13 05:53:52 +00:00
Matthias Braun	cb4627d150	Add setBranchWeigths convenience function. NFC (#72446 ) Add `setBranchWeights` convenience function to ProfDataUtils.h and use it where appropriate.	2023-11-16 10:55:19 -08:00
Mircea Trofin	6c2bde9bb9	[nfc][instr] Encapsulate CFGMST (#72207 ) Very little of it needs to be public.	2023-11-14 19:22:21 -08:00
Youngsuk Kim	57dd23bc0a	[llvm] Remove no-op ptr-to-ptr bitcasts (NFC) Opaque ptr cleanup effort (NFC).	2023-11-14 10:43:00 -06:00
Paulo Matos	7b9d73c2f9	[NFC] Remove Type::getInt8PtrTy (#71029 ) Replace this with PointerType::getUnqual(). Followup to the opaque pointer transition. Fixes an in-code TODO item.	2023-11-07 17:26:26 +01:00
Ellis Hoag	890335bb28	[InstrProf] Do not block functions from PGOUse (#71106 ) The `skipPGO()` function was added in https://reviews.llvm.org/D137184. Unfortunately, it also blocked functions from being annotated (PGOUse), which I believe will cause confusion to users if a function has a profile but it is not PGO'd. The docs for `noprofile` and `skipprofile` only claim to block instrumentation, not PGO optimization: https://llvm.org/docs/LangRef.html	2023-11-03 09:41:26 -07:00
Zequan Wu	db7a1ed9a2	Revert "[Profile] Refactor profile correlation. (#70712 )" This reverts commit 4b383d0af93136b80841fc140da0823dfc441dd4.	2023-10-31 10:53:45 -04:00
Zequan Wu	4b383d0af9	[Profile] Refactor profile correlation. (#70712 ) Refactor some code from https://github.com/llvm/llvm-project/pull/69493. Rebase of https://github.com/llvm/llvm-project/pull/69656 on top of main as it was messed up.	2023-10-31 10:41:01 -04:00

1 2 3 4 5 ...

308 Commits