llvm-project

Author	SHA1	Message	Date
Kazu Hirata	4d1bb7699b	[memprof] Fix a typo in writeMemProfV1 (#87890 ) This patch borrows memprof-merge.test to test --memprof-version.	2024-04-07 15:06:13 -07:00
Mingming Liu	1351d17826	[InstrFDO][TypeProf] Implement binary instrumentation and profile read/write (#66825 ) (The profile format change is split into a standalone change into https://github.com/llvm/llvm-project/pull/81691) * For InstrFDO value profiling, implement instrumentation and lowering for virtual table address. * This is controlled by `-enable-vtable-value-profiling` and off by default. * When the option is on, raw profiles will carry serialized `VTableProfData` structs and compressed vtables as payloads. * Implement profile reader and writer support * Raw profile reader is used by `llvm-profdata` but not compiler. Raw profile reader will construct InstrProfSymtab with symbol names, and map profiled runtime address to vtable symbols. * Indexed profile reader is used by `llvm-profdata` and compiler. When initialized, the reader stores a pointer to the beginning of in-memory compressed vtable names and the length of string. When used in `llvm-profdata`, reader decompress the string to show symbols of a profiled site. When used in compiler, string decompression doesn't happen since IR is used to construct InstrProfSymtab. * Indexed profile writer collects the list of vtable names, and stores that to index profiles. * Text profile reader and writer support are added but mostly follow the implementation for indirect-call value type. * `llvm-profdata show -show-vtables <args> <profile>` is implemented. rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600#pick-instrumentation-points-and-instrument-runtime-types-7	2024-04-01 08:52:35 -07:00
Teresa Johnson	08ddd2ce40	[PGO] Add support for writing previous indexed format (#84505 ) Enable temporary support to ease use of new llvm-profdata with slightly older indexed profiles after 16e74fd48988ac95551d0f64e1b36f78a82a89a2, which bumped the indexed format for type profiling.	2024-03-08 12:27:46 -08:00
Alan Zhao	922a431e10	[profdata][nfc] Disable several tests on Windows (#83907 ) Several profdata tests pass the byte 012 to printf. This causes these tests to fail when using GnuWin32's version of printf because printf will detect that 012 is the LF character and will prepend the byte 015 (CR) in front of LF. This change is required after https://github.com/llvm/llvm-project/pull/82711 which bumped the version number.	2024-03-04 13:15:50 -08:00
Mingming Liu	16e74fd489	Reland "[TypeProf][InstrPGO] Introduce raw and instr profile format change for type profiling." (#82711 ) New change on top of [reviewed patch](https://github.com/llvm/llvm-project/pull/81691) are [in commits after this one](`d0757f46b3`). Previous commits are restored from the remote branch with timestamps. 1. Fix build breakage for non-ELF platforms, by defining the missing functions {`__llvm_profile_begin_vtables`, `__llvm_profile_end_vtables`, `__llvm_profile_begin_vtabnames `, `__llvm_profile_end_vtabnames`} everywhere. * Tested on mac laptop (for darwins) and Windows. Specifically, functions in `InstrProfilingPlatformWindows.c` returns `NULL` to make it more explicit that type prof isn't supported; see comments for the reason. * For the rest (AIX, other), mostly follow existing examples (like this [one](`f95b2f1acf`)) 2. Rename `__llvm_prf_vtabnames` -> `__llvm_prf_vns` for shorter section name, and make returned pointers [const](`a825d2a4ec (diff-4de780ce726d76b7abc9d3353aef95013e7b21e7bda01be8940cc6574fb0b5ffR120-R121)`) Original Description * Raw profile format - Header: records the byte size of compressed vtable names, and the number of profiled vtable entries (call it `VTableProfData`). Header also records padded bytes of each section. - Payload: adds a section for compressed vtable names, and a section to store `VTableProfData`. Both sections are padded so the size is a multiple of 8. * Indexed profile format - Header: records the byte offset of compressed vtable names. - Payload: adds a section to store compressed vtable names. This section is used by `llvm-profdata` to show the list of vtables profiled for an instrumented site. [The originally reviewed patch](https://github.com/llvm/llvm-project/pull/66825) will have profile reader/write change and llvm-profdata change. - To ensure this PR has all the necessary profile format change along with profile version bump, created a copy of the originally reviewed patch in https://github.com/llvm/llvm-project/pull/80761. The copy doesn't have profile format change, but it has the set of tests which covers type profile generation, profile read and profile merge. Tests pass there. rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600 --------- Co-authored-by: modiking <modiking213@gmail.com>	2024-02-27 11:07:40 -08:00
Mingming Liu	0e8d1877cd	Revert type profiling change as compiler-rt test break on Windows. (#82583 ) Examples https://lab.llvm.org/buildbot/#/builders/127/builds/62532/steps/8/logs/stdio	2024-02-21 21:41:33 -08:00
Mingming Liu	db7e9e6841	[TypeProf][InstrPGO] Introduce raw and instr profile format change for type profiling. (#81691 ) * Raw profile format - Header: records the byte size of compressed vtable names, and the number of profiled vtable entries (call it `VTableProfData`). Header also records padded bytes of each section. - Payload: adds a section for compressed vtable names, and a section to store `VTableProfData`. Both sections are padded so the size is a multiple of 8. * Indexed profile format - Header: records the byte offset of compressed vtable names. - Payload: adds a section to store compressed vtable names. This section is used by `llvm-profdata` to show the list of vtables profiled for an instrumented site. [The originally reviewed patch](https://github.com/llvm/llvm-project/pull/66825) will have profile reader/write change and llvm-profdata change. - To ensure this PR has all the necessary profile format change along with profile version bump, created a copy of the originally reviewed patch in https://github.com/llvm/llvm-project/pull/80761. The copy doesn't have profile format change, but it has the set of tests which covers type profile generation, profile read and profile merge. Tests pass there. rfc in https://discourse.llvm.org/t/rfc-dynamic-type-profiling-and-optimizations-in-llvm/74600 --------- Co-authored-by: modiking <modiking213@gmail.com>	2024-02-21 20:59:42 -08:00
William Junda Huang	2b8649fbec	Added feature in llvm-profdata merge to filter functions from the profile (#78378 ) `--function=<regex>` Include functions matching regex in the output `--no-function=<regex>` Exclude functions matching regex from the output If both are specified, `--no-function` has a higher precedence if a function name matches both filters	2024-01-23 16:19:45 -05:00
Mingming Liu	665e46c268	[llvm-profdata] Use semicolon as the delimiter for supplementary profiles. (#75080 ) When merging instrFDO profiles with afdo profile as supplementary, instrFDO counters for static functions are stored with function's PGO name (with filename.cpp; prefix). - This pull request fixes the delimiter used when a PGO function name is 'normalized' for AFDO look-up.	2024-01-04 15:03:18 -08:00
Kazu Hirata	1c1eaf75f5	[llvm-profdata] Make tests more readable (NFC) This patch splits a couple of lines of printf into four for readability so that each corresponds to one field or padding. They correspond to NumCounters, NumValueSites, NumBitmapBytes, and padding, respectively.	2023-12-27 13:28:37 -08:00
Kazu Hirata	ce02357795	[llvm-profdata] Make tests more readable (NFC) This patch splits a couple of lines of printf into four for readability so that each corresponds to one field or padding. They correspond to NumCounters, NumValueSites, NumBitmapBytes, and padding, respectively.	2023-12-27 13:19:09 -08:00
Kazu Hirata	b8424eaede	[llvm-profdata] Make tests more readable (NFC) These tests generally use one printf for each field of RawInstrProf::ProfileData except the lines being touched in this patch. These lines print two fields, namely NumValueSites and NumBitmapBytes, with one printf, which is very confusing. (Note that the 4-byte printf at the end of the group is padding to make the struct size a multiple of 8 bytes.) This patch makes the tests a litle more readable by splitting NumValueSites and NumBitmapBytes into two separate lines.	2023-12-26 23:31:21 -08:00
Mingming Liu	493e2400ca	[nfc][llvm-profdata] Use cl::Subcommand to organize subcommand and options in llvm-profdata (#71328 ) - The motivation is to reduce the number of arguments passed around (e.g., from `show_main` to `show*Profile`). In order to do this, move function-defined options to global variables, and create `cl::SubCommand` for {show, merge, overlap, order} to organize options. - The side-effect by extracting function local options to a C++ namespace is that the extracted options are no longer (lazily) initialized when the enclosing function runs for the first time. - `cl::Subcommand` support (introduced in https://lists.llvm.org/pipermail/llvm-dev/2016-June/101804.html) could put options in a per-subcommand namespace. - One option could belong to multiple subcommand. This patch defines most of the options once and associates them with multiple subcommands except 1. `overlap` and `show` both has `value-cutoff` with different default values ([former](`64f62de966/llvm/tools/llvm-profdata/llvm-profdata.cpp (L2352)`) vs [latter](`64f62de966/llvm/tools/llvm-profdata/llvm-profdata.cpp (L3009)`)). Define 'OverlapValueCutoff' and 'ShowValueCutoff' respectively. 2. `show` supports three profile formats in `ProfileKind` while {`merge`, `overlap`} supports two. Define separate options. - Clean up obsolete code as a result, including `-h` and `--version` customizations. These two options are supported for all commands. Results pasted. - [-h and --help](https://gist.github.com/minglotus-6/387490e5eeda2dd2f9c440a424d6f360) output. - [--version](https://gist.github.com/minglotus-6/f905abcc3a346957bd797f2f84c18c1b) - [llvm-profdata show --help](https://gist.github.com/minglotus-6/f143079f02af243a94758138c0af471a) This PR should be `llvm-profdata` only. It depends on https://github.com/llvm/llvm-project/pull/71981	2023-11-14 10:19:13 -08:00
Zequan Wu	89a2e70159	[llvm-profdata] Emit warning when counter value is greater than 2^56. (#69513 ) Fixes #65416	2023-10-31 16:40:51 -04:00
Alan Phipps	f95b2f1acf	Reland "[InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)" Part 1 of 3. This includes the LLVM back-end processing and profile reading/writing components. compiler-rt changes are included. Differential Revision: https://reviews.llvm.org/D138846	2023-10-30 11:15:02 -05:00
Zequan Wu	3c34245c47	[Profile] Use upper 32 bits of profile version for profile variants. (#67695 ) Currently all upper 8 bits are reserved for different profile variants. We need more bits for new mods in the future. Context: https://discourse.llvm.org/t/how-to-add-a-new-mode-to-llvm-raw-profile-version/73688	2023-10-03 10:15:22 -04:00
Hans Wennborg	53a2923bf6	Revert "[InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)" This seems to cause Clang to crash, see comments on the code review. Reverting until the problem can be investigated. > Part 1 of 3. This includes the LLVM back-end processing and profile > reading/writing components. compiler-rt changes are included. > > Differential Revision: https://reviews.llvm.org/D138846 This reverts commit a50486fd736ab2fe03fcacaf8b98876db77217a7.	2023-09-21 12:20:24 +02:00
Alan Phipps	a50486fd73	[InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3) Part 1 of 3. This includes the LLVM back-end processing and profile reading/writing components. compiler-rt changes are included. Differential Revision: https://reviews.llvm.org/D138846	2023-09-19 17:07:23 -05:00
William Huang	7624de5bea	[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map. The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow. Several changes to note: (1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names. (2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) ) (3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing. (4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable. (5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive) Performance impact: When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%. Reviewed By: davidxl, snehasish Differential Revision: https://reviews.llvm.org/D147740	2023-08-17 20:10:45 +00:00
Aaron Ballman	1a53b5c367	Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map" This reverts commit 66ba71d913df7f7cd75e92c0c4265932b7c93292. Addressing issues found by: https://lab.llvm.org/buildbot/#/builders/245/builds/11732 https://lab.llvm.org/buildbot/#/builders/187/builds/12251 https://lab.llvm.org/buildbot/#/builders/186/builds/11099 https://lab.llvm.org/buildbot/#/builders/182/builds/6976	2023-07-28 09:41:38 -04:00
William Huang	66ba71d913	[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map. The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow. Several changes to note: (1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names. (2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) ) (3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing. (4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable. (5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive) Performance impact: When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%. Reviewed By: davidxl, snehasish Differential Revision: https://reviews.llvm.org/D147740	2023-07-27 23:08:27 +00:00
Fangrui Song	4c2980c1a3	[llvm-profdata] Stabilize iteration order for InstrProfWriter If two functions are inserted to the same bucket, their order in the serialized profile is dependent on StringMap iteration order, which is not guaranteed to be deterministic. (https://llvm.org/docs/ProgrammersManual.html#llvm-adt-stringmap-h). Use a sort like we do in writeText.	2023-07-20 18:31:41 -07:00
Haojian Wu	58056ae299	Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map" This reverts commit 12e9c7aaa66b7624b5d7666ce2794d912bf9e4b7. The commit has broken the buildbot, see comment https://reviews.llvm.org/D147740#4451540	2023-06-27 15:19:35 +02:00
William Huang	12e9c7aaa6	[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map. The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow. Several changes to note: (1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names. (2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) ) (3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing. (4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable. (5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive) Performance impact: When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%. Reviewed By: davidxl, snehasish Differential Revision: https://reviews.llvm.org/D147740	2023-06-27 00:06:05 +00:00
Douglas Yung	c9a8a0e8a9	Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map" This reverts commit 31af18bccea95fe1ae8aa2c51cf7c8e92a1c208e. This change is causing build failures on many Windows build bots: https://lab.llvm.org/buildbot/#/builders/216/builds/22833 https://lab.llvm.org/buildbot/#/builders/123/builds/19602 https://lab.llvm.org/buildbot/#/builders/172/builds/28315 https://lab.llvm.org/buildbot/#/builders/119/builds/13870 https://lab.llvm.org/buildbot/#/builders/233/builds/794 https://lab.llvm.org/buildbot/#/builders/235/builds/387 https://lab.llvm.org/buildbot/#/builders/13/builds/36921 https://lab.llvm.org/buildbot/#/builders/127/builds/50510	2023-06-23 17:58:22 -07:00
William Huang	31af18bcce	[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map. The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow. Several changes to note: (1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names. (2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) ) (3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing. (4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable. (5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive) Performance impact: When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%. Reviewed By: davidxl, snehasish Differential Revision: https://reviews.llvm.org/D147740	2023-06-23 21:48:52 +00:00
Ellis Hoag	c1d935ece3	[InstrProf] Fix BalancedPartitioning when threads are disabled In https://reviews.llvm.org/D147812 we introduced the class `BalancedPartitioning` which includes some threading code. The tests in that diff run forever when built with `-DLLVM_ENABLE_THREADS=OFF` so some bots were broken. These tests were skipped in https://reviews.llvm.org/rGa4845eaf2e9aa18dd900d7cbeff4e5ff52e4b50e because of this. This diff disables the threading code if `LLVM_ENABLE_THREADS` is disabled so we can re-enable the tests. Reviewed By: luporl Differential Revision: https://reviews.llvm.org/D152390	2023-06-07 12:04:35 -07:00
Leandro Lupori	a4845eaf2e	[InstrProf] Skip Balanced Partitioning tests on ARM Balanced Partitioning tests, added by 1794532bb942, currently hang on ARM, what causes check-all to never finish. Issue #63168 Differential Revision: https://reviews.llvm.org/D147812	2023-06-07 17:39:41 +00:00
Ellis Hoag	1117b9a284	[InstrProf] Use BalancedPartitioning to order temporal profiling trace data In [0] we described an algorithm called //BalancedPartitioning// (bp) to consume function traces [1] and compute a function order that reduces the number of page faults during startup. This patch adds the `order` command to the `llvm-profdata` tool which uses bp to output a function order that can be passed to the linker via `--symbol-ordering-file=`. Special thanks to Sergey Pupyrev and Julian Mestre for designing this balanced partitioning algorithm. [0] https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068 [1] https://reviews.llvm.org/D147287 Reviewed By: spupyrev Differential Revision: https://reviews.llvm.org/D147812	2023-06-06 11:59:57 -07:00
Michael Platings	1023d7e40d	[llvm-profdata] Fix test on Windows Output on Windows is "llvm-profdata.exe"	2023-05-22 16:00:44 +01:00
Michael Platings	6521905389	[llvm-profdata] Accept --version argument The `llvm-profdata --version` output now looks like: llvm-profdata LLVM (http://llvm.org/): LLVM version 17.0.0git Optimized build with assertions. This makes llvm-profdata more consistent with other tools. Reviewed By: simon_tatham Differential Revision: https://reviews.llvm.org/D150964	2023-05-22 14:44:03 +01:00
Tobias Hieta	f84bac329b	[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0 since I forgot the lit.local.cfg files in that one. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: barannikov88, kwk Differential Revision: https://reviews.llvm.org/D150762	2023-05-17 17:03:15 +02:00
William Huang	4357824c63	[llvm-profdata] ProfileReader cleanup - preparation for MD5 refactoring Cleanup profile reader classes to prepare for complex refactoring as propsed in D147740 (Use MD5 as key for profile map). Change is too complicated so I am cleaning up the reader implementation first with these goals. - Reduce duplicated/similar logic - Reduce virtual functions, changing them to non-virtual - Reduce unnecessry checks, indirections, and dead writes. This is patch 1/n. This patch refactors NameTable Explaining several decisions here 1. useMD5() means whether names of the profiles (the ProfileMap) are represented as MD5. It is NOT whether the input profile format is MD5. This function is an interface for IPO passes to decide whether to match function names or function MD5. There are two motives here: (a) Eventually we want to use MD5 to represent all function contexts because it is much faster to use it as a key for lookup tables (prototype implementation D147740), so in compilation mode we call setProfileUseMD5() to force use MD5. While in tools mode (llvm-profdata) we want to keep the function name info if it's in the input profile. (b) We also propose to allow multiple name tables and profile sections in ExtBinary format, and it could consist of name tables with or without using MD5, in this case MD5 prevails and other name tables are converted to MD5. 2. MD5 handling logic is pushed up to BinaryReader base class, because this trades a non-devirtualized virtual function call with a predictable branch. ReadStringFromTable() accounts for >5% time when loading a full 1 GB profile, it should not be virtual. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D148868	2023-05-06 00:21:03 +00:00
William Huang	d38d6ca179	[llvm-profdata] Deprecate Compact Binary Sample Profile Format Remove support for compact binary sample profile format Reviewed By: davidxl, wenlei Differential Revision: https://reviews.llvm.org/D149400	2023-05-01 17:10:08 +00:00
Jessica Paquette	17cfd2e025	[profiling] Improve error message for raw profile header mismatches When a user uses a mismatched clang + llvm-profdata, they didn't get a very informative error message. It would just say "unsupported version". As a result, users are often confused as to what they are supposed to do and tend to assume that it's a bug in the profiling runtime. This patch improves the error message by: - Adding a new class of error (`raw_profile_version_mismatch`) to make it clear that, specifically, the raw profile version is unsupported because of a tool mismatch. - Adding an error message that tells the user which raw profile version was encountered, which version was expected, and instructs them to align their tool versions. To support this, this patch also updates `InstrProfError::take` to also propagate the optional error message. Differential Revision: https://reviews.llvm.org/D149361	2023-04-27 14:51:38 -07:00
Snehasish Kumar	932d7b9ddd	[memprof] Print out profile build ids in the error message. When no --profiled-binary flag is provided we can print out the build ids of the modules in the profile. This can help the user fetch the correct binary from e.g. remote object store. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D148301	2023-04-17 17:53:57 +00:00
Ellis Hoag	4bddef4117	[InstrProf][Temporal] Add weight field to traces As discussed in [0], add a `weight` field to temporal profiling traces found in profiles. This allows users to use the `--weighted-input=` flag in the `llvm-profdata merge` command to weight traces from different scenarios differently. Note that this is a breaking change, but since [1] landed very recently and there is no way to "use" this trace data, there should be no users of this feature. We believe it is acceptable to land this change without bumping the profile format version. [0] https://reviews.llvm.org/D147812#4259507 [1] https://reviews.llvm.org/D147287 Reviewed By: snehasish Differential Revision: https://reviews.llvm.org/D148150	2023-04-13 10:37:05 -07:00
Ellis Hoag	244be0b0de	[InstrProf] Temporal Profiling As described in [0], this extends IRPGO to support //Temporal Profiling//. When `-pgo-temporal-instrumentation` is used we add the `llvm.instrprof.timestamp()` intrinsic to the entry of functions which in turn gets lowered to a call to the compiler-rt function `INSTR_PROF_PROFILE_SET_TIMESTAMP()`. A new field in the `llvm_prf_cnts` section stores each function's timestamp. Then in `llvm-profdata merge` we convert these function timestamps into a //trace// and add it to the indexed profile. Since these traces could significantly increase the profile size, we've added `-max-temporal-profile-trace-length` and `-temporal-profile-trace-reservoir-size` to limit the length of a trace and the number of traces in a profile, respectively. In a future diff we plan to use these traces to construct an optimized function order to reduce the number of page faults during startup. Special thanks to Julian Mestre for helping with reservoir sampling. [0] https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068 Reviewed By: snehasish Differential Revision: https://reviews.llvm.org/D147287	2023-04-11 08:30:52 -07:00
Mingming Liu	f91b0f23c2	[AutoFDO]Merge called target in body samples when flattening profiles - Body samples could have call targets, merge them as well. Differential Revision: https://reviews.llvm.org/D147297	2023-03-31 13:24:59 -07:00
wlei	3520f6d666	Fix a missing checksum field	2023-03-31 10:16:06 -07:00
wlei	339b8a0019	[AutoFDO] Use flattened profiles for profile staleness metrics For profile staleness report, before it only counts for the top-level function samples in the nested profile, the samples in the inlinees are ignored. This could affect the quality of the metrics when there are heavily inlined functions. This change adds a feature to flatten the nested profile and we're changing to use flatten profile as the input for stale profile detection and matching. Example for profile flattening: ``` Original profile: _Z3bazi:20301:1000 1: 1000 3: 2000 5: inline1:1600 1: 600 3: inline2:500 1: 500 Flattened profile: _Z3bazi:18701:1000 1: 1000 3: 2000 5: 600 inline1:600 inline1:1100:600 1: 600 3: 500 inline2: 500 inline2:500:500 1: 500 ``` This feature could be useful for offline analysis, like understanding the hotness of each individual function. So I'm adding the support to `llvm-profdata merge` under `--gen-flattened-profile`. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D146452	2023-03-30 11:05:10 -07:00
Snehasish Kumar	cef71d0105	[memprof] Support symbolization of PIE binaries. Support symolization of PIE binaries in memprof. We assume that the profiled binary has one executable text segment for simplicity. Update the memprof-pic test to now expect the same output as the memprof-basic test. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D146181	2023-03-21 20:13:18 +00:00
Snehasish Kumar	a1bbf5ac3c	[memprof] Record BuildIDs in the raw profile. This patch adds support for recording BuildIds usng the sanitizer ListOfModules API. We add another entry to the SegmentEntry struct and change the memprof raw version. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D145190	2023-03-14 18:16:38 +00:00
Snehasish Kumar	debe80cb8d	Revert "[memprof] Record BuildIDs in the raw profile." This reverts commit 287177a47a396ca6cc0bef7696108cdaa0c68e5f.	2023-03-13 20:09:46 +00:00
Snehasish Kumar	287177a47a	[memprof] Record BuildIDs in the raw profile. This patch adds support for recording BuildIds usng the sanitizer ListOfModules API. We add another entry to the SegmentEntry struct and change the memprof raw version. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D145190	2023-03-13 19:28:38 +00:00
Kirill Stoimenov	011b4d4706	[HWASAN][LSAN] Disable tests which don't pass in HWASAN+LSAN mode Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D145727	2023-03-10 00:51:55 +00:00
Snehasish Kumar	e99b5ad383	[memprof] Add scripts to automate testdata regeneration. The memprof profiles and binaries need to be updated in case of version updates. This change adds three scripts for llvm-profdata, clang and llvm tests where memprof profiles are used as inputs. Also update the tests, profiles and binaries in this change. Change based on the review suggestions in D145023. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D145644	2023-03-09 19:54:23 +00:00
Snehasish Kumar	e1b569b96a	Revert "[memprof] Refactor tests to generate binaries and profiles on the fly." This reverts commit 599b7690fa917ea4e9cd67275e34d0b5a0f51aa9. Since adding a cross project dependency is a concern.	2023-03-06 23:48:52 +00:00
Snehasish Kumar	599b7690fa	[memprof] Refactor tests to generate binaries and profiles on the fly. This change replaces the binary profiles and executables used for testing the memprof profile reader with tests where the profiles are generated on the fly. This reduces toil when the profile version changes. The tests are moved from tools/llvm-profdata to compiler-rt/test/memprof due to the following reasons: 1. Adding dependency on memprof lit.cfg.py for llvm-profdata is preferable to adding a dependency on compiler-rt for llvm/test. 2. All the tests can now be run with `ninja check-memprof`. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D145023	2023-03-06 21:24:40 +00:00
Erik Desjardins	87d02e0dfd	Recommit "[Support] change StringMap hash function from djbHash to xxHash" This reverts commit 37eb9d13f891f7656f811516e765b929b169afe0. Test failures have been fixed: - ubsan failure fixed by 72eac42f21c0f45a27f3eaaff9364cbb5189b9e4 - warn-unsafe-buffer-usage-fixits-local-var-span.cpp fixed by 03cc52dfd1dbb4a59b479da55e87838fb93d2067 (wasn't related) - test-output-format.ll failure was spurious, build failed at https://lab.llvm.org/buildbot/#/builders/54/builds/3545 (b4431b2d945b6fc19b1a55ac6ce969a8e06e1e93) but passed at https://lab.llvm.org/buildbot/#/builders/54/builds/3546 (5ae99be0377248c74346096dc475af254a3fc799) which is before my revert `b4431b2d94...5ae99be037` Original commit message: Depends on https://reviews.llvm.org/D142861. Alternative to https://reviews.llvm.org/D137601. xxHash is much faster than djbHash. This makes a simple Rust test case with a large constant string 10% faster to compile. Previous attempts at changing this hash function (e.g. https://reviews.llvm.org/D97396) had to be reverted due to breaking tests that depended on iteration order. No additional tests fail with this patch compared to `main` when running `check-all` with `-DLLVM_ENABLE_PROJECTS="all"` (on a Linux host), so I hope I found everything that needs to be changed. Differential Revision: https://reviews.llvm.org/D142862	2023-02-19 16:52:26 -05:00

1 2 3 4 5 ...

312 Commits