llvm-project

Author	SHA1	Message	Date
Andrew Rogers	ee2d2bda71	[lllvm] add Passes to LLVM_LINK_COMPONENTS for LLVMMCATests (#145617 ) ## Purpose Add `Passes` to `LLVM_LINK_COMPONENTS` for `LLVMMCATests` so that it links properly when LLVM is built as a Windows DLL. ## Background `LLVPasses` appears to be a missing dependency from `LLVMMCATests`, but when LLVM is built statically it picks-up the required `LLVMPasses` symbols from a transitive dependency (presumably). When LLVM is built as a Windows DLL, `LLVMMCATests` fails to link 4 symbols from `LLVMPasses` without this change: ``` LLVMX86CodeGen.lib(X86CodeGenPassBuilder.cpp.obj) : error LNK2019: unresolved external symbol "public: __cdecl llvm::ModuleInlinerWrapperPass::ModuleInlinerWrapperPass(struct llvm::InlineParams,bool,struct llvm::InlineContext,enum llvm::InliningAdvisorMode,unsigned int)" (??0ModuleInlinerWrapperPass@llvm@@QEAA@UInlineParams@1@_NUInlineContext@1@W4InliningAdvisorMode@1@I@Z) referenced in function "public: void __cdecl llvm::ModuleInlinerWrapperPass::`default constructor closure'(void)" (??_FModuleInlinerWrapperPass@llvm@@QEAAXXZ) LLVMX86CodeGen.lib(X86CodeGenPassBuilder.cpp.obj) : error LNK2019: unresolved external symbol "public: __cdecl llvm::PipelineTuningOptions::PipelineTuningOptions(void)" (??0PipelineTuningOptions@llvm@@QEAA@XZ) referenced in function "public: void __cdecl llvm::PassBuilder::`default constructor closure'(void)" (??_FPassBuilder@llvm@@QEAAXXZ) LLVMX86CodeGen.lib(X86CodeGenPassBuilder.cpp.obj) : error LNK2019: unresolved external symbol "public: __cdecl llvm::PassBuilder::PassBuilder(class llvm::TargetMachine ,class llvm::PipelineTuningOptions,class std::optional<struct llvm::PGOOptions>,class llvm::PassInstrumentationCallbacks )" (??0PassBuilder@llvm@@QEAA@PEAVTargetMachine@1@VPipelineTuningOptions@1@V?$optional@UPGOOptions@llvm@@@std@@PEAVPassInstrumentationCallbacks@1@@Z) referenced in function "public: void __cdecl llvm::PassBuilder::`default constructor closure'(void)" (??_FPassBuilder@llvm@@QEAAXXZ) LLVMX86CodeGen.lib(X86InsertPrefetch.cpp.obj) : error LNK2019: unresolved external symbol "public: __cdecl llvm::SampleProfileLoaderPass::SampleProfileLoaderPass(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >,enum llvm::ThinOrFullLTOPhase,class llvm::IntrusiveRefCntPtr<class llvm::vfs::FileSystem>,bool,bool)" (??0SampleProfileLoaderPass@llvm@@QEAA@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@0W4ThinOrFullLTOPhase@1@V?$IntrusiveRefCntPtr@VFileSystem@vfs@llvm@@@1@_N3@Z) referenced in function "public: void __cdecl llvm::SampleProfileLoaderPass::`default constructor closure'(void)" (??_FSampleProfileLoaderPass@llvm@@QEAAXXZ) unittests\tools\llvm-mca\LLVMMCATests.exe : fatal error LNK1120: 4 unresolved externals ``` ## Validation Local builds and tests to validate cross-platform compatibility. This included llvm, clang, and lldb on the following configurations: - Windows with MSVC - Windows with Clang - Linux with GCC - Linux with Clang - Darwin with Clang	2025-07-08 08:48:42 -07:00
Kazu Hirata	b4fac94181	[llvm] Remove unused using decls (NFC) (#138386 )	2025-05-03 07:05:02 -07:00
Roman Belenov	c9d90f15af	[Exegesis][AArch64] Use more generic cycles counter (#133376 ) CPU_CYCLES counter does not work on some Aarch64 CPUs; CYCLES is more generic and is equivalent to CPU_CYCLES in case the latter is supported. Longer story - CPU_CYCLES work only on CPU models explicitly recognized by libpfm4 ( via pfm_arm_detect_*() functions in https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/lib/pfmlib_arm_armv8.c ) and its name is consistent with ARM documentation. However, the counter is architectural and is supported on all ARMv8 CPUs; libpfm4 recognizes generic PMU on unknown ARMv8 CPUs, but does not provide CPU_CYCLES event. Instead, CYCLES is provided (an alias to PERF_COUNT_HW_CPU_CYCLES). Physically, it is the same event with code 0x11. On supported architectures CYCLES also work, so the change should not introduce regression.	2025-03-30 16:59:53 -07:00
Nikita Popov	f137c3d592	[TargetRegistry] Accept Triple in createTargetMachine() (NFC) (#130940 ) This avoids doing a Triple -> std::string -> Triple round trip in lots of places, now that the Module stores a Triple.	2025-03-12 17:35:09 +01:00
Nikita Popov	979c275097	[IR] Store Triple in Module (NFC) (#129868 ) The module currently stores the target triple as a string. This means that any code that wants to actually use the triple first has to instantiate a Triple, which is somewhat expensive. The change in #121652 caused a moderate compile-time regression due to this. While it would be easy enough to work around, I think that architecturally, it makes more sense to store the parsed Triple in the module, so that it can always be directly queried. For this change, I've opted not to add any magic conversions between std::string and Triple for backwards-compatibilty purses, and instead write out needed Triple()s or str()s explicitly. This is because I think a decent number of them should be changed to work on Triple as well, to avoid unnecessary conversions back and forth. The only interesting part in this patch is that the default triple is Triple("") instead of Triple() to preserve existing behavior. The former defaults to using the ELF object format instead of unknown object format. We should fix that as well.	2025-03-06 10:27:47 +01:00
Aiden Grossman	fa6e976602	[llvm-exegesis] Use TestBase for TargetTest (#121895 ) This patch makes the PPC and X86 Exegesis TargetTests use TestBase to provide initial setup rather than doing it themselves. This promotes code reuse a little bit and makes the tests a bit more consistent (with MIPS and with the initial RISC-V tests landing soon).	2025-01-27 15:41:31 -08:00
Craig Topper	3173a4fc3a	[llvm-exegesis] Remove implicit conversions of MCRegister to unsigned. NFC (#123223 ) -Use MCRegister::id() for BitVector index. -Replace std::unordered_set<unsigned> with std::set<MCRegister. There are other std::sets for Register. None for MCRegister before this. I'm assuming we can have operator<(MCRegister, MCRegister). This avoids needing to add std::hash<MCRegister>. -Use MCRegister::isValid() to avoid comparing to 0.	2025-01-16 11:59:51 -08:00
Craig Topper	afa8aeeeec	[RISCV][llvm-exegesis] Add default Pfm cycle counter. (#121866 ) Also tested with Ubuntu on SiFive's HiFive Premier P550 board. Curiously latency is reporting ~1.5 on basic scalar arithmetic, scalar mul is ~3.5, and div is ~36.5. This 0.5 cycles higher than I expect.	2025-01-07 09:51:34 -08:00
Craig Topper	71ddde8ba5	[RISCV][llvm-exegesis] Add unittests. NFC (#121862 ) This is largely based on Mips and PowerPC.	2025-01-07 07:14:41 -08:00
Aiden Grossman	842fd15375	[llvm-exegesis] Add explicit support for setting DF in X86 (#115644 ) While llvm-exegesis has explicit support for setting EFLAGS which contains DF, it can be nice sometimes to explicitly set DF, especially given that it is modeled as a separate register within LLVM. This patch adds the ability to do that by lowering setting the value to 0 or 1 to cld and std respectively.	2024-11-18 12:06:52 -08:00
Matin Raayai	bb3f5e1fed	Overhaul the TargetMachine and LLVMTargetMachine Classes (#111234 ) Following discussions in #110443, and the following earlier discussions in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html, https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine` interface classes. More specifically: 1. Makes `TargetMachine` the only class implemented under `TargetMachine.h` in the `Target` library. 2. `TargetMachine` contains target-specific interface functions that relate to IR/CodeGen/MC constructs, whereas before (at least on paper) it was supposed to have only IR/MC constructs. Any Target that doesn't want to use the independent code generator simply does not implement them, and returns either `false` or `nullptr`. 3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming aims to make the purpose of `LLVMTargetMachine` clearer. Its interface was moved under the CodeGen library, to further emphasis its usage in Targets that use CodeGen directly. 4. Makes `TargetMachine` the only interface used across LLVM and its projects. With these changes, `CodeGenCommonTMImpl` is simply a set of shared function implementations of `TargetMachine`, and CodeGen users don't need to static cast to `LLVMTargetMachine` every time they need a CodeGen-specific feature of the `TargetMachine`. 5. More importantly, does not change any requirements regarding library linking. cc @arsenm @aeubanks	2024-11-14 13:30:05 -08:00
Aiden Grossman	074209034f	[llvm-exegesis] Use older instructions to load lower vregs (#114768 ) This patch makes X86 llvm-exegesis unconditionally use older instructions to load the lower vector registers, rather than trying to use AVX512 for everything when available. This fixes a case where we would try and load AVX512 registers using the older instructions if such a snippet was constructed while -mcpu was set to something that did not support AVX512. This would lead to a machine code verification error rather than resulting in incomplete snippet setup, which seems to be the intention of how this should work. Fixes #114691.	2024-11-04 09:04:31 -08:00
Jay Foad	e03f427196	[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133 ) It is almost always simpler to use {} instead of std::nullopt to initialize an empty ArrayRef. This patch changes all occurrences I could find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor could be deprecated or removed.	2024-09-19 16:16:38 +01:00
JOE1994	459a82e689	[llvm][unittests] Don't call raw_string_ostream::flush() (NFC) raw_string_ostream::flush() is essentially a no-op (also specified in docs). Don't call it in tests that aren't meant to test 'raw_string_ostream' itself. p.s. remove a few redundant calls to raw_string_ostream::str()	2024-09-13 19:55:44 -04:00
Aiden Grossman	fac87b889c	[llvm-exegesis] Switch from intptr_t to uintptr_t in most cases (#102860 ) This patch switches most of the uses of intptr_t to uintptr_t within llvm-exegesis for the subprocess memory support. In the vast majority of cases we do not want a signed component of the address, hence making intptr_t undesirable. intptr_t is left for error handling, for example when making syscalls and we need to see if the syscall returned -1.	2024-08-27 11:19:44 -07:00
Rainer Orth	a417083e27	[llvm-exegesis][unittests] Also disable SubprocessMemoryTest on SPARC (#102755 ) Three `llvm-exegesis` tests ``` LLVM-Unit :: tools/llvm-exegesis/./LLVMExegesisTests/SubprocessMemoryTest/DefinitionFillsCompletely LLVM-Unit :: tools/llvm-exegesis/./LLVMExegesisTests/SubprocessMemoryTest/MultipleDefinitions LLVM-Unit :: tools/llvm-exegesis/./LLVMExegesisTests/SubprocessMemoryTest/OneDefinition ``` `FAIL` on Linux/sparc64 like ``` llvm/unittests/tools/llvm-exegesis/X86/SubprocessMemoryTest.cpp:68: Failure Expected equality of these values: SharedMemoryMapping[I] Which is: '\0' ExpectedValue[I] Which is: '\xAA' (170) ``` It seems like this test only works on little-endian hosts: three sub-tests are already disabled on powerpc and s390x (both big-endian), and the fourth is additionally guarded against big-endian hosts (making the other guards unnecessary). However, since it's not been analyzed if this is really an endianess issue, this patch disables the whole test on powerpc and s390x as before adding sparc to the mix. Tested on `sparc64-unknown-linux-gnu` and `x86_64-pc-linux-gnu`.	2024-08-10 22:54:07 +02:00
Simon Pilgrim	c6e264952e	[llvm-exegesis] Fix -Wdangling-else gcc warning. NFC.	2024-06-28 18:48:30 +01:00
Aiden Grossman	370555c02c	[MCA] Parameterize variant scheduling classes by hash (#92849 ) This patch looks up variant scheduling classes using a hash of the instruction. Keying by the pointer breaks certain use cases that might occur out of tree, like decoding an execution trace instruction by instruction and creating MCA instructions as one goes along, like in the MCAD case. In this case, the MCInst will always have the same address and thus all instructions with the same variant scheduling class will end up with the same instruction description, leading to undesired behavior (assertions, uses after free, invalid results, etc.).	2024-06-28 10:46:46 -07:00
Michael Kruse	4ecbfacf9e	[llvm] Revise IDE folder structure (#89741 ) Update the folder titles for targets in the monorepository that have not seen taken care of for some time. These are the folders that targets are organized in Visual Studio and XCode (`set_property(TARGET <target> PROPERTY FOLDER "<title>")`) when using the respective CMake's IDE generator. * Ensure that every target is in a folder * Use a folder hierarchy with each LLVM subproject as a top-level folder * Use consistent folder names between subprojects * When using target-creating functions from AddLLVM.cmake, automatically deduce the folder. This reduces the number of `set_property`/`set_target_property`, but are still necessary when `add_custom_target`, `add_executable`, `add_library`, etc. are used. A LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's root CMakeLists.txt.	2024-05-25 13:28:30 +02:00
Chinmay Deshpande	848bef5d85	[llvm-mca] Add command line option -call-latency (#92958 ) Currently we assume a constant latency of 100 cycles for call instructions. This commit allows the user to specify a custom value for the same as a command line argument. Default latency is set to 100.	2024-05-22 13:51:55 -07:00
Aiden Grossman	50e6218132	Reland "[llvm-exegesis] Add thread IDs to subprocess memory names (#84451 )" This reverts commit 1fe9c417a0bf143f9bb9f9e1fbf7b20f44196883. This relands commit 6bbe8a296ee91754d423c59c35727eaa624f7140. This was causing build failures on one of the ARMv8 builders. Still not completely sure why, but relanding it to see if the failure pops up again. If it does, the plan is to fix forward by disabling tests on ARM temporarily as llvm-exegesis does not currently use SubprocessMemory on ARM.	2024-03-22 11:51:09 -07:00
Aiden Grossman	1fe9c417a0	Revert "Reland "[llvm-exegesis] Add thread IDs to subprocess memory names (#84451 )"" This reverts commit 8003f553a01a9a2a7eb09fe07e88f1ba9ee7d3a7. This (and/or a related commit) was causing build failures on one of the buildbots that needs more investigation. More information is available at https://lab.llvm.org/buildbot/#/builders/178/builds/7015.	2024-03-13 00:49:23 -07:00
Aiden Grossman	8003f553a0	Reland "[llvm-exegesis] Add thread IDs to subprocess memory names (#84451 )" This reverts commit aefad27096bba513f06162fac2763089578f3de4. This relands commit 6bbe8a296ee91754d423c59c35727eaa624f7140. This patch was casuing build failures on non-Linux platforms due to the default implementations for the functions not being updated. This ended up causing out-of-line definition errors. Fixed for the relanding.	2024-03-12 16:35:05 -07:00
Florian Hahn	aefad27096	Revert "[llvm-exegesis] Add thread IDs to subprocess memory names (#84451 )" This reverts commit 6bbe8a296ee91754d423c59c35727eaa624f7140. This breaks building LLVM on macOS, failing with llvm/tools/llvm-exegesis/lib/SubprocessMemory.cpp:146:33: error: out-of-line definition of 'setupAuxiliaryMemoryInSubprocess' does not match any declaration in 'llvm::exegesis::SubprocessMemory' Expected<int> SubprocessMemory::setupAuxiliaryMemoryInSubprocess(	2024-03-12 08:52:29 +00:00
Aiden Grossman	6bbe8a296e	[llvm-exegesis] Add thread IDs to subprocess memory names (#84451 ) This patch adds the thread ID to the subprocess memory shared memory names. This avoids conflicts for downstream consumers that might want to consume llvm-exegesis across multiple threads, which would otherwise run into conflicts due to the same PID running multiple instances.	2024-03-12 01:24:21 -07:00
Aiden Grossman	1d1186de34	[llvm-exegesis] Add loop-register snippet annotation (#82873 ) This patch adds a LLVM-EXEGESIS-LOOP-REGISTER snippet annotation which allows a user to specify the register to use for the loop counter in the loop repetition mode. This allows for executing snippets that don't work with the default value (currently R8 on X86).	2024-02-27 12:28:25 -08:00
Aiden Grossman	415bf200a7	[llvm-exegesis] Replace --num-repetitions with --min-instructions (#77153 ) This patch replaces --num-repetitions with --min-instructions to make it more clear that the value refers to the minimum number of instructions in the final assembled snippet rather than the number of repetitions of the snippet. This patch also refactors some llvm-exegesis internal variable names to reflect the name change. Fixes #76890.	2024-02-01 01:58:27 -08:00
Aiden Grossman	d8b61d7168	[llvm-exegesis] Add middle half repetition mode (#77020 ) This patch adds two new repetition modes to llvm-exegesis, particularly loop and duplicate repetition modes of what I am terming the middle half repetition mode. The middle half repetition mode essentially runs each measurement twice, one with twice the number of iterations of the other. These two measurements are then agregated by taking their difference. This subtracts away any setup/overhead that is unrelated to the code in the snippet, providing more accurate results. Using this mode on a couple toy examples, I am able to get exact (integer) throughput values on all of them in contrast to the default duplicate/loop repetition modes which show a little bit of noise on the snippet value.	2024-01-30 12:42:35 -08:00
Aiden Grossman	c1a155bf78	[llvm-exegesis] Refactor BenchmarkMeasure instantiation in tests This patch refactors the instantiation of BenchmarkMeasure within all the unit tests to use BenchmarkMeasure::Create rather than through direct struct instantialization. This allows us to change what values are stored in BenchmarkMeasure without getting compiler warnings on every instantiation in the unit tests, and is also just a cleanup in general as the Create function didn't seem to exist at the time the unit tests were originally written.	2024-01-26 17:00:57 -08:00
Aiden Grossman	2b31a673de	[llvm-exegesis] Make duplicate snippet repetitor produce whole snippets (#77224 ) Currently, the duplicate snippet repetitor will truncate snippets that do not exactly divide the minimum number of instructions. This patch corrects that behavior by making the duplicate snippet repetitor duplicate the snippet in its entirety until the minimum number of instructions has been reached. This makes the behavior consistent with the loop snippet repetitor, which will execute at least `--num-repetitions` (soon to be renamed `--min-instructions`) instructions.	2024-01-19 11:34:16 -08:00
Aiden Grossman	f670112a59	[llvm-exegesis] Add support for validation counters (#76653 ) This patch adds support for validation counters. Validation counters can be used to measure events that occur during snippet execution like cache misses to ensure that certain assumed invariants about the benchmark actually hold. Validation counters are setup within a perf event group, so are turned on and off at exactly the same time as the "group leader" counter that measures the desired value.	2024-01-19 02:00:33 -08:00
Aiden Grossman	d9c8edf08a	[llvm-exegesis] Add matcher for register initial values (#76666 ) Currently, the unit tests for the BenchmarkResult struct do not check if the register initial values can be parsed back in. This patch adds a matcher and tests that the register initial values can be parsed correctly. This exercises code already contained within the benchmark to yaml infrastructure.	2024-01-05 13:38:05 -08:00
Aiden Grossman	a25b66217f	[NFC][llvm-exegesis] Remove redundant register initial values argument This patch removes the redundant RegisterInitialValues parameter from assembleToStream and friends as it is included within the BenchmarkKey struct that is also passed to all the functions that need this information.	2024-01-03 15:25:21 -08:00
Aiden Grossman	5f423b7d1c	[llvm-exegesis] Adjust page size in unit tests to fix ppc failures The llvm-exegesis unit tests currently fail on PPC after ceb196d9903f4db7250bbc6c8da13eeae1b85886 landed as the default page size on most common linux distributions for PPC is 64kb rather than 4kb. This patch changes the memory mappings to have addresses as multiples of 64kb rather than multiples of 4kb to fix this issue.	2023-12-15 13:11:07 -08:00
Aiden Grossman	3194928c3c	[llvm-exegesis] Refactor MMAP platform-specific preprocessor directives (#75422 ) This patch refactors the MMAP platform-specific preprocessor directives in llvm-exegesis to a single file instead of having duplicate code split across multiple files. These originally got introduced to get buildbots green again due to platform specific failures.	2023-12-14 12:07:46 -08:00
Aiden Grossman	f1963fde9f	Reland "[llvm-exegesis] Add in snippet address annotation (#74218 )" This reverts commit 30d700117b772d94d8474ec56bd6f9cc423fc613. This relands commit 3ab41f912a6c219a93b87c257139822ea07c8863. When I was updating the patch to use llvm::to_integer, I only ran the lit tests and didn't run the unit tests, one of which started to fail. This patch fixes the broken unit test.	2023-12-07 00:20:24 -08:00
Aiden Grossman	30d700117b	Revert "[llvm-exegesis] Add in snippet address annotation (#74218 )" This reverts commit 3ab41f912a6c219a93b87c257139822ea07c8863. Unit tests break after recent changes. Will investigate/reland.	2023-12-06 11:25:03 -08:00
Aiden Grossman	3ab41f912a	[llvm-exegesis] Add in snippet address annotation (#74218 )	2023-12-06 11:05:33 -08:00
Kazu Hirata	c630f95f33	[llvm-exegesis] Remove unnecessary includes (NFC) Identified with clangd.	2023-12-05 23:28:09 -08:00
Aiden Grossman	38f75d606f	[llvm-exegesis] Removed useless test This test was an exact duplicate of the one above, providing no value, and also adding confusion as it referred to a behavior that (was presumably) moved around and tested differently during the review process with the test being forgotten about.	2023-12-02 18:04:03 -08:00
Aiden Grossman	8a02b70324	[llvm-exegesis] Refactor ExecutableFunction to use a named constructor (#72837 ) This patch refactors ExecutableFunction to use a named constructor pattern, namely adding the create function, so that errors occurring during the creation of an ExecutableFunction can be propogated back up rather than having to deal with them in report_fatal_error.	2023-11-24 02:15:34 -08:00
Kazu Hirata	3b34c117db	[llvm] Remove unused using decls (NFC) Identified with misc-unused-using-decls.	2023-10-03 23:21:50 -07:00
Kazu Hirata	534c096ec9	[llvm-exegesis] Remove unused using decls (NFC) Identified with misc-unused-using-decls.	2023-09-24 00:21:33 -07:00
William Junda Huang	f4f85e0ab4	[llvm-profdata] Remove MD5 collision check in D147740 (#66544 ) This is the patch at https://reviews.llvm.org/D153692, migrating to Github After testing D147740 with multiple industrial projects with ~10 million FunctionSamples, no MD5 collision has been found. In perfect hashing, the probability of collision for N symbols over K possible hash value is 1 - K!/((K-N)! * K^N). When N is 1 million and K is 2^64, the probability is 310^-8, when N is 10 million the probability is 310^-6, so we are probably not going to find an actual case in real world application. (However if K is 2^32, the probability of collision is almost 1, this is indeed a problem, if anyone still use a large profile on 32-bit machine, as hash_code is tied to size_t). Furthermore, when a collision happens we can't do anything to recover it, unless using a multi-map, but that is significantly slower, which contradicts the purpose of optimizing the profile reader. One more thing, since we have been using profiles with MD5 names, and they have to be coming from non-MD5 sources, so if hash collision is to happen, it already happened when we convert a non-MD5 profile to a MD5 one, so there's no point to check for that in the reader, and this feature can be removed.	2023-09-15 22:30:51 +00:00
Nico Weber	e6e69f3bd4	[cfi-verify tests]: Skip two x86-only tests if x86 is not enabled With this, check-llvm passes on an arm mac if x86 isn't in LLVM_TARGETS_TO_BUILD. This pattern to skip the tests if x86 isn't enabled is used in every other test in this file.	2023-09-11 14:31:39 -07:00
Bjorn Pettersson	7b3f6e64a0	[llvm-exegesis] Fix in SubprocessMemoryTest after commit adb01dea6a5 Make sure the TestCount const definition is guarded the same way as the use of the constant. This is an attempt to fix buildbot failures related to -Wunused-const-variable.	2023-09-08 22:49:33 +02:00
Aiden Grossman	adb01dea6a	[llvm-exegesis] Make SubprocessMemoryTest use PIDs (#65245 ) This patch makes SubprocessMemoryTest use process PIDs during creation of the SubprocessMemory objects within the tests so that there isn't interference between multiple instances of the test running at the same time which could potentially occur in multi-user environments. This is a continuation the review in https://reviews.llvm.org/D154680.	2023-09-08 00:09:29 -07:00
Khem Raj	01a92f06f2	[llvm-exegesis] Use mmap2 when mmap is unavailable to fix riscv32 build Some 32-bit architectures don't have mmap and define mmap2 instead. E.g. on riscv32 we may get ``` \| /mnt/b/yoe/master/build/tmp/work-shared/llvm-project-source-17.0.0-r0/git/llvm/tools/llvm-exegesis/lib/X86/Target.cpp:1116:19: error: use of undeclared identifier 'SYS_mmap' \| 1116 \| generateSyscall(SYS_mmap, MmapCode); \| \| ^ \| /mnt/b/yoe/master/build/tmp/work-shared/llvm-project-source-17.0.0-r0/git/llvm/tools/llvm-exegesis/lib/X86/Target.cpp:1134:19: error: use of undeclared identifier 'SYS_mmap' \| 1134 \| generateSyscall(SYS_mmap, GeneratedCode); \| \| ^ \| 1 warning and 2 errors generated. ``` Co-Authored-By: Fangrui Song <i@maskray.me> Differential Revision: https://reviews.llvm.org/D158375	2023-08-25 10:43:00 -07:00
William Huang	7624de5bea	[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map This is phase 1 of multiple planned improvements on the sample profile loader. The major change is to use MD5 hash code ((instead of the function itself) as the key to look up the function offset table and the profiles, which significantly reduce the time it takes to construct the map. The optimization is based on the fact that many practical sample profiles are using MD5 values for function names to reduce profile size, so we shouldn't need to convert the MD5 to a string and then to a SampleContext and use it as the map's key, because it's extremely slow. Several changes to note: (1) For non-CS SampleContext, if it is already MD5 string, the hash value will be its integral value, instead of hashing the MD5 again. In phase 2 this is going to be optimized further using a union to represent MD5 function (without converting it to string) and regular function names. (2) The SampleProfileMap is a wrapper to *map<uint64_t, FunctionSamples>, while providing interface allowing using SampleContext as key, so that existing code still work. It will check for MD5 collision (unlikely but not too unlikely, since we only takes the lower 64 bits) and handle it to at least guarantee compilation correctness (conflicting old profile is dropped, instead of returning an old profile with inconsistent context). Other code should not try to use MD5 as key to access the map directly, because it will not be able to handle MD5 collision at all. (see exception at (5) ) (3) Any SampleProfileMap::emplace() followed by SampleContext assignment if newly inserted, should be replaced with SampleProfileMap::Create(), which does the same thing. (4) Previously we ensure an invariant that in SampleProfileMap, the key is equal to the Context of the value, for profile map that is eventually being used for output (as in llvm-profdata/llvm-profgen). Since the key became MD5 hash, only the value keeps the context now, in several places where an intermediate SampleProfileMap is created, each new FunctionSample's context is set immediately after insertion, which is necessary to "remember" the context otherwise irretrievable. (5) When reading a profile, we cache the MD5 values of all functions, because they are used at least twice (one to index into FuncOffsetTable, the other into SampleProfileMap, more if there are additional sections), in this case the SampleProfileMap is directly accessed with MD5 value so that we don't recalculate it each time (expensive) Performance impact: When reading a ~1GB extbinary profile (fixed length MD5, not compressed) with 10 million function names and 2.5 million top level functions (non CS functions, each function has varying nesting level from 0 to 20), this patch improves the function offset table loading time by 20%, and improves full profile read by 5%. Reviewed By: davidxl, snehasish Differential Revision: https://reviews.llvm.org/D147740	2023-08-17 20:10:45 +00:00
Aaron Ballman	1a53b5c367	Revert "[llvm-profdata] Refactoring Sample Profile Reader to increase FDO build speed using MD5 as key to Sample Profile map" This reverts commit 66ba71d913df7f7cd75e92c0c4265932b7c93292. Addressing issues found by: https://lab.llvm.org/buildbot/#/builders/245/builds/11732 https://lab.llvm.org/buildbot/#/builders/187/builds/12251 https://lab.llvm.org/buildbot/#/builders/186/builds/11099 https://lab.llvm.org/buildbot/#/builders/182/builds/6976	2023-07-28 09:41:38 -04:00

1 2 3 4 5 ...

291 Commits