Claiming AArch64 support for llvm-exegesis is a bit of a stretch in my
opinion as only a couple of opcodes with GPR64 operands will work for
snippet benchmarking, so I propose to clarify that AArch64 support is
very experimental. Also added some clarifications about its libpfm4
dependency.
This patch makes X86 llvm-exegesis unconditionally use older
instructions to load the lower vector registers, rather than trying to
use AVX512 for everything when available. This fixes a case where we
would try and load AVX512 registers using the older instructions if such
a snippet was constructed while -mcpu was set to something that did not
support AVX512. This would lead to a machine code verification error
rather than resulting in incomplete snippet setup, which seems to be the
intention of how this should work.
Fixes#114691.
This reverts commit 2cd20c255684257b86940bdda6861897f0bf3c00.
This relands commit 9886788a8a500a1b429a6db64397c849b112251c.
This was causing more buildbot failures due to getcpu not being
available with glibc <=2.29. This patch fixes that by directly making
the syscall, assuming the syscall number macro is available.
This reverts commit 5e3d48a68096a0017a0fa4bb89f2d48767c8a7e4.
This relands commit 9886788a8a500a1b429a6db64397c849b112251c.
This was originally causing build failures on more esoteric platforms
that have different definitions of getcpu. This is only intended to be
supported on x86-64 currently, so just use preprocessor definitions to
special case the function.
This patch adds in support for pinning a benchmarking process to a
specific CPU (in the subprocess benchmarking mode on Linux). This is
intended to be used in environments where a certain set of CPUs is
isolated from the scheduler using something like cgroups and thus should
present less potential for noise than normal. This also opens up the
door for doing multithreaded benchmarking as we can now pin benchmarking
processes to specific CPUs that we know won't interfere with each other.
This patch removes the getter for the mentioned mapping. This was only
kept around to keep things in sync for some downstream codebases (that
didn't even end up needing it), so removing it now that it is not needed
anymore.
This patch refactors the procedure of getting the register number from a
register name to LLVMState rather than having individual users get the
values themselves by getting a reference to the map from LLVMState. This
is primarily intended to make some downstream usage in Gematria simpler,
but also cleans up a little bit upstream by pulling the actual map
searching out and just leaving error handling to the clients.
The original getter is left to enable downstream migration in Gematria,
particularly before it gets imported into google internal.
This patch switches most of the uses of intptr_t to uintptr_t within
llvm-exegesis for the subprocess memory support. In the vast majority of
cases we do not want a signed component of the address, hence making
intptr_t undesirable. intptr_t is left for error handling, for example
when making syscalls and we need to see if the syscall returned -1.
Glibc v2.40 changes the definition of __rseq_size to the usable area of
the struct rather than the actual size of the struct to accommodate
users trying to figure out what features can be used. This change breaks
llvm-exegesis trying to disable rseq as the size registered in the
kernel is no longer equal to __rseq_size. This patch adds a check to see
if __rseq_size is less than 32 bytes and uses 32 as a value if it is
given alignment requirements.
Fixes#100791.
llvm-exegesis currently links and initializes all targets, even though
most of them are not supported by llvm-exegesis. This is particularly
unfortunate because llvm-exegesis does not support the LLVM dylib, so
llvm-exegesis essentially ends up doing a complete relink of all of
LLVM, which is not fun if you use LTO.
Instead, only link and initialize the targets that are part of
LLVM_EXEGESIS_TARGETS.
In the SubprocessMemory destructor, I was using a normal std::string to
hold the name of the current shared memory name, but a const reference
works just as well in this situation while having better performance
characteristics.
Fixes#90289
There are several insstances in the subprocess executor in llvm-exegesis
where we fail to close file descriptors after using them. This leaves
them open, which can cause issues later on if a third-party binary is
using the exegesis libraries and executing many blocks before exiting.
Leaving the descriptors open until process exit is also bad practice.
This patch fixes that by explicitly calling close() when we are done
with a specific file descriptor.
This patch fixes#86583.
This patch defines SYS_gettid as __NR_gettid if SYS_gettid is not
available to avoid compile time errors due to SYS_gettid not being
defined. This happens with certain libcs (like bionic) that do not
define SYS_gettid.
This patch changes the preprocessor directives surrounding
getCurrentTID, particularly moving it out of the block that is only
defined when not building for Android. The getCurrentTID function is
called in places that only require Linux definitions, so this function
should have the same preprocessor scoping around it to prevent link time
failures.
This patch adds error handling for shm_open failures in one case where
they were not handled before and also makes an error handler in another
case report the value of errno for diagnosis.
This reverts commit 1fe9c417a0bf143f9bb9f9e1fbf7b20f44196883.
This relands commit 6bbe8a296ee91754d423c59c35727eaa624f7140.
This was causing build failures on one of the ARMv8 builders. Still not
completely sure why, but relanding it to see if the failure pops up
again. If it does, the plan is to fix forward by disabling tests on ARM
temporarily as llvm-exegesis does not currently use SubprocessMemory
on ARM.
This reverts commit c3a41aac5f32475b9a0499e6e888e713763566dc.
This relands commit bd493756fa51e538575fc320aae50d75394f0567.
Apparently I forgot to update a couple values, so this change failed on
every builder that builds those sections (should be every Linux
platform) rather than something architecture specific like originally
thought. I swore I updated the values and ran check-llvm before merging.
Wondering If I forgot to push those changes...
Before this patch, llvm-exegesis would leave processes lingering that
experienced signals like segmentation faults. They would up in a
signal-delivery-stop state under the ptrace and never exit. This does
not cause problems (or at least many) in llvm-exegesis as they are
cleaned up after the main process exits, which usually happens quickly.
However, in downstream use, when many blocks are being executed (many of
which run into signals) within a single process, these processes stay
around and can easily exhaust the process limit on some systems.
This patch cleans them up by sending SIGKILL after information about the
signal that was sent has been gathered.
This patch removes an explicit use of dbgs() and exit when logging an
error to explicitly use ExitOnErr as this is the canonical way to do
this within exegesis.
This reverts commit 8003f553a01a9a2a7eb09fe07e88f1ba9ee7d3a7.
This (and/or a related commit) was causing build failures on one of the
buildbots that needs more investigation. More information is available
at https://lab.llvm.org/buildbot/#/builders/178/builds/7015.
This reverts commit 1c3b15e9f5bc671e40bcf5d3475f5425466754ce.
This (and/or) a related patch was causing build failures on one of the
buildbots. More information is available at
https://lab.llvm.org/buildbot/#/builders/178/builds/7015.
This patch switches from manually using the Linux syscall to get the
current thread ID to using the relevant LLVM Support libraries that
abstract over the low level system details.
This reverts commit aefad27096bba513f06162fac2763089578f3de4.
This relands commit 6bbe8a296ee91754d423c59c35727eaa624f7140.
This patch was casuing build failures on non-Linux platforms due to the
default implementations for the functions not being updated. This ended
up causing out-of-line definition errors. Fixed for the relanding.
This reverts commit 6bbe8a296ee91754d423c59c35727eaa624f7140.
This breaks building LLVM on macOS, failing with
llvm/tools/llvm-exegesis/lib/SubprocessMemory.cpp:146:33: error: out-of-line definition of 'setupAuxiliaryMemoryInSubprocess' does not match any declaration in 'llvm::exegesis::SubprocessMemory'
Expected<int> SubprocessMemory::setupAuxiliaryMemoryInSubprocess(
This patch adds the thread ID to the subprocess memory shared memory
names. This avoids conflicts for downstream consumers that might want to
consume llvm-exegesis across multiple threads, which would otherwise run
into conflicts due to the same PID running multiple instances.
APX assembly syntax recommendations:
https://cdrdv2.intel.com/v1/dl/getContent/817241
NOTE:
The change in llvm/tools/llvm-exegesis/lib/X86/Target.cpp is for test
LLVM ::
tools/llvm-exegesis/X86/latency/latency-SETCCr-cond-codes-sweep.s
For `SETcc`, llvm-exegesis would randomly choose 1 other instruction to
test with `SETcc`, after selecting the instruction, llvm-exegesis would
check if the operand is initialized and valid, if not
`randomizeTargetMCOperand` would choose a value for invalid operand, it
misses support for condition code operand, which cause the flaky failure
after `CCMP` supported.
llvm-exegesis can choose `CCMP` without specifying ccmp feature b/c it
use `MCSubtarget` and only16/32/64 bit is considered.
llvm-exegesis doesn't choose other instructions b/c requirement in
`hasAliasingRegistersThrough`: the instruction should use GPR (defined
by `SETcc`) and define `EFLAGS` (used by `SETcc`).
This patch adds a LLVM-EXEGESIS-LOOP-REGISTER snippet annotation which
allows a user to specify the register to use for the loop counter in the
loop repetition mode. This allows for executing snippets that don't work
with the default value (currently R8 on X86).
This patch removes the exegesis:: prefix within the exegesis namespace
in llvm-exegesis.cpp as it isn't necessary due to the code already being
wrapped in the namespace.
…table.
All data is derived from a single table rather than being spread out
over an enum, a table and the main entry point.
This is intended as a replacement for #82092.
This patch adds a debug option to print per measurement latency and
validation counter values. This makes it easier to debug certain
transient issues that can be hard to spot just using the summary view at
the end.
I've hacked print statements into this part of the code base enough
times for debugging various things that I think it makes sense to be a
proper debug macro.
There were a couple things in the comments of BenchmarkRunner.cpp (and
maybe other files, I can't really remember) that were bugging me. This
patch fixes a couple of minor issues specifically in BenchmarkRunner
like typos and makes a couple instances more clear.
This patch replaces --num-repetitions with --min-instructions to make it
more clear that the value refers to the minimum number of instructions
in the final assembled snippet rather than the number of repetitions of
the snippet. This patch also refactors some llvm-exegesis internal
variable names to reflect the name change.
Fixes#76890.
This patch adds two new repetition modes to llvm-exegesis, particularly
loop and duplicate repetition modes of what I am terming the middle half
repetition mode. The middle half repetition mode essentially runs each
measurement twice, one with twice the number of iterations of the other.
These two measurements are then agregated by taking their difference.
This subtracts away any setup/overhead that is unrelated to the code in
the snippet, providing more accurate results.
Using this mode on a couple toy examples, I am able to get exact
(integer) throughput values on all of them in contrast to the default
duplicate/loop repetition modes which show a little bit of noise on the
snippet value.
This patch removes the llvm:: prefix within llvm-exegesis where it is
not necessary. This is most occurrences of the prefix within exegesis as
exegesis is within the llvm namespace. This patch makes things more
consistent as the vast majority of the code did not use the llvm::
prefix for anything.